combinational loops

A

alb

Hi Mike,

Have a look on the rtl viewer. Some variable array is too large or some
block ram does not fit the device.

If you don't have a viewer, start out with the free quartus tools.

as Andy said, the synthesizer did not quite realized there was an FSM.
The fact that the next state was taken from a register confused it (I
wonder why, since a state is just a register...).
Start with with just the reset procedure
check the reset pulse and all static outputs.

reset procedure and static output look ok. The whole code simulated ok.
Add a clock and counter and check that. etc.

I agree, I need to verify at each step what the rtl viewer shows, in
order to be confident about what I'm doing.
I never go near the bench until the RTL view is as expected.

Pressures from above unfortunately... but I should admit that I looked
at the RTL only *after* finding the HW did not work.
I would read a burst at at time and check sims that the bus port has Z,1,0
at the right times.

I have a 1-wire slave in the TB and sims ok, all sequences are in the
correct place as expected. What is not taken into account is the powerup
time which might be different in sim and HW.
 
A

Andy

Al,

I checked your update_ports procedure. In a clocked process, you cannot have a combinatorial in-to-out path. I'm surprised it is not flagged as a warnging from the synthesis tool. I really don't know what circuitry you might get from using input ports in the update_ports procedure, but it is possible that, on a clock-cycle basis, the circuitry might still match simulation.

All outputs driven by update_ports must be functions only of process variables or constants/generics, NOT input ports or signals. The references to variables in update_ports are to the registered values thereof (remember, update_ports also runs on the falling edge of the clock), and any logic (decision making, expression, etc.) in the procedure is combinatorial after the registers.

Just like in a combinatorial process, you also have to make sure that any ports that may be assigned in update_ports are always assigned, or else a latch will be inferred to remember the last value assigned to the port.

If update_ports were called inside the clocked if-then-else, the cycle-based timing of the outputs would be identical, but the input variables would be combinatorial, along with any logic in the procedure, and the ports wouldbe registered. In this case, you would also need to reset the output port registers.

The synchronous process template with three procedures (init_regs, update_regs, update_ports) works well within limits, but those limits must be observed, and are not always well understood. I never really bought into the idea that anything was improved by using procedures here; I tend to see the three regions of the clocked process as having those three distinct functionsanyway, procedure or not. Reuse of a procedure multiple times would be a good reason, if not for the synthesis bug I mentioned.

Andy
 
A

alb

Hi Andy,

On 25/09/2013 16:26, Andy wrote:
[]
I checked your update_ports procedure. In a clocked process, you
cannot have a combinatorial in-to-out path. I'm surprised it is not
flagged as a warnging from the synthesis tool. I really don't know
what circuitry you might get from using input ports in the
update_ports procedure, but it is possible that, on a clock-cycle
basis, the circuitry might still match simulation.

you are definitely right, that went completely unnoticed. Funny enough
the rtl looks a multiplexer, as wanted.
All outputs driven by update_ports must be functions only of process
variables or constants/generics, NOT input ports or signals. The
references to variables in update_ports are to the registered values
thereof (remember, update_ports also runs on the falling edge of the
clock), and any logic (decision making, expression, etc.) in the
procedure is combinatorial after the registers.

Since the logic is combinatorial for update_ports, shouldn't I bump in a
simulation mismatch between pre and post synth. because of the
sensitivity list for the combinational process?

As far as I know synthesis tools tend to ignore sensitivity lists,
resulting in a different behavior for the combinational logic...
Just like in a combinatorial process, you also have to make sure that
any ports that may be assigned in update_ports are always assigned,
or else a latch will be inferred to remember the last value assigned
to the port.

I could potentially use a registered port with the following template
instead:
<code>
procedure template_a_rst is
begin -- Has proven equivalent to v_rst for synthesis.
if RST = '1' then
init_regs;
update_ports;
elsif rising_edge(CLK) then
update_regs;
update_ports;
end if;
end procedure template_a_rst;
</code>

and I just happen to notice a comment referring to the equivalence with
the template I previously posted.
If update_ports were called inside the clocked if-then-else, the
cycle-based timing of the outputs would be identical, but the input
variables would be combinatorial, along with any logic in the
procedure, and the ports would be registered. In this case, you would
also need to reset the output port registers.

See above.
The synchronous process template with three procedures (init_regs,
update_regs, update_ports) works well within limits, but those limits
must be observed, and are not always well understood. I never really
bought into the idea that anything was improved by using procedures
here; I tend to see the three regions of the clocked process as
having those three distinct functions anyway, procedure or not. Reuse
of a procedure multiple times would be a good reason, if not for the
synthesis bug I mentioned.

I agree on the limits and it is true that when sailing in unknown waters
it is certainly not clear what will happen and instead of debugging code
we might end up tricking the tools.

The main issue I have with this approach is that variables are 'locally
global', in the sense that they are local to the process, but global to
every procedure in the code, potentially losing the encapsulation
benefit provided by subprograms in the first place. I'd like to verify
if there's a better way to isolate code which operates on globally
defined data structures.

The main merit I see in this approach is to clearly separate the three
functions you referred to. I also find that readability is increased
with this approach, even though I should really separate the reused
states in separate fsm.
 
M

Mike Treseler

Hi Andy,
On 25/09/2013 16:26, Andy wrote:

Thanks Andy, Hi Alb,

In the update_ports block I use only,
my_port <= my_variable_v

Keep other expressions in the update_regs block
or in a function declared in architecture scope.

-- Mike Treseler
 
R

rickman

Hi Rick,

On 25/09/2013 06:52, rickman wrote:
[]
Ok, at this point I am showing my ignorance of using procedures in VHDL.
I never realized that scope would work this way. In fact, this sounds
very unlike VHDL. I'm still a bit confused about a couple of things.

There's only one process in the entire entity. Variables have local
scope in the process, but since there's nothing except one process they
can be considered 'global' in the entity. Moreover every procedure
defined within the process has access to all local variables as well.
The procedures init_regs and update_ports may be in a clocked process,
but they are *not* in the clocked region of the clocked process and so
can generate combinatorial logic.

the init_regs procedure is called in the reset branch of the clocked
process, therefore will run to 'reset' all initial states of any locally
defined variable.

In update_ports all appropriate variables are 'wired' to ports (no
signals). Meaning that every time the process is triggered they will be
assigned to a port. Since update_ports is only called when the process
is triggered I'm not sure if you can have an output port which is mapped
to a pure combinatorial logic of a set of inputs.
Again, I have not looked at the code
in detail as that would require me to copy to a text file and open it in
a decent editor rather than this limited newsreader. Are you sure there
are no issues in one of those two procedures?

Yeah I apologize for the code format, I try to keep everything under 80
characters for this very purpose but I'm not always so strict :-/

I'm trying to remove most of the stuff and synthesize piece by piece.
Honestly I do not see how the init_regs and update_ports procedures can
be broken.

I don't really see anything either. Looking at the RTL diagram may show
you something useful. I often look at the schematics produced by
synthesis when I am concerned about the hardware used. I don't recall
needing to look at schematics to find bugs. It may help though if your
circuit is small enough.

AFAIK vhdl passes arguments by value, making a local copy of the
parameter which has local scope to the subprogram, in this case I do not
know how I can have my 'state' variable retaining the value through
clock cycles.

I'm not sure what that means. Procedures have inputs and outputs. Pass
the inputs in and the outputs out. Where's the problem?

To be honest, I have looked at using procedures before and didn't find
any utility to taking a section of code delimited by the control
structure of the registered process and partitioning it into procedures
as blocks. If the procedures and/or functions are describing some
logical entity, then it makes sense to me. But partitions for the sake
of partitions don't help a lot when you don't have an easy way to test
them as functional units.

BTW, why are you using variables rather than signals? Variables are
certainly useful at times. But I get the impression you are using them
solely because that is what are used when writing software.
 
A

alb

Hi Rick,

On 25/09/2013 23:46, rickman wrote:
[]
I don't really see anything either. Looking at the RTL diagram may show
you something useful. I often look at the schematics produced by
synthesis when I am concerned about the hardware used. I don't recall
needing to look at schematics to find bugs. It may help though if your
circuit is small enough.

as Andy said, the update_ports was broken. That was a misuse of the
template, forgetting about the whole idea about the single process.

That said, it seemed the synthesis tool ignored the sensitivity list and
did not infer any latch, on the contrary it instantiated a multiplexer
just as I intended. By I agree with Andy and Mike that this is not the
proper way to do it.

[]
I'm not sure what that means. Procedures have inputs and outputs. Pass
the inputs in and the outputs out. Where's the problem?

Indeed I can use 'inout' parameter for variables which need to retain
the value (they will be locally modified and globally propagated), while
I can use 'in' parameter for variables that the procedure need to be
sensitive on.

Using the same name for the global variable and the formal parameter
might be a trick to 'protect' the global variable from being
inadvertently overwritten by a procedure which is supposed to have read
only access (with 'in' mode).
To be honest, I have looked at using procedures before and didn't find
any utility to taking a section of code delimited by the control
structure of the registered process and partitioning it into procedures
as blocks. If the procedures and/or functions are describing some
logical entity, then it makes sense to me. But partitions for the sake
of partitions don't help a lot when you don't have an easy way to test
them as functional units.

I see two main advantages in using subprograms:

1. reuse
2. readability

There's another subtle advantage in the template I posted, the three
functional steps grouped in init_regs, updated_regs, update_ports make
clear what often is not so evident in logic designs.

I do not want to say that you cannot make readable code with several
processes and signals to make them communicate, but I too often have
seen single flops processes here and there with few attention to the
overall function.

I personally like this approach, regardless the problems that I'm
currently facing (most probably related to my inexperience with the
template itself). The template does not prevent to have a separate part
of the entity where input-output combinational logic is inserted.
BTW, why are you using variables rather than signals? Variables are
certainly useful at times. But I get the impression you are using them
solely because that is what are used when writing software.

I used to use only signals because I did not quite understand the rules
for variables. This is currently changing. While I consider abstraction
a great benefit (just consider how easy is to describe a counter just
saying 'cnt = cnt + 1', instead of describing connections between
flops), the advantage of using variables is not only related to
simulation performances (memory footprint for a signal is bigger), but
also because it follows IMO more naturally the way of sequential reasoning.
 
R

rickman

Hi Rick,



That said, it seemed the synthesis tool ignored the sensitivity list and
did not infer any latch, on the contrary it instantiated a multiplexer
just as I intended. By I agree with Andy and Mike that this is not the
proper way to do it.

Synthesis tools *often* ignore sensitivity lists.

Indeed I can use 'inout' parameter for variables which need to retain
the value (they will be locally modified and globally propagated), while
I can use 'in' parameter for variables that the procedure need to be
sensitive on.

Sounds like a plan.

Using the same name for the global variable and the formal parameter
might be a trick to 'protect' the global variable from being
inadvertently overwritten by a procedure which is supposed to have read
only access (with 'in' mode).

??? If a procedure has an input with the same name as a global, it
won't be able to access the global. But then as I said, I'm not fluent
with procedures. I don't use globals for a variety of reasons including
reuse and readability.

I see two main advantages in using subprograms:

1. reuse
2. readability

Yeah... maybe you can explain those. I think the procedure based
approach is *less* readable, but that is subjective so I doubt we will
agree. But as to reuse??? How is taking the code in a clocked process
and breaking it into sections with wrappers around them make the code
easier to reuse?

Reuse comes from planning a design for reuse and making modules which
are reusable. "Modules" does not have to be procedures. I typically
use entities.

There's another subtle advantage in the template I posted, the three
functional steps grouped in init_regs, updated_regs, update_ports make
clear what often is not so evident in logic designs.

Again, subjective. I have no trouble reading a standard clocked process
without breaking it into sections. Heck, update_ports isn't even needed
if you aren't using variables.

I do not want to say that you cannot make readable code with several
processes and signals to make them communicate, but I too often have
seen single flops processes here and there with few attention to the
overall function.

Yes, but this is a separate issue. Any method can be abused.

I personally like this approach, regardless the problems that I'm
currently facing (most probably related to my inexperience with the
template itself). The template does not prevent to have a separate part
of the entity where input-output combinational logic is inserted.

What is your background? Have you always been a hardware designer?

I used to use only signals because I did not quite understand the rules
for variables. This is currently changing. While I consider abstraction
a great benefit (just consider how easy is to describe a counter just
saying 'cnt = cnt + 1', instead of describing connections between
flops), the advantage of using variables is not only related to
simulation performances (memory footprint for a signal is bigger), but
also because it follows IMO more naturally the way of sequential reasoning.

I don't follow. I can use an integer, a signed or an unsigned type and
use the exact same code with a signal. The only difference is I don't
have to assign my result to a signal to use it outside the process
because it is already a signal.
 
A

Andy

Al,

The update_ports procedure (or the region of the clocked process in which it runs) is not its own process. Any code there runs anytime the process is triggered (changes in reset or clk, including both edges of each). There isno separate sensitivity list.

Since update_ports should only be reading variables that are only updated when init_regs or update_regs run, the output value of the ports never actually changes on the falling edge, but they are reassigned (there is a transaction but no event). If you looked at a driven port's 'transaction attribute, that falling edge transaction would be apparent. And since there is no re-execution of the process to generate new output values, those output values change in the same simulation delta cycle as any signal updated on the rising of the clock (e.g. a register).

I'm not sure what kind of circuitry your modified template (I assume you meant "process" template_a_rst instead of "procedure" template_a_rst?). Try it and see. But beware unlike the standard template which is compliant with 1076.6-2004, the VHDL RTL Synthesis Standard, this behavior does not appearto be supported. You could create an init_ports procedure that assigned constant values to the ports during reset, and that should create registered ports. Maybe synthesis could propagate the constants assigned to the registers on to the outputs.

VHDL inherited its lack of static local variables from Ada. They only appear to be static in a process because the process only executes once, while suspending and awakening many times.

Passing lots of variables into and out of a function or procedure can be made much simpler by using a record.

If synthesis supported protected types and their method calls, they would allow complete encapsulation like entities, but with procedural instead of signal interfaces. Something to think about...

Andy
 
A

Andy

I'm not sure what that means. Procedures have inputs and outputs. Pass theinputs in and the outputs out. Where's the problem?

Inout ports can be used for this, but then the procedure does not remember the last state, the process does.

For in ports to procedures that do not span time, It is best to use a constant port kind, instead of variable or signal kind. A constant port can be associated with a signal, a variable, or an expression of either or both. But a variable port can only be associated with a variable, and a signal portcan only be associated with a signal.

Inout and Out ports must be either signal or variable, and cannot be associated with the other.

Also, VHDL cannot recognize the signature difference between two procedureswhose only difference is variable vs signal port declarations. So writing two different, otherwise identical procedures for a variable port and a signal port is useless: the compiler cannot choose between them.

Pass by value vs reference is kinda moot except for inout (you cannot writean input anyway), and the standard mentions somewhere that it is an error to depend on the simulator implementation one way or the other.

Andy
 
A

Andy

I used to use only signals because I did not quite understand the rules
for variables. This is currently changing. While I consider abstraction
a great benefit (just consider how easy is to describe a counter just
saying 'cnt = cnt + 1', instead of describing connections between
flops), the advantage of using variables is not only related to
simulation performances (memory footprint for a signal is bigger), but
also because it follows IMO more naturally the way of sequential
reasoning.

I prefer variables to signals for several reasons:

1) Variables are declared locally in the process or subprogram (encapsulation). You can surround a process with a block statement in which you can declare signals that are used only by that process, but it takes more code.

2) Use of variables can infer combinatorial or registered logic in a clocked process.

3) Variables allow you to create combinatorial or registered outputs from the same clocked process.

4) The update semantics of variable assignments are more intuitive from a SW/code point of view. Signals assignments in a process are only pseudo-sequential; in some aspects, order of execution matters, in others it does not.Variables are purely sequential.

5) Not only is the variable's memory footprint less than a signal's, the overhead of the separate update at suspension consumes less compute time.

Andy
 
A

alb

Hi Andy,

On 24/09/2013 19:21, Andy wrote:
[]
After reviewing your FSM though, I think you might do better with
separate, smaller state machine(s) for the reusable states, and
implement a hierarchical state machine. There is no point in
combining all the reusable and unique states into the same FSM.

I came up with this new attempt to implement hierarchical FSMs:

<code>
procedure update_regs is
begin -- purpose: call the procedures above in the desired order
writ_read (state_v, mstate_v, nstate_v); -- low level write/read
pres_puls (state_v, mstate_v, nstate_v); -- presence pulse
auto_read (state_v, mstate_v, nstate_v); -- top level fsm
end procedure update_regs;
</code>

the three FSMs run in 'parallel' and they pass each other the three main
state variables. state_v, mstate_v and nstate_v [1] are globally
accessible but the key point is in the formal parameter definitions of
the three procedures:

<code>
procedure writ_read (
state_v : inout state_t;
mstate_v : in mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure pres_puls (
state_v : in state_t;
mstate_v : inout mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure auto_read (
state_v : in state_t;
mstate_v : in mstate_t;
nstate_v : inout nstate_t) is
-- ...
</code>

While all states are available to all FSMs only one of them is
changeable buy each of them. Moreover, calling the parameters with the
same name as the global variables prevents accidental write access to
global variables from inside the procedure, providing some degree of
'protection'.
Does Synplify actually recognize your FSM as an FSM (what does it
look like in the RTL viewer or FSM explorer)? Sometimes if you assign
the next state from the contents of a register other than the current
state, Synplify will not treat it as a state machine, which then
excludes optimizations normally performed on FSMs.

With the current implementation Synplify Pro recognizes all the FSM and
operates accordingly, resulting in an RTL which makes much more sense.

That said, my stupid logic seems not working for other reasons that I
now start to believe reside in my simulation setup not being 100%
representative of the HW setup (ouch!).

[1] I should really ask myself why I've chosen these set of names for
the state variables... the fact they are not equal length is a worrying
sign of my 'post 25 y.o. decline'
 
A

alb

Hi Andy,

The update_ports procedure (or the region of the clocked process in
which it runs) is not its own process. Any code there runs anytime
the process is triggered (changes in reset or clk, including both
edges of each). There is no separate sensitivity list.

this is clear and that is the main reason why update_ports cannot
instantiate purely combinatorial combination of inputs to produce outputs.

That said, aren't synthesis tool ignoring - for some strange reason -
the sensitivity list?
Since update_ports should only be reading variables that are only
updated when init_regs or update_regs run, the output value of the
ports never actually changes on the falling edge, but they are
reassigned (there is a transaction but no event). If you looked at a
driven port's 'transaction attribute, that falling edge transaction
would be apparent. And since there is no re-execution of the process
to generate new output values, those output values change in the same
simulation delta cycle as any signal updated on the rising of the
clock (e.g. a register).

Now I see why the template_rst and template_v_rst are functionally
equivalent in Mike's example:

http://hdfs.googlecode.com/svn/trunk/hdfs/lib/uart/uart.vhd
I'm not sure what kind of circuitry your modified template (I assume
you meant "process" template_a_rst instead of "procedure"
template_a_rst?).

You are right, I should have said 'process' and should have modified the
procedure to be a process instead, but I've been lazy and I copied it
from the example above.
Try it and see. But beware unlike the standard
template which is compliant with 1076.6-2004, the VHDL RTL Synthesis
Standard, this behavior does not appear to be supported. You could
create an init_ports procedure that assigned constant values to the
ports during reset, and that should create registered ports. Maybe
synthesis could propagate the constants assigned to the registers on
to the outputs.

Probably is not worth the effort and certainly if I start to depend on
tools' implementation than I'm on my own.
VHDL inherited its lack of static local variables from Ada. They only
appear to be static in a process because the process only executes
once, while suspending and awakening many times.

as I showed in this post <you
may come to some higher level of encapsulation, allowing a procedure to
modify only a specific global variable, while still providing reading
access to the others.
Passing lots of variables into and out of a function or procedure can
be made much simpler by using a record.

but the record type does not allow different modes for its elements,
therefore limiting the aim of passing through formal parameters.
If synthesis supported protected types and their method calls, they
would allow complete encapsulation like entities, but with procedural
instead of signal interfaces. Something to think about...

Learning to use protected types is the next point on my to-do list ;-)
 
R

rickman

Hi Andy,



this is clear and that is the main reason why update_ports cannot
instantiate purely combinatorial combination of inputs to produce outputs.

That said, aren't synthesis tool ignoring - for some strange reason -
the sensitivity list?

I'm not sure I follow. What is being ignored about the sensitivity
list? But I will say I have read about many times when there are errors
in the sensitivity list and the simulation fails because of it, but the
synthesis works just the same. But I expect this is very different from
what you are talking about.
 
R

rickman

Hi Andy,

On 24/09/2013 19:21, Andy wrote:
[]
After reviewing your FSM though, I think you might do better with
separate, smaller state machine(s) for the reusable states, and
implement a hierarchical state machine. There is no point in
combining all the reusable and unique states into the same FSM.

I came up with this new attempt to implement hierarchical FSMs:

<code>
procedure update_regs is
begin -- purpose: call the procedures above in the desired order
writ_read (state_v, mstate_v, nstate_v); -- low level write/read
pres_puls (state_v, mstate_v, nstate_v); -- presence pulse
auto_read (state_v, mstate_v, nstate_v); -- top level fsm
end procedure update_regs;
</code>

the three FSMs run in 'parallel' and they pass each other the three main
state variables. state_v, mstate_v and nstate_v [1] are globally
accessible but the key point is in the formal parameter definitions of
the three procedures:

<code>
procedure writ_read (
state_v : inout state_t;
mstate_v : in mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure pres_puls (
state_v : in state_t;
mstate_v : inout mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure auto_read (
state_v : in state_t;
mstate_v : in mstate_t;
nstate_v : inout nstate_t) is
-- ...
</code>

While all states are available to all FSMs only one of them is
changeable buy each of them. Moreover, calling the parameters with the
same name as the global variables prevents accidental write access to
global variables from inside the procedure, providing some degree of
'protection'.

Maybe you are accustomed to this style, but I would find it rather
complex and difficult to code for. Even though these variables will
result in registers when used in a clocked process, as variables their
values are updated immediately rather than waiting for a delta step like
signals do. So the first procedure will run and update its state
variable. Then the second procedure will run having to be aware that
the first state variable has *already changed*... ect for the third
state variable.

I'm not sure what FSM would have three independent state variables like
this. I am assuming the other variables would have to do with outputs
or something else, dunno. I just know I would never use a style like
this for FSM work.

Typically any FSM has two signals or variables, present_state and
next_state with obvious uses. Next_state is a function of present_state
and inputs. Present_state is updated from next_state on the clock edge.
So present_state is the registered value and next_state is in essence
the value of the input to the state register.

In most hardware implementations any other FSM will depend on the
registered version of the other FSM state variables. Likewise the
output of a FSM will typically only depend on the registered value. But
when using a strict Mealy machine it can be useful to access the input
to the state register.

Does Synplify actually recognize your FSM as an FSM (what does it
look like in the RTL viewer or FSM explorer)? Sometimes if you assign
the next state from the contents of a register other than the current
state, Synplify will not treat it as a state machine, which then
excludes optimizations normally performed on FSMs.

With the current implementation Synplify Pro recognizes all the FSM and
operates accordingly, resulting in an RTL which makes much more sense.

That said, my stupid logic seems not working for other reasons that I
now start to believe reside in my simulation setup not being 100%
representative of the HW setup (ouch!).

[1] I should really ask myself why I've chosen these set of names for
the state variables... the fact they are not equal length is a worrying
sign of my 'post 25 y.o. decline'

Hmmm... I've not suffered a post 25 decline. I only sharpened through
my 30s and 40s, but I am seeing a post 50 decline. I slow down and
think more about what I do. lol
 
R

rickman

Inout ports can be used for this, but then the procedure does not remember the last state, the process does.

For in ports to procedures that do not span time, It is best to use a constant port kind, instead of variable or signal kind. A constant port can be associated with a signal, a variable, or an expression of either or both. But a variable port can only be associated with a variable, and a signal port can only be associated with a signal.

Inout and Out ports must be either signal or variable, and cannot be associated with the other.

Also, VHDL cannot recognize the signature difference between two procedures whose only difference is variable vs signal port declarations. So writing two different, otherwise identical procedures for a variable port and a signal port is useless: the compiler cannot choose between them.

Pass by value vs reference is kinda moot except for inout (you cannot write an input anyway), and the standard mentions somewhere that it is an error to depend on the simulator implementation one way or the other.

I still don't see the problem with using the parameter lists with
procedures. Maybe you need to spell it out more clearly for me?
 
A

alb

Hi Rick,

On 02/10/2013 02:35, rickman wrote:
[]
I came up with this new attempt to implement hierarchical FSMs:

<code>
procedure update_regs is
begin -- purpose: call the procedures above in the desired order
writ_read (state_v, mstate_v, nstate_v); -- low level write/read
pres_puls (state_v, mstate_v, nstate_v); -- presence pulse
auto_read (state_v, mstate_v, nstate_v); -- top level fsm
end procedure update_regs;
</code>

the three FSMs run in 'parallel' and they pass each other the three main
state variables. state_v, mstate_v and nstate_v [1] are globally
accessible but the key point is in the formal parameter definitions of
the three procedures:

<code>
procedure writ_read (
state_v : inout state_t;
mstate_v : in mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure pres_puls (
state_v : in state_t;
mstate_v : inout mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure auto_read (
state_v : in state_t;
mstate_v : in mstate_t;
nstate_v : inout nstate_t) is
-- ...
</code>

While all states are available to all FSMs only one of them is
changeable buy each of them. Moreover, calling the parameters with the
same name as the global variables prevents accidental write access to
global variables from inside the procedure, providing some degree of
'protection'.

Maybe you are accustomed to this style, but I would find it rather
complex and difficult to code for. Even though these variables will
result in registers when used in a clocked process, as variables their
values are updated immediately rather than waiting for a delta step like
signals do. So the first procedure will run and update its state
variable. Then the second procedure will run having to be aware that
the first state variable has *already changed*... ect for the third
state variable.

I do not see what's wrong with the second procedure needing to be aware
of the changed variable. When the second procedure runs all previous
variables have been updated accordingly and depending on their values
the procedure runs accordingly.

To infer a register simply use the variable before it is assigned.
I'm not sure what FSM would have three independent state variables like
this. I am assuming the other variables would have to do with outputs
or something else, dunno. I just know I would never use a style like
this for FSM work.

this is what the RTL view is showing and what the FSM explorer has found.
Typically any FSM has two signals or variables, present_state and
next_state with obvious uses. Next_state is a function of present_state
and inputs. Present_state is updated from next_state on the clock edge.
So present_state is the registered value and next_state is in essence
the value of the input to the state register.

exactly the same here. In each individual FSM the inout mode in the
parameter list is for the state parameter that the FSM will run with.
Like in one process FSM, you do not need to formally separate the
'present state' and the 'next state'.
In most hardware implementations any other FSM will depend on the
registered version of the other FSM state variables. Likewise the
output of a FSM will typically only depend on the registered value. But
when using a strict Mealy machine it can be useful to access the input
to the state register.

I've thrown Mealy and Moore definitions away long ago. I think about the
function, not a particular implementation.
I prefer to have registered outputs and in my case those outputs are
described directly in the appropriate states of the FSM, whether this is
Mealy or Moore I could care less with all due respect.

[]
[1] I should really ask myself why I've chosen these set of names for
the state variables... the fact they are not equal length is a worrying
sign of my 'post 25 y.o. decline'

Hmmm... I've not suffered a post 25 decline. I only sharpened through
my 30s and 40s, but I am seeing a post 50 decline. I slow down and
think more about what I do. lol

At 25 y.o. I spent 3 months on the red roads of the Australian inland, a
year later only 21 days in Chile and two years later only 15 days in New
Zealand. Nowadays I can have a sore back after a night in a tent... why
should I expect my brain to have followed a different course? ;-)
 
A

Andy

The local/global thing is not so important in HDL the way most
people use it. Very few designers put their entire design in
one process with a large number of procedures. Rather the
modularize by using multiple processes to describe the hardware
they are designing a section at a time. When using signals they
can only be driven validly by one process, so you get warnings,
no, actually errors, when a signal is driven by multiple
processes.

Encapsulation is often less about which code can modify/drive a variable/signal than it is about which code can read it, depend on it, and quit working if it is modified.

For example, say I have a couple of processes in an architecture. One uses a counter to step through some data, and it is not important which order itis processed, so the author decides to use an up-counter.

Another process that can see that counter, uses that same counter's value to control something else within it, but it depends on the implementation decision in the first process to use an up-counter.

What happens if the first process is modified to optimize the counter by converting it to a down-counter? If the counter had been a local variable, then there would be nothing outside that process that could be directly dependent upon its behavior, and changing the direction of the counter would have no impacts elsewhere. But if it is a signal, then the entire architecturehas to be understood to make sure that any changes to the counter's behavior do not have an unforeseen impact.

Sometimes shared counters are a good thing; great, make them signals so that it is known that it is intended to be shared. Otherwise, keep it local sothat it cannot be shared. Better yet, if the counter is shared among only two of the processes, put those two processes in a block, and declare the counter signal locally within the block. This protects the counter from dependencies in the other processes in the architecture.

Al's solution of passing state variables around between different processesis another example. Generally, state variables are pure implementation, and should not be shared. A better solution might be to define the interfacesbetween the procedures as explicit control (start) and status (finished) parameters, so that one procedure can be modified to change the way its FSM works, while maintaining the interface signals' functionality, and the other procedures would not be impacted.
The other way the global/local thing is not an issue is because
most designers don't even use a single entity. Each entity has
its own set of signals making them all local.

If designers are using a single process per entity, then yes, there is no practical difference in scope between a signal and a variable. Most designers use multiple processes per entity, so there is a difference for most designers.

I have never wanted to add combinatorial logic to a clock process,
I keep the combinatorial logic separate because it is... well,
separate. This is really an issue of what you are comfortable with.
I don't care for what are fairly subtle usages to create such logic
in a clocked process. It will confuse many designers and is subject
to error.

This is a matter of how most designers are taught HDL: by examples of what kind of code structure creates what kind of circuit, and then just write concurrent islands of code that generates those circuits, and wire them up (in code).

Sure, it is important to know what kind of circuit will be created from a certain piece of code. But the problem is, the synthesis tool is analyzing the behavior of the code, not the structure, and inferring the circuit from that behavior. The problem is that designers are taught that "code that looks like this" creates a register, and "code that looks like that" creates acombinatorial circuit.

Designers should be taught that "code that BEHAVES like this" creates a register, etc. It is amazing to me how many different approaches to avoiding latches in RTL are based on a fundamental misunderstanding of the behavior that infers a latch (which is very similar to the behavior that creates a register).

Design productivity can only progress so far by continuing to focus on describing the circuitry (gates and registers). To improve design productivity,we have to start designing more at the behavioral level (functions, throughput and latency). Why do you think high level synthesis tools (that can synthesize untimed models in C, etc.) are becoming so popular? I don't think it is the language as much as it is the concept of describing behavior separate from throughput and latency (those are provided to the HLS tool separately), and getting working hardware out the other end.
Isn't this the same as 2? No, wait, you can't output variables, so you have to use signals to output anything from a clocked process, no?

Of course, any output from a process must be a signal. But for that signal to be a combinatorial function of registered values in the same process, the registers must be inferred from variables. If you use a signal for the register in the process, you have to use a separate process for the combinatorial function.
Intuitive is a nice word for "it matches my bias". If your background
is sequential code like C, then yes, you will likely feel more
comfortable with variables. I learned such languages, but when I
learned HDL I was taught to describe the hardware (after all, were
were hardware designers learning to use an HDL in place of schematics)
and processes and signals are natural tools for that.

Perhaps so, but my background is hardware design (analog and digital circuit cards and later, FPGAs), not SW. My first few XC3090 FPGA designs were byschematic entry. I did not immediately embrace HDL design (I actually lobbied managment against it), but once I tried it, I was hooked. My first VHDLtraining was for simulation, not synthesis, so maybe that too has influenced the way I use VHDL even for synthesis. Over the decades I have seen first hand the value of designing the desired behavior of a circuit, rather than describing the circuit itself. There are times where performance or synchronization still require focus on the circuit. But even for those, I tend to tweak the behavior I am describing to get the circuit I need (using RTL &Technology viewers in the synthesis tool), rather than brute-force the circuit description.
I have never had an issue with running time in a simulation that didn't
involve the amount of data being recorded rather than the simulation
itself. But then I don't design huge FPGAs, I tend to do small designs,
about the same complexity as you might find on a small MCU. I have a
different approach to HDL use. My background is likely very different
from yours. Also, my approach is widely used by the industry and so it
well debugged in terms of the tools supporting it. I seem to have good
productivity and my designs are solid (as long as I don't make newbie
mistakes like I did recently where I forgot the synchronizing FF on an
input).

Among causes for slow simulations, using signals where variables would workis pretty low on the list of big hitters. But using lots of combinatorial processes is a much bigger hitter (gate level models are the extreme example of this). Some simulators can merge execution of processes that share thesame sensitivity list, saving the overhead of separately starting and stopping the individual processes. Combinatorial processes rarely share the same sensitivities, so they are rarely combined, and the performance shows it.
So different horses for different courses.

Of course!

Andy
 
K

KJ

Another process that can see that counter, uses that same counter's valueto control
something else within it, but it depends on the implementation decision in the first
process to use an up-counter.

What happens if the first process is modified to optimize the counter by converting it
to a down-counter? If the counter had been a local variable, then there would be nothing
outside that process that could be directly dependent upon its behavior, and changing
the direction of the counter would have no impacts elsewhere. But if it is a signal,
then the entire architecture has to be understood to make sure that any changes to
the counter's behavior do not have an unforeseen impact.

This is a weak argument. If a signal that is a counter can be 'seen' and used for some other than the original intended purpose, then so can a variable...you simply put the 'other' code inside that same process and watch things not work in the same way as when the counter is a signal. Pretending you have some sort of firewall here doesn't make it one.

If something no longer works because a behavior was changed it will show upin the testbench. Even if you don't do that then you should be able to catch it when you do a simple text search to see where else the signal or variable that you're modifying gets used elsewhere in the design. Making a behaviour change without doing this simple text search to see what else can be affected is a design process that should be modified.

One clear advantage of signals over variables is simply that signals can beadded to a wave window for debug after the fact, variables cannot. When an assertion is triggered or some other anomoly is noticed, the fact that you can drag the signal over to the wave window and see the entire history and not have to restart the simulation can be a big time saving advantage. If your sims are short then restarting the sim is not an issue...but then ifthe sim is short there is likely little wall clock time advantage to use variables either.

Kevin Jennings
 
R

rickman

Hi Rick,

On 02/10/2013 02:35, rickman wrote:
[]
I came up with this new attempt to implement hierarchical FSMs:

<code>
procedure update_regs is
begin -- purpose: call the procedures above in the desired order
writ_read (state_v, mstate_v, nstate_v); -- low level write/read
pres_puls (state_v, mstate_v, nstate_v); -- presence pulse
auto_read (state_v, mstate_v, nstate_v); -- top level fsm
end procedure update_regs;
</code>

the three FSMs run in 'parallel' and they pass each other the three main
state variables. state_v, mstate_v and nstate_v [1] are globally
accessible but the key point is in the formal parameter definitions of
the three procedures:

<code>
procedure writ_read (
state_v : inout state_t;
mstate_v : in mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure pres_puls (
state_v : in state_t;
mstate_v : inout mstate_t;
nstate_v : in nstate_t) is
-- ...
procedure auto_read (
state_v : in state_t;
mstate_v : in mstate_t;
nstate_v : inout nstate_t) is
-- ...
</code>

While all states are available to all FSMs only one of them is
changeable buy each of them. Moreover, calling the parameters with the
same name as the global variables prevents accidental write access to
global variables from inside the procedure, providing some degree of
'protection'.

Maybe you are accustomed to this style, but I would find it rather
complex and difficult to code for. Even though these variables will
result in registers when used in a clocked process, as variables their
values are updated immediately rather than waiting for a delta step like
signals do. So the first procedure will run and update its state
variable. Then the second procedure will run having to be aware that
the first state variable has *already changed*... ect for the third
state variable.

I do not see what's wrong with the second procedure needing to be aware
of the changed variable. When the second procedure runs all previous
variables have been updated accordingly and depending on their values
the procedure runs accordingly.

That *is* the problem. The second procedure sees the next state rather
than the current state.

To infer a register simply use the variable before it is assigned.

I thought they were all registers? If not, they aren't state variables.

this is what the RTL view is showing and what the FSM explorer has found.


exactly the same here. In each individual FSM the inout mode in the
parameter list is for the state parameter that the FSM will run with.
Like in one process FSM, you do not need to formally separate the
'present state' and the 'next state'.

And therein lies the problem. When using a single variable like this,
if some state variable have been updated but not others, the FSM gets
very complex. The isolated procedures for updating each state variable
in your FSM have to be aware of one another and the order in which they
are invoked. This greatly complicates the code and understanding of it.
I would find that to be impossibly difficult to use. I don't consider
this to be useful decomposition.

I've thrown Mealy and Moore definitions away long ago. I think about the
function, not a particular implementation.
I prefer to have registered outputs and in my case those outputs are
described directly in the appropriate states of the FSM, whether this is
Mealy or Moore I could care less with all due respect.

Yes, Mealy and Moore are not often used in a strict sense, but the point
is access to the *next* value of the state rather than the current
value. This lets you get registered outputs out on *this* clock edge
rather than having them wait a clock cycle.

[1] I should really ask myself why I've chosen these set of names for
the state variables... the fact they are not equal length is a worrying
sign of my 'post 25 y.o. decline'

Hmmm... I've not suffered a post 25 decline. I only sharpened through
my 30s and 40s, but I am seeing a post 50 decline. I slow down and
think more about what I do. lol

At 25 y.o. I spent 3 months on the red roads of the Australian inland, a
year later only 21 days in Chile and two years later only 15 days in New
Zealand. Nowadays I can have a sore back after a night in a tent... why
should I expect my brain to have followed a different course? ;-)

I give up..? Trick question?

The brain develops as does the body. I can't do the things I could do
at 25, but other things I do much better. I'm not willing to stop
learning, but I have goals and my learning fits those goals.
 
R

rickman

Encapsulation is often less about which code can modify/drive a variable/signal than it is about which code can read it, depend on it, and quit working if it is modified.

For example, say I have a couple of processes in an architecture. One uses a counter to step through some data, and it is not important which order it is processed, so the author decides to use an up-counter.

Another process that can see that counter, uses that same counter's value to control something else within it, but it depends on the implementation decision in the first process to use an up-counter.

What happens if the first process is modified to optimize the counter by converting it to a down-counter? If the counter had been a local variable, then there would be nothing outside that process that could be directly dependent upon its behavior, and changing the direction of the counter would have no impacts elsewhere. But if it is a signal, then the entire architecture has to be understood to make sure that any changes to the counter's behavior do not have an unforeseen impact.

That makes no sense to me. Whether the counter implementation affects
other logic depends on whether the other logic uses the value of the
counter. Why wouldn't you know this if a signal is used?

Sometimes shared counters are a good thing; great, make them signals so that it is known that it is intended to be shared. Otherwise, keep it local so that it cannot be shared. Better yet, if the counter is shared among only two of the processes, put those two processes in a block, and declare the counter signal locally within the block. This protects the counter from dependencies in the other processes in the architecture.

Sometimes??? A counter is part of a design, created by a designer. If
the counter is intended to be shared it is shared, otherwise it is not.
You are talking about a totally different situation than the OP is
talking about using procedures.

Al's solution of passing state variables around between different processes is another example. Generally, state variables are pure implementation, and should not be shared. A better solution might be to define the interfaces between the procedures as explicit control (start) and status (finished) parameters, so that one procedure can be modified to change the way its FSM works, while maintaining the interface signals' functionality, and the other procedures would not be impacted.

I don't follow exactly. My problem with alb's implementation is that
the order of the procedure calls affects the values read by each
procedure, all within *one process*. That has got to be clumsy if not
impossible to make work. Or maybe I read his example wrong. If they
are separate processes then they communicate by signals, no?

If designers are using a single process per entity, then yes, there is no practical difference in scope between a signal and a variable. Most designers use multiple processes per entity, so there is a difference for most designers.

Yeah, but I don't buy into the idea that using signals creates problems
from lack of isolation. Modularization allows isolation. I use
entities, you want to use processes, I don't see much difference. I put
different state machines into different processes for clarity, I think
you (or alb) are putting different state machines into different
procedures in the same process. But I can't see how this woudl work the
way he shows it with one variable for each state variable. With no
isolation between the present state and next state I can't see how to
code separate procedures.

This is a matter of how most designers are taught HDL: by examples of what kind of code structure creates what kind of circuit, and then just write concurrent islands of code that generates those circuits, and wire them up (in code).

Sure, it is important to know what kind of circuit will be created from a certain piece of code. But the problem is, the synthesis tool is analyzing the behavior of the code, not the structure, and inferring the circuit from that behavior. The problem is that designers are taught that "code that looks like this" creates a register, and "code that looks like that" creates a combinatorial circuit.

Designers should be taught that "code that BEHAVES like this" creates a register, etc. It is amazing to me how many different approaches to avoiding latches in RTL are based on a fundamental misunderstanding of the behavior that infers a latch (which is very similar to the behavior that creates a register).

Design productivity can only progress so far by continuing to focus on describing the circuitry (gates and registers). To improve design productivity, we have to start designing more at the behavioral level (functions, throughput and latency). Why do you think high level synthesis tools (that can synthesize untimed models in C, etc.) are becoming so popular? I don't think it is the language as much as it is the concept of describing behavior separate from throughput and latency (those are provided to the HLS tool separately), and getting working hardware out the other end.

I don't agree really. RTL doesn't describe literal registers and gates.
It describes behavior at the level of registers. If you need that
level of control, which many do just so they can understand what is
being produced, then there is nothing wrong with RTL. Abstractions
create obfuscation with the hardware produced. I often have trouble
predicting and controlling the size and efficiency of a design. What
you are a describing would likely make that much worse.

Of course, any output from a process must be a signal. But for that signal to be a combinatorial function of registered values in the same process, the registers must be inferred from variables. If you use a signal for the register in the process, you have to use a separate process for the combinatorial function.

I don't have a problem with that although I would like to learn the
technique better, I might end up liking it. I have seen it, but never
used it. I'm usually too busy designing the circuit in my head to worry
about the coding really. I just don't see problems with the coding.
I'd like to be better at test benches though. There I sometimes code
purely behaviorally. But only when timing is not such an issue.

Perhaps so, but my background is hardware design (analog and digital circuit cards and later, FPGAs), not SW. My first few XC3090 FPGA designs were by schematic entry. I did not immediately embrace HDL design (I actually lobbied managment against it), but once I tried it, I was hooked. My first VHDL training was for simulation, not synthesis, so maybe that too has influenced the way I use VHDL even for synthesis. Over the decades I have seen first hand the value of designing the desired behavior of a circuit, rather than describing the circuit itself. There are times where performance or synchronization still require focus on the circuit. But even for those, I tend to tweak the behavior I am describing to get the circuit I need (using RTL& Technology viewers in the synthesis tool), rather than brute-force the circuit description.

There is also the issue that I don't use FPGAs and HDL every day, or any
other tool for that matter. I move around enough that I want to learn a
way to use a tool and then tend to stick with it so I don't have to keep
relearning. The tools change enough as it is.

Among causes for slow simulations, using signals where variables would work is pretty low on the list of big hitters. But using lots of combinatorial processes is a much bigger hitter (gate level models are the extreme example of this). Some simulators can merge execution of processes that share the same sensitivity list, saving the overhead of separately starting and stopping the individual processes. Combinatorial processes rarely share the same sensitivities, so they are rarely combined, and the performance shows it.


Of course!

Andy

Well, like I said, next design I do I will try the combinatorial output
from a clocked process to see how I like it. Not sure when that will be.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,158
Messages
2,570,882
Members
47,414
Latest member
djangoframe

Latest Threads

Top