Style of coding complex logic (particularly state machines)

K

KJ

rickman said:
The fanout of an async reset in an FPGA is not an issue because the
signal is a dedicated net.
My point was that if timing is not met due to the large fanout, that
the typical fitter will allow for the fanout to be limited by the user
if necessary. But to directly answer the original question, 'no' I
haven't had reset signal fanout as a problem but if I did I know I
could fix it by limiting the fanout on the fitter side without having
to change the source code. But I also tend to reset only those things
that really need resetting which, by itself, cuts down on the fanout as
well.
The timing is an issue as all the FFs have
to be released in a way that does not allow counters and state machines
to run part of their FFs before the others. But this can be handled by
ways other than controlling the release of the reset. Typically these
circuits only require local synchronization which can be handled easily
by the normal enable in the circuit. For example most state machines
do nothing until an input arrives. So synchronization of the release
of the reset is not important if the inputs are not asserted. Of
course this is design dependant and you must be careful to analyze your
design in regards to the release of the reset.
I agree.
That is what I addressed above. Whether the circuit will malfunction
depends on the circuit as well as the inputs preset. It is often not
hard to assure that one or the other prevents the circuit from changing
any state while the reset is released.
But simply synchronizing the reset in the first place will do that as
well...two different approaches to the problem, each equally valid.
Since the dedicated global reset can not be synchronized to a clock of
even moderately high speed, you can provide local synchronous resets to
any logic that actually must be brought out of reset cleanly. I
typically use thee FFs in a chain that are reset to zero and require
three clock cycles to clock a one through to the last FF.
Agreed, but one can also view these locally generated resets as simply
synchronized versions of the original reset. In fact, the synthesizer
would probably do just that seeing that you have (for example) 4 places
throughout the design where you've generated a local reset which is
simply the raw reset signal brought into a flip flop chain (I think
that's what you're describing). So it would take those four instances
and probably generate a single shift chain and local reset signal to
send to those 4 places. So all you've really done is written the
source code for the local reset 3 more times than is needed. Had you
taken the view that the reset input to those 4 places must be a
synchronized reset signal in the first place you probably would've
written the reset shift chain logic one time at a top level and
connected it up to those four inputs yourself and not written it on the
receiver side.
This is not a problem when you use the dedicated reset net.
I agree, but I was referring more to the reset signal distribution on a
board rather than inside an FPGA.
Even though there are FFs that do not need a reset, it does not hurt to put
the entire device in a known state every time.
OK, it doesn't 'hurt', but it doesn't 'help' either in the sense that
both approaches would meet the exact same requirements of the
functional specification for that part.
It is not hard to miss a FF that needs to be reset otherwise.
Inside the FPGA it doesn't matter since if you discover something that
you now realize needs to be reset you re-route and get a new file. Not
routing it to a part on a board and then discovering you need it is a
bit more of an issue. Resolving that issue by routing reset to every
part and then using it asynchronously is where problems have come up
when there are a lot of parts on the board.
Personally I think the noise issue is a red herring.
If it's a red herring than I can safely say that I have slayed several
red herrings over my career...but actually not many of late....not
since a certain couple designers moved on to to greener pastures to be
brutally honest.
If you have noise
problems on the board, changing your reset to sync will not help in
general. You would be much better off designing a board so it does not
have noise problems.
Maybe. But remember the scenario when you're brought in to fix a
problem with an existing board that you trace back to some issue with
reset. In that situation, a programmable logic change is more likely
the more cost effective solution.

KJ
 
R

rickman

KJ said:
My point was that if timing is not met due to the large fanout, that
the typical fitter will allow for the fanout to be limited by the user
if necessary. But to directly answer the original question, 'no' I
haven't had reset signal fanout as a problem but if I did I know I
could fix it by limiting the fanout on the fitter side without having
to change the source code. But I also tend to reset only those things
that really need resetting which, by itself, cuts down on the fanout as
well.

I don't know exactly what you mean by fanout. If a sync reset has to
go to 100 FFs, then there is nothing you can do to tell the fitter to
change that. The async reset is free, or actually already paid for, so
if it does the job why not use it?

But simply synchronizing the reset in the first place will do that as
well...two different approaches to the problem, each equally valid.

Both valid, but typically I find the async reset takes less effort and
resources. Only a small portion of my typical design has to be
controlled coming out of reset.

Agreed, but one can also view these locally generated resets as simply
synchronized versions of the original reset. In fact, the synthesizer
would probably do just that seeing that you have (for example) 4 places
throughout the design where you've generated a local reset which is
simply the raw reset signal brought into a flip flop chain (I think
that's what you're describing). So it would take those four instances
and probably generate a single shift chain and local reset signal to
send to those 4 places. So all you've really done is written the
source code for the local reset 3 more times than is needed. Had you
taken the view that the reset input to those 4 places must be a
synchronized reset signal in the first place you probably would've
written the reset shift chain logic one time at a top level and
connected it up to those four inputs yourself and not written it on the
receiver side.

Yes, that is exactly how I think of it, a local sync'd reset. Putting
it where it is needed is both very clear and saves resources. I never
use this in place of the async reset, but rather to supplement it for
synchronization. Much of the logic has to be reset, but very little of
it has to be synchronously released from reset.

I agree, but I was referring more to the reset signal distribution on a
board rather than inside an FPGA.

I understand, but noise still can upset a sync reset. This is just not
a workable solution to noise.

OK, it doesn't 'hurt', but it doesn't 'help' either in the sense that
both approaches would meet the exact same requirements of the
functional specification for that part.

I don't agree. By globally resetting the device, you have handled all
FFs so that if your requirement misses one, you don't find out about it
after the unit is in the field.

Inside the FPGA it doesn't matter since if you discover something that
you now realize needs to be reset you re-route and get a new file. Not
routing it to a part on a board and then discovering you need it is a
bit more of an issue. Resolving that issue by routing reset to every
part and then using it asynchronously is where problems have come up
when there are a lot of parts on the board.

The question is when do you find out about the missing reset? It is
easy for this sort of thing to slip totally through testing and only
show up in the users's hands.

If it's a red herring than I can safely say that I have slayed several
red herrings over my career...but actually not many of late....not
since a certain couple designers moved on to to greener pastures to be
brutally honest.

I assume you mean board designers who were not producing quiet boards?

Maybe. But remember the scenario when you're brought in to fix a
problem with an existing board that you trace back to some issue with
reset. In that situation, a programmable logic change is more likely
the more cost effective solution.

I am in a fairly long thread in comp.arch.embedded about how to design
boards so that you don't have SI and EMI issues. I think this sort of
problem should be dealt with before you make the board, not after it is
in the field. Too many engineers learn to cover their butts rather
than to produce good designs. I am tired of working that way and not
really knowing if my design will work before it is shipped. The one
universal rule I learned very early on is that you can not prove a
product works correctly by testing. It has to be designed to work
correctly by using design methods based on understanding what you are
doing. I have never seen a board noise issue that could be fixed by an
FPGA design change.
 
K

KJ

rickman said:
I don't know exactly what you mean by fanout. If a sync reset has to
go to 100 FFs, then there is nothing you can do to tell the fitter to
change that.
Yes you can. If for example, the timing analysis failed because of
reset then you can tell the fitter to limit fanout to say, 20. Then
what the fitter would do is replicate the flip flop that generates the
reset signal so that there are 5 of them and distribute those 5
(logically identical) resets to those 100 loads.

We can debate the extra resource usage of those 4 extra flops or that
maybe there wouldn't have been 100 in the first place, but I think
we've both made our points already.
The async reset is free, or actually already paid for, so
if it does the job why not use it?
At what point do you want to find out that the answer to the question
"if it does the job..." turns out to be "No, it doesn't do the job"
because the designer of some hunk of code that you're integrating in
didn't pay as close attention to resets as they should have and that
the way that the code 'used' the reset, while implying it could be
asynchronous really was not the case and that it needed to be
synchronous after all? (Either that or 'fix' the errent hunk of code
of course).
Both valid, but typically I find the async reset takes less effort and
resources. Only a small portion of my typical design has to be
controlled coming out of reset.
And unless that small portion is actually 'zero' you'll need some
synchronizer somewhere. In that case, I've found that resources
differences is neglibile or non-existent. I'll accept that you may
have seen differences and I don't want to get into the nitty gritty but
I'll bet that those differences that you saw were pretty small as well.
If not, then to what did you attribute the large difference would be
interesting to know.

As for effort, the only effort I see in either case is the coding which
is identical. It's just a question of where you physically put the "if
(Reset = '1) then"....or is there some other effort that you mean?
I understand, but noise still can upset a sync reset. This is just not
a workable solution to noise.
I'm not sure what solution you're referring to. All I'm saying is that
use of a synchronous reset is less susceptible to a noise issue than an
asynchronous one because it requires the noise to be somewhat
coincident with the clock in order for it to have any effect. On a
given board design though that coincidence will tend to either be near
0 or near 100%....but those near 0 ones don't need to be fixed because
they're not broken if used synchronously.
I don't agree. By globally resetting the device, you have handled all
FFs so that if your requirement misses one, you don't find out about it
after the unit is in the field.
Only if your requirement is that the flip flop be set to the state that
you happened to have coded for it and not the other state. In any
case, it's not sporting to say that one design approach is better
because it has a chance that it just happened to code correctly for a
missing requirement. Actual reset signal requirements are usually
pretty benign and in many cases NOT coding it as a matter of course
could lead one to finding this missing requirement earlier....during
simulation. The scenario I'm thinking of here is that OK, the
functional requirements has an as yet unidentified reset state. Based
on that I code the design and do not do anything to signal 'ABC' as a
result of reset. During simulation I find that I just can't get signal
'ABC' into the proper state (since it is an unknown at the end of
reset) and that I need to because the logic tree that it feeds into
requires 'ABC' to be in the proper state. In that situation the
simulation has immediately hit on to the missing functional requirement
and you can investigate, whereas coding to a specific value you have
the chance of getting it right or not and not finding out until product
is in the field. Starting with 'U' states in simulation and seeing
your system simulation model drive the 'U' out as a result of signals
other than 'reset' is a good indicator of things I've found.
The question is when do you find out about the missing reset? It is
easy for this sort of thing to slip totally through testing and only
show up in the users's hands.
Simulation and the 'U' value in the std_logic type is the key here I've
found to getting all initialization issues properly identified really
early on, long before prototypes.
I assume you mean board designers who were not producing quiet boards?
And that then had problems that needed to be fixed.
I am in a fairly long thread in comp.arch.embedded about how to design
boards so that you don't have SI and EMI issues. I think this sort of
problem should be dealt with before you make the board, not after it is
in the field.
Totally agree. But being realistic here, if you DO have boards out in
the field with this problem, there is also the issue of what is the
cost effective way to fix the problem from the perspective of both your
company and your customer?
Too many engineers learn to cover their butts rather
than to produce good designs. I am tired of working that way and not
really knowing if my design will work before it is shipped. The one
universal rule I learned very early on is that you can not prove a
product works correctly by testing. Agreed.

It has to be designed to work
correctly by using design methods based on understanding what you are
doing. I have never seen a board noise issue that could be fixed by an
FPGA design change.
Here's a hypothetical one (but not far from what I've seen) for you
then. You've got a 'blip' on reset where it gets above threshold and
that lasts for...maybe 1 ns at the receiver. You trace it down and
find out exactly what output switching condition is causing the blip to
happen. You can also characterize and analyze it to say that it will
never be able to couple and cause this blip to exist for more than 2
ns. On the receiver you have a clocked device that receives this reset
signal.

The 'proper' solution of course is to re-route the board to get the
reset away from the noise initiator, guard it appropriately,
etc.....the 'soft' design change is to change the code in the receiving
device to ignore resets that last for only two clocks or less (or
whatever works for you). Granted, the reset response of the device has
been degraded (by that clock cycle or two) but in many cases, that's OK
as well. You need to investigate it of course to validate but under
the right circumstances it would work just as flawlessly as the PCB
re-route.

The point being that just because a solution does not tackle the root
cause does not necessarily imply that it is in any way less robust.
And I'll also accept that in some (possibly many) situations there may
be no 'soft' solution...if you'll also accept that in some (possibly
many) situations that there really might.

Now, you've got "N" boards in the field. What is the 'best' solution,
not only from the perspective of your company (presumably the 'soft'
update is easier to distribute) but from your customers as well (who
would have down time to swap out the PCBA....oops, that board is in a
deep sea sensor? On the way to Mars? Inside average Joe user's PC?)

KJ
 
M

Mike Treseler

Eli said:
I also try to avoid variables for another reason (in addition to the
ones you stated). Somehow, when variables are used I can't be 100% sure
if the resulting code is synthesizable, because it can turn out not to
be.

If you mean that a variable does not always infer a register I agree.
If you mean that synthesis does not always produce a netlist that
simulates the same as the code, I disagree.
Additionally, since I do use signals, variables create the mixup of
"update now" and "update later" statements which make the process more
difficult to understand. With signals only it's all "update later".

I agree, and this is exactly why
I do not declare any signals for synthesis.

-- Mike Treseler
 
D

Duane Clark

KJ said:
...
The drawback of signals is that take longer simulation time...wasted
time too. I'm trying to resurrect the test code that I had comparing
use of variables versus signals but I seem to remember about a 10% hit
for signals...

I would be interested in whether anyone has theories on why variables
would simulate faster than signals. And whether this behavior has been
seen on different simulators, or only Modelsim.
 
A

Andy

KJ said:
5. There was also a post either here or in comp.lang.vhdl in the past couple
months that talked about how using the generally listed template can result
in gated clocks getting synthesized when you have some signals that you want
reset, and other signals that you don't. Being in the same process and all,
the original poster found that gated clocks were being synthesized in order
to implement this logic. The correct form of the template (that rarely gets
used by anyone posting to either this group or the vhdl group) is of the
form
process(clk, reset)
begin
if rising_edge(clk) then
s1 <= Something;
s2 <= Something else;
end if;
if (reset = '1') then
s1 <= '0';
-- s2 does not need to be reset,
end if;
end process;

Again, the scenario here is that you have
- More than one signal being assigned in this process
- At least one of those signals is not supposed to change as a result of
reset (either this is by intent, or by unintentionally forgetting to put the
reset equation)

Depending on the synthesis tool, this could result in a gated clock getting
generated as the clock to signal 's2' in the above example.

KJ

KJ,

I may be the previous poster you are speaking of...

The standard template with "if reset then... elsif rising_edge(clk)
then ..." will not cause a gated clock, but rather a clock enabled
register, disabled during reset, for those signals not reset in the
reset clause. This is also independent of whether reset is coded as a
synchronous or an asynchronous input (because of the elsif). The
template you used above would allow the normal clocked statements to
execute, and then override those signals that are reset, leaving the
unreset ones to retain their normal clocked behavior, thus avoiding the
need to disable them during reset.

Other comments on this thread:

If one disables all retiming and other sequential optimizations, then
there is definite merit in a descriptive style that explicitly
describes combinatorial behavior separately from registered behavior
(i.e combinatorial processes or concurrent statements separate from
clocked processes). But once retiming, etc. are enabled, all bets are
off. In those cases, I believe one is better off focusing on the
behavioral (temporal and logical) description and getting it right, and
not paying so much attention to specific gates and registers which will
not exist in the final result anyway. Since I enable retiming by
default, I use single, clocked processes by default as well.

One aspect that has not been touched upon is data scoping. One
convenient aspect of using variables is that their scope limits their
visibility to within the same process. The comment about "related"
functions being described in the same process is important in this
aspect. There is no need for unrelated functions to have visibility to
unrelated signals. Within "one big process" for the whole architecture,
scoping can be implemented with blocks, functions, or procedures
declared within the process to create islands of related functionality,
with limited visibility. I generally prefer to separate unrelated
functions to separate processes, but all my processes are clocked.

State variables are one such scoping application. I generally don't
want any process but the state machine process itself to have any idea
of what states I am using, and what they mean (the concept of
"information hiding" comes to mind). If I need something external to
happen when the state machine is in a certain state, I do it from
within the state machine process, either by handling it directly (e.g.
adding one to a count), or "exporting" a signal to indicate when it
should happen. The same effect can be accomplished with local signal
declarations inside a block statement that contains the combinatorial
next state process, the output process (if applicable), and the state
register process.

Andy
 
C

Charles Bailey

Martin Gagnon said:
txgen_state_machine_proc:
process(clk, reset_n)
begin
if reset_n = '0' then
prev_state_buf <= st_idle ;
cur_state_buf <= st_idle ;

elsif rising_edge(clk) then
prev_state_buf <= cur_state_buf ;

case cur_state_buf is
when st_idle =>
if sync = '1' then
cur_state_buf <= st_gotsync ;
else
cur_state_buf <= cur_state_buf;
end if;
...

You have a lot of unnecessary "else" clauses in your state machine code.
All you need is this:
...
case cur_state_buf is
when st_idle =>
if sync = '1' then
cur_state_buf <= st_gotsync ;
end if;
...

In other words, if cur_state_buf=st_idle and sync='0', for example,
cur_state_buf will keep its current value. You don't have to explicitly
reload it with the current value. Synthesis tools will automatically
insert the proper gating logic.

Charles Bailey
 
A

Andy

Variables simulate faster because there is no scheduling of a later
value update, as with signals (signal values do not actually update
until after the assigning process suspends). If the signal has
processes that are sensitive to it (i.e. separate combinatorial and
registered processes), then there is the process invocation overhead as
well.

Most modern simulators also merge all processes that are sensitive to
the same signal(s), to avoid the duplicate overhead of separate process
invocations. Combinatorial processes, because of their widely varying
sensitivity lists, foil this optimization.

By using only clocked processes with variables, one can write
synthesizable RTL that simulates at speeds approaching that of
cycle-based code on cycle accurate simulators.

Andy
 
M

Mike Treseler

Andy said:
I may be the previous poster you are speaking of...

The standard template with "if reset then... elsif rising_edge(clk)
then ..." will not cause a gated clock, but rather a clock enabled
register, disabled during reset, for those signals not reset in the
reset clause. This is also independent of whether reset is coded as a
synchronous or an asynchronous input (because of the elsif).

Exactly. Synthesis will go through asynchronous contortions
to *prevent* a register from being reset.
This is why I reset all registers the same way
and why I don't touch my process template
between _begin_ and _end_.
But once retiming, etc. are enabled, all bets are
off. In those cases, I believe one is better off focusing on the
behavioral (temporal and logical) description and getting it right, and
not paying so much attention to specific gates and registers which will
not exist in the final result anyway.

Well said.

-- Mike Treseler
 
K

KJ

KJ,

I may be the previous poster you are speaking of...

The standard template with "if reset then... elsif rising_edge(clk)
then ..." will not cause a gated clock, but rather a clock enabled
register, disabled during reset, for those signals not reset in the
reset clause.
<snip>
No, you weren't the one Andy although you and I did discuss resets on
that thread as well. The one I'm referring to is from June 15 in
comp.lang.vhdl called "alternate synchronous process template" started
by "Jens" (all that just in case the link below doesn't work)
http://groups.google.com/group/comp.lang.vhdl/browse_frm/thread/77006ae7297b6e86/?hl=en#

At the time, nobody seemed to dispute Jen's claim that the gated clock
could be created....I dunno, don't use them async resets ;)
Other comments on this thread:

If one disables all retiming and other sequential optimizations, then
there is definite merit in a descriptive style that explicitly
describes combinatorial behavior separately from registered behavior
(i.e combinatorial processes or concurrent statements separate from
clocked processes).
I'm not sure what merit you see in that. I'm describing the
functionality of the entity. If there is some need for what amounts to
a combinatorial function of the current state I'll do it with a
concurrent statement whereas you and Mike T will do it with a variable.
In either case, we would be trying to implement the same function
whether optomizations were on or off.
But once retiming, etc. are enabled, all bets are
off. In those cases, I believe one is better off focusing on the
behavioral (temporal and logical) description and getting it right, and
not paying so much attention to specific gates and registers which will
not exist in the final result anyway.
The "focusing on the behavioral (temporal and logical) description..."
is what I'm focused on as well. I also couldn't care less about
"specific gates and registers which will
not exist in the final result anyway". I'm just trying to get the
function and timing to meet the goal, if it all gets mushed together in
the synthesis process that's fine...that's what I pay for the tool to
do....

Either that or I'm missing what your point is, I've been known to do
that.
One aspect that has not been touched upon is data scoping. One
convenient aspect of using variables is that their scope limits their
visibility to within the same process. The comment about "related"
functions being described in the same process is important in this
aspect. There is no need for unrelated functions to have visibility to
unrelated signals.
I'll agree but add that that is somewhat of a 'religious' statement.
If taken to the other extreme yes you have a huge mass of only global
signals (and I'm not advocating that) but if one breaks the problem
down into manageable sized entities you don't (or should I say, I
don't) tend to have hundreds of signals in the architecture either.
It's a managable size, say from 0 to 2 dozen as a rough guess.
Within "one big process" for the whole architecture,
scoping can be implemented with blocks, functions, or procedures
declared within the process to create islands of related functionality,
This wouldn't address the issue I brought up about the use of
Modelsim's Dataflow window as a debug aid, but OK....my islands of
related functionality are the multiple processes and the concurrent
statements.
with limited visibility. I generally prefer to separate unrelated
functions to separate processes, but all my processes are clocked.
As do I.
State variables are one such scoping application. I generally don't
want any process but the state machine process itself to have any idea
of what states I am using, and what they mean (the concept of
"information hiding" comes to mind).
I would consider that to be a 'religion' thing. I wouldn't draw the
somewhat arbitrary boundary, I consider all of the logic implemented in
an entity to be closely enough related that they can at least talk
amongst themselves if it is helpful to get the overall function
implemented. Not really disagreeing with you, just saying that there
is no reason that relates back to the functional spec that would
justify this hiding so I wouldn't necessarily break them apart unless
the 'process fits on a screen' fuzzy rule starts kicking up.
If I need something external to
happen when the state machine is in a certain state, I do it from
within the state machine process, either by handling it directly (e.g.
adding one to a count), or "exporting" a signal to indicate when it
should happen.
And that tends to muddy the waters somewhat for someone following the
code since they can't perceive the interaction between the state
machine and the outputs all in one fell swoop that they could if it was
put together (and it didn't violate the 'process fitting on a screen'
fuzzy rule.

Good points, I don't necessarily disagree with the idea of local
scoping and information hiding as a general guiding principle but it
can be taken as dogma too that results in hiding things from those who
have a need to know (i.e. those other statements, processes, etc. that
are all within the same entity/architecture).

If you view all of those statements and processes in an entity as being
part of the same 'team' doing their little bit to get the overall
function of the entity implemented none is really more important than
the other, they all live and die together. By that rather crude sports
analogy the idea of 'information hiding' should be taken with some
suspicion. And yes, I realize that VHDL has nothing to do with sports
just thought I'd toss out an unrelated analogy to break up the day.

But which approach one takes is definitely a function of just how 'big'
the function is being implemented. One with hundreds of signals would
be far worse than multiple processes with local variables all scoped
properly. But if you have hundreds of signals I'll bet you have
thousands of lines of code all within one architecture and I'll bet
would be a good candidate for some refactoring and breaking it down
into multiple subentities that could be understood individually instead
of only as some large collective.

KJ
 
A

Andy

Wow, I never even noticed he said "gated clock" in the OP of that other
thread. I have never seen that, just the clock-disabled registers
(which creates a problem when the reset asynchronously asserts, all
mine synchronously deassert anyway).

The synthesis tool is not just trying to keep those unreset registers
from resetting, it is keeping them from doing anything else while the
other registers are reset, which is exactly the way the code simulates,
because of the elsif. Avoiding the elsif by using a parallel if
statement (whether synchronous or asynchronous) at the end avoids the
clock-disables. The main place where I have run into this is when
inferring memories from arrays. The array cannot be reset, otherwise
you get a slew of registers. But if it is in a process with other
registers that do get reset, then that creates a problem, which is
solved by putting the reset clause in parallel, at the end.
Occasionally, resets cause optimization or routing problems when I'm
trying to squeeze the most performance from a design, and I'll remove
the reset from those registers as well if it is not needed. My general
preference is to reset everything though, and I generally use the
traditional form since it will give me a warning if something is not
reset.

I don't take data scoping to a religious level, but I do keep it in
mind, even below the architecture level.

When coding state machines and their outputs, I prefer to see
everything associated with one state in one place. If it is not there,
it does not have visibility of the state anyway, the way I code it.
That way if I change my mind about the organization or naming of the
states, the effects of such a change are limited to one place and one
place only. It is more for maintenance than anything else, to try to
limit the extent to which all those signals are interweaved, and
impossible to untangle. VHDL makes it relatively easy to see what all
the inputs are to a function, but finding all the places where a signal
goes is another matter. That's what the text search function is for...

As to when to isolate different processes in a separate
entity/architecture, that is a touchy-feely type of decision. I
usually know it when I see it, but trying to describe a set of rules
for it is much more difficult than just doing it. Because my coding
styles are generally more compact than those with separate processes
for combo and registered logic, I generally get more in an architecture
before it gets too big. So a lightweight scoping mechanism is useful to
deal with more complexity within a given architecture. Let's just say
it helps keep a borderline too-complex description from overflowing
into multiple entity/architectures.

I like your "what fits on a screen" standard for processes. That seems
to work well for me too. That could be extended to functions and
procedures too, although mine are not usually anywhere near that long,
and they are usually defined within the process anyway.

My point about merits of separate combinatorial and clocked processes
is that most proponents of that style like the fact that they can
easily visualize what is gates and what is registers. I try to
encourage them to lift their visual ceilings (and floors, to some
extent) and focus on behavior since, especially with retiming and other
sequential optimizations, their original description will have little
in common with the synthesis output, except for the behavior which is
often obscured by the separation of registers from gates in the first
place. The same argument applies to using variables for registers
and/or combinatorial logic.

Thanks for the ideas...

Andy
 
E

Eli Bendersky

Mike said:
If you mean that a variable does not always infer a register I agree.
If you mean that synthesis does not always produce a netlist that
simulates the same as the code, I disagree.

Is all code using variables always synthesizable, and can you tell by a
single look how many clock cycles the update of all values take ? I'd
really appreciate a simple example or two.
Thanks in advance
 
M

mikegurche

Eli said:
Is all code using variables always synthesizable, and can you tell by a
single look how many clock cycles the update of all values take ? I'd
really appreciate a simple example or two.
Thanks in advance

In VHDL, a variable is a more "abstract" construct. Unlike a
signal, which is mapped to a wire or a wire with memory (i.e., a latch
or FF). There is no direct hardware counterpart for variable and the
synthesized circuit depends on the context in which the variable is
used.

The variable in VHDL is "static", which means that its value will
be kept between the process invocations. This implies a variable may
need to keep its previous value. Thus, a variable infers a latch or an
FF if it is "used before assigned a value" in a process and infers
a combinational circuit if it is "assigned a value before used".
For this aspect, a variable is usually synthesizable. I personally use
variable in a very restricted way:
- don't use variable to infer memory
- avoid self reference (e.g., n := n+1).
- use it as shorthand for a function.

Although I don't do it, this approach can even be used in a clocked
process and obtain combinational, unbuffered, output (see my previous
post on 1-process FSM example).

In synthesis, the problem is normally the abuse of sequential
statements, rather than the use of variable. I have seen people trying
to convert C segment into a VHDL process (you can have variables, for
loop, while loop, if, case, and even break inside a process) and
expecting synthesis software to figure out everything.

My 2 cents.

Mike G
 
K

KJ

Martin Gagnon wrote:
Hi.. I've read this pdf and it's look very interesting.. it's how many
different type of state machine implementations etc.. But the way I code
my state machine is different of all of them and I don't know if it's
good and I'm not sure to which one mine is equivalent.
I'd classify it with the 'one process' state machine folks since it
doesn't involve a combinatorial process to compute the next state
followed by a synchronous process to transform next state into current
state. That's usually the litmus test between the 'one process' versus
'two process' folks.

Here's one of my state machines example.
txgen_state_machine_proc:
process(clk, reset_n)
begin
if reset_n = '0' then
prev_state_buf <= st_idle ;
cur_state_buf <= st_idle ;

elsif rising_edge(clk) then
prev_state_buf <= cur_state_buf ;

case cur_state_buf is
when st_idle =>
if sync = '1' then
cur_state_buf <= st_gotsync ;
else
cur_state_buf <= cur_state_buf;
end if;
end case;
end if;
end process;
what do you think about the way I do my state machine ?

Well since you asked...

1. Since the reset input into a state machine almost always needs to be
synchronized I would lean towards using a synchronous reset template
(i.e. take reset out of the sensitivity list and move the "if (reset_n
= '0')" inside the "if rising_edge(clk) then". I almost hesitate to
bring this up since it's already been debated on this thread and
others, so I'll leave it as that's my 2 cents on the reset.

2. The "else cur_state_buf <= cur_state_buf;" construct that shows up
on every case is redundant and makes the overall source code roughly
twice as big as it would otherwise need to be.

Now, it could be argued that adding the else branch makes it 'clearer'
about what state the state machine goes to next but the following
counterarguments could be made as well.
- Read up some more on VHDL (not necessarily you, but the reader that
you're trying to make it 'clearer' to). If left undefined, a signal
will retain it's current state. Add a comment if you need to if you're
trying to guide the new guy that doesn't realize this, but don't double
the code size.
- Even most software languages are defined this way (i.e. if not
explicitly assigned a new value, every variable retains it's current
state).

Some other things to keep in mind...
- Question yourself about which is 'clearer', the 50 line process with
the else branches or the 25 line process without?
- Since the synthesizer will output the exact same output whether you
have the redundant else branches or not, question yourself why are you
explicitly writing lines of code (which have some non-zero probability
of having an error) that will get chucked out the window on step #1 of
synthesis? Is that a good use of time?

Don't read this as harsh criticism, just read it as the 2 cents of
input that you asked for (and that nobody til now responded to that I
can see).

KJ
 
M

Martin Gagnon

Martin Gagnon wrote:

I'd classify it with the 'one process' state machine folks since it
doesn't involve a combinatorial process to compute the next state
followed by a synchronous process to transform next state into current
state. That's usually the litmus test between the 'one process' versus
'two process' folks.






Well since you asked...

1. Since the reset input into a state machine almost always needs to be
synchronized I would lean towards using a synchronous reset template
(i.e. take reset out of the sensitivity list and move the "if (reset_n
= '0')" inside the "if rising_edge(clk) then". I almost hesitate to
bring this up since it's already been debated on this thread and
others, so I'll leave it as that's my 2 cents on the reset.

2. The "else cur_state_buf <= cur_state_buf;" construct that shows up
on every case is redundant and makes the overall source code roughly
twice as big as it would otherwise need to be.

Now, it could be argued that adding the else branch makes it 'clearer'
about what state the state machine goes to next but the following
counterarguments could be made as well.
- Read up some more on VHDL (not necessarily you, but the reader that
you're trying to make it 'clearer' to). If left undefined, a signal
will retain it's current state. Add a comment if you need to if you're
trying to guide the new guy that doesn't realize this, but don't double
the code size.
- Even most software languages are defined this way (i.e. if not
explicitly assigned a new value, every variable retains it's current
state).

Some other things to keep in mind...
- Question yourself about which is 'clearer', the 50 line process with
the else branches or the 25 line process without?
- Since the synthesizer will output the exact same output whether you
have the redundant else branches or not, question yourself why are you
explicitly writing lines of code (which have some non-zero probability
of having an error) that will get chucked out the window on step #1 of
synthesis? Is that a good use of time?

Don't read this as harsh criticism, just read it as the 2 cents of
input that you asked for (and that nobody til now responded to that I
can see).

KJ

Thanks for your 2 cents.. I like to have that kind of feed back.. It's
true.. I specified the else to be explicit and because I was not
absolutly sure what the synthetizer ouput in that case. But I think you
are right.. I will probably cut some explicit code on my next
projects..

I'm more concerned about my single process state machine with ma
prev_state and current_state with everything clocked.. As oposed to what
is suggested in the PDF..

Thanks for your answer..
 
M

Mike Treseler

Eli said:
Is all code using variables always synthesizable, and can you tell by a
single look how many clock cycles the update of all values take ?

The variables are updated every clock
but that "update" may be to keep the
same value.
I'd
really appreciate a simple example or two.

The advantages of a variable logic description
*increase* with complexity, so a persuasive
yet simple example is a challenge.


My favorite simple example is the
"clock enabled counters" source here:

http://home.comcast.net/~mike_treseler/

The focus is on updating values for simulation
rather than recipes for gates and flops.
The procedure "update_regs"
only describes value updates
required for the slow, medium and fast counts.
Note that I read carry bits and immediately clear
them without worrying about what that means in gates or flops.

Note in the RTL schematic view (object) that synthesis does
just fine working out how the carries and enables
work and where registers are not needed.
Also note that a process-per-block description using this view
would be more complicated than the example source.

-- Mike Treseler
 
K

KJ

I'm more concerned about my single process state machine with ma
prev_state and current_state with everything clocked.. As oposed to what
is suggested in the PDF..
The two process folks (those advocating the combinatorial 'next' state
process followed by the clocked process) and the one process folks
(those advocating what you've posted, plus possible use of concurrent
statements for combinatorial logic if required) all do agree on one
point:

Either method will produce identical functioning code that will
synthesize to the exact same output design.

With at least grudging agreement, the two process folks will also have
to agree that the one process approach requires less lines of source
code entry.

Take those two points as the two great truths in the 'great debate' and
draw your own conclusions.

All the other stuff about setting outputs a clock cycle earlier or
later, localizing references to signals, sim time using variables or
signals, etc. are just more interesting talk and tips but should have
no effect on which overall approach you adopt.

KJ
 
R

rickman

In synthesis, the problem is normally the abuse of sequential
statements, rather than the use of variable. I have seen people trying
to convert C segment into a VHDL process (you can have variables, for
loop, while loop, if, case, and even break inside a process) and
expecting synthesis software to figure out everything.

Personally I think most problems in using HDLs in this way come not
directly from the way signals or variables are used, but rather from
the use of an HDL to describe the solution in an abstract way. I
nearly always design in terms of registers and "clouds" for the logic.
I get a feel for how large the design is and if I need to optimize at
this block diagram level. I can even get an idea of how complex the
logic part is by looking at the equations that describe it. Then I use
an HDL to "describe" the hardware rather than describing the
functionality and letting the tool decide what hardware to invoke.

If I know I want a register, I add the code that will infer a register.
If I need a certain logic, I can include those equations in the
register process or I can use combinatorial descriptions separately. I
never start writing the HDL before I have a clear understanding of what
the hardware should look like. To me the HDL is just the lowest level
description of a sucessive decomposition of the design. The HDL is
never used to "program" a solution. This seldom results in the types
of problems you are discussing.

Just for the record, I do use integer variables for memory or other
sequential logic like counters. Memories simulate much faster when
coded with integer variables. This is both because of the integer and
the variable, IIRC.
 
M

mikegurche

rickman said:
Personally I think most problems in using HDLs in this way come not
directly from the way signals or variables are used, but rather from
the use of an HDL to describe the solution in an abstract way. I
nearly always design in terms of registers and "clouds" for the logic.
I get a feel for how large the design is and if I need to optimize at
this block diagram level. I can even get an idea of how complex the
logic part is by looking at the equations that describe it. Then I use
an HDL to "describe" the hardware rather than describing the
functionality and letting the tool decide what hardware to invoke.

I agreed with you completely. What I am trying to say is that variable
may not be synthesizable if you write the code with a "C
mentality."

Mike G.
 
M

mikegurche

rickman said:
Personally I think most problems in using HDLs in this way come not
directly from the way signals or variables are used, but rather from
the use of an HDL to describe the solution in an abstract way. I
nearly always design in terms of registers and "clouds" for the logic.
I get a feel for how large the design is and if I need to optimize at
this block diagram level. I can even get an idea of how complex the
logic part is by looking at the equations that describe it. Then I use
an HDL to "describe" the hardware rather than describing the
functionality and letting the tool decide what hardware to invoke.

I agreed with you completely. What I am trying to say is that variable
may not be synthesizable if you write the code with a "C
mentality."

Mike G.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,170
Messages
2,570,927
Members
47,469
Latest member
benny001

Latest Threads

Top