signals in sensitiv list... and reset

K

kennheinrich

I want to second Rickman's comments; it is important to learn this
aspect of VHDL; that will allow you to proceed and, if you wish, carry
on to find out the real cause of the performance degradation in your
5-state SM.

I agree wholeheartedly. The reality of design, though is that you will
*still* be forced to tailor your code to the tools. In any high-
performance, large production design there is a large amount of
"generic" VHDL combined with some ugly tool- and vendor-specific
hacks. I'd extend rickman's comments to say that you need to use your
knowledge of the "pure" language to remind yourself which is which.
I'd even go so far as to argue that one of the high-level metrics of a
design's "quality" is related to the ratio of the two and their degree
of separation. If you're lucky, you can use this knowledge to organize
your design better to improve reusability. If you're not, you can at
least gripe at the tool vendor more intelligently :-(

- Kenn
 
J

jacko

Hi

I just don't choose to us the said 'feature'. On the subject of
sensitivity lists, I tend to exclude things which have no relevance
until clock, or selection. This possibly allows such designs to be
sythesized using latches based on sensitivity, preventing many
possible power wasting transitions.

cheers
jacko
 
J

jacko

It is called postponed assignment; within a single process, it is the
same thing. The two or more assignments resolve into a single
assignment; the last one executed within the process.

I want to second Rickman's comments; it is important to learn this
aspect of VHDL; that will allow you to proceed and, if you wish, carry
on to find out the real cause of the performance degradation in your
5-state SM.

The way signal assignments work in VHDL processes, with the delta cycle
model, is one of VHDL's strongest points. It makes parallel processes
and inter-process communication unambiguous and reliable.

- Brian

The performance degradation was de to looped combinational logic,
insertion of a registered carry oll bit double the speed.

cheers
jacko
 
J

jacko

I agree wholeheartedly. The reality of design, though is that you will
*still* be forced to tailor your code to the tools. In any high-
performance, large production design there is a large amount of
"generic" VHDL combined with some ugly tool- and vendor-specific
hacks.  I'd extend rickman's comments to say that you need to use your
knowledge of the "pure" language to remind yourself which is which.
I'd even go so far as to argue that one of the high-level metrics of a
design's "quality" is related to the ratio of the two and their degree
of separation. If you're lucky, you can use this knowledge to organize
your design better to improve reusability. If you're not, you can at
least gripe at the tool vendor more intelligently :-(

 - Kenn

Fully generic, has generic string option which is not used at present.
In a full VHDL design, I think it may be possible to enumerate the
instruction opcodes, and have the 'program' compile to the bit states
which are optimal. This would require an understanding of how to
implement a code compiler in VHDL, infering the ROM containing the
code. Maybe an abstract type of entity_bus which is an enumeration of
all std_logic_vector possible states, with parallel or serial
interconnect to max enumeration, and width. Any thoughts?

cheers
jacko
 
K

KJ

jacko said:
Take nibz12.vhd from http://nibz.googlecode.com and eliminate the
enumeration state indqq. This is to prevent post increment on register
assign.

It does more than just that from a logic perspective. Just a simple perusal
by searching for 'indqq' shows that deleting the 'indqq' enumeration would
change the following:
- Logic for signals 'p', 'q', 'r' and 's' would be different. As written,
there would be instances (i.e. when 'indirect = indqq') where p,q,r,s would
not be updated; by deleting the 'indqq' enumeration one of these four would
always be doing something. Refer to lines 122-134 of nibz12.vhd, the
snippet of the code is at the end of this post as "Nibz12.vhd example #1".
There are other instances of this as well.
- Logic for signal 'pre' would be different. Refer to lines 223-234 of
nibz12.vhd or "Nibz12.vhd example #2" at then end of this post. Without
'indqq' as an enumeration, the statement "pre <= q;" that occurs when in the
case when 'indirect = indqq' would need some modifications.
- Logic for the signal 'ir' would be different. refer to case statement
starting at line 273 (or "Nibz12.vhd example #3" at the end of this post)
and in particular, the assignment "ind <= indqq;" on line 282 which would
produce a compile error if you deleted the 'indqq' enumeration.

I have no idea why you would toss this out as an example of the particular
sub-topic of demonstrating what you claim to be differences when there are
multiple assignments to a signal within a single process...but at this point
I don't really much care.

Obviously you didn't even take the time to see that simply deleting the
'indqq' enumeration...
- Would produce code that wouldn't compile
- Would change the logical function being implemented.
- Is not an example to support your claim that multiple assignments to a
signal within a process produces different synthesis results in terms of
either resource usage or performance.

I'm guessing that your claims are more based on the arrogrance of ignorance
than anything else.
According to you the fact that post-increment code occurs
before register assignment code, register assignment should overide,
and the state indqq would not be required.

No I didn't say that at all. What I said was that from the perspective of
- Logic function
- Synthesis resources
- Synthesized performance
the following two forms are exactly identical. Perhaps you should take some
more time reading and understanding what is being presented instead of going
off on various tangents stating things that you don't really know about.
Unfortunately, a person who doesn't know what they don't know is in a far
worse situation than someone who at least knows what they don't know.

-- #1
process(clk)
begin
if rising_edge(clk) then
if (reset = '1') then
-- do some sync resets
else
-- do something else
end if;
end if;
end process;

-- #2
process(clk)
begin
if rising_edge(clk) then
-- do something else
if (reset = '1') then
-- do some sync resets
end if;
end if;
end process;
I have tried it, it makes it larger and slower! A real example of
using or avoiding 'double' assignment.

It's a real example of something, but it is not an example of how avoiding a
'double' assignment changes anything. It may be an example of pipelining,
I'm not interested enough to find out. In order to show your point, you
would have to produce two designs that are
- Logically exactly equivalent (every signal has the same value in both
designs at every clock cycle).
- The only source code difference is that at least one signal has a 'double'
assignment in the one design but not the other.
- Demonstrates different resource usage or clock cycle performance
cheers
jacko

p.s. wouldn't consider doing a pointless simple test as reduction of
logic form just too obvoius to any silicon compilier.

But not so obvious to you for some reason.

The "pointless simple test" as you call it has nothing to do with scale,
those two templates would produce the exact same results whether the process
in question was one of a handful of lines (as was presented) or 10,000 lines
with multiple loops, case, if statements and whatever. You seem to feel
otherwise, even in spite of
- The comments of multiple people who know what they are talking about.
- The presentation of a complete design (not just a snippet of the relevant
code) that was provided that in the previous post that you could use to
prove it to yourself by simply copying it and trying it out.

In any case, I took the time to review what you suggested and pointed out
the flaws in your argument. The reason for your resource usage and clock
cycle differences have nothing to do with double assignment in the source
code it has completely to do with changing the logical function itself which
generally does produce changes in both of these metrics. A simple example
of this is pipelining where you break up a computationally 'expensive' logic
function into smaller ones that span several clock cycles. While at some
higher level of abstraction the two designs can be thought of as being
equivalent, the fact remains that the one with pipelining has more latency,
it will produce results at a different (later) time, the logic function
being implemented is different. That is not news, that is well known.

As a final point, since you pointed me to your code, here are some other
suggestions:
* You don't know how to name signals and constants in a meaningful way to
indicate what they logically represent. Some examples are...
signal p, q, r, s, c, x0, a0, x1, a1, car, ctmp
constant z, z4

* You don't understand what signals belong in the sensitivity list of a
synchronous process. Example:
process (CLK_I, RST_I, ACK_I) -- KJ: ACK_I is not needed.

* You don't understand what signals elong in the sensitivity list of a
combinatorial process. Example:
process(ir)
But this process (starting at line 206 of nibz12.vhd) depends on the
following signals as well: 'cycle', 'indirect', 'dir', etc. This will
synthesize to something that is functionally different than simulation.
That is a huge blunder, debugging in the simulator is way more productive
than on the bench...once you have sufficient skill that is.

* You probably don't simulate your source code.

Good luck on your learning experience, I'm done with this one.

Kevin Jennings

---- Nibz12.vhd example #1
case indirect is
--pre decrement??
when indp =>
p <= ADR_O;
when indq =>
q <= ADR_O;
when indr =>
r <= ADR_O;
when inds =>
s <= ADR_O;
when indqq =>

end case;

---- Nibz12.vhd example #2
case indirect is
when indp =>
pre <= p;
when indq =>
pre <= q;
when indr =>
pre <= r;
when inds =>
pre <= s;
when indqq =>
pre <= q;
end case;

---- Nibz12.vhd example #3
case ir(3 DOWNTO 0) is
when "0000" =>
-- BAck
ind <= indr;
wrt <= rd;
dir <= dirp;
post <= din;
when "0001" =>
-- Fetch In
ind <= indqq;
wrt <= rd;
dir <= dirq;
post <= din;
....
 
K

kennheinrich

Fully generic, has generic string option which is not used at present.
In a full VHDL design, I think it may be possible to enumerate the
instruction opcodes, and have the 'program' compile to the bit states
which are optimal. This would require an understanding of how to
implement a code compiler in VHDL, infering the ROM containing the
code. Maybe an abstract type of entity_bus which is an enumeration of
all std_logic_vector possible states, with parallel or serial
interconnect to max enumeration, and width. Any thoughts?

cheers
jacko

Jacko,

Thoughts? My first thought, is that this is not only a non-sequitur,
but it's very difficult to understand what you're asking. I *think*
that what you're asking, is "can I write an assembler in VHDL, to let
me initialize program memory directly from text/assembly source
code?". The answer is possibly yes, although there are rules about
static initializers and user-defined functions that may interfere with
your goals. Simply defining std_logic_vector constants that match the
opcode values might be enough for your testbenches.

Before embarking on that effort, though, you'd be well served by
cleaning up your code as Kevin suggested in another post. He made a
number of helpful suggestions, pointed out some real errors, and it's
worth trying to understand them.

- Kenn
 
J

jacko

hi

Obviously you didn't even take the time to see that simply deleting the
'indqq' enumeration...
- Would produce code that wouldn't compile

Well I got a version to compile! see below
- Would change the logical function being implemented.

It did the same.
- Is not an example to support your claim that multiple assignments to a
signal within a process produces different synthesis results in terms of
either resource usage or performance.

I'm guessing that your claims are more based on the arrogrance of ignorance
than anything else.


No I didn't say that at all.  What I said was that from the perspective of
- Logic function
- Synthesis resources
- Synthesized performance
the following two forms are exactly identical.  Perhaps you should take some
more time reading and understanding what is being presented instead of going
off on various tangents stating things that you don't really know about.
Unfortunately, a person who doesn't know what they don't know is in a far
worse situation than someone who at least knows what they don't know.

if they knew what they don't know I'd say they were confused!
It's a real example of something, but it is not an example of how avoiding a
'double' assignment changes anything.  It may be an example of pipelining,
I'm not interested enough to find out.  In order to show your point, you
would have to produce two designs that are
- Logically exactly equivalent (every signal has the same value in both
designs at every clock cycle).
- The only source code difference is that at least one signal has a 'double'
assignment in the one design but not the other.
- Demonstrates different resource usage or clock cycle performance



But not so obvious to you for some reason.

The "pointless simple test" as you call it has nothing to do with scale,
those two templates would produce the exact same results whether the process
in question was one of a handful of lines (as was presented) or 10,000 lines
with multiple loops, case, if statements and whatever.  You seem to feel
otherwise, even in spite of
- The comments of multiple people who know what they are talking about.
- The presentation of a complete design (not just a snippet of the relevant
code) that was provided that in the previous post that you could use to
prove it to yourself by simply copying it and trying it out.

In any case, I took the time to review what you suggested and pointed out
the flaws in your argument.  The reason for your resource usage and clock
cycle differences have nothing to do with double assignment in the source
code it has completely to do with changing the logical function itself which
generally does produce changes in both of these metrics.  A simple example
of this is pipelining where you break up a computationally 'expensive' logic
function into smaller ones that span several clock cycles.  While at some
higher level of abstraction the two designs can be thought of as being
equivalent, the fact remains that the one with pipelining has more latency,
it will produce results at a different (later) time, the logic function
being implemented is different.  That is not news, that is well known.

As a final point, since you pointed me to your code, here are some other
suggestions:
* You don't know how to name signals and constants in a meaningful way to
indicate what they logically represent.  Some examples are...
    signal p, q, r, s, c, x0, a0, x1, a1, car, ctmp
    constant z, z4

For such an entity I thing register names p etc. are fine.
* You don't understand what signals belong in the sensitivity list of a
synchronous process.  Example:
   process (CLK_I, RST_I, ACK_I)  -- KJ: ACK_I is not needed.

Deleted, ok 1 to you :)
* You don't understand what signals elong in the sensitivity list of a
combinatorial process.  Example:
   process(ir)
   But this process (starting at line 206 of nibz12.vhd) depends on the
following signals as well:  'cycle', 'indirect', 'dir', etc.  This will
synthesize to something that is functionally different than simulation.
That is a huge blunder, debugging in the simulator is way more productive
than on the bench...once you have sufficient skill that is.

Instruction decode only has to be valid on a new instruction in the
instruction register is my intent.
* You probably don't simulate your source code.

FlexLM and modelsim , tried, could not get flex licence to work.
Good luck on your learning experience, I'm done with this one.

Kevin Jennings

---- Nibz12.vhd example #1
    case indirect is
     --pre decrement??
     when indp =>
      p <= ADR_O;
     when indq =>
      q <= ADR_O;
     when indr =>
      r <= ADR_O;
     when inds =>
      s <= ADR_O; -- lines deleted
    end case;

---- Nibz12.vhd example #2
  case indirect is
   when indp =>
    pre <= p;
   when indq =>
    pre <= q;
   when indr =>
    pre <= r;
   when inds =>
    pre <= s; -- lines deleted
  end case;

---- Nibz12.vhd example #3
   case ir(3 DOWNTO 0) is
    when "0000" =>
     -- BAck
     ind <= indr;
     wrt <= rd;
     dir <= dirp;
     post <= din;
    when "0001" =>
     -- Fetch In
     ind <= indq; -- modifiied line
     wrt <= rd;
     dir <= dirq;
     post <= din;
   ....

And the post increment based on indirect when indp has an
if(not(dir=dirq)) then ... end if; wrapper to prevent double
assignmnt, or no wrapper as assignment overridden.

Making the alu 16 bit wide instead of 16/32 bit wide (excluing c
register from calculation) actually pushes up synthesis by 34 LE,
although lifting fmax by 1.5MHz. Curious. I wonder if the ALU high 16
bit is combined with some routing?

Now as some people have said, and maybe would say about me, it often
not to do witha lack of understanding, I'm just an eratic genious.

cheers
jacko
 
K

KJ

jacko said:
Hi

On the subject of
sensitivity lists, I tend to exclude things which have no relevance
until clock, or selection.

This statement demonstrates a lack of knowledge of what the following type
of synthesis warning means:
"Incomplete sensitivity list: assuming completeness"

Given that premise, I would hazard a guess to say that you also don't
understand the implications of this 'warning' and how it means that your
simulation model will behave differently under certain conditions than the
real hardware....ah well, life's lessons are best remembered when taught by
direct exposure.
This possibly allows such designs to be
sythesized using latches based on sensitivity, preventing many
possible power wasting transitions.

Not to mention creating opportunities for creating a design that behaves
differently as a function of temperature (i.e. warm the part up or cool it
off and watch it stop working) because the targetted part either doesn't
have a hardware latch as a basic resource or the synthesis tool doesn't map
it to such a latch for whatever reason. Most people don't consider correct
operation only under very stringent temperature conditions to be much of a
feature, they typically want the entire commercial operating temperature
range or something close to that.

KJ
 
K

KJ

if they knew what they don't know I'd say they were confused!

And if you would understand what was actually written maybe you wouldn't
come off looking rather foolish/arrogant in your postings.
Now as some people have said, and maybe would say about me, it often
not to do witha lack of understanding, I'm just an eratic genious.

Rest assured, I would not be counted among those who might consider you to
be any sort of genious.

KJ
 
J

jacko

This statement demonstrates a lack of knowledge of what the following type
of synthesis warning means:
"Incomplete sensitivity list:  assuming completeness"

If assumption is possible, and acurate to control design, then purpose
of inclusion is $/klocs wage scheme or to provide late combination
entry performance queues and uPower latch halting of logic oscillation/
ringing/set-up transitions.
Given that premise, I would hazard a guess to say that you also don't
understand the implications of this 'warning' and how it means that your
simulation model will behave differently under certain conditions than the
real hardware....ah well, life's lessons are best remembered when taught by
direct exposure.

I expect a simulation model to behave differently, after all it is
just a simulation, where accuracy is not perfect. I assume your
talking about VHDL simulation, and not spice modeling of resulting
synthesized design, with stripline modelling of PCB. Yes, slew rate
hold time violations due to power drain of extra 'hidden' transitions,
could be an effect, but lazy simulators which don't imply what my
synthesizer does is not the type of simulator I am after. Monte Carlo
analysis?
Not to mention creating opportunities for creating a design that behaves
differently as a function of temperature (i.e. warm the part up or cool it
off and watch it stop working) because the targetted part either doesn't
have a hardware latch as a basic resource or the synthesis tool doesn't map
it to such a latch for whatever reason.  Most people don't consider correct
operation only under very stringent temperature conditions to be much of a
feature, they typically want the entire commercial operating temperature
range or something close to that.

If the synthesizer does not know the holding properties of the part,
I'd say your well screwed.

cheers
jacko
 
K

KJ

If assumption is possible, and acurate to control design, then purpose
of inclusion is $/klocs wage scheme or to provide late combination
entry performance queues and uPower latch halting of logic oscillation/
ringing/set-up transitions.

I'll refrain from stating what this statement seems to demonstrate about
your expertise in this area..
Given that premise, I would hazard a guess to say that you also don't
understand the implications of this 'warning' and how it means that your
simulation model will behave differently under certain conditions than the
real hardware....ah well, life's lessons are best remembered when taught
by
direct exposure.

Yes, VHDL simulation...you still don't have a clue though about what I'm
talking about what the differences are though now do you?

This process...
process(a)
begin
c <= a and b;
end process;

will simulate one way (i.e. changes in 'b' will not cause a change in
'c'...not until a change in 'a' also occurs to 'wake up' the process. Get a
simulator, try it out, hold 'a' constant and toggle 'b' all you want and
watch 'c' not change at all.

Synthesis will give you a warning about 'b' not being in the sensitivity
list and generate code equivalent to this...
c<= a and b;

These are functionally different things, they will behave differently.
Depending on what you intended, you might like the way the synthesis tools
handled it, or you might not, the point is that they are going to do
radically different things so you won't be able to use the simulator to
debug a problem in the real world...you don't seem to grasp that this is a
very bad situation to be in, but that's because you don't simulate...which
is yet another problem.

This is a simple example to demonstrate the point, your code has more
complicated examples that will misbehave on you as well.
If the synthesizer does not know the holding properties of the part,
I'd say your well screwed.

Not true at all. If you ignore the warnings that the synthesizer gives you
then you'll be screwed. Every synthesis tool will properly generate a
bitstream that can be used to program a part to implement the following
transparent latch...

process(c, d)
begin
if (c = '1') then
q <= d;
end if;
end process;

If the targetted device does not have a hard latch primitive then it will be
cobbled together from the basic logic elements that are available...and that
device will likely fail either immediately or when the device is heated up
or cooled down a bit. In any case, it's not the tools fault, it implemented
the logic that you specified with the most appropriate elements available to
it. The fault lies with the designer for using that code in a device that
does not have hardware transparent latches.

KJ
 
M

Mike Treseler

jacko said:
Now as some people have said, and maybe would say about me, it often
not to do witha lack of understanding, I'm just an eratic genious.


Genius | Gen"ius |
n.; pl. E. Geniuses; in sense 1, L. Genii.

1. A good or evil spirit, or demon, supposed by the ancients
to preside over a man's destiny in life; a tutelary deity;
a supernatural being; a spirit, good or bad. Cf. Jinnee.
 
J

jacko

Hi
Not true at all.  If you ignore the warnings that the synthesizer gives you
then you'll be screwed.  Every synthesis tool will properly generate a
bitstream that can be used to program a part to implement the following
transparent latch...

process(c, d)
begin
  if (c = '1') then
    q <= d;
  end if;
end process;

If the targetted device does not have a hard latch primitive then it will be
cobbled together from the basic logic elements that are available...and that
device will likely fail either immediately or when the device is heated up
or cooled down a bit.  In any case, it's not the tools fault, it implemented
the logic that you specified with the most appropriate elements available to
it.  The fault lies with the designer for using that code in a device that
does not have hardware transparent latches.

Yes this is why the DAT_I, DAT_O and ADR_O signals are now generated
in the final clocked process, as process(ir) is two clocks per process
exec. To prevent any synth from infering a power saving address
generator which had incorrect function. In fact I have slightly over-
specified the sensitivity list of the last process.

cheers
jacko
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,166
Messages
2,570,907
Members
47,446
Latest member
Pycoder

Latest Threads

Top