Discrepancy between functional simulation and post-synthesissimulation when inferring a block-RAM

C

Cesar

Hello:

I've been always coding VHDL in 'data-flow' way. For my last module, I
tried to code it in 'one-process' style. Functional simulation was ok,
but post-synthesis simulation has different results. I use XST and
Modelsim and my device is a Spartan-3.
I've employed RTL viewer from Xilinx ISE 11.4 to check out the
synthesis results and I've discovered that there is a problem
inferring a block-RAM.

I only read the block-RAM (ROM). When reading, a block-RAM should
latch-in the address in the active clock edge and, after Tco, the data
should be output at DO. Synchronously speaking, reading the block-RAM
should imply one clock period delay.
When inferring the block-RAM in 'one-process' style, XST automatically
and always add a register for the address input and a register for the
data output (independently of the VHDL code you have).
This fact implies that functional simulation does not meet post-
synthesis one and adds an aditional clock period delay when reading
the block-RAM.

Does any body has have a similar problem? Any solution?
Unfortunally I think I'll have to recode my module in the old style :-
(

Regards,
César
 
K

KJ

Hello:

I've been always coding VHDL in 'data-flow' way. For my last module, I
tried to code it in 'one-process' style. Functional simulation was ok,
but post-synthesis simulation has different results.

And is the post-synthesis sim result that an output pin (or pins)
happen exactly one clock cycle late as you later suggest? Are the
setup and hold times to all of the input pins of the device in the
simulation meeting the requirements that pop out of the timing
report? A testbench such as the following likely will cause a timing
problem and therefore a sim mismatch

process(Clock)
begin
wait until rising_edge(Clock);
...
Some_Input <= '1';
...
end if;

Instead the signal assignment should be
Some_Input <= '1' after 5 ns; -- As an example
I use XST and
Modelsim and my device is a Spartan-3.
I've employed RTL viewer from Xilinx ISE 11.4 to check out the
synthesis results and I've discovered that there is a problem
inferring a block-RAM.

I only read the block-RAM (ROM). When reading, a block-RAM should
latch-in the address in the active clock edge and, after Tco, the data
should be output at DO. Synchronously speaking, reading the block-RAM
should imply one clock period delay.
When inferring the block-RAM in 'one-process' style, XST automatically
and always add a register for the address input and a register for the
data output (independently of the VHDL code you have).

You have to be a little careful here because the logic for how those
address signals are generated as well as the logic that uses the data
outputs could be influencing what is going on and you may be mis-
interpreting what you're seeing in the viewer. Synthesis is only
supposed to generate a cycle for cycle match at the device pin
boundary. There is no guarantee that internal signals. Not saying
that you're suggesting is wrong, just suggesting something else to
check first.
This fact implies that functional simulation does not meet post-
synthesis one and adds an aditional clock period delay when reading
the block-RAM.

Only if the setup and hold times on every input pins is correct. Some
people think that they can take their testbench and simply plop down
the post-synthesis model and expect it to functionally work in the
same manner as the original code. This is only true if you've met
each of the following conditions:
- Setup and hold times of all input pins meet the requirements that
are specified by the timing report
- Sampling time of all output pins only occurs after the timing report
specified clock to output requirements are met (or propogation delay
for combinatorial outputs)
- No synthesis warnings about things that cannot be synthesized
exactly. Some examples are:
* x <= y after 10 ns; -- Can't synthesize delays
* Incomplete sensitivity lists
* Combinatorial loops
* Other things that I may have forgotten
Does any body has have a similar problem? Any solution?
Unfortunally I think I'll have to recode my module in the old style :-

Before you go down that path, I'd suggest you take the time to
investigate this a bit further. It's quite possible that when you
recode to your 'old style' that you might also introduce a subtle
difference that accounts for the discrepancy and you'll go off on the
mistaken belief that it has something to do with your coding style.

The thing to do is as Brian suggested, create a testbench that
instantiates both parts and compares the two outputs. I'd add the
above mentioned cautions about how to generate inputs and when to
sample outputs. It also might be easier to simply have the block ram
I/O be the device I/O in order to remove all of the clutter and see if
it really does have something to do with the coding style.

Kevin Jennings
 
M

Mike Treseler

Cesar said:
I've been always coding VHDL in 'data-flow' way. For my last module, I
tried to code it in 'one-process' style. Functional simulation was ok,
but post-synthesis simulation has different results. I use XST and
Modelsim and my device is a Spartan-3.
I've employed RTL viewer from Xilinx ISE 11.4 to check out the
synthesis results and I've discovered that there is a problem
inferring a block-RAM.
I only read the block-RAM (ROM).

If all you need is a rom, consider somthing like this.
http://mysite.verizon.net/miketreseler/sync_rom.vhd
When reading, a block-RAM should
latch-in the address in the active clock edge and, after Tco, the data
should be output at DO. Synchronously speaking, reading the block-RAM
should imply one clock period delay.
When inferring the block-RAM in 'one-process' style, XST automatically
and always add a register for the address input and a register for the
data output (independently of the VHDL code you have).

Here's a template for a block ram that works for brand A.
It might work for X also.

http://mysite.verizon.net/miketreseler/block_ram.vhd
Does any body has have a similar problem? Any solution?
Unfortunately I think I'll have to recode my module in the old style :-

If I'm in a hurry I don't try synthesis experiments,
especially with templates for vendor specific features.
If I am writing rtl code, single process entities work well.

-- Mike Treseler
 
C

Cesar

And is the post-synthesis sim result that an output pin (or pins)
happen exactly one clock cycle late as you later suggest?  Are the
setup and hold times to all of the input pins of the device in the
simulation meeting the requirements that pop out of the timing
report?  A testbench such as the following likely will cause a timing
problem and therefore a sim mismatch

process(Clock)
begin
  wait until rising_edge(Clock);
  ...
  Some_Input <= '1';
  ...
end if;

Instead the signal assignment should be
  Some_Input <= '1' after 5 ns; -- As an example


I don't think it is a set-up or hold time issue. I've recoded my
module in 'data-flow' way and post-synthesis simulation is ok. I'll
look into it and I'll make you know.




You have to be a little careful here because the logic for how those
address signals are generated as well as the logic that uses the data
outputs could be influencing what is going on and you may be mis-
interpreting what you're seeing in the viewer.  Synthesis is only
supposed to generate a cycle for cycle match at the device pin
boundary.  There is no guarantee that internal signals.  Not saying
that you're suggesting is wrong, just suggesting something else to
check first.


Only if the setup and hold times on every input pins is correct.  Some
people think that they can take their testbench and simply plop down
the post-synthesis model and expect it to functionally work in the
same manner as the original code.  This is only true if you've met
each of the following conditions:
- Setup and hold times of all input pins meet the requirements that
are specified by the timing report
- Sampling time of all output pins only occurs after the timing report
specified clock to output requirements are met (or propogation delay
for combinatorial outputs)
- No synthesis warnings about things that cannot be synthesized
exactly.  Some examples are:
* x <= y after 10 ns; -- Can't synthesize delays
* Incomplete sensitivity lists
* Combinatorial loops
* Other things that I may have forgotten


Before you go down that path, I'd suggest you take the time to
investigate this a bit further.  It's quite possible that when you
recode to your 'old style' that you might also introduce a subtle
difference that accounts for the discrepancy and you'll go off on the
mistaken belief that it has something to do with your coding style.

I believe it is something I've made wrong, since it's my first try
with this coding style.

The thing to do is as Brian suggested, create a testbench that
instantiates both parts and compares the two outputs.  I'd add the
above mentioned cautions about how to generate inputs and when to
sample outputs.  It also might be easier to simply have the block ram
I/O be the device I/O in order to remove all of the clutter and see if
it really does have something to do with the coding style.
Kevin Jennings

I've created that testbench and there are different outputs. As I told
you, I'm going to spend more time to think about it and I'll tell you.

Thanks,
César
 
C

Cesar

If all you need is a rom, consider somthing like this.http://mysite.verizon.net/miketreseler/sync_rom.vhd

Hello Mike:

In fact, I've been employing your templates from your site for this
module .
Since I was in a hurry, I had to code my module in the 'data-flow' way
and it worked.
But I want to understand why 'one-process' module does not.

I made a testbench instantiating my module and the post-synthesis
module, as suggested by KJ. The outputs are different.
The problem is when I try to look into the module to understand where
the problem is. I'm not used to work with variables (instead of
signals), and watching the variables in ModelSim is a little messy for
me. I don't know if they represent the registered value or not.
When I attach variables to test-point ports, they always get
registered at the port. So it is impossible to see in the actual cycle
the combinatorial value. When you have many variables, it is hard to
remember which test_point is being used combinatorially within the
module and which one is used registered.

Any way, now I have a 'data-flow' module that gets synthetized as I
expected. Then, I'll compare the RTL-view of both modules to
understand what's going on and I'll post it.

Regards,
César
 
M

Mike Treseler

Cesar said:
I made a testbench instantiating my module and the post-synthesis
module, as suggested by KJ. The outputs are different.
The problem is when I try to look into the module to understand where
the problem is. I'm not used to work with variables (instead of
signals), and watching the variables in ModelSim is a little messy for
me. I don't know if they represent the registered value or not.

I don't know that with signals either.
The advantage to using abstractions such as variables or
signals is that I shouldn't have to care as long as
synthesis works. Unfortunately, synthesis is weak
on fixed vendor logic like block_rams.
When I attach variables to test-point ports, they always get
registered at the port.

That is a result of using a synchronous process.
So it is impossible to see in the actual cycle
the combinatorial value. When you have many variables, it is hard to
remember which test_point is being used combinatorially within the
module and which one is used registered.

With variables, I trace code using the step command.

Any way, now I have a 'data-flow' module that gets synthetized as I
expected. Then, I'll compare the RTL-view of both modules to
understand what's going on and I'll post it.

Thanks.

-- Mike Treseler
 
C

Cesar

If all you need is a rom, consider somthing like this.http://mysite.verizon.net/miketreseler/sync_rom.vhd


Here's a template for a block ram that works for brand A.
It might work for X also.

http://mysite.verizon.net/miketreseler/block_ram.vhd

I finally made it work properly. At first, I was trying to infer the
blockRAM as a ROM from a single VHDL module (the single-process
module). I made it like this:

p_main: process(clk)
type rom_t is array(0 to 2**NR_BITS_ADDR - 1) of
std_logic_vector(NR_BITS_DATA - 1 downto 0);
constant rom_c: rom_t := // ... (read from a file)
variable data_read_v: unsigned(NR_BITS_DATA - 1 downto 0);
variable addr_read_v: std_logic_vector(NR_BITS_ADDR - 1 downto 0);

procedure update_regs is
begin
// ...
data_read_v := rom_c(addr_read_v); // USE addr_read_v before
updating it (since blocRAM is registered)
// ...
addr_read_v := addr_read_v + 1; // UPDATE addr_read_v after
using it
// ...
end procedure update_regs;


begin
if rising_edge(clk) then
if rst = '1' then
init_regs;
else
update_regs;
end if;
end if;
update_ports;
end process p_main;

Logic simulation with Modelsim was ok, but post-synthesis simulation
(ISE) added and additional registering stage.

Then, I tried to infer the blockRAM (ROM) from an independent VHDL
module following the template suggested by Mike. At first it did not
work because of my fault. I did not take into consideration the
additional register stage added when the addr_read_v goes out to a
port from the 'one-process style' module towards the blockRAM module.
After debugging that, I worked ok.

Thank you all for your help,
César
 
M

Mike Treseler

Cesar said:
...
Then, I tried to infer the blockRAM (ROM) from an independent VHDL
module following the template suggested by Mike. At first it did not
work because of my fault. I did not take into consideration the
additional register stage added when the addr_read_v goes out to a
port from the 'one-process style' module towards the blockRAM module.
After debugging that, I worked ok.

Thanks for reporting your results.

It is good to simulate everything when learning variables.
(or any other time actually ;)
For vendor - specific structures like this, trial synthesis is needed also.

-- Mike Treseler
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,696
Latest member
BarbraOLog

Latest Threads

Top