U
Uncle Noah
ANNOUNCE: YARDstick - custom processor development toolset
Dear friends,
I am very pleased and proud to announce YARDstick:
http://electronics.physics.auth.gr/people/nkavv/yardstick,
a custom processor development toolset with an impressive list of
features.
YARDstick is a novel design automation tool for custom processor
development flows
that focuces on the hard part: generating and evaluating application-
specific hardware extensions. YARDstick is a powerful building block
for ASIP development, since it integrates application analysis, ultra-
fast algorithms for custom instruction generation and selection with
user-defined compiler intermediate representations. As of September
2007, YARDstick integrates retargetable compiler features for the
targeted IRs/architectures. Remarkable features of YARDstick are the
following:
- retargetable to used-defined IRs by machine description.
- can be targeted to low-level compiler IRs, assembly-level
representations of virtual
machines, or assembly code for existing processors.
- fully parameterized custom instruction generation and selection
engine.
- lightning-fast code selector for multiple-input multiple-output
patterns based on
graph matching. It is known that the code selector scales very well
with the instruction
node count of basic block data-dependence graphs (successfully
tested with custom
instruction patterns of more than 30 nodes).
- virtual register assignment for virtual machine targets.
- an extensive set of backends including assembly code emitter, C
backend, visualization
backends for Graphviz and VCG (or aiSee), an XML format amenable to
graph rewriting
and others.
YARDstick comes along with a cross-platform GUI written in Tcl/Tk 8.5.
The ultimate goal of YARDstick is to liberate the designer's
development infrastructure
from compiler and simulator idiosyncrasies. With YARDstick, the ASIP
designer is empowered
with the freedom of specifying the target architecture of choice and
adding new implementations of analyses and custom instruction
generation/selection methods.
At this moment, YARDstick is being heavily used for developing a new
processor architecture of mine with many never-being-seen features,
mostly aiming FPGAs. Status update report on the processor
architecture should be expected near late October 2007.
Typically, 2x to 15x speedups for benchmark applications (ANSI C
optimized source code)
can be fully automatically obtained by using YARDstick depending on
the target architecture. Speedups are evaluated against a typical
scalar RISC architecture.
Detailed feature list:
1. Analysis engines generating both static and dynamic statistics:
- Data types
- Operation-level statistics
- Basic block statistics (ranking)
- Performance estimations with/without custom instructions.
2. Generation of CDFGs (Control-Data Flow Graphs).
3. Backend engines:
- ANSI C
- dot (Graphviz)
- VCG (GDL, aiSee)
- XML (GGX for the AGG graph rewriting tool)
- Retargetable assembly emitter for entire translation units
(single files with multiple functions/procedures).
- CDFG formats for various RTL synthesis tools.
4. Custom instruction engines:
- Full-parameterized MIMO custom instruction generation algorithm.
Features:
* Fast heuristic !!!
* Configurable number of inputs
* Configurable number of outputs
* List of forbidden nodes
* Node sorting strategies (3 different strategies!)
* Transformation rule library for applying CFG transformation
strategies
5. Custom instruction selection:
- Based on priority metrics (2 choices at the moment).
6. Graph (and graph-subgraph) isomorphism features for eliminating
redundant patterns. Multiple algorithms supported.
7. Visualization of custom instructions, basic blocks, control-flow
graphs and control-data flow graphs (basic block nodes expanded to
their constituent instructions).
8. Basic retargetable compiler features (alpha state):
- Code selector for MIMO instructions (tested with large cases).
- Virtual register assignment (allocation for a VM).
- Hard register allocator in the works.
9. Miscellaneous features:
- single constant multiplication optimizer
- elimination of false data-dependences in assembly-level CDFGs.
- beautification options for visualization
- interfacing (co-operation) with external tools such as peephole
optimizers, profilers, code generators etc.
- features related to the custom processor architecture (not to be
disclosed yet)
Here is a list of application benchmarks that have been tested with
YARDstick (compiler
features not fully tested):
- ADPCM encoder and decoder (typically: 4x speedup)
- Video processing kernels: full-search block-matching motion
estimation, logarithmic search motion estimation, motion compensation
- Image processing kernels: steganography (hide/uncover), edge
detection, matrix multiplication
- Cryptographic kernels: crc32, rc5, raiden (7x speedup, 12x for
unrolled version)
At the YARDstick homepage:
http://electronics.physics.auth.gr/people/nkavv/yardstick/
you can find some additional material:
- 2-page brochure
- 2-page abstract for the DATE'07 University Booth
- a more extended presentation on YARDstick
The above material refers to the status of April 2007.
Expected enhancements to YARDstick in the near future:
- linear-scan and integer-linear programming based register
allocators
- bitwidth analysis
- CDFG->VHDL generation of custom instruction hardware
- algorithm implementation for CDFG pipelining
Interested parties are welcome to contact me for details on how to get
access to a demo version of the YARDstick toolset.
Kind regards
Nikolaos Kavvadias
Computer Architecture Specialist - Compiler Developer
Ph.D. candidate
M.Sc. Eletronics Engineering
B.Sc. Physics
You may contact me at:
Nikolaos Kavvadias <[email protected]>
http://www.geocities.com/kaveirious/
http://electronics.physics.auth.gr/tomeas/en/kavvadias.html
Dear friends,
I am very pleased and proud to announce YARDstick:
http://electronics.physics.auth.gr/people/nkavv/yardstick,
a custom processor development toolset with an impressive list of
features.
YARDstick is a novel design automation tool for custom processor
development flows
that focuces on the hard part: generating and evaluating application-
specific hardware extensions. YARDstick is a powerful building block
for ASIP development, since it integrates application analysis, ultra-
fast algorithms for custom instruction generation and selection with
user-defined compiler intermediate representations. As of September
2007, YARDstick integrates retargetable compiler features for the
targeted IRs/architectures. Remarkable features of YARDstick are the
following:
- retargetable to used-defined IRs by machine description.
- can be targeted to low-level compiler IRs, assembly-level
representations of virtual
machines, or assembly code for existing processors.
- fully parameterized custom instruction generation and selection
engine.
- lightning-fast code selector for multiple-input multiple-output
patterns based on
graph matching. It is known that the code selector scales very well
with the instruction
node count of basic block data-dependence graphs (successfully
tested with custom
instruction patterns of more than 30 nodes).
- virtual register assignment for virtual machine targets.
- an extensive set of backends including assembly code emitter, C
backend, visualization
backends for Graphviz and VCG (or aiSee), an XML format amenable to
graph rewriting
and others.
YARDstick comes along with a cross-platform GUI written in Tcl/Tk 8.5.
The ultimate goal of YARDstick is to liberate the designer's
development infrastructure
from compiler and simulator idiosyncrasies. With YARDstick, the ASIP
designer is empowered
with the freedom of specifying the target architecture of choice and
adding new implementations of analyses and custom instruction
generation/selection methods.
At this moment, YARDstick is being heavily used for developing a new
processor architecture of mine with many never-being-seen features,
mostly aiming FPGAs. Status update report on the processor
architecture should be expected near late October 2007.
Typically, 2x to 15x speedups for benchmark applications (ANSI C
optimized source code)
can be fully automatically obtained by using YARDstick depending on
the target architecture. Speedups are evaluated against a typical
scalar RISC architecture.
Detailed feature list:
1. Analysis engines generating both static and dynamic statistics:
- Data types
- Operation-level statistics
- Basic block statistics (ranking)
- Performance estimations with/without custom instructions.
2. Generation of CDFGs (Control-Data Flow Graphs).
3. Backend engines:
- ANSI C
- dot (Graphviz)
- VCG (GDL, aiSee)
- XML (GGX for the AGG graph rewriting tool)
- Retargetable assembly emitter for entire translation units
(single files with multiple functions/procedures).
- CDFG formats for various RTL synthesis tools.
4. Custom instruction engines:
- Full-parameterized MIMO custom instruction generation algorithm.
Features:
* Fast heuristic !!!
* Configurable number of inputs
* Configurable number of outputs
* List of forbidden nodes
* Node sorting strategies (3 different strategies!)
* Transformation rule library for applying CFG transformation
strategies
5. Custom instruction selection:
- Based on priority metrics (2 choices at the moment).
6. Graph (and graph-subgraph) isomorphism features for eliminating
redundant patterns. Multiple algorithms supported.
7. Visualization of custom instructions, basic blocks, control-flow
graphs and control-data flow graphs (basic block nodes expanded to
their constituent instructions).
8. Basic retargetable compiler features (alpha state):
- Code selector for MIMO instructions (tested with large cases).
- Virtual register assignment (allocation for a VM).
- Hard register allocator in the works.
9. Miscellaneous features:
- single constant multiplication optimizer
- elimination of false data-dependences in assembly-level CDFGs.
- beautification options for visualization
- interfacing (co-operation) with external tools such as peephole
optimizers, profilers, code generators etc.
- features related to the custom processor architecture (not to be
disclosed yet)
Here is a list of application benchmarks that have been tested with
YARDstick (compiler
features not fully tested):
- ADPCM encoder and decoder (typically: 4x speedup)
- Video processing kernels: full-search block-matching motion
estimation, logarithmic search motion estimation, motion compensation
- Image processing kernels: steganography (hide/uncover), edge
detection, matrix multiplication
- Cryptographic kernels: crc32, rc5, raiden (7x speedup, 12x for
unrolled version)
At the YARDstick homepage:
http://electronics.physics.auth.gr/people/nkavv/yardstick/
you can find some additional material:
- 2-page brochure
- 2-page abstract for the DATE'07 University Booth
- a more extended presentation on YARDstick
The above material refers to the status of April 2007.
Expected enhancements to YARDstick in the near future:
- linear-scan and integer-linear programming based register
allocators
- bitwidth analysis
- CDFG->VHDL generation of custom instruction hardware
- algorithm implementation for CDFG pipelining
Interested parties are welcome to contact me for details on how to get
access to a demo version of the YARDstick toolset.
Kind regards
Nikolaos Kavvadias
Computer Architecture Specialist - Compiler Developer
Ph.D. candidate
M.Sc. Eletronics Engineering
B.Sc. Physics
You may contact me at:
Nikolaos Kavvadias <[email protected]>
http://www.geocities.com/kaveirious/
http://electronics.physics.auth.gr/tomeas/en/kavvadias.html