
Transport Triggered CPU Architecture - peter_d_sherman
https://en.wikipedia.org/wiki/Transport_triggered_architecture
======
peter_d_sherman
Excerpt(s):

"Transport triggering _exposes some microarchitectural details that are
normally hidden from programmers_. This greatly simplifies the control logic
of a processor, because many decisions normally done at run time are fixed at
compile time.

However, it also means that a binary compiled for one TTA processor will not
run on another one without recompilation if there is even a small difference
in the architecture between the two.

TTAs can be seen as "exposed datapath" VLIW architectures. While VLIW is
programmed using operations, TTA splits the operation execution to multiple
move operations. The low level programming model enables several benefits in
comparison to the standard VLIW. For example, a TTA architecture can provide
_more parallelism with simpler register files than with VLIW_.

As the programmer is in control of the timing of the operand and result data
transports, the complexity (the number of input and output ports) of the
register file (RF) need not be scaled according to the worst case
issue/completion scenario of the multiple parallel instructions."

[...]

"An important unique software optimization enabled by the transport
programming is called software bypassing. In case of software bypassing, the
programmer bypasses the register file write back by moving data directly to
the next functional unit's operand ports.

When this optimization is applied aggressively, the original move that
transports the result to the register file can be eliminated completely, thus
_reducing both the register file port pressure and freeing a general purpose
register for other temporary variables. The reduced register pressure, in
addition simplifying the required complexity of the RF hardware, can lead to
significant CPU energy savings_ , an important benefit especially in mobile
embedded systems.[1] [2]"

------
PaulHoule
The greatest potential of this is building a custom CPU that has a desired set
of functions built in. If you do a lot of addition, add more adders. You
choose the balance of fixed or floating point, or you could put in decimal
floating point operations, etc.

This would be fun to implement on an FPGA since you could add and remove
operations at will.

A very intelligent toolchain could synthesize alternative CPUs based on the
bottlenecks that some particular software finds; that codesign could make very
good parts.

~~~
peter_d_sherman
I did not think of that; good ideas!

In fact, the more I think of it... excellent ideas!!!

It's sort of like, whatever a standard college/online EE CPU design course
told you you would only need one of on a CPU (ALU for example), you could
actually have 2..n, with the right circuit design, if the software that was to
be run warranted it, and if there was enough space on your FPGA... Yes, the
actual underlying circuit implementation might be far more complex, but again,
as a high-level design idea, it's a great idea!

For the right software/application -- I can visualize it working really well!

