

A new CPU (2013) - kumarski
http://www.fpgarelated.com/showarticle/44.php

======
danbruc
TL;DR Instead of _call [32-bit-start-address-of-function]_ use _call [8-bit-
index-into-table-of-start-addresses-of-functions]_ plus a mechanism to build
such a table and make the look-up depended on the current value of the program
counter because otherwise you could only ever call 256 different methods.

IMHO needlessly complex just to make call instructions smaller. And on its own
it provides not any more security - messing with a call instruction will no
longer let you jump into the middle of nowhere but only to the beginning of a
basic block, but you can still just mess with the table of addresses unless
protected by other measures.

~~~
danbruc
On a second thought it is actually hard to save any memory at all. The
proposed look-up is _table[PC >> 4 + index]_ so the table will have _code-size
/16 + 255_ entries of 32 bit each, making it _code-size /4 + 1020_ byte large.

Given the normal code size _s_ with _n_ call instructions you can reduce the
code size to _s - 3n_ because you save three byte per call but you have an
additional table of size _s /4 + 1020_. This yields _n = s /12 + 340_ and that
means there has to be more than one call instruction every 12 byte on average
to save any memory at all.

~~~
skew
That sounds like a pretty reasonable density to see savings in threaded forth
code like the second post talks about (it also mentions adjusting the shift
for other densities). Second, you don't need to reserve excess state at the
end like that if the linker/compiler takes a little care allocating offsets so
the last instructions use small indices. It definitely seems like a bit of a
specialized trick, but the articles do a decent job describing conditions
where it might make sense.

~~~
danbruc
Assuming 32 bit instructions at least every third instruction would have to be
a call or jump - this seams no reasonable assumption. And if you have code
with such a high density of jumps you will run into a lot of other problems,
too, for example branch mispredictions and cache inefficiency due to non-
locality. Maybe there is really a niche where this design has advantages but I
don't think it is good for general purpose code.

~~~
prutschman
> Assuming 32 bit instructions

You don't have to assume, the article itself says the author is contemplating
9-bit instructions.

~~~
danbruc
I did not read it like that. 9 bit is not really enough for an interesting
instruction set for a register machine. It is however enough for a stack
machine or something similar with implicit operands, but then again you need
more instructions for the same task than with a register machine.

~~~
prutschman
If you search for the phrase "To add more 'regular' instructions" in part 1,
you'll see where he references 8 or 9 bit instructions. The author goes on, in
subsequent sections, to expand the idea of tokenized jump references to
tokenized instructions.

Each instruction token may only be 8 or 9 bits, but the secondary memory which
contains jump targets is expanded to also contain full-width instructions.

The operating hypothesis is that just it's desirable to be able to "reach" any
32-bit destination address despite any given code segment not needing that
flexibility, it's likewise desirable to be able to execute any, say, 36-bit
instruction even though any given code segment only needs a subset of them.

It's really worth a read.

------
vanderZwan
Although it's fine to be critical (and I don't think people here are
particularly vicious in their criticism, just sceptical about the utility of
this technique), I'd like to remind everyone that the only way that we can say
with certainty that there's nothing valuable in pursuing a particular
direction is by actually taking a close look at it.

I think it's great that there are people who take a concept and work it out
until the end in precise detail, even when it is not directly clear what the
benefits would be, if any. It's also wonderful that this person decided to
share the result of all of this work so that others might be inspired by it or
find a useful trick or technique in it.

------
zw123456
It seems to me to be basically a rehash of the 80x86 segmented architecture,
but maybe I am missing something?

------
lnanek2
Seems like a lot of work to just get a slow indirect addressing system which
we have plenty of already...

~~~
gpcz
Parts 2, 3, and 4 explain the practical ramifications of his sliding-window
design (which differs from indirect addressing) much more thoroughly. In Part
3, he actually mentions that HackADay accidentally credited him with PC-
relative jump due to a similar misunderstanding.

