
The Art of Picking Intel Registers (2003) - Tomte
http://www.swansontec.com/sregisters.html
======
bogomipz
>"There are three major processor architectures: register, stack, and
accumulator. In a register architecture, operations such as addition or
subtraction can occur between any two arbitrary registers. In a stack
architecture, operations occur between the top of the stack and other items on
the stack. In an accumulator architecture, the processor has single
calculation register called the accumulator."

I have only ever seen register and stack used to distinguish between a virtual
machine implementation - i.e a "stack machine", example - the JVM , a
"register machine", example - Dalvik. The author explains that the x86 is
register based architecture but what are some examples of a stack based CPU
architecture and an accumulator based CPU architecture?

~~~
innocenat
For stack machine, RTX2010 comes to mind. It's a radiation-hardened processor
used in various spaceships.

~~~
gh02t
Neat, I didn't know about this chip. Is there anything particular about a
stack architecture that makes it ideal for radhard applications, or is it just
the one they selected because they wanted to use Forth?

~~~
my123
It's mostly about convenience. Hubble uses a radiation hardened 486 instead of
those.

------
cesarb
Another interesting trivia about the x86 registers: while most programmers
think of their order as being EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP, when you
look at how the registers are numbered in the instruction set, you see that
their true order is different: EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI.

~~~
teh_klev
What's the significance of this?

~~~
eyer2016
Next time you play trivia at your local bar and get asked, "what's the true
order of x86 registers," you will get 5 points

~~~
teh_klev
Ha ha.... :)

------
dbrgn
This is a great article. I used it to optimize the code for a code golf puzzle
our hackerspace made: [http://codegolf.coredump.ch/challenges/asm-
compass/](http://codegolf.coredump.ch/challenges/asm-compass/) If you want a
challenge to reduce the binary size of a very simple program (= choosing
instructions with smaller opcodes), give it a try :)

~~~
userbinator
I've done a few challenges like that[1] but seeing that the smallest so far is
over 300 bytes makes me think most of those 300 bytes are not actual code but
program header overhead, because it should be doable in much fewer bytes of
actual code. Once you've done these challenges a few times, you can easily
estimate a rough size:

The template is 48 bytes. Add 16 bytes for an [offset,char] table. Maybe
another 30 bytes (let's be generous and round that up to 32) for code to pull
the commandline parameter, do the indexing and template modification, write
the output, and exit. 96 bytes of code. Add 64 bytes of minimal ELF headers[2]
and you're still well below 200 bytes.

I don't have an environment at the moment that doesn't do ELF files, so I just
tried it with 16-bit DOS and ended up at... 95 bytes:

[https://pastebin.com/C7vDfFbR](https://pastebin.com/C7vDfFbR)

I wasn't trying very hard so someone will probably find an even smaller
version, and it was just for fun since it doesn't meet the rules anyway.

[1]
[http://www.hugi.scene.org/compo/compoold.htm](http://www.hugi.scene.org/compo/compoold.htm)

[2]
[http://www.muppetlabs.com/~breadbox/software/tiny/teensy.htm...](http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)

~~~
dbrgn
Thanks for the links! Most of the submitted approaches don't even mess with
the ELF binary structure. But even if you don't do that, there's a lot you can
learn by optimizing (sometimes doing something with more instructions
instructions actually result in a smaller binary because of shorter opcodes).

It was the first time I really messed around with x86 assembly and I learned a
lot during the progress :)

Edit: Wording

Edit2: Also note that for this challenge you need to submit NASM assembly, and
not the resulting binary. So you can't resort to hex editing the binary
itself.

------
aetherspawn
Don’t get any ideas though. Whilst these instructions might make the output
code smaller, it will probably make the program slower (1). Processors are
optimised to run code generated by compilers fast, and compilers stopped
outputting these a long time ago.

1/ There are edge cases like when the code is very big then less code might
fit in the cache better or perhaps dispatch better .. but probably not many.

~~~
userbinator
In my experience, cache matters a lot and compilers aren't smart enough to
figure out what parts of the code are time-critical and what parts aren't, so
they just use the fastest (and sometimes significantly bigger) instruction
sequences everywhere. Optimising non-critical-path code by making it smaller
even if it's somewhat slower can actually make the program as a whole faster,
since the cache misses decrease.

