
Why the EAX register of x86 is called that - based2
https://keleshev.com/eax-x86-register-meaning-and-history/
======
kens
One thing directly connected to this history is why the x86 is little-endian.
As the article explains, the 8008 was designed for the Datapoint 2200
terminal. The 8008 was intended as a compatible replacement for the existing
Datapoint processor, which was built from simple TTL chips.

To reduce the chip count, the Datapoint 2200 used a serial processor, which
processed one bit at a time, so you had a 1-bit ALU among other things. One
consequence is that you have to start with the low-order bit when doing
addition, so you can handle carries. And for 16-bit values, you also have to
start with the low-order byte. This forces you into a little-endian
architecture.

Thus, to be compatible with the Datapoint 2200, the 8008 was also made little-
endian. Unfortunately, Intel was very slow creating the 8008, so Datapoint had
moved on to a parallel 74181-based architecture and didn't want the 8008.
Intel decided to sell the 8008 as a stand-alone product, essentially creating
the microprocessor industry. As the article explains, the x86 grew out of the
8008, so x86 also inherited the little-endian architecture

~~~
ajross
Pretty much all simple multi-word ALU algorithms need to start with the low
word first. That particular terminal wasn't remotely unique (though for all I
know, it was indeed the specific product that drove the Intel design
decision), nor was x86 the first LE architecture. The PDP/11 made the same
decision for the same reason several years earlier (and the VAX followed for
pseudocompatibility), etc...

Really, it wasn't until the mid-80's and the RISC revolution, where all of a
sudden people found themselves designing systems that would be 32 bits wide
from the very first silicon that the community "decided" that the only true
byte order should be BE.

And of course, the reason they made that decision was simply that the order of
bytes in memory happened to match the way readers of the latin alphabet write
arabic numbers on paper. Had the arabs invented computing, there would never
even have been a debate.

~~~
greenyoda
> Really, it wasn't until the mid-80's and the RISC revolution, where all of a
> sudden people found themselves designing systems that would be 32 bits wide
> from the very first silicon that the community "decided" that the only true
> byte order should be BE.

IBM 360-series mainframes were 32-bit and big-endian in the 1960s. (Their
earlier computers may have been big-endian too, I'm not sure.)

~~~
cesarb
> (Their earlier computers may have been big-endian too, I'm not sure.)

Yes, their choice of big endian probably came from their earlier unit record
equipment, which used punched cards as input, storage, and output. Since
punched cards were (sort of) directly human readable, it was natural to use
big endian.

~~~
KMag
Also, making punch cards bid-endian meant that BCD integers sorted the same
way lexographically and numerically, so there wasn't any special sorting mode
necessary for numeric fields in punch card sorting machines.

------
jpxw
From reddit:

“This is basically the report_final_FINAL2.docx of register names.”

~~~
pizlonator
That analogy only works if “report”, “report_final”, and “report_final_FINAL”
were all released, were insanely successful and made you a billionaire and
then you didn’t want to lose the customers that liked those versions when you
made some small edits.

~~~
elsjaako
Half-life 2: Episode 2

------
nayuki
Excellent explanation in the article. Also note that in x86-64 mode, the low 8
bits of all registers can be accessed, namely: AL, BL, CL, DL, SIL, DIL, SPL,
BPL, R8B, R9B, R10B, R11B, R12B, R13B, R14B, R15B. Previously, there was no
equivalent of SIL, DIL, SPL, and BPL. [https://docs.microsoft.com/en-
us/windows-hardware/drivers/de...](https://docs.microsoft.com/en-us/windows-
hardware/drivers/debugger/x64-architecture)

~~~
hinkley
AL, BL make sense. But now I need an explanation for why they switched from L
to B for R8B...

Also if you could explain A-R15 that’d be super. Apparently they have learned
absolutely nothing.

~~~
therealcamino
I had the same question about the suffix letters. My uninformed guess is:

    
    
      8 bits == byte       == B
     16 bits == word       == W
     32 bits == doubleword == D

~~~
jagrsw
I believe D for doubleword is mostly Microsoft C/C++ thing, though I'm sure it
appears in other places too.

When it comes to the assembler syntax 32bit word, my guess would be, that
32bit words (e.g AT&T x86 syntax, m68 assemblers) are mostly indicated as l
(for long).

movl (at&t's x86) or mov.l (m68k).

Edit: Ah.. yeah, also intel x86 asm syntax uses it, so I guess that's where
the idea of using D for 32bit values originates.

------
xscott
Most people list the registers in alphabetic order, but numerically they are
encoded with the B registers in the 4th place: EAX=0, ECX=1, EDX=2, EBX=3, ...

If you ever find yourself writing an AMD-64 assembler, it really feels like
you're digging through archaeology with all of the weird quirks you need to
implement. The SSE, AVX, and AVX-512 encodings add even more levels of, "why
did they do that?!?" which don't make much sense except in the context of
history.

~~~
userbinator
_but numerically they are encoded with the B registers in the 4th place_

I've never seen an authoritative answer for that but I believe it comes from
the fact that in the 16-bit addressing modes, BX is the only one of the 4 ABCD
registers that can be used to address memory, and the circuitry for decoding
is very slightly simpler to detect two 1 bits than one 1 and one 0 bit. The 4
other registers are, in order, SP BP SI DI.

 _which don 't make much sense except in the context of history_

That can be said for a lot of things...

------
falcrist
> B, C, D, E were completely generic and interchangeable.

I was under the impression that it went

Accumulator

Base

Counter

Data

I'm not sure about the D or E registers, but I _am_ sure I remember using B as
the base address register for arrays, and using C as _the_ counter register
for loops and such because the others couldn't be used that way.

It's been a while. Am I misremembering?

~~~
bonzini
Yes, in the Z80 BC was a sort of counter register (e.g. for LDIR or DJNZ
instructions), which is perhaps why BC became CX. In the 8080 there wasn't
much difference between BC and DE.

The interesting part is that BX maps to HL, which explains the weird order
AX/CX/DX/BX in the encoding of 8086 instructions.

------
DmitryOlshansky
E - extended, meaning from 16 to 32 bits.

A - accumulator

X - wildcard for both upper and lower 8-bit parts, this becomes redundant b/c
of E prefix

------
amelius
> I’m afraid there’s no short answer! We’ll have to go back to 1972…

No we don't ... the E stands for "extended".

~~~
BorisTheBrave
And the X also stands for extended.

~~~
Kranar
Alternatively the X is a very old assembly notation for "pair". 8-bit
registers on 8080 processors could be paired together to work as a single
register. Operations performed on these register pairs used the letter X.

You can look here at the original 8080 reference manual:

[https://altairclone.com/downloads/manuals/8080%20Programmers...](https://altairclone.com/downloads/manuals/8080%20Programmers%20Manual.pdf)

For example page 4 lists the name of instructions, INR is "increment register"
while INX is "increment register pair", similar notation is used for several
other instructions where the X is the register paired version of an
instruction.

In the case of the AX register, it likely just refers to the pair of AH and AL
registers.

At any rate, it's really interesting glancing over the various 8086 reference
manuals. Gives me a deep sense of appreciation for how far things have come
and how things have managed to build upon what are otherwise some very simple
and fundamental building blocks.

------
jchw
I was wondering if it was going to cover RAX/amd64, and it does. Nothing
terribly new here but it’s a nice dive into an interesting microcosm of intel
architecture.

I do somewhat wish AMD managed to get R0-R7 as the standard, though :p oh
well.

~~~
remcob
The R0-R7 naming is pretty widely supported. For example by LLVM's assembler,
so you can use it in Rust or C(++) inline assembly.

------
pizlonator
I love the x86 registers and their names and special roles.

On the one hand, it’s gross that x86 still has this legacy.

On the other hand, it’s a good thing that it’s possible to maintain
compatibility so far back while still having such good perf. I find that
aspect of modern x86 to be super impressive.

------
chkaloon
Interesting how much early processor history was driven by Intel project
delays.

~~~
Tuna-Fish
IMHO the story of the iAPX432 really is the whole industry in microcosm.

Intel hires the best of the best, the true cream of the crop to design them an
ISA that is meant to crystallize everything that is known about ISA design
into a single completely new design that discards all the broken crap of
yesterday and will be futureproof for decades if not forever, and a chip to
implement it.

Everything is going to be great forever, but it turns out it's a bit hard to
get done, so they just have the B-team whip up something quick to sell in the
meantime. This something was the 8086, which gets adopted into increasingly
successful products, but no matter, the new shiny thing is going to displace
all of that when it finally ships.

Then it does, and it actually has a hard time competing in performance with
the much older stopgap product. It turns out that the team of superstars they
hired was very theoretical, and built an ISA that was a dream to program
against, but did not really understand what it took to build something that
was going to be fast when implemented in hardware. (Also, the system was meant
to be used mainly with high-level languages, and the compilers really were not
there yet.) Being more expensive, much slower than the competition and with 0
market penetration, the iAPX 432 was dead on arrival in the market.

Luckily, the B-team had been busy working up on another extension of the stop-
gap product, the 80286, which is again a runaway success, and only partially
because of backwards compatibility with the existing x86 ecosystem. It was
also quite fast for it's time.

~~~
JoeAltmaier
I recall the iAPX432 came out years later than the 8086, was a 'capabilities
machine' and took 500 memory cycles to execute "JMP ." e.g. jump to self,
potentially the simplest possible operation.

------
mkchoi212
So cool. Kinda disappointed at my hacker side of things because I never
questioned why EAX register is actually named "EAX". Wondering what else I
take for granted now :p

------
kazinator
A is the A register, first letter of the alphabet.

Under 16 bits, it has a high and low part, AL and AH. The 16 bit whole got
called AX, because X gets used a lot as a wild card.

EAX is extended (to 32 bits) AX.

~~~
iforgotpassword
Then you could have kept calling it A.

~~~
danmg
No. because you still have the opcodes that operate on just 64/32/16/8 bits of
the register.

~~~
thaumasiotes
Aren't those AL and AH? When we rename A to AL, why do we need to permanently
retire the term "A"? The L in AL stands for "low"; what does the A stand for?

~~~
kazinator
The A stands for AL, in a snippet of 8008 assembly code that you're supposed
to be able to use in the middle of 8080 assembly code, and dwhich is written
in a language that knows nothing about AL or AX. Or that was the idea.

~~~
thaumasiotes
Thanks. So it's not so much that "you still have the opcodes that operate on
just 64/32/16/8 bits" as "ASCII assembly code for any CPU is expected to be
source-compatible with ASCII assembly code for any later CPU"?

Is there any indication in the source demarcating the 8008 assembly from the
8080 assembly?

------
maayank
I LOVE this kind of technical computer history! Anyone knows of other
recommended resources?

~~~
woadwarrior01
If you’re visiting the Bay Area, be sure to visit the Computer History Museum
in Mountain View. It’s the Mecca of computing history. Also, for early
internet history, I’d recommend Katie Hafner’s book: Where Wizards Stay Up
Late.

~~~
maayank
I did and I loved every moment of it :) Also in meatspace, enjoyed Bletchley
Park in the UK.

Re: books, I liked Robert X. Cringley's Accidental Empires, but it covers only
until 91 (or 93 if IIRC for the 2nd edition) of the PC industry and not really
technical.

edit: also, of course:
[https://devblogs.microsoft.com/oldnewthing/](https://devblogs.microsoft.com/oldnewthing/)

~~~
Y_Y
The Bletchley Park museum is about ten times better and has lots of running
computers, knowledgeable staff, and stuff to play with.

------
peter_d_sherman
Excerpt:

"You might think—gee, seven is a very odd number of registers—and would be
right! The registers were encoded as three bits of the instruction, so it
allowed for eight combinations. The last one was for a pseudo-register called
M. It stood for memory. M referred to the memory location pointed by the
combination of registers H and L. H stood for high byte, while L stood for low
byte of the memory address. That was the only available way to reference
memory in [an] 8008."

------
perl4ever
Now I want to define an architecture just like the 8008, only with each
register 64-bits and only 64-bits.

~~~
saagarjha
Defining an architecture is easy! All you have to do is write up a document
with instructions and their encodings, and boom! You’ve created a new
architecture. Here’s one I made earlier this year for a class I was teaching,
for example: [https://github.com/regular-
vm/specification](https://github.com/regular-vm/specification)

~~~
perl4ever
Well, yes, but that would be in order to emulate it or maybe implement it on
an fpga or something.

~~~
saagarjha
Yes, that's what we did. Did you have something else in mind?

------
knolax
Assembly mnemonics and their cousin pin names are so excessively terse. I wish
it was mandatory to expand acronyms in a dedicated field in datasheets. So
often you'll find pin names like "NCE" where you're just expected to know a
priori that it means "active-low Chip Enable" and it's so counterproductive.

~~~
tenebrisalietum
Assembly mnemonics I think are terse to make it dirt simple for assemblers to
read them, which themselves had to be hand-entered via hex until they became
"self-hosting." Assemblers could not be on the same level as compilers. This
was definitely very needed when these CPUs were new in the 70's and such.

I've always liked how 6502 neatly has everything in 3 letter opcodes. But
that's not scalable given modern CPU capabilities.

~~~
userbinator
...and for humans to read and write them, because what could be a single
symbol in an HLL ('+') turns into several latters ('ADD') in Asm.

Some companies experimented with "symbolic Asm" which was even terser, and
made it look more like an HLL.

------
billfruit
More interestingly why are FS and GS registers called so?

~~~
Narishma
Maybe because they come after DS and ES.

~~~
rzzzt
CS, DS and ES also have letters in alphabetical order, yet they are treated as
acronyms. Wondering why is there no meaning (even if fabricated) attached to
FS and GS?

~~~
rwmj
Not really sure what your question means, but originally CS for "code
segment", DS for "data segment" and SS for "stack segment" had distinct use
cases. They weren't general purpose segment registers at all. ES was an
"extra" data segment register. When FS and GS were added later the
alphabetical ordering of CS, DS, ES, FS, GS was natural (with SS still being
the odd one out).

~~~
rzzzt
Precisely what you wrote: CS is "Code Segment", DS is "Data Segment", ES is
"Extra Segment" (even here it feels a bit manufactured), but FS and GS lack
any semi-reasonable expansion.

~~~
masklinn
Because the purpose of FS and GS was not specifically defined at the hardware
level, they're segment registers but not specific purpose.

And I guess the complete lack of dedicated purpose of FS and GS is why x64
keeps them available.

~~~
psychoslave
So, are they faint and generic segments? :D

------
jbverschoor
Hmm I guess I'm getting old. ehhh senior

------
based2
[https://www.reddit.com/r/programming/comments/fm2xb9/heres_w...](https://www.reddit.com/r/programming/comments/fm2xb9/heres_why_the_eax_register_of_x86_is_called_like/)

------
JoeAltmaier
The iAPX 432 was long after the 8086, not before.

And the 80286 predated the 80386 as a 32-bit processor, so the 80286 was
Intel's first 32-bit offering. It was still segmented, missing the paging
hardware. I helped write an operating system for it. Short-lived.

~~~
ch_123
The iAPX 432 project started in 1975, but took many years to deliver anything.
The 8086 was indeed intended as a stop gap while the ‘432 was under
development.

The 80286 by most definitions was a 16-bit CPU, having a 16-bit data bus,
16-bit general purpose registers and 16-bit segments. It did have a 24-bit
address space though, up from the 20 bits of the 8086.

EDIT: Even Intel called the 286 a 16-bit CPU, see the Preface of
[http://bitsavers.trailing-
edge.com/components/intel/80286/21...](http://bitsavers.trailing-
edge.com/components/intel/80286/210498-005_80286_and_80287_Programmers_Reference_Manual_1987.pdf)

~~~
JoeAltmaier
Oh of course! Thanks. The 24-bit address space changed operating system
designs to use a LONG to store physical memory addresses, which I guess
confused my memory of the whole thing.

It had 'real mode' which was the ordinary 8086 addressing mode (capable of
20-bit addressing to 1MB) and 'protected mode' which we called 'imaginary
mode' since nobody used it.

