
Reverse-engineering the 8086's Arithmetic/Logic Unit from die photos - abbeyj
http://www.righto.com/2020/08/reverse-engineering-8086s.html
======
SilasX
That title looked familiar, so I checked, and it looks like we've had recent
submissions from the same site for reverse engineering of different parts of
the 8086:

[https://news.ycombinator.com/item?id=24092605](https://news.ycombinator.com/item?id=24092605)

[https://news.ycombinator.com/item?id=24021415](https://news.ycombinator.com/item?id=24021415)

~~~
kens
Yes, I've been working through different parts of the 8086 lately. My goal is
to understand the microcode. I hope people aren't getting tired of the 8086
:-)

~~~
tyingq
Maybe the NEC V33 would be interesting to compare. 8086 compatible, but all
done with regular logic, no microcode.

~~~
kens
I'm surprised that NEC would do away with the microcode, since there is a lot
of microcode in the 8086. It would be interesting to see how much circuitry it
took to replace the microcode.

~~~
tyingq
Supposedly that made it twice as fast at the same clock speed.

~~~
tpmx
Someone (not me!) edited

[https://en.wikipedia.org/wiki/NEC_V20](https://en.wikipedia.org/wiki/NEC_V20)

to say

"The NEC V33 is a super version of the V30 that separates address bus and data
bus, and executes all instructions with wired logic instead of micro-codes,
making it twice as fast as a V30 for the same clock frequency. V33 has the
performance equivalent to Intel 80286. NEC V33 offers a method to expanding
the memory address space to 16M bytes. It has two additional instructions
BRKXA and RETXA to support extended addressing mode. The 8080 emulation mode
was not supported."

------
kens
Author here if anyone wants to discuss the internals of the 8086.

~~~
abbeyj
Do you know if NOP is really executed as `XCHG AX, AX` (as would be implied
from its position in the opcode map and its cycle count) or if it is special-
cased in some way?

I've always been curious about the XLAT instruction. It performs an operation
like `MOV AL, [BX + AL]` (if that were a valid instruction). I assume this
instruction was designed to do EBCDIC <-> ASCII conversions using a simple
loop consisting of `LODSB / XLAT / STOSB` which would explain why the input
and output are both hardcoded to AL. But it seems like it would have been
fairly easy to allow the programmer to specify different register(s) and the
instruction probably would have been a lot more useful generally if that was
allowed.

It is the only place in the ISA that I know of where an 8-bit value is zero-
extended to 16 bits before being added to another value. Short jumps use a
sign-extended 8-bit value (and are a special case anyway since they deal with
the instruction pointer). Other additions are always 8+8 or 16+16. Is there
some special case in the processor to allow just AL to be zero-extended in
this way? If so, that would go some way toward explaining why this instruction
isn't more flexible. I realize this may be very difficult to determine working
from a die photo, at least without completely reverse engineering the entire
microcode ROM.

~~~
kiwidrew
The 8086/88 does in fact execute 90h (the NOP opcode) as XCHG AX, AX. It does
not generate any change in the user-visible architectural state, however, and
so it appears to be a NOP.

[Internally the CPU uses the "TMP" register as a temporary storage location
during execution of the XCHG operation, and so the NOP will clobber whatever
value was previously held there.]

One can observe two of the internal registers ("TMP" and "IND") using an
invalid JMP FAR m32 opcode. Normally this is encoded as opcode FF/5 with a
mod-rm byte that specifies a memory location (mod=00, mod=01, or mod=10) from
which to fetch a doubleword CS:IP pointer. Specifying a mod-rm byte with
mod=11 (register) - which is an undefined encoding - skips the "load
doubleword from memory" microcode subroutine and causes the CPU to jump
directly to the location IND:TMP instead, allowing one to observe the values
held in these internal registers.

At some point I'll get around to writing this up in more detail, as I don't
think anyone else has ever described the behaviour of this particular invalid
8086 opcode.

~~~
ajenner
I have written code for a cycle-exact 8088 emulator
[https://github.com/reenigne/reenigne/blob/master/8088/xtce/x...](https://github.com/reenigne/reenigne/blob/master/8088/xtce/xtce.h)
which handles all the invalid opcodes the same way that real hardware does.
There is more potentially-useful way of observing TMP and IND that doesn't
involve jumping to code that you might not control: use the LDS or LES opcodes
with mod=11.

~~~
kiwidrew
Hmm... I thought that would be the case, but when I tested it [on an AMD
80C88], LDS/LES with mod=11 just skipped the EA calculation subroutine and
then proceeds as normal, loading a doubleword from whichever memory address
happens to have been left in the IND register.

But I'll probably go back and re-test LDS/LES more thoroughly at some point
just to make sure I haven't missed something [or more likely, that I'm not
misinterpreting my scattered notes].

By the way, I really appreciated your detailed bus sniffer logs of the 8088
executing various instructions. It was enlightening to read through the traces
and helped me understand what was going on "under the hood" of the CPU.

~~~
ajenner
It might be different on an Intel 8088. What I saw was two bytes loaded from
the bus (at the address that was left in the IND register) placed in ES or DS,
and the offset part was loaded from a second hidden register that contained
the previous word read from or written to the bus (excluding instruction
fetches).

Glad the sniffer logs were useful! I have been using them pretty regularly for
debugging and profiling things.

------
jasonzemos
From note #13:

> The silicon implementation of the lower eight bits of the ALU / registers is
> flipped compared to the upper eight bits. The motivation is to put the ALU
> signals next to the flag circuitry that needs these signals.

This caught me out because I was expecting some more obvious rotation or
discernible mirroring of the upper half and lower half. Is the layout of bits
15:8 entirely different yet logically equivalent? Looking closely though I do
see some similarities in the outer circuits at the very top and very bottom,
yet there is a large gap across the lower bank.

~~~
kens
The top and bottom have almost identical circuitry, even though the layouts
look entirely different. The layout of the ALU is highly optimized; since
everything is repeated 16 times, squeezing out a bit of space makes a big
difference.

I assume the layouts are different due to the surroundings: The top stages
have two bus wires next to them, while the bottom stage just has one bus wire,
so there is a bit more room. Some of the horizontal metal wiring is different
(like the wiring in the gap you saw), and power and ground may be offset
slightly. The top stage needs to fit with the register interface circuitry
just above it. My suspicion is that these relatively small changes resulted in
the layouts being visibly fairly different. Maybe I'll do a gate-by-gate
comparison at some point to check.

------
dm319
This is fascinating. Really enjoyed reading this as a lay-person. The only bit
I struggled to follow was matching up the diagrams with the actual
photomicrographs of the circuits - wasn't quite sure what was meant by
'pullup' \- is that just a straight connection? Have you any plans to do
something similar for the 68000?

~~~
nidgood
pullup resistors.

When the switch is closed, it creates a direct connection to ground or VCC,
but when the switch is open, the rest of the circuit would be left floating
(i.e., it would have an indeterminate voltage). For a switch that connects to
ground, a pull-up resistor ensures a well-defined voltage (i.e. VCC, or
logical high) across the remainder of the circuit when the switch is open.
Conversely, for a switch that connects to VCC, a pull-down resistor ensures a
well-defined ground voltage (i.e. logical low) when the switch is open.

[https://en.m.wikipedia.org/wiki/Pull-
up_resistor](https://en.m.wikipedia.org/wiki/Pull-up_resistor)

