How the 8086 processor determines the length of an instruction

hyperman1 · on Feb 28, 2023

You're writing these almost faster than I can read them. Thanks

Question: Does the 1BL thing imply that the 8086 is not capable of detecting useless prefixes? So the next 2 implications are correct:

Eg1: lock cs: clc is just treated as clc, and the lock and cs: are ignored?

Eg2: The 8086 has no 16 byte instruction length limit, unlike some successors. So e.g 16 seg overrides:

Cs: Ds: Es: Ss: Cs: Ds: Es: Ss: Cs: Ds: Es: Ss: Cs: Ds: Es: Ss: mov [1234],5

Is just ss: mov [1234],5

kens · on Feb 28, 2023

I haven't tested a physical chip to verify, but based on my simulations I think you are correct. For your second example, a side effect is that NMI is blocked until the end of the instruction, so you could block the NMI interrupt for an arbitrary amount of time.

hyperman1 · on Feb 28, 2023

Oh wow. NMI blocked, and presumably other interrupts too? That means the filling a 64K segment with cs: prefixes will lock the CPU completely. IP will wrap around forever, and you have created some kind of infinite sized instruction. That's kind of cool!

kens · on Feb 28, 2023

Reset happens immediately, so at least that would let you escape.

anyfoo · on Feb 28, 2023

Presumably, yeah. If other interrupts weren't blocked, unless PC is somehow saved to be the address of the prefix(es), upon exiting from the interrupt you'd resume from the "wrong" ("incomplete", lacking its prefix) instruction.

kens · on Feb 28, 2023

Yes, more 8086 microcode. I'm here for any questions...

PaulCarrack · on Feb 28, 2023

Love the work Ken, been reading your articles since the early 2010s. Do you get paid to write these posts and do the research or is this just a hobby? I wish I could find the time to do something similar, but between family and work I have zero free time to do anything anymore, unlike my 20s that were time spent wasted. Would love to know how you do time management if this is your hobby.

kens · on Feb 28, 2023

No, I don't get paid for this. The time-management secret is to retire :-)

mmastrac · on Feb 28, 2023

Do you ever feel like you'd want to go back to work? I've tried a few sabbaticals and always find myself itching to work again.

kens · on Feb 28, 2023

A bit surprisingly, I don't miss it at all. (Except for the Google meals.) I'm enjoying working on a bunch of projects.

sitkack · on Feb 28, 2023

Given the pace, he doesn't have time for a job.

anyfoo · on Feb 28, 2023

I met your group at some informal meeting where Eric was showing his Monster 6502 (it seems ages ago by now), and it definitely gave me a blueprint of how I'd spent my time once I retire. :D

theresLand · on March 5, 2023

I've recently taken an interest in CPU architecture and your wonderful article couldn't have come at a better time for me. Thank you.

Beginner question: In the example of the 3-byte instruction using the immediate value: ADD AX,1234 Is this instruction 3 bytes long because 'ADD AX' is encoded in 1 byte while the immediate value '1234' is two bytes long?

colejohnson66 · on March 16, 2023

Yes. ADD AX,Iw is encoded as the single byte, 0x01. That byte indicates that a two byte immediate (Iw, which is 1234 here) follows.

anyfoo · on Feb 28, 2023

Does the differing prefetch queue size between 8088 and 8086 lead to any significant differences between the Bus Interface Units (or wherever that affects most) of the two chips, or is it basically just a "parameter" in the design that could be tuned without a lot of knock-on effects?

Also:

> If the queue ran empty, the processor waited until more instruction bytes were fetched from memory into the queue.

Does the CPU make any effort to fill up the queue before it runs empty?

kens · on March 1, 2023

I haven't studied the 8088 super-closely. There are a moderate number of changes. For instance, the prefetch registers in the 8088 needs to be updated a byte at a time, so they need separate write control lines for the low and high bytes. The logic that counts queue positions also needs changing; it is optimized logic rather than a generic counter. So it's more than just changing a parameter.

As for the CPU making an effort to fill up the queue, the CPU tries to fill up the queue if the bus is idle. But if memory accesses are happening, you're better off doing the memory accesses that you need rather than performing prefetches which could get discarded.

anyfoo · on March 1, 2023

That makes a lot of sense, thanks.

manv1 · on Feb 28, 2023

So what you're saying is that the 8086 was sort of stack based (like forth), and a given instruction just consumed the number of bytes off the stack it needed, then the assumption was the next thing on the stack was the next instruction?

kens · on Feb 28, 2023

The instruction bytes were in a queue, not a stack, so it's not really like Forth. It's the same as reading the bytes in order from memory except the queue improved performance by reading instructions when the bus was otherwise free.

ajross · on March 1, 2023

Not stack based, as there's no storage for a stack (well, beyond the actual program stack in main memory). It's just a state machine.

sitkack · on Feb 28, 2023

Have you started power glitching or using an epitaxial beam to introduce faults in real hardware to confirm your findings?

Are you in communication with https://stevemorse.org/8086/index.html ?

kens · on Feb 28, 2023

I haven't tried glitching an 8086, but I have chatted with Steve Morse, creator of the 8086 architecture.

_2uwr · on March 1, 2023

> https://stevemorse.org/8086/index.html

This is a book I could read in the flesh instead of online. Getting answers to questions like why did they do this, was it a constraint of the day or some other reason could be elucidating.

sitkack · on March 1, 2023

An online book club with kens and stevemorse using this book would be pretty cool. 8086 masterclass.