the researcher wrote:
> Somewhere around the release of the 8086, Intel decided to add a special caveat to instructions loading the SS register...where loading SS with [`pop ss` or `move ss`] would force the processor to disable external interrupts, NMIs, and pending debug exceptions.
So it's a really, really old piece of documentation, dating from around 1980.
To call it a 'misinterpretation' rather than a vulnerability is extremely generous, given that most Intel engineers spent entire careers in the presence of code vulnerable to this 'misinterpretation' without calling the OS vendors out on their error.
The specific implication of that for a mov ss ; syscall pair with a hardware breakpoint set on the first instruction is a lot more subtle.
* When an interrupt, debug exception, ... occurs, the CPU pushes stuff on the stack as part of the task switch.
* The stack is managed by 2 registers: SS and (e/r)SP. To change your stack, you have to change both registers. If an interrupt happens and you've changed only 1, stuff gets pushed on an invalid stack and you're toast.
* To fix this, the CPU has a wild card: When you change SS, you get exactly 1 instruction that will not be interrupted. The idea is you use that instruction to change (e/r)SP and make the stack valid again. If there is a need for an interrupt, it will be delayed for 1 instruction.
* Now this being a security problem, what would happen if you use this second instruction to switch to kernel mode ? It turns out the delayed interrupt happens before the first kernel mode instruction, but in the kernel.
* And you can trigger the right kind of interrupt with debug exceptions and single stepping.
* And if you do this, the kernel tells the debugger not about the debugged program but about the kernel. Oops.
So to fix this, I suppose the kernel checks the debug exception info from the CPU, and if it is debugging the kernel it fixes things up so you go back 1 instruction.
I wonder, why could not they make a single instruction to change both SS and SP?
And they can’t really remove the old instructions because of backwards compatibility.
Whitelisting a limited set of instructions that can follow setting SS and making all others trap might be an option, though. It still would break backwards compatibility, but if the effective impact would be negligible, they could deem it acceptable.
“Loading the SS register with a POP instruction suppresses or inhibits some debug exceptions and inhibits interrupts on the following instruction boundary. (The inhibition ends after delivery of an exception or the execution of the next instruction.) This behavior allows a stack pointer to be loaded into the ESP register with the next instruction (POP ESP) before an event can be delivered. See Section 6.8.3, “Masking Exceptions and Interrupts When Switching Stacks,” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A. Intel recommends that software use the LSS instruction to load the SS register and ESP together.”
EDIT: Although I guess it's almost certain to be an error anyway if it's not a whitelisted instruction.
Albeit, they could have made an stiandhlt instruction here as well.
A trace exception shouldn't be creating work for a thread context that's ostensibly the one going to sleep with nothing to do, but an external interrupt very likely is.
Maybe they dindt think about it, and fixed it with a quick hack once they got aware of the problem? The whole segment register story was very hacky from the start.
Intel might have dumped the segment registers in i386 32 bit protected mode, as they cleaned up a lot of other troublesome corners around that time. But, well, they didn't, so we have to deal with it today.
* There's an old feature which causes POP SS/MOV SS instructions to delay all interrupts until the next instruction has executed, to safely allow changing both SS and SP without an interrupt firing inbetween on a bad stack.
* If such an instruction itself causes an interrupt (by triggering a memory breakpoint through the debug registers), it is delayed (as intended).
* The delayed interrupt will fire after the second instruction even if the second instruction disabled interrupts.
* By means of the above, a MOV SS instruction triggering a #DB followed by an INT n instruction will cause the #DB exception to fire before the first instruction of the interrupt handler, even though this should be impossible (as entering the handlers sets IF=0, disabling interrupts).
* The OS #DB handler assumes GS has been fixed up by the previous interrupt handler, which in now under user control.
I hope we learn a lot, and take the time to record the experience, for coming platforms like RISC-V and others.
Why is there no big CAVEATs document from intel detailing weird quirks. I strongly assume the intel arch engineers are well aware of many of those counter-intuitive behaviours in their products.
But it is likely just kind of distributed, organic knowledge that is hard to condense into a single document. Writing and maintaining such a thing would be a significant project, and (I am speculating here) not the kind of thing that significantly burnishes anyone's performance review.
That said, the whole community of assembly-hackers has even broader knowledge of the topic, and could start such a document out in the open. And Intel engineers might likely contribute their own two cents. (Unless lawyers forbid it).
This is going to be a busy day!
In this case it seems they just didn't properly specify a piece of insane behaviour though. Hell, I'd consider it an outright CPU bug if I'm reading this right. Seemingly there's a "feature" where loading SS causes interrupts to be delayed until after the next instruction, even if the next instruction disables interrupts - so you can cause an interrupt to fire on the first instruction of the handler (where it should be impossible).
1. The CPU does a buffer overflow when reading the array of bits used to determine IO permission for instructions like "in" and "out". Every OS which supports the feature has to add an extra byte of 0xff beyond the end of the array.
2. Returning from a 32-bit OS to a 16-bit process will only update the low 16 bits of the stack pointer. The upper 16 bits can still be read, leaking info about the kernel stack. Linux has a complicated work-around called espfix.
Mayhaps much like legalese.
Just to clarify, this is kernel code. Listing 3 different (+ "other") Linux distros as affected is kinda bogus, it's not that they all made the same mistake, they just all use the same kernel.
Yet the article's title makes it seems like it was the OS implementors faults instead of Intel's.
I wonder for what reasons the website tries to shift/soften Intel's fault on this?
Seems like there's a trend in news articles to have incoherent titles in relation to content lately, it's really annoying...
”If a sequence of consecutive instructions each loads the SS register (using MOV or POP), only the first is guaranteed to inhibit or suppress events in this way.”
So, we still don’t know. For all I know, it may even depend on the exact cpu used or cpu state.
Is this a fair comparison? I feel like the patching techniques must have been easier to develop than for Meltdown/Spectre. Furthermore, if this affected the same kind of people in this community, maybe this time around benefitted from the communication channels of the previous exercises.
Maybe this isn't a comparison to try and badmouth the previous iteration, and instead just tried to show a general improvement in the industry—I just find it a bit unfair.
And yes, it is all due to the horrible Meltdown/Spectre problem and how that was handled. We were not allowed to work together for that problem, and we do not want to that to happen again.
How do you mean?
The other BSD's have different kernels and vastly different development histories as well.
In fact, the Linux fix was a patch I wrote in 2015 (for unrelated reasons) and just never got around to upstreaming.
In the other hand it is somewhat surprising, because loading anything into SS is mostly pointless operation.
Where was the security testing at the OS level? Why can't there be automated test suites that catch unauthorized access issues before ship (if not before merge commit)? If your vendor delivers an insecure product and you don't discover it, how much blame do you share?
Usually the search space is too large.
Basic fuzzing probably wouldn’t catch this; as the other comments point out, the search space is probably too large, and the set of vulnerable executions is probably too small for an undirected random search.
If they can’t sign off on the security/correctness of a pile of code, they delete it, even if it means losing functionality.
In this case, they simply didn’t invoke the incomprehensible instruction of doom.