
Speculative execution, variant 4: speculative store bypass - brandon
https://bugs.chromium.org/p/project-zero/issues/detail?id=1528
======
kashyapc
If you are using Linux-based virtualization (KVM), besides requiring updated
kernel and Intel microcode (which is not yet available), you would also need
updates for relevant layers: QEMU and libvirt. Patches are posted[1][2].

Virtual Machines now need to be exposed a new Intel CPU feature flag: 'ssbd'
(Speculative Store Bypass Disable).

On microcode, from Red Hat's blog post[3]:

 _In many (but not all) cases, full mitigation will also require updated
microcode from the system microprocessor vendor. Red Hat intends to ship
updated microcode as a convenience to our customers as it is made available to
us. In the interim, customers are strongly advised to contact their OEM, ODM,
or system manufacturer to receive this via a system BIOS update._

[1] [https://www.redhat.com/archives/libvir-
list/2018-May/msg0156...](https://www.redhat.com/archives/libvir-
list/2018-May/msg01560.html)

[2] [https://lists.gnu.org/archive/html/qemu-
devel/2018-05/msg047...](https://lists.gnu.org/archive/html/qemu-
devel/2018-05/msg04795.html)

[3] [https://www.redhat.com/en/blog/speculative-store-bypass-
expl...](https://www.redhat.com/en/blog/speculative-store-bypass-explained-
what-it-how-it-works)

------
my123
AMD guidance:

[https://developer.amd.com/wp-
content/resources/124441_AMD64_...](https://developer.amd.com/wp-
content/resources/124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf)

(setting an CPU-specific MSR and it's done for current CPUs, no microcode
updates required.)

[https://www.amd.com/en/corporate/security-
updates](https://www.amd.com/en/corporate/security-updates) has : "We have not
identified any AMD x86 products susceptible to the Variant 3a vulnerability in
our analysis to-date."

~~~
tedunangst
I like that it's specex variant 4 and spectre variant 3. Keeps everybody
sharp.

~~~
bonzini
Special register read is called "variant 3a" because it allows you to break
the privilege level separation like Meltdown and, back in November, Meltdown
was called "variant 3". Variant 1 was conditional-branch Spectre (speculative
out of bounds accesses) while variant 2 was indirect-branch Spectre (the one
that could be used to read host memory from a virtual machine).

------
ENOTTY
These are the links I found most explanatory

[https://bugs.chromium.org/p/project-
zero/issues/detail?id=15...](https://bugs.chromium.org/p/project-
zero/issues/detail?id=1528)

[https://software.intel.com/sites/default/files/managed/b9/f9...](https://software.intel.com/sites/default/files/managed/b9/f9/336983-Intel-
Analysis-of-Speculative-Execution-Side-Channels-White-Paper.pdf)

[https://software.intel.com/sites/default/files/managed/c5/63...](https://software.intel.com/sites/default/files/managed/c5/63/336996-Speculative-
Execution-Side-Channel-Mitigations.pdf)

[https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-...](https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-
and-mitigation-of-speculative-store-bypass-cve-2018-3639/)

[https://developer.amd.com/wp-
content/resources/124441_AMD64_...](https://developer.amd.com/wp-
content/resources/124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf)

[https://www.intel.com/content/www/us/en/security-
center/advi...](https://www.intel.com/content/www/us/en/security-
center/advisory/intel-sa-00115.html) uCode update is only for variant 3a (MSR
read) and for the global disable bit in the MSR. The standard mitigation is
still LFENCE.

[https://docs.microsoft.com/en-us/cpp/security/developer-
guid...](https://docs.microsoft.com/en-us/cpp/security/developer-guidance-
speculative-execution) vulnerable code examples

------
swonderl
Explained: [https://www.redhat.com/en/blog/speculative-store-bypass-
expl...](https://www.redhat.com/en/blog/speculative-store-bypass-explained-
what-it-how-it-works)

~~~
gruez
This is actually less clear (at least for me) than the project zero post.

~~~
jaytaylor
Can you share the link? I found the redhat article clearer than the current
chromium FP post :)

\---

edit: I wasn't able to find anything new since Jan 3rd about the speculative
bypass from Project Zero.

Some additional articles about the newly revealed Variant 4:

[https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/Variant4](https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/Variant4)

[https://xenbits.xen.org/xsa/advisory-263.html](https://xenbits.xen.org/xsa/advisory-263.html)

[https://www.cnet.com/news/intel-microsoft-reveal-new-
variant...](https://www.cnet.com/news/intel-microsoft-reveal-new-variant-on-
spectre-meltdown-chip-security-flaws/)

[https://newsroom.intel.com/editorials/addressing-new-
researc...](https://newsroom.intel.com/editorials/addressing-new-research-for-
side-channel-analysis/)

~~~
gruez
>Can you share the link? I found the redhat article clearer than the current
chromium FP post :)

I was talking about [https://bugs.chromium.org/p/project-
zero/issues/detail?id=15...](https://bugs.chromium.org/p/project-
zero/issues/detail?id=1528)

------
cesarb
A commenter over at arstechnica ([https://arstechnica.com/gadgets/2018/05/new-
speculative-exec...](https://arstechnica.com/gadgets/2018/05/new-speculative-
execution-vulnerability-strikes-amd-arm-and-intel/?comments=1&post=35370251))
found an old article explaining the optimization which led to this
vulnerability: "Faster Load Times - Intel Core versus AMD's K8 architecture"
[https://www.anandtech.com/show/1998/5](https://www.anandtech.com/show/1998/5)

------
pedro84
Additional vendor info:

[https://developer.arm.com/support/arm-security-
updates/specu...](https://developer.arm.com/support/arm-security-
updates/speculative-processor-vulnerability)

[https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-...](https://blogs.technet.microsoft.com/srd/2018/05/21/analysis-
and-mitigation-of-speculative-store-bypass-cve-2018-3639/)

[https://www.intel.com/content/www/us/en/security-
center/advi...](https://www.intel.com/content/www/us/en/security-
center/advisory/intel-sa-00115.html)

------
exikyut
Possibly completely unrelated question (this stuff is firmly over my head):
toward the end of the first PoC there's

    
    
          /* if we don't break the loop after some time when it doesn't
      work, in NO_INTERRUPTS mode with SMP disabled, the machine will lock
      up */
    

The bit at the top of the that says

    
    
      ======== Demo code (no privilege boundaries crossed) ========
    

is suggestive and unambiguous, but the program executions show (with "$"s)
that this is being executed as non-root.

So... is this deadlock fundamentally related to the speculative execution
glitch(es)?

~~~
geogriffin
You wouldn't be able to disable interrupts as non-root. The iopl syscall
allows the PoC to use CLI to disable interrupts. See the "sudo" in the
NO_INTERRUPTS runs:

    
    
      $ gcc -o test test.c -Wall -DHIT_THRESHOLD=50 -DNO_INTERRUPTS
      $ sudo ./test
    

I would guess the deadlock is due to a hardware watchdog timer rebooting the
system, or some other hardware function that needs to be tended to
periodically before it hangs.

~~~
pdkl95
It doesn't look like a deadlock; turning off interrupts prevents the
preemptive scheduler from running. Without a timer interrupt, the only way the
scheduler would run is if it's invoked to put the process to sleep during a
blocking syscall or explicitly with sched_yield(2), pthread_yield(3), _etc_.

If interrupts are off, the the PoC program might wait forever for "hits > 32"
if never testfun() never detects a "hit". Giving up after 1M bust loops
("cycles < 1000000") should prevent this from happening... _but_... I
wonder...

    
    
        gcc -o test test.c -Wall -DHIT_THRESHOLD=50 -DNO_INTERRUPTS
    

Without -O0, some optimizations are still enabled. Could a modern "clever"
optimizing compiler assume that the speculative "hit" never happens and
therefor conclude that "cycles" is only used _after_ the loop when it is
"guaranteed" to have the value 1000000 and "optimize" the loop into something
like

    
    
        /*long cycles = 0;*/ //DEAD
        while (hits < 32 /*&& cycles < 1000000*/) { //DEAD
            // ... rest of loop body, maybe?
            /* cycles++; */ //DEAD
            pipeline_flush();
        /*}*/ //DEAD
    

and the sprintf() into something like:

    
    
        sprintf(out_, "%c: %s in 100000 cycles (hitrate: %f%%)\n",
            secret_read_area[idx], results,  100*hits/(double)(100000));
    

I'm probably worrying about nothing. Or at lest I _should_ be worrying about
nothing, but with the current trend of "clever" optimizers exploiting
everything they think is provable, I'm no longer certain. _bleh_

~~~
caf
The pipeline_flush() asm block has a "memory" clobber which will certainly
prevent this kind of optimisation.

------
rbanffy
The MS advisory: [https://portal.msrc.microsoft.com/en-US/security-
guidance/ad...](https://portal.msrc.microsoft.com/en-US/security-
guidance/advisory/ADV180012)

