
Intel Analysis of Speculative Execution Side Channels [pdf] - bcantrill
https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf
======
bcantrill
I was surprised that this wasn't already submitted -- and then surprised again
that it's not being upvoted. To give some color: this doc contains the first
public disclosure of some new MSRs being added to allow system software to
help mitigate Spectre. In particular, these are the controls to limit
speculation including Indirect Branch Restricted Speculation (IBRS) to
restrict speculation of indirect branches; Single Thread Indirect Branch
Predictors (STIBP) to prevent indirect branch predictions from being
controlled by the sibling hyperthread (!!) and Indirect Branch Predictor
Barrier (IBPB) to limit influence on later indirect branch predictions. For
those who need to implement system software support for the new microcode,
this is very important information!

~~~
JdeBP
Actually, thanks to the messy uncoördinated premature disclosure, the MSRs
became publicly known yesterday through code comments, commit messages, Google
Docs, and linux-kernel mailing list discussions.

* [https://news.ycombinator.com/item?id=16072009](https://news.ycombinator.com/item?id=16072009)

* [https://lkml.org/lkml/2018/1/4/432](https://lkml.org/lkml/2018/1/4/432)

* [https://news.ycombinator.com/item?id=16072775](https://news.ycombinator.com/item?id=16072775)

* [https://lkml.org/lkml/2018/1/4/615](https://lkml.org/lkml/2018/1/4/615)

* [https://news.ycombinator.com/item?id=16072806](https://news.ycombinator.com/item?id=16072806)

* [https://news.ycombinator.com/item?id=16075082](https://news.ycombinator.com/item?id=16075082)

A more pertinent observation is that this is the sort of technical paper that
Intel should have published _before_ the press releases that yielded such a
backlash.

* [https://news.ycombinator.com/item?id=16076601](https://news.ycombinator.com/item?id=16076601)

* [https://medium.com/@frankycaron/this-week-in-words-the-langu...](https://medium.com/@frankycaron/this-week-in-words-the-language-of-meltdown-and-spectre-126bf109e5c5) ([https://news.ycombinator.com/item?id=16075588](https://news.ycombinator.com/item?id=16075588))

~~~
bonzini
It's not due to the premature disclosure.

As far as Linux is concerned, the disclosure of Spectre and the hardware
mitigations was _uncoordinated by design_. During the embargo period, distros
were kept siloed and each more or less left on its own, assuming they were
part of the early disclosure at all. This sucks, but until the embargo was
lifted we were pretty much forced to play along. We did not even know who knew
what, so we couldn't do anything about it.

As a result, distros all differ in the amount of tunables that they provide,
in the exact behavior, and in the performance hit that you can expect from
fixing CVE-2017-5715. Assuming it's fixed at all (it's not in either Debian or
Fedora, for example).

The silver lining is that all discussions on the design choices for the fixes
are going to be public. Anyway, based on some tweets from Alex Ionescu, it
seems that these MSRs are what Windows uses (and RHEL as well, which is what I
worked on).

~~~
JdeBP
You are talking about the phase where stuff was _not_ publicly known. I talked
about the point of it becoming publicly known.

The way that it actually happened, rather than being the way that Intel
_expected_ it to happen with this paper all nicely ready on the first day
(like Google's happened to be ahead of time), was pretty clearly unfortunately
very much down to the premature disclosure. Witness what Paul Turner said
(hyperlinked from one of the aforegiven discussions) about Google's and
Intel's original goals for the week, for example.

One can speculate about an alternative universe where Linus Torvalds had this
to read _before_ the press releases. There's a whole chapter that addresses
the points that he raised.

* [https://lkml.org/lkml/2018/1/3/797](https://lkml.org/lkml/2018/1/3/797) ([https://news.ycombinator.com/item?id=16066968](https://news.ycombinator.com/item?id=16066968))

It is interesting that you bring up M. Ionescu. I'll see M. Cantrill's
surprise at no-one commenting on Intel's paper, albeit that this is becoming
less and less true by the minute by dint of posts like yours and mine, and
I'll raise him no-one on Hacker News commenting _at all_ on Microsoft; even
though it was a more prominent topic of discussion in my workplace today (when
the office systems administrator found out that updates were not happening)
than any Linux machine or BSD. (-:

* [https://news.ycombinator.com/item?id=16076660](https://news.ycombinator.com/item?id=16076660)

~~~
bonzini
I am sure Linus knew about this before, just like I was disclosed most of the
things in the paper (in .pptx format).

Based on how the early disclosure was handled, I would have expected the same
mess even without the emergency premature lifting of the embargo. Maybe a
little less scrambling, but the same confusion, plenty of discussions on 0-day
and no fixes for noncommercial distros. Everything else is wishful thinking.

------
wilun
Intel seems committed to write everywhere that their current processors work
as intended and according to their specification. I'm not mad at them for
Spectre, but Meltdown is ridiculous. If this is the quality of specification
they want to reach, I'll use processors from another company that seems to
have saner and less buggy specifications.

------
infamouscow
> These methods rely on common properties of both high-performance
> microprocessors modern operating systems and susceptibility is not limited
> to Intel processors, nor does it imply the processor is working outside its
> intended functional specification.

Given a class-action was filed after the Project Zero reveal [1], I suspect
Intel's lawyers are doing everything they can to avoid any indication there is
a defect in their chips.

[1]: [https://www.courthousenews.com/wp-
content/uploads/2018/01/In...](https://www.courthousenews.com/wp-
content/uploads/2018/01/Intel-Flaws-COMPLAINT.pdf)

~~~
mannykannot
One somewhat self-serving justification Intel might make for this claim is
that the functional specification does not say anything about the specific
behavior that is being used by the exploits.

In addition, functional specifications are sometimes modified to conform to
as-built behavior. When used responsibly, this is a reasonable response, and
in practice unavoidable for something as complex as a modern processor.

On the other hand, Intel's statements are silent about what implications
cannot be found in the effort they and others are putting into mitigating this
feature.

Tangentially, the Titanic sailed with the number of lifeboats it was designed
to carry.

------
puddums
As far as I can tell, one bit of news to me at least in this Intel whitepaper
from today is that the microcode update to mitigate “variant #2” would be
needed for Broadwell+, rather than the Skylake+ that had been stated yesterday
on LKML. From Intel's PDF today:

 _" For Intel® Core™ processors of the Broadwell generation and later, this
retpoline mitigation strategy also requires a microcode update to be applied
for the mitigation to be fully effective."_

vs. at least what I had seen on LKML list yesterday seemed to indicate
Skylake+.

Sample snippet from LKML[1]:

 _" The x86 IBRS feature requires corresponding microcode support. It
mitigates the variant 2 vulnerability..."_

and related sample snippet from LKML[2]:

 _" On Skylake the target for a 'ret' instruction may also come from the BTB.
So if you ever let the RSB (which remembers where the 'call's came from get
empty, you end up vulnerable.

Other than the obvious call stack of more than 16 calls in depth, there's also
a big list of other things which can empty the RSB, including an SMI.

Which basically makes retpoline on Skylake+ very hard to use reliably. The
plan is to use IBRS there and not retpoline."_

I'll confess I'm not 100% following all the ins and outs of this, but can
anyone comment on any additional details regarding the Skylake+ vs.
Broadwell+, and/or confirm if there was seemingly a change?

[1] [https://lkml.org/lkml/2018/1/4/615](https://lkml.org/lkml/2018/1/4/615)

[2] [https://lkml.org/lkml/2018/1/4/708](https://lkml.org/lkml/2018/1/4/708)

~~~
makomk
Presumably they've found a way to make retpoline work on Broadwell using a
microcode update, which is probably better than the alternative of adding a
very expensive kludged way of clearing the indirect branch cache in a
microcode update.

------
vsrinivas
Control Flow Enforcement (ENDBRANCH requirement at branch targets) looks like
a nice feature, looking forward to it.

~~~
jgowdy
Agreed, which is why I’m worried about retpoline as it’s a rather hacky (but
obviously pragmatic) solution that isn’t compatible with shadow stacks that
mitigation strategies won’t be compatible with.

------
default-kramer
It almost seems like Intel is saying that this is the new normal. That future
Intel CPUs will have the same weaknesses, and it's up to software (compilers,
OS authors) to deal with it. Or am I speculating incorrectly?

~~~
userbinator
It's almost like Intel is taking the position that side-channels are nearly
impossible to prevent if you're running adversarial code on the same hardware,
which makes sense to me; there are certainly those who don't need to resist
such attacks, but need the performance benefits of speculative execution.

~~~
default-kramer
I'm sure you could do speculative execution without causing side effects that
software can observe. Whether it can be done cost-effectively, I can't say.

------
bogomipz
Section 2.2.1 states:

>"An attacker discovers or causes the creation of ‘confused deputy’ code which
allows the attacker to cause speculative operations to reveal information not
normally accessible to the attacker."

Can someone say what ‘confused deputy’ code means here? This is not a term I
have ever come across before.

The authors then go on to state:

>"If the attacker can identify an appropriate ‘confused deputy’ in a more
privileged level, the attackermay be able to exploit that deputy in order to
deduce the contents of memory accessible to that deputy but not to the
attacker."

Here the reference is to an "appropriate 'confused deputy.' Are these just
weasel words? Can someone shed some light on "confused deputies" and what
makes one "appropriate"?

~~~
detaro
[https://en.wikipedia.org/wiki/Confused_deputy_problem](https://en.wikipedia.org/wiki/Confused_deputy_problem)

That said, I don't think it's good label for what's going on.

~~~
tlb
It's sort of victim blaming. The code is wrong for having instructions
corresponding to a[*b] anywhere in it, rather than our processor is wrong by
speculatively executing them with visible effects.

~~~
fyi1183
Those visible effects (caches being primed by speculative execution) are
desirable in many cases though, so it's misleading to say the processor is
wrong.

------
herf
This seems complicated, but my impression is it would mean:

a) a super-smart compiler that can anticipate lots of flaws and fence them or
turn on restrictive modes (maybe V8 is this smart?)

b) you have to turn off more speculation than you wanted to, and it hits
performance

SMAP looks cool, seems like same origin policy for pages, but this is more
Meltdown than Spectre right?

~~~
jbfoo
Problem is, that as far as I understand, compiler "fixes" could fix attacked
application. So if JavaScript code is exploiting Spectre to read passwords
from your keepassx fixes need to be applied to keepassx and not to V8 engine.

(Also probably you can patch V8 interpreter to mitigate this issue, but this
is a different story)

~~~
pas
No, it's V8 that has to make sure the privilige check variable is in L1d cache
when the if happens.

KeePass uses HTTP, and by the time it sees the request, it cannot do much if
it's valid.

~~~
bennofs
You would not send requests to keepass in order to leak passwords. Instead,
you would try to setup the branch prediction cache in way such that during
ordinary execution, keepass will causes cache accesses dependent on secret
data b.c. of the code that is speculatively executed due to indirect branch
prediction (you setup the branch prediction cache in such a way that it
executes your "gadget" to leak things). So yes, assuming you manage to get
enough control over the addresses of things in memory via javascript (may be
hard, but there are known ways to defeat ASLR via javascript as well), I think
you should be able to attack keepass even if V8 fixed it.

------
nimbius
issues were reported to intel on 2017-06-01 In November, CEO Brian Krzanich
sold roughly $11 million of stock in company, keeping just the bare minimum.
[https://gizmodo.com/intel-says-ceo-dumping-tons-of-stock-
las...](https://gizmodo.com/intel-says-ceo-dumping-tons-of-stock-last-year-
unrelate-1821739988)

Part of me is outraged that Intel thinks it gets to have an opinion on the
issue at all, after the absolutely chicken shit press release they issued.
[https://newsroom.intel.com/news/intel-responds-to-
security-r...](https://newsroom.intel.com/news/intel-responds-to-security-
research-findings/)

the really tragic tale is that more businesses arent running AMD. Dan Luu
sounded the alarm well before Meltdown, but people buy brands, not technology.

[https://danluu.com/cpu-bugs/](https://danluu.com/cpu-bugs/)

~~~
macintux
I'd wager you're getting downvoted because this stock discussion has been
beaten to death AND is completely irrelevant to the technical details relevant
to this topic. Just in case you were curious.

