
Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism - subbu88
https://developer.arm.com/support/security-update
======
mrmondo
Almost as disgusting as Intel's response.

> _" It is important to note that this method is dependent on malware running
> locally which means it's imperative for users to practice good security
> hygiene by keeping their software up-to-date and avoid suspicious links or
> downloads."_

If you're running something on your ARM CPU, it's _running locally_!, they're
using careful language to make the problem seem to impact them less than it
does and lay the blame on the user.

This affects a huge number of processors out there, not just in phones and
tablets but in embedded, industrial and network devices - many of which will
never seem workarounds from their OS / software for the hardware fault.

Also OH: _" The bug doesn’t happen if you don’t use our product"_

~~~
userbinator
Their response makes perfect sense and is a careful contrast to the
"everything is broken" hysteria from the other side. This isn't like
Heartbleed or a remote code execution vulnerability --- the attacker _has to
be able to run very specific code on the processor_ in order to exploit
anything. Thus they are basically saying "you're fine if you don't run
untrusted code" \--- something which bears repeating since it is the only
positive thing in this whole debacle.

 _but in embedded, industrial and network devices - many of which will never
seem workarounds from their OS / software for the hardware fault._

...many of which won't ever run anything but the original firmware they came
with from the factory anyway, making it a somewhat moot point (and if there
are exploits that lead to remote code execution, chances are there are better
things for an attacker to go after than try to exploit a timing side-channel.)

The ones most affected by this are cloud providers and other scenarios where
multiple mutually untrusting users are sharing the same hardware on which they
can run arbitrary code.

The ones least affected (i.e. not at all) are isolated single-user/single-
purpose machines running trusted code. This includes the majority of embedded
systems, which is presumably why ARM is emphasising the point so much compared
to Intel or AMD.

~~~
flukus
> Thus they are basically saying "you're fine if you don't run untrusted code"
> \--- something which bears repeating since it is the only positive thing in
> this whole debacle.

Except for that whole web thing that has most of the world running untrusted
code just about every minute of every day. Security hygiene doesn't protect
you from this.

~~~
userbinator
_Except for that whole web thing that has most of the world running untrusted
code just about every minute of every day_

Certainly those who e.g. have JS off by default are currently in the minority,
but perhaps this will be the defining event that causes everyone else to think
more deeply about letting untrusted code run, regardless of how sandboxed it
is.

Things like TEMPEST[1] have shown for many years that side-channel attacks are
extremely difficult to defend against, even for an attacker who is merely in
proximity to the hardware and can't influence it at all; nevermind running
code directly on it. It was only a matter of time. A lot of malware
researchers already don't trust VMs and use separate physical hardware,
precisely because of these risks of sharing trusted and untrusted code on the
same hardware.

[1]
[https://en.wikipedia.org/wiki/TEMPEST](https://en.wikipedia.org/wiki/TEMPEST)

~~~
continuational
There’s not a single program that i trust. Not even the programs I’ve written
myself; even taken all reasonable precautions, it’s far too easy to introduce
a vulnerability.

There are programs that I choose to run with escalated privileges, like the
operating system, out of necessity.

If I can’t run untrusted code; I simply can’t compute. It’s the main thing
hardware is designed to do.

~~~
andrewem
For instance, even if you read your compiler's source code and then compile
it, there's no way to know that the compilation process didn't insert a back
door, as in this classic essay:
[https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp...](https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf)

------
Rebelgecko
Maybe it's just because this is a more technical update, but the responses
from Intel[1] and AMD really seem like night and day to me. This page is for
the most part clearly written and only has a minimal amount of BS before
getting to the point about:

• identifying the problems (without casting aspersions on other vendors or
being FUD-y about issues that don't exist)

• specifying what _specific_ hardware is vulnerable

• info for developers (here's the patches you need if you're building your own
kernel, here's a new compiler intrinsic that will be available soon for your
applications, here [2] is code to implement the intrinsic yourself if you're
not able to update your compiler)

• slightly less useful info for end users (update your stuff, which isn't
necessarily actionable advice for some of us on Android)

• lets us know that future hardware may still require the KPTI patches, but it
sounds like the speculative execution issues that aren't fixed by the patches
("variant 1") will be fixed in hardware

Things that still aren't great:

• no mention of performance impact of their various mitigation strategies that
surely someone would have benchmarked in the 6 months they've been working on
this. However I'm sure before too much longer someone will grab that code from
their github and compare it to regular memory accesses

• this FAQ purporting to be in layman's terms definitely isn't [3]. Imagine
telling your elderly relative that saw an article about scary computer bugs in
the New York Times that the problem is just a "novel use of an existing side-
channel technique" to "access data from privileged memory (DRAM or CPU
cache)."

[1] [https://newsroom.intel.com/news/intel-responds-to-
security-r...](https://newsroom.intel.com/news/intel-responds-to-security-
research-findings/)

[2] [https://github.com/ARM-software/speculation-
barrier/blob/mas...](https://github.com/ARM-software/speculation-
barrier/blob/master/speculation_barrier.h)

[3] [https://developer.arm.com/support/security-
update/frequently...](https://developer.arm.com/support/security-
update/frequently-asked-questions)

~~~
chmod775
Nitpick: This is Arm's press release, not AMD's.

------
deckard1
> However, this side-channel mechanism could enable someone to potentially
> extract some information that otherwise would not be accessible to software
> from processors that are performing as designed and not based on a flaw or
> bug.

I'm having trouble parsing this bit of lawyer speak.

In the prior sentence they say nothing is new. Then in this sentence, they are
saying that there _is_ something new. I'd say "potentially extract information
that would not be accessible" as definitely not "performing as designed"
_unless_ it were "based on a flaw or bug." That is, if you take security
seriously. I mean, how could it not be? Are you saying that this attack is by
design?

Is... is this an NSA canary? Did you guys just admit that this is not a flaw
by design? Blink once if the men in black are listening, twice if they know we
know they are listening.

~~~
ealexhudson
To be honest, they're saying two things which are both simultaneously true:

* you can access information that, by design, the processor would ordinarily prevent access to * the fault is not a flaw or bug, but inherent in the design of the processor

The intent of the design was to keep the information secure, and the features
of the design to do that are working as intended. Other features of the
design, intended to increase performance, are also working correctly, but turn
out to leak information in certain circumstances.

There's no clear analogy in the physical world I can think of, but I don't
think this is weasel-speak. Exfiltration of data accidentally through a side-
channel, like here, is very difficult to design against.

~~~
noway421
"We succeeded on the design of it, but failed on the security goal."

------
mrep
> However, this side-channel mechanism could enable someone to potentially
> extract some information that otherwise would not be accessible to software
> from processors that are performing as designed and not based on a flaw or
> bug.

"performing as designed and not based on a flaw or bug"

I have no words...

------
cududa
White paper direct link as the site seems to be under heavy load:
[https://developer.arm.com/-/media/Files/pdf/Cache_Speculatio...](https://developer.arm.com/-/media/Files/pdf/Cache_Speculation_Side-
channels.pdf?revision=966364ce-10aa-4580-8431-7e4ed42fb90b&la=en)

------
monocasa
> All future Arm Cortex processors will be resilient to this style of attack
> or allow mitigation through kernel patches.

Lol, 'we may or may not fix our HDL.'

------
thrillgore
2018 is off to a great start.

~~~
bitmapbrother
Especially for Google's Project Zero.

~~~
nasredin
Project Zero AKA Google is probably a better choice than "Tailored Access"
guys at the NSA or their Russian or Chinese equivalents.

------
DCKing
Oof. The Cortex-A75 has a hardware security bug _before any silicon with it
has been released_. On the other hand, this luckily means that software
mitigations can be included before release.

Luckily for most unpatched Android phones, they use Cortex A53 and Cortex A7
in-order chips which are not affected because they lack the speculative
execution required for this vulnerability.

~~~
Symmetry
Cortex A53 certainly have speculative execution because they have branch
predictors. The branch predictor of the A53 is probably the most sophisticated
thing about it and other in-order ARM cores like the A8 are vulnerable. But
apparently something about the A53 design prevents speculating of indirect
loads making them safe.

~~~
fulafel
Speculative execution in computer architecture terminology is reserved for
speculating past branches.

~~~
Symmetry
Yes, and the A53 speculatively executes past branches, otherwise there's
little point in having a branch predictor. The difference between it and an
out of order processor is that the depth of speculation is limited by the
pipeline length rather than the reorder buffer.

EDIT: The pipeline layout severely constrains what sort of speculation can
happen and, thinking about it more, I'm really surprised that the A8 can
finish load and issue a second one before the mis-speculated branch is
resolved and the loads are quashed.

EDIT2: Well, that's not the only difference obviously. The big one that makes
a processor in order versus out of order is whether a stalled instruction
causes all subsequent instructions in the stream to stall or just ones with
data dependencies on it. But the problem of quashing bad instructions is
shared by all processors that are pipelined and which can throw exceptions,
not just those that continue speculatively issuing past predicted branches.

~~~
fulafel
I'm not familiar with the A53, maybe it is some kind of hybrid, but certainly
there are branch predictors in pipelined in-order superscalar processors that
don't speculate past branches, like Pentium or 21064. The advantage is that a
correctly predicted branch can be executed quickly and the following
instruction can be fetched & decoded. This doesn't yet require register
renaming, reservation stations etc.

~~~
Symmetry
I think we have a conflict of definition? I would call instruction fetch and
decode behind an unresolved branch speculation. If you have a machine with a
single execution stage which can't issue instructions in parallel with the
branch then of course you can't have these speculated instructions execute
because whether they are valid will always be resolved before they get to the
execution or writeback stages. But the mechanics of putting instructions into
the pipeline on the basis of speculation and quashing them if the branch
resolves differently than predicted is the same and re-use the same hardware
mechanisms that handle exceptions, making the speculative issuing of
instructions a clear win from a design perspective. It's the details of the
pipeline that determine whether it's possible to trigger the Spectre
vulnerability before the branch resolves.

Register renaming isn't required for speculation and you can even design out
of order micro architectures without it. On an in-order processor you just
have to make sure that writeback occurs after the branch is resolved and just
flush the pipeline above the branch if the branch resolves incorrectly. No
need to mess about with checkpoints because the register state is always
valid. Well, there are some considerations with store commands too if you're
using superscalar execution but it's still pretty easy. But again, these are
all mechanisms you need in any event because your processor has to deal with
interrupts that might be thrown at any moment by external triggers or memory
faults.

EDIT: I suppose it's possible in theory that you could look at an upcoming
branch, predict how it will resolve, and then fetch the relevant instruction
data into your instruction cache. But I'd tend to call that an instruction
pre-fetcher by analogy to the data pre-fetcher eveyone uses rather than a
branch predictor. And I'm not aware of anyone every having done that.

~~~
fulafel
Here's the 1997 Intel optimization manual:
[https://www.ece.cmu.edu/~ece548/localcpy/24281603.pdf](https://www.ece.cmu.edu/~ece548/localcpy/24281603.pdf)

In 2.1.3, that pertains to the original Pentium processor, it describes how
the predicted branch target instruction is fetched.

This is not commonly considered speculative execution in computer architecture
terminology.

In 3.2.1 it describes how on later OoO processors the fetched instructions are
speculatively executed.

Here in the description of the 21064, another in-order processor, the pipeline
is described (under "Conditional-Branch Pipeline Flow") as potentially taking
zero cycles for correctly prediced branches, meaning that the instruction
decode is also pipelined:
[http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Web...](http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Website/articles/DDJ/1995/9516/9516c/9516c.htm)

~~~
Symmetry
You're right, that's speculative issue rather than speculative execution. I
was being sloppy in my terminology when I called it that. It is speculation,
however.

------
MarkSweep
The whitepaper linked from this article introduces a new memory barrier named
"CSDB". Does this work with existing ARM CPUs, and if so, how? I guess they
could have always had this instruction encoding defined but just never gave it
a name, but the seems a little odd.

~~~
palotasb
According to the table here [1] it is a DSB (data synch barrier) instruction.
I don't know if it corresponds to any of the listed valid _option_ values in
[2].

[1] [http://kitoslab-
eng.blogspot.nl/2012/10/armv8-aarch64-instru...](http://kitoslab-
eng.blogspot.nl/2012/10/armv8-aarch64-instruction-encoding.html)

[2]
[http://infocenter.arm.com/help/topic/com.arm.doc.dui0802b/CI...](http://infocenter.arm.com/help/topic/com.arm.doc.dui0802b/CIHGHHIE.html)

------
sargun
Does TrustZone mitigate this? Can the secure world be compromised via this
side channel?

~~~
bryanbuckley
Your question is a bit too vague. TrustZone by itself doesn't really give you
anything. For instance, by default you run everything w/ TrustZone bit set
unless you do something different during boot.

In general, I would say TrustZone is not special at all (yet) in regards to
immunity to side-channel attacks. See:
[https://twitter.com/bryanbuckley/status/912458210191093760](https://twitter.com/bryanbuckley/status/912458210191093760)
On the other hand, TZ (NS bit) was a bit special for ARM so maybe they paid
more attention to it (i.e. maybe it matters greatly that addresses are tagged
NS/S for the HW, unlike x86 which was not designed to consider addresses part
of securable boundary between modes/rings?). Or maybe ARM still opted for
speed over security, in some surprising/disappointing ways.

However, that may be a moot point if you can execute arbitrary code in SWd and
if something like Spectre might leak SWd usr to SWd usr (since you could then
use normal communication between the SWd/NWd)?

Of course folks will experiment and we will know in the coming months.. if a
TEE vendor does not first make a comment (e.g. linaro). Also remember that the
TEE folks have different OSes (some are micro-kernels that are more security-
focused).

Curiously, I saw the blog post in August last year and it reminded that it was
during OMAP5 bring-up that I think I sent my first ever patch to lkml, dealing
with aborts being spammed (being originated via speculation; aborting because
we had a region of memory dedicated for TZ). A15 had introduced a deeper
branch prediction buffer, iianm.

I can't find the final patch. Hopefully my sign-off was just omitted rather
than the commit living in some OMAP specific fork. My RFC patch was pretty
dumb (basically I disabled speculation during early linux loading, re-enabled
once "the real" page tables were ready to activate) so better programmers took
the helm and I confirmed the fix..

So anyway.. at least on OMAP the hardware _seemed_ to disallow any access at
all across TZ boundary, even by hardware. Then again, you can't really trust
this reporting/aborting (good sign at though) and would have to verify
yourself w/ some PoC attack.

And obligatory mention that The Mill seems like a better CPU arch.

------
nnx
Are Apple Ax processors affected?

Could their great performance advantage over other ARM-based be derived from a
similar “check later” speculation as Intel’s?

------
Scene_Cast2
The site isn't loading, and there's no google cache yet. Does anyone have a
rip or mirror?

~~~
omeid2
Fun fact: The website linked most probably runs on Intel CPUs as it seems to
be hosted on Azure. Azure doesn't offer Ryzen VMs yet.

And here is an archived link:
[http://archive.is/jVCwc](http://archive.is/jVCwc)

~~~
sandworm101
I wonder if we will ever see a bug dangerous enough to cause clouds to shut
down so many machines that it is impossible to get the word out about a fix.

