Hacker News new | comments | show | ask | jobs | submit login
Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism (arm.com)
193 points by subbu88 6 months ago | hide | past | web | favorite | 60 comments



Almost as disgusting as Intel's response.

> "It is important to note that this method is dependent on malware running locally which means it's imperative for users to practice good security hygiene by keeping their software up-to-date and avoid suspicious links or downloads."

If you're running something on your ARM CPU, it's running locally!, they're using careful language to make the problem seem to impact them less than it does and lay the blame on the user.

This affects a huge number of processors out there, not just in phones and tablets but in embedded, industrial and network devices - many of which will never seem workarounds from their OS / software for the hardware fault.

Also OH: "The bug doesn’t happen if you don’t use our product"


Their response makes perfect sense and is a careful contrast to the "everything is broken" hysteria from the other side. This isn't like Heartbleed or a remote code execution vulnerability --- the attacker has to be able to run very specific code on the processor in order to exploit anything. Thus they are basically saying "you're fine if you don't run untrusted code" --- something which bears repeating since it is the only positive thing in this whole debacle.

but in embedded, industrial and network devices - many of which will never seem workarounds from their OS / software for the hardware fault.

...many of which won't ever run anything but the original firmware they came with from the factory anyway, making it a somewhat moot point (and if there are exploits that lead to remote code execution, chances are there are better things for an attacker to go after than try to exploit a timing side-channel.)

The ones most affected by this are cloud providers and other scenarios where multiple mutually untrusting users are sharing the same hardware on which they can run arbitrary code.

The ones least affected (i.e. not at all) are isolated single-user/single-purpose machines running trusted code. This includes the majority of embedded systems, which is presumably why ARM is emphasising the point so much compared to Intel or AMD.


> Thus they are basically saying "you're fine if you don't run untrusted code" --- something which bears repeating since it is the only positive thing in this whole debacle.

Except for that whole web thing that has most of the world running untrusted code just about every minute of every day. Security hygiene doesn't protect you from this.


Except for that whole web thing that has most of the world running untrusted code just about every minute of every day

Certainly those who e.g. have JS off by default are currently in the minority, but perhaps this will be the defining event that causes everyone else to think more deeply about letting untrusted code run, regardless of how sandboxed it is.

Things like TEMPEST[1] have shown for many years that side-channel attacks are extremely difficult to defend against, even for an attacker who is merely in proximity to the hardware and can't influence it at all; nevermind running code directly on it. It was only a matter of time. A lot of malware researchers already don't trust VMs and use separate physical hardware, precisely because of these risks of sharing trusted and untrusted code on the same hardware.

[1] https://en.wikipedia.org/wiki/TEMPEST


Sorry, but no way. Local execution of sandboxed or VMed code isn't going away, and shouldn't go away, and suggesting that it should or will is honestly a bit deformation professionnelle, not to say a tiny bit crackers. Neither Spectre nor several more Spectres will change that. Computing mostly got along fine without sandboxing in the 'Seventies, but it could do so because computing in the 'Seventies was a radically different world for many other reasons too. We need more, and more reliable, sandboxing not less. If that means that, for instance, hardware manufacturers have to start getting serious about relative timing guarantees instead of cheerfully doing whatever it takes to beat the benchmarks, well Too Bad Really. It's a direction that things should probably have been going in anyway in the interests of real-time perfomance guarantees.


IBM offered VM-level sandboxing back then, starting from s/370 hardware in 1972.

https://en.wikipedia.org/wiki/VM_(operating_system)


There’s not a single program that i trust. Not even the programs I’ve written myself; even taken all reasonable precautions, it’s far too easy to introduce a vulnerability.

There are programs that I choose to run with escalated privileges, like the operating system, out of necessity.

If I can’t run untrusted code; I simply can’t compute. It’s the main thing hardware is designed to do.


For instance, even if you read your compiler's source code and then compile it, there's no way to know that the compilation process didn't insert a back door, as in this classic essay: https://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomp...


I'm not saying you're wrong, but you seem to be using a different definition of untrusted than the rest of the thread. Your code may not be perfect, but you certainly don't suspect it's launching a timing attack against your machine's kernel, and exfiltrating what it discovers, right?


> but perhaps this will be the defining event that causes everyone else to think more deeply about letting untrusted code run, regardless of how sandboxed it is.

Your average user: Javascript is on the website not on my computer so it's fine.


Your average user: what’s a Javascript?


Such users simply expect the computer they bought to work as advertised. Is it really their fault when it can’t run a browser correctly?


while not directly related to meltdown and Spectre it's not just the web. Almost all code really shoud be considered untrusted. Every game on steam. every app on the app store. many apps use ad or analytics libraries the app devs don't know the innards of. there are plenty of apps that are just skinned web browsers effectively downloading new code all the time and while I might trust Facebook or maybe Slack to check all the 3rd party libraries and updates they import I doubt your average app dev team does any of that.


> The ones least affected (i.e. not at all) are isolated single user/single purpose machines running trusted code

What about people who use personal machines to browse the web and run untrusted JavaScript?


> What about people who use personal machines to browse the web and run untrusted JavaScript?

They're running untrusted code, and that doesn't sound like a single-purpose machine.


From the spectre paper, https://spectreattack.com/spectre.pdf

"As a proof-of-concept, JavaScript code was written that, when run in the Google Chrome browser, allows JavaScript to read private memory from the process in which it runs (cf. Listing 2). "

That looks like is the current limit of javascript base attack. It doesn't seem to be able to access system resources nor execute system command script (yet....).

That kind of JS attack vector likely can be mitigated with web browser update.


So you think the majority of existing IoT devices currently in existence won't be exploited in the next 5 years? That's rather optimistic when a large portion seems to use hardcoded passwords and with the rise of easy to build botnets.


Yeah this bothered me as well. It's not just a matter of not clicking suspicious links or downloads, apparently (according to the Spectre pdf) there's a javascript vulnerability for this. What does a suspicious page even mean anyways, and how does the average user determine that before clicking?


None of the Cortex M cores seem to be affected, according to ARM. These are the ones mostly used in embedded applications.


Cortex M and all ARM-developed cores prior to v7 are in-order CPUs, so they do not do branch prediction or speculative execution. Without branch prediction this kind of issue cannot exist.


In-order CPUs (e.g. Pentium 1, older Atom and ARM chips, POWER 6) perform branch prediction and speculative execution. They'll predict the branch and start speculatively decoding and executing instructions from the predicted target, then flush the pipeline if there was a misprediction. What they won't do is execute past an instruction that has an unresolved data dependency.


You're right, I was conflating two things. It looks like no in-order ARM cores have branch prediction though.


‘M’ just stands for microcontroller, as apposed to ‘A’ for application.

Many / most complex or multitasking embedded devices will use ‘A’ series to meet their processing requirements, e.g. network routing + OS + firmware / OS programming and multiuser processsing.

I probably should have better qualified my use of the term ‘embedded’.


But it does affect the processors used on the Beaglebone and older Raspberry Pi devices.


It does not affect Raspberry Pi devices to my knowledge. Please provide source for your claims!

The CPUs in Raspberry Pi 1-3 are not affected.

  ARM11, Cortex-A7, Cortex-A5
Raspberry Pi 2 v1 use a Broadcom BCM2836 SoC with a 900 MHz 32-bit quad-core ARM Cortex-A7 processor.

Raspberry Pi 3 (and Pi 2 v1.2) uses a Broadcom BCM2837 SoC with a 1.2 GHz 64-bit quad-core ARM Cortex-A53 processor.

According to ARM website https://developer.arm.com/support/security-update it especially says

  "*Only affected cores are listed, all other Arm cores are NOT affected.*" 
and it lists only

  "Cortex-R7, Cortex-R8, Cortex-A8, Cortex-A9, Cortex-A15, 
  Cortex-A17, Cortex-A57, Cortex-A72, Cortex-A73, Cortex-A75"


    and older Raspberry Pi devices
https://www.raspberrypi.org/products/raspberry-pi-2-model-b/

But it looks like I confused the Cortex-A7 with the Cortex-R7, which is not listed.


Cortex M cores don't have MMUs, so they cannot be affected. Also in embedded applications you would only be running trusted code anyway


The MMU isn't part of this attack, and there are ARM cores with MMUs that are not affected. Cortex M cores are in-order CPUs, so they do not have branch prediction or perform speculative execution which the attacks rely on.


The attack is all about bypassing memory protection, which without an MMU to provide any, is moot --- code has access to all of memory to read normally anyway.


Cortex-M has an MPU, not MMU to provide protection for memory regions.

But as said above, the MPU/MMU has no effect on this bug, this is ab out speculative execution and determining information from a cache side channel attack.


Maybe it's just because this is a more technical update, but the responses from Intel[1] and AMD really seem like night and day to me. This page is for the most part clearly written and only has a minimal amount of BS before getting to the point about:

• identifying the problems (without casting aspersions on other vendors or being FUD-y about issues that don't exist)

• specifying what _specific_ hardware is vulnerable

• info for developers (here's the patches you need if you're building your own kernel, here's a new compiler intrinsic that will be available soon for your applications, here [2] is code to implement the intrinsic yourself if you're not able to update your compiler)

• slightly less useful info for end users (update your stuff, which isn't necessarily actionable advice for some of us on Android)

• lets us know that future hardware may still require the KPTI patches, but it sounds like the speculative execution issues that aren't fixed by the patches ("variant 1") will be fixed in hardware

Things that still aren't great:

• no mention of performance impact of their various mitigation strategies that surely someone would have benchmarked in the 6 months they've been working on this. However I'm sure before too much longer someone will grab that code from their github and compare it to regular memory accesses

• this FAQ purporting to be in layman's terms definitely isn't [3]. Imagine telling your elderly relative that saw an article about scary computer bugs in the New York Times that the problem is just a "novel use of an existing side-channel technique" to "access data from privileged memory (DRAM or CPU cache)."

[1] https://newsroom.intel.com/news/intel-responds-to-security-r...

[2] https://github.com/ARM-software/speculation-barrier/blob/mas...

[3] https://developer.arm.com/support/security-update/frequently...


Nitpick: This is Arm's press release, not AMD's.


> However, this side-channel mechanism could enable someone to potentially extract some information that otherwise would not be accessible to software from processors that are performing as designed and not based on a flaw or bug.

I'm having trouble parsing this bit of lawyer speak.

In the prior sentence they say nothing is new. Then in this sentence, they are saying that there is something new. I'd say "potentially extract information that would not be accessible" as definitely not "performing as designed" unless it were "based on a flaw or bug." That is, if you take security seriously. I mean, how could it not be? Are you saying that this attack is by design?

Is... is this an NSA canary? Did you guys just admit that this is not a flaw by design? Blink once if the men in black are listening, twice if they know we know they are listening.


To be honest, they're saying two things which are both simultaneously true:

* you can access information that, by design, the processor would ordinarily prevent access to * the fault is not a flaw or bug, but inherent in the design of the processor

The intent of the design was to keep the information secure, and the features of the design to do that are working as intended. Other features of the design, intended to increase performance, are also working correctly, but turn out to leak information in certain circumstances.

There's no clear analogy in the physical world I can think of, but I don't think this is weasel-speak. Exfiltration of data accidentally through a side-channel, like here, is very difficult to design against.


"We succeeded on the design of it, but failed on the security goal."


> However, this side-channel mechanism could enable someone to potentially extract some information that otherwise would not be accessible to software from processors that are performing as designed and not based on a flaw or bug.

"performing as designed and not based on a flaw or bug"

I have no words...


White paper direct link as the site seems to be under heavy load: https://developer.arm.com/-/media/Files/pdf/Cache_Speculatio...


> All future Arm Cortex processors will be resilient to this style of attack or allow mitigation through kernel patches.

Lol, 'we may or may not fix our HDL.'


2018 is off to a great start.


Especially for Google's Project Zero.


Project Zero AKA Google is probably a better choice than "Tailored Access" guys at the NSA or their Russian or Chinese equivalents.


Oof. The Cortex-A75 has a hardware security bug before any silicon with it has been released. On the other hand, this luckily means that software mitigations can be included before release.

Luckily for most unpatched Android phones, they use Cortex A53 and Cortex A7 in-order chips which are not affected because they lack the speculative execution required for this vulnerability.


Cortex A53 certainly have speculative execution because they have branch predictors. The branch predictor of the A53 is probably the most sophisticated thing about it and other in-order ARM cores like the A8 are vulnerable. But apparently something about the A53 design prevents speculating of indirect loads making them safe.


Speculative execution in computer architecture terminology is reserved for speculating past branches.


Yes, and the A53 speculatively executes past branches, otherwise there's little point in having a branch predictor. The difference between it and an out of order processor is that the depth of speculation is limited by the pipeline length rather than the reorder buffer.

EDIT: The pipeline layout severely constrains what sort of speculation can happen and, thinking about it more, I'm really surprised that the A8 can finish load and issue a second one before the mis-speculated branch is resolved and the loads are quashed.

EDIT2: Well, that's not the only difference obviously. The big one that makes a processor in order versus out of order is whether a stalled instruction causes all subsequent instructions in the stream to stall or just ones with data dependencies on it. But the problem of quashing bad instructions is shared by all processors that are pipelined and which can throw exceptions, not just those that continue speculatively issuing past predicted branches.


I'm not familiar with the A53, maybe it is some kind of hybrid, but certainly there are branch predictors in pipelined in-order superscalar processors that don't speculate past branches, like Pentium or 21064. The advantage is that a correctly predicted branch can be executed quickly and the following instruction can be fetched & decoded. This doesn't yet require register renaming, reservation stations etc.


I think we have a conflict of definition? I would call instruction fetch and decode behind an unresolved branch speculation. If you have a machine with a single execution stage which can't issue instructions in parallel with the branch then of course you can't have these speculated instructions execute because whether they are valid will always be resolved before they get to the execution or writeback stages. But the mechanics of putting instructions into the pipeline on the basis of speculation and quashing them if the branch resolves differently than predicted is the same and re-use the same hardware mechanisms that handle exceptions, making the speculative issuing of instructions a clear win from a design perspective. It's the details of the pipeline that determine whether it's possible to trigger the Spectre vulnerability before the branch resolves.

Register renaming isn't required for speculation and you can even design out of order micro architectures without it. On an in-order processor you just have to make sure that writeback occurs after the branch is resolved and just flush the pipeline above the branch if the branch resolves incorrectly. No need to mess about with checkpoints because the register state is always valid. Well, there are some considerations with store commands too if you're using superscalar execution but it's still pretty easy. But again, these are all mechanisms you need in any event because your processor has to deal with interrupts that might be thrown at any moment by external triggers or memory faults.

EDIT: I suppose it's possible in theory that you could look at an upcoming branch, predict how it will resolve, and then fetch the relevant instruction data into your instruction cache. But I'd tend to call that an instruction pre-fetcher by analogy to the data pre-fetcher eveyone uses rather than a branch predictor. And I'm not aware of anyone every having done that.


Here's the 1997 Intel optimization manual: https://www.ece.cmu.edu/~ece548/localcpy/24281603.pdf

In 2.1.3, that pertains to the original Pentium processor, it describes how the predicted branch target instruction is fetched.

This is not commonly considered speculative execution in computer architecture terminology.

In 3.2.1 it describes how on later OoO processors the fetched instructions are speculatively executed.

Here in the description of the 21064, another in-order processor, the pipeline is described (under "Conditional-Branch Pipeline Flow") as potentially taking zero cycles for correctly prediced branches, meaning that the instruction decode is also pipelined: http://collaboration.cmc.ec.gc.ca/science/rpn/biblio/ddj/Web...


You're right, that's speculative issue rather than speculative execution. I was being sloppy in my terminology when I called it that. It is speculation, however.


The whitepaper linked from this article introduces a new memory barrier named "CSDB". Does this work with existing ARM CPUs, and if so, how? I guess they could have always had this instruction encoding defined but just never gave it a name, but the seems a little odd.


According to the table here [1] it is a DSB (data synch barrier) instruction. I don't know if it corresponds to any of the listed valid option values in [2].

[1] http://kitoslab-eng.blogspot.nl/2012/10/armv8-aarch64-instru...

[2] http://infocenter.arm.com/help/topic/com.arm.doc.dui0802b/CI...


Does TrustZone mitigate this? Can the secure world be compromised via this side channel?


Your question is a bit too vague. TrustZone by itself doesn't really give you anything. For instance, by default you run everything w/ TrustZone bit set unless you do something different during boot.

In general, I would say TrustZone is not special at all (yet) in regards to immunity to side-channel attacks. See: https://twitter.com/bryanbuckley/status/912458210191093760 On the other hand, TZ (NS bit) was a bit special for ARM so maybe they paid more attention to it (i.e. maybe it matters greatly that addresses are tagged NS/S for the HW, unlike x86 which was not designed to consider addresses part of securable boundary between modes/rings?). Or maybe ARM still opted for speed over security, in some surprising/disappointing ways.

However, that may be a moot point if you can execute arbitrary code in SWd and if something like Spectre might leak SWd usr to SWd usr (since you could then use normal communication between the SWd/NWd)?

Of course folks will experiment and we will know in the coming months.. if a TEE vendor does not first make a comment (e.g. linaro). Also remember that the TEE folks have different OSes (some are micro-kernels that are more security-focused).

Curiously, I saw the blog post in August last year and it reminded that it was during OMAP5 bring-up that I think I sent my first ever patch to lkml, dealing with aborts being spammed (being originated via speculation; aborting because we had a region of memory dedicated for TZ). A15 had introduced a deeper branch prediction buffer, iianm.

I can't find the final patch. Hopefully my sign-off was just omitted rather than the commit living in some OMAP specific fork. My RFC patch was pretty dumb (basically I disabled speculation during early linux loading, re-enabled once "the real" page tables were ready to activate) so better programmers took the helm and I confirmed the fix..

So anyway.. at least on OMAP the hardware _seemed_ to disallow any access at all across TZ boundary, even by hardware. Then again, you can't really trust this reporting/aborting (good sign at though) and would have to verify yourself w/ some PoC attack.

And obligatory mention that The Mill seems like a better CPU arch.


Are Apple Ax processors affected?

Could their great performance advantage over other ARM-based be derived from a similar “check later” speculation as Intel’s?


The site isn't loading, and there's no google cache yet. Does anyone have a rip or mirror?


> Vulnerability of Speculative Processors to Cache Timing Side-Channel Mechanism

> Updated on 03/Jan/2018

> Based on the recent research findings from Google on the potential new cache timing side-channels exploiting processor speculation, here is the latest information on possible Arm processors impacted and their potential mitigations. We will post any new research findings here as needed.

> Cache timing side-channels are a well-understood concept in the area of security research and therefore not a new finding. However, this side-channel mechanism could enable someone to potentially extract some information that otherwise would not be accessible to software from processors that are performing as designed and not based on a flaw or bug. This is the issue addressed here and in the Cache Speculation Side-channels whitepaper.

> It is important to note that this method is dependent on malware running locally which means it's imperative for users to practice good security hygiene by keeping their software up-to-date and avoid suspicious links or downloads.

> The majority of Arm processors are not impacted by any variation of this side-channel speculation mechanism. A definitive list of the small subset of Arm-designed processors that are susceptible can be found below.


Fun fact: The website linked most probably runs on Intel CPUs as it seems to be hosted on Azure. Azure doesn't offer Ryzen VMs yet.

And here is an archived link: http://archive.is/jVCwc


I wonder if we will ever see a bug dangerous enough to cause clouds to shut down so many machines that it is impossible to get the word out about a fix.


It loaded after a minute or two for me, but if that isn't working try this: http://archive.is/jVCwc


Site as an image: https://imgur.com/B15WzJZ




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: