Hacker News new | comments | ask | show | jobs | submit login
The Next Vulnerability: Looking Back on Meltdown and Spectre One Year Later (rambus.com)
39 points by DyslexicAtheist 7 days ago | hide | past | web | favorite | 46 comments

The basics of the spectre exploit were discoverable 20 years ago and run deep, to the very core of branch speculation. This is nothing but marketing for risc-v and rambus, there's no magic in risc-v that prevents spectre except that the ecosystem is so primitive that they barely have any speculative processors...

We could have found the spectre exploit 20 years ago. To me, the fact that we didn't until now is maybe the most interesting thing.

I think one of the reasons is that nowadays we run a lot more untrusted code on top of our hardware, due to things like Javascript and virtualization. Meltdown and Spectre would not be as dangerous 20 years ago.

> The basics of the spectre exploit were discoverable 20 years ago and run deep, to the very core of branch speculation.

Some aspects of these were known 20 years ago.

http://www.cse.psu.edu/~trj1/cse543-f06/papers/vax_vmm.pdf section VI.E)

To me, the fact that we didn't until now is maybe the most interesting thing.

I suspect it's because there was still plenty of far simpler and more obvious exploits to look at.

> To me, the fact that we didn't until now is maybe the most interesting thing.

Back then, people were smashing the stack for fun and profit. Think about it! Executable stack! No canary! No ASLR! Can't be more fun.

> These challenges are going away as chipmakers and innovators collectively leverage open-source to develop better solutions and reduce time-to-market. The open source RISC-V architecture is particularly notable for its availability of unencumbered reference implementations and compiler/software support. As a result, RISC-V greatly reduces the amount of ancillary work required for a processor security project, allowing design teams to move more quickly and focus on areas of innovation – including security.

Is this a thing? I have not heard of anyone working on trying to replace x86 with RISC-V because of Spectre…

It probably switched a few orders from Intel to AMD, where the performance hit of the Intel patches vs the AMD ones changed the performance/$ enough.

But probably nothing of enough volume to be notable.

Risc-V would be unlikely, since there's not yet real server class silicon.

I view the current situation as a good time to invent 32+ core CPUs with less fancy features like all speculative execution or branch prediction (at least not to the extent which made Spectre/Meltdown possible anyway). And a good time for a next-gen kernel to emerge that can make use of several kernel-reserved cores and with better preemptive scheduling overall.

Not a hardware guy or a driver programmer but I believe this area stands to get some innovation.

We have today 32+ cores CPUs with minimal or no branch prediction and speculation. They work spectacularly well on some specific workloads but are terrible for general purpose computation. They are called GPUs.

Speculation and branch prediction is not going away. On the contrary is going to get more and more sophisticated.

What needs to die is the belief that you can rely on just software (i.e. memory safety) for isolation between trusted and untrusted code[1].

[1] Yes, meltdown bypasses hardware protection, but that's an intel specific fuck up, not inherent issue with speculation.

> What needs to die is the belief that you can rely on just software (i.e. memory safety) for isolation between trusted and untrusted code[1].

I don't understand how you came to this conclusion. How does spectre indicate that you can't rely on software for isolation?

> meltdown bypasses hardware protection, but that's Intel specific

It's not Intel specific. Certain ARM and POWER architectures are also vulnerable.

Spectre bypasses hardware protections too, just in a more subtle way.

The idea behind spectre is that speculative execution bypasses software protections.

Given if f(A) B else C, one cannot safely assume that B only gets executed when A resolves to true. It is not reasonable to expect programmers to produce safe code in an environment where they have to doubt if statements.

This could be solved without any major overhauls in hardware, by just updating it to make speculative execution actually speculative and not commit any side effects until it knows it went down the correct branch.

Sure, I get it. Perhaps I phrased my response poorly.

This comment:

> What needs to die is the belief that you can rely on just software (i.e. memory safety) for isolation between trusted and untrusted code

is an argument that doubting if statements should be acceptable, and that the assumption that only B has architecturally visible side effects when A is true is not sound. I disagree strongly with that.

You should be able to rely on software checks for some things. Going forward, the solution is not to rewrite software to not depend on software mechanisms (e.g. bounds checks) for protection (unclear what would be used instead...) but to fix the hardware so these software mechanisms work.

What needs to "die" is not "belief that you can rely on just software," but hardware that violates fundamental guarantees.

It sounds like we agree on this, I just wanted to make my point clear since I see now that my original comment was not clear.

FWIW, this is essentially the argument Torvalds made when Intel tried to add feature flags for non-broken speculation.

I don't think Linus has any expectation that spectre v1 will ever be fixed in hardware.

There's a difference between what you expect, and how you think things should be. Linus was probably making a normative statement, not a prediction.

Unfortunately fetching a new cache line from memory is a globally visible side effect. One of the major benefits of speculation and is that it enables additional memory level parallelism.

Even otherwise strictly in order designs, when aiming for high performance incorporate some form of memory speculation (scout threading, run ahead execution).

Right, and CPUs could, in principle, fix this. Eg. they could speculatively fetch the cache line, and only commit it to the cache once they know what branch they took. This still lets you do the slow memory access before you know if you will need it, without causing globally visible side effects.

As soon as you fetch the cache line, every other core will know you have done that via the cache coherency protocol. You have just disclosed an address potentially derived from some sensitive information.

Edit: to be precise they can, for example, see the transition from exclusive to shared.

I suppose this gets into what is consider a "major" overhaul of the architecture. The idea is that you don't actually "fetch" anything until you are no longer speculating. The core that is doing the speculating can pre-fetch the memory, and then commit it to the cache only once it knows that is the correct thing to do. There are still some cache coherence concerns, but nothing that strikes me as insurmountable.

I do not understand your distinction between prefect and fetch. You either have the data, and you need to notify other cores that you do or you don't.

Not notifying other CPUs seems pointless because before committing you would have to do a round-trip to the other core to verify whether the data is still valid making the prefetching pointless.

What you are describing here is high cost, because you have to take care of coherence when the line is committed to the cache, which in your scenario would not be speculative. That would require notifying the other cores and potentially refetching if another core has written to the line between your prefetch and commit.

A more practical solution is probably to have protection boundaries within the cache, like http://people.csail.mit.edu/vlk/dawg-micro18.pdf.

> What needs to die is the belief that you can rely on just software (i.e. memory safety) for isolation between trusted and untrusted code[1].

That is definitely true. We need a bit more security in the hardware. I am not saying we should go back to 300MHz CPUs just for the sake of security but surely there's a spectrum where Spectre / Meltdown / Rowhammer attacks are impossible and we still have very fast processors and RAM.

Meltdown for sure. Spectre not so much.

Note that 2+ decade old 200mhz pentium pro is very likely as vulnerable to spectre as the current generation. Frequency doesn't really factor in it.

I wasn't clear. I meant that we shouldn't take huge performance hits for having some security at the hardware level.

Meltdown shouldn't hopefully be an issue, as there are high performance CPUs which are immune. A proper Spectre V2 would probably require partitioning the jump prediction tables by process; in the meantime flushing seem to work (although at a non trivial performance cost).

The issue is the basic spectre v1 vulnerability, the bound check bypass, is so firmly rooted in the basic mechanism of speculative execution that it is very unlikely to be fixable. But it only affect,as a first approximation, processes that try to execute code with different trust domains in the same address space. The existing software mitigations appear to help, address spac separation is probably a safer long term solution.

There are simple in order CPUs that have a small speculation window and are, in practice not affected, but it is not a given that an high performance in order core is safe. Denver (an inorder VLIW design) is affected; I can't find anything about power6 (the last inorder CPU competitive with OoO).

> But it only affect,as a first approximation, processes that try to execute code with different trust domains in the same address space

This is wrong, even as an approximation. This comment, as well as your other comments on this thread, indicate that you misunderstand spectre v1. Some of the initial proofs of concept of spectre v1 demonstrated the attack with both victim and attacker in the same process, but that is not necessary, and the original materials disclosing spectre make that clear.

The paper [1] describes spectre v1 as a technique in which "conditional branch misprediction can be exploited by an attacker to read arbitrary memory from another context, e.g., another process" [emphasis mine]. The others go on to describe a proof of concept in which a vulnerable code sequence is inserted in the kernel via ebpf, but the attacker is in user space: "we use the eBPF code only for the specu- latively executed code. We use native code in user space to acquire the covert channel information."

It's not clear whether spectre v1 style attacks within a single process can be prevented by hardware, but spectre v1 attacks in which an attacker process manipulates a vulnerable gadget in a victim process certainly can be.

Running untrusted code in the same process as trusted code has always been dodgy, and now will probably need speculation barriers at bounds checks. But the most worrying part of spectre v1 is how it can be used to bypass process (and user/kernel) isolation, and that can and should be fixed.

[1]: https://spectreattack.com/spectre.pdf

What is ebpf if not untrusted code running in a trusted (the kernel) domain?

The ebpf is not the exploit. The ebpf was used in the proof of concept to generate a gadget in the victim process (in this case the kernel). Similar code patterns already existed in the kernel, the authors just used ebpf to make their own so they didn't have to hunt one down.

If you don't believe me, perhaps the document making its way into the kernel source (and the review comments) will make things clear: https://lkml.org/lkml/2018/12/21/577

If you are still not convinced, take a look at these patches merged into Linux to mitigate a vulnerability you are arguing doesn't exist: https://lkml.org/lkml/2018/1/5/769

Spectre V1 can be exploited by simply passing parameters to code that runs in another process. It does not require that the attacker can run code in the victim process.

"We have today 32+ cores CPUs ... They are called GPUs."

There are also chips like the GA144, which has 144 cores:


To get any reasonable performance benefit out of 32+ cores the code running on those cores would have to be >75% parallelized.[1] I know very little about threading and such, but that seems like it would be a difficult thing to do for many (most?) applications. Point being that I doubt consumer CPUs will have that many cores until there's a clear benefit to it, and thus far there isn't. Even 8-16 cores is pushing it. If anyone reading this has more insight into the matter, I'd love to hear what you think.

[1] https://en.wikipedia.org/wiki/Amdahl%27s_law

The real-world progression of software run over time on processors doesn't always match the assumption in amdahl's law. Frequently people want to solve harder/bigger problems in the same time, not just the same problem in less time. See, for example, how slack has taken over doing what IRC could do, just with much higher resource requirements.

For that, see https://en.m.wikipedia.org/wiki/Gustafson%27s_law.

> To get any reasonable performance benefit out of 32+ cores the code running on those cores would have to be >75% parallelized.

Exactly this, yes. But for that we need a language that enforces and guarantees pure functions and disallows global state mutation. For now C is king in systems programming but I wonder if certain chips are made with a very different idea from the get go, then can we have a systems functional language?

I also am not very informed on this front, never had enough time to dig deeper. (I did like the idea of the LISP machines several decades ago though.)

I feel uneasy giving up general purpose computing in the name of security. Can't we have both?

Yes but at a performance cost. I _think_ general purpose scalar processors lack any of the timing vulnerability since they almost literally do one thing at a time. But you are winding back the performance clock 1-2 decades - that's not to say they are unusable, for instance the arm core in the pi zero is scalar... but I doubt there is any modern desktop x86 equivalent without speculative execution.

That ARM11 core is actually partially out of order (i.e. it issues in order, but permits out of order completion of independent insns) and has branch prediction.

More like 4 decades. OoO itself is about three decades.

Yup. OoO is a thing from the mid-nineties for us mere mortals. CPUs like Pentium Pro and AMD K5. Of course the first out-of-order CPU is way older CDC 6600, back from 1964.

MIPS R2000 from 1986 could do simple form of speculative execution.

How many people's computers are known to have gotten hacked via Meltdown or Spectre?

How would you know? It allows reading privileged data without actually taking over privileged processes, and wouldn’t leave behind any signs of what happened.

It really would depend on what they did and how they did it.

If the "privileged data" they read was, say, the root password which they used to gain root access and then started snooping around the system as root or modifying parts of the filesystem, that could be easily detected.

If the exploit itself was performed over the network or if the privileged data they read was transferred over the network, that might also be detected, depending on how it was sent and where it was sent to.

If they tried to launch attacks or probes from the exploited system they could be detected at well.

Network intrusion detection systems and host detection intrusion systems could both help here.

Not nessaserilly no signs. It involves rep3atadly triggering the error path with specifically constructed values. This could easily leave behind traces in the log from which it can be infered that there was a spectre attack (of course, determining what attack was being conducted, or its succeslevel and information exfiltrated is still difficult)

What strikes me the most is that the state of the art in CPU architecture is immune to Spectre and Meltdown, yet it isn't once mentioned. Makes the article sounds like Intel funded stuttering.

Can you expand on which CPU exactly you are talking to? I'm genuinely curious.

From what I've seen, every major high-end CPU architecture is affected, because they all rely on speculation. Of course, some low-end CPUs (such as ARM Cortex-M cores) don't have speculation, so they aren't vulnerable. But on the high-performance front, I haven't seen a credible alternative that wasn't vulnerable to speculative execution side-channels. Which one am I missing?

Exactly, the low end isn't vulnerable, nor is the the Mill architecture on the high end (see millcomputing.com). The Mill isn't silicon yet, but the architecture is sound (and of extreme beauty). It is an architecture for the Intel sized out there though.

The Mill makes the current OOO look bad (Intel, AMD, others). It is striking to see that "experts", on this page and the writer of the article, act as if it didn't exist.

The Mill can save Intel, or destroy it, Intel decides that. The pos-x86 nears and time is running out. Intel needs a new architecture and RISC-V is only part of the answer.

Given the phrase "Intel funded" 'childintime is probably confused and thinks AMD Zen is not vulnerable.

Intel is under threat, not only from AMD, but from the end of the x86 era. Of course they need to protect their current bread and butter, but AMD isn't their only worry. Far from it.

Meltdown IIRC didn’t affect AMD processors, but Spectre was pretty much across the board.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact