Hacker News new | past | comments | ask | show | jobs | submit login
Researchers discover seven new Meltdown and Spectre attacks (zdnet.com)
140 points by Garbage on Nov 15, 2018 | hide | past | favorite | 36 comments

I don't know much about low level stuff, but from what I have read:

- these issues are inherent in speculative execution

- speculative execution is critical for performance

- therefore these issues are really hard to eliminate

- these attacks can break out of VM's

Are these points correct? Because if they are, it seems like a safe conclusion that most of the cloud will be compromised.

Eliminating speculation entirely would be an utter disaster for performance, more than a factor of 10. Eliminating speculation across certain security boundaries seems to be viable and not too horrible from a performance standpoint, though you have to be careful to get the boundaries and I imagine this'll be something like buffer overflow attacks where we occasionally find new ones that have to be patched.

Also, as far as I'm aware the sort of shallow speculation used by most in order processors doesn't tend to provide enough of a window for an attacker to exfiltrate data after loading it, though it's possible that I'm unaware of one that does.

Eliminating speculation across certain security boundaries seems to be viable and not too horrible from a performance standpoint, though you have to be careful to get the boundaries and I imagine this'll be something like buffer overflow attacks

Can these be detected in the same manner that valgrind can detect buffer overflows?

Given the CPU architectures we are currently using yes. There are other architectures not affected by it like Mill.

I was counting those under shallow in order processors. The ARM A8 was vulnerable and I'd guess the POWER 6 was as well. But the A53s you've got in a modern phone are safe.

I don't know of any VLIW design that's vulnerable to these whether they're open pipeline like the Mill or Transmeta lineage or closed like the Itanium. Or the Hexagon which uses round robin multi-threading to make every operation look like 1 cycle so it can be both ;)

Both Itanium and Mill support load speculation, which introduces Spectre-like vulnerabilities depending on how it is used. Someone on the Mill team said they had to make a compiler change for this: https://news.ycombinator.com/item?id=16125519

Yeah, if the compiler hoists both loads to before the check that would be a problem. But that's not really anything tied to the architecture, you could have the same problem with the prelude of an unrolled loop on a conventional machine.

The same code would be an ordinary compiler correctness bug on a conventional architecture, because the second load could trap when reading from an invalid address. Itanium and Mill offer instructions that don’t trap, return an invalid value instead, and allow the trap to be forced later (if desired). This is fundamentally new functionality that exposes new vulnerabilities if it is used in certain ways.

Ah, right, that makes a lot of sense.

Sure, the problem is those optimizations are critical to make in order machines competitive. The idea of Itanium and friends was about moving the complrxity to the compiler. Speculation and dependency breaking is critical for performance, whether it is done by the compiler, cpu or programmer.

I believe that Nvidia Denver, an in order VLIW design is vulnerable. In the case of Denver the speculation is mostly done in the firmware JIT frontend, but I wouldn't be surprised that the aggressive optimizations done by compilers to make VLIW and in order CPUs are similarly exploitable.

I hadn't been thinking of the JIT frontend. It's fair to call that part of the architecture and yeah, I can see how it could be vulnerable to Spectre.

How does the Mill architecture handle branch prediction?

It speculates of course but because the speculation is so shallow it can't read then pass on privileged information before the mis-speculation is caught.

EDIT: Oh, I should also point out that it isn't just branches. A pipelined processor is also speculating if it continues issuing institutions before memory accesses are fully resolved, because those accesses might result in exceptions that render the following instructions invalid.

What about it terms of performance given this limited speculation?

I think you're right on everything but "critical for performance".

Some chips (mostly Intel) have seen pretty big performance drops. But nothing like 50%+.

Cloud could cost 50% more. We will bear it.

None of the fixes have completely eliminated speculative execution, they have only eliminated speculative execution in a few specific cases. Wiping it out across the board would likely have a massive impact on performance.

Good point. I wonder what the real world impact is, I can imagine all the task switching that happens (at least on my VMs) could have a big impact somehow. Either bad, as in more waiting on memory for instance, or good, as in there wasn't much speculation going on anyway because of it.

> Cloud could cost 50% more. We will bear it.

Likely far less impact than this, as much of the cloud isn't CPU-bound, right?

I think that's his point; cloud is often network-bound, which can be orders of magnitude more costly.

Chandler Carruth has a good talk with an example of why speculative execution is critical for performance. https://www.youtube.com/watch?v=2EWejmkKlxs It starts around 36m13s time frame.

That's a fascinating video about how modern processors work, but I don't see here why it's critical for performance. If you built a CPU without speculation, how bad would perf be? What other features could you still use? How much do common algorithms depend on speculation?

Superscalar processors have a deep pipeline with many execution units and keep a lot of instructions in flight so a penalty of a misprediction or stall is significant. Every time it reaches a branch instruction that depends on a result which is not yet available it would need to speculate or stall. Most programs consist of small amounts of compute code followed by a branch that might depends on the results of that code.

conceivably this could be all put into the compiler.

Conceivable, yes. Practical? Not so much.

This is an idea that I personally love, but that hasn't fared well so far. Compilers are not as good as assigning instruction schedules statically as hardware can do dynamically.

curious as to why hardware can do it dynamically while software can't. It's all logic in the end.

I can understand "not being able to statically compile it because every architecture is different" but, presuming our compiler compiled to a specific platform - why wouldn't it be able to dynamically rearrange in, say, a JITted fashion using exactly whatever logic is available in the hardware.

Hopefully I'll put together a more technical answer in a while, but for now I'll just point out that when talking about performance, reducing things to "it's all logic in the end" makes little sense. We could emulate a modern CPU on an 8-bit micro controller, but the performance would be bad.

You mean speculative execution? Do you know of an sample implementation?

Meltdown in particular is worse than that, I believe. It would allow user-space programs - even regular sandboxed javascript in a browser - to read memory from the kernel and maybe other processes within the same OS?

I think that is a safe conclusion.

I'm not sure on if and how well these things can be mitigated in the hypervisor, but given that current mitigations don't seem to work in all cases, I'm not hopeful for the x86 speculative execution.

I'm hopeful that the insanity of running untrusted code in the same network, machine or even process will be over.

Direct link to paper: https://arxiv.org/abs/1811.05441


Sounds like the only solution would be to enable speculative execution on a per-process basis. Something like Visual Studio and it's child processes, core Windows processes and the kernel may fully use branch prediction. "Trusted processes" so to speak. The Web browser, not so much, so no branch prediction, no side channel.

Perhaps new cpu instructions?

- nspex - No SPeculative EXecution from here

- espex - Enable SPeculative EXecution from here

> Something like Visual Studio and it's child processes, core Windows processes and the kernel may fully use branch prediction.

Even that does not work.

If the kernel can do speculative execution and user code can cause the kernel to do work (that's kind of the kernel's job), then user code can cause the kernel to leak secret data. This was one of the first set of attacks.

Are these already patched?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact