Hacker News new | past | comments | ask | show | jobs | submit login

>> These attacks leak information through micro-architectural side-channels which we show are not mere bugs, but in fact lie at the foundation of optimization.

So we'll need to have non-speculative execution for cloud CPUs and stronger efforts to keep untrusted code off our high performance CPUs. This may even lead to chips with performance cores and trusted cores.

No. The paper notes that Spectre can, and will in the future be able to defeat all programming language level techniques of isolation. With properly designed OoO, Spectre cannot defeat process isolation. The fundamental lesson that everyone must take to heart is that in the future, any code running on a system, including things like very high-level languages running on interpreters, always have full read access to the address space of the process they run in. Process isolation is the most granular security boundary that actually works.

Or in other words, running javascript interpreters in the same address space as where you manage crypto is not something that can be done. Running code from two different privilege levels in the same VM is not something that can be done. Whenever you need to run untrusted code, you need to spin up a new OS-managed process for it.

So this is something that I've never gotten a full answer to: what is the difference between a "thread" and a "process" in this model?

This isn't a facetious question. A thread is just, at its core, a process that shares memory with another process. (In fact, this is how threads are implemented on Linux.) But all, or virtually all, processes also share memory with other processes. Text pages of DLLs are shared between processes. Browser processes have shared memory buffers, needed for graphics among other things.

What separates processes that share memory from threads that share memory regarding Spectre? Is it the TLB flush when switching between processes that doesn't occur between threads? Or something else?

For meltdown (spectre v3, iirc) It's not so much sharing memory as sharing address space. Processes have different page tables. Threads within a process share page tables.

For spectre v1 and v2, right now (on existing hardware) mostly nothing separates threads from processes. In the future, process isolation is a good candidate for designing hardware + system software such that different processes are isolated (via partitioning the caches, etc).

You probably still want threads within a process to share cache hits.

So, if that's true, why is Chrome considered to have solved Spectre? Browser content processes from different domains share some memory. Moreover, if process boundaries don't have any effect on the branch predictor on current hardware, then why is process separation relevant at all? Doesn't all this mean Spectre is still an issue?

I guess I jumped the gun a bit in my comment above.

In terms of the possibility of exploit, as I understand there isn't at this point any isolation between processes.

In terms of the ease of exploit, being able to run untrusted code in the same process as the victim helps quite a bit. Otherwise, you have to find a gadget (i.e. qualifying bounds check for v1, indirect branch for v2) in the victim process that you can exploit from the attacker process. Possible, but quite a bit harder than making your own gadget.

This all ignores the forward looking reasons process isolation is a good idea. I can't keep track of the latest mitigations in Linux, but they pretty much all will only help between processes by flushing various hardware data structures. And hopefully someday we will have hardware actually designed to restore the guarantees of isolation between processes.

I'm pretty sure this is accurate, but I'm just a random guy on the internet so don't trust my word for it too much.

It's not really about process isolation then, but the amount of control untrusted code can have over a process. Which means if everything that code can do is masked to some part of the process, it should be able to achieve the same isolation between such subprocesses but within the OS process boundaries. Although the paper claims this is too hard.

Chromium does not fully solved Spectre. It is still too expensive to run one process per domain, so many unrelated pages run in the same process. But Chromium contains a few mitigations that makes exploiting Spectre from JS much harder.

The threat model is: A code triggering spectre v1 gets read access to the entire address space currently mapped in. ^*

Since process boundaries are enforced by not mapping any ram not usable by the process, this means they don't get violated by spectre v1. If you have two threads which only share part of their address space, the unshared part is protected. Any executable or library mapped into multiple processes is readable from any of them.

^*: With modern cpus, multiple processes can be mapped in simultaneously using ASIDs, however this doesn't matter because they work as they should and properly isolate the processes. You can just assume the model "only one process is mapped at a time".

Your description implies the existence of another mitigation. Namely: When you enter untrusted code you mprotect() all sensitive areas and remove PROT_READ. When exiting the untrusted code you add the permissions back.

Are you sure that works? As I understand it, the issue with Spectre is the branch predictor, not the memory mappings. The reason why process isolation works is that branch prediction gets reset on context switch (or that this will happen on newer generations of hardware in the future).

Mprotect should in fact work, but it is likely more expensive than actual process separation. Resurrecting segments or using virtualization hardware in userspace (see libdune) might be workable solutions.

The issue is that speculation allows bypassing software enforced bound checking, but, discounting meltdown, the hope is that hardware can still enforced them.

mprotect does not issue a memory barrier (mfence), so whilst theoretically protected it is practically delayed and can still be read from the cache via sidechannels. Same issue with the unsafe bzero call. A compiler barrier is not safe enough to delete secrets.

Mprotect should work because even under speculation the CPU shouldn't allow a read to an invalid address to be executed. Meltdown shows that some CPUs speculate even this sort of checks, but it seems that it is not inherently required for an high performance implementation.

"Text pages of DLLs are shared between processes"

I thought this wasn't possible with ASLR'd relocations all over the place in the text?

Most modern architectures make extensive use of PC-relstive instructions for branches and load/store. That means when rebasing a binary you just need to modify the pointers in the data segment (things like GOT entries, etc) and can leave the text untouched.

> With properly designed OoO, Spectre cannot defeat process isolation

It's worth noting that no existing or announced common hardware is "properly designed" according to this condition. Even the "fixed" Intel hardware that's been announced is still vulnerable to spectre v1 across process boundaries.

> It's worth noting that no existing or announced common hardware is "properly designed" according to this condition. Even the "fixed" Intel hardware that's been announced is still vulnerable to spectre v1 across process boundaries.

AMD Zen is.

Spectre v1 (bounds check bypass) only works inside processes. All it allows you to do is to read any memory location currently mapped into your address space, and so it gives anything that can execute code complete read access to the address space of the process it's running in. On Intel CPUs, this also allows reading the kernel address space, unless kpti is used. Eventually, the ability to read kernel memory will be removed, and so kpti becomes unneccessary.

On all AMD post-BD cpus, spectre v1 cannot be used to read kernel address space.

All the rest of spectre (and meltdown) can eventually be fixed, but it is effectively impossible to make a cpu that is both fast and doesn't exhibit spectre v1.

Regardless of the accuracy of your claims regarding spectre v1, I'd like to see a source saying that spectre cannot defeat process isolation on AMD Zen. I've found a lot of sources that don't support that, and none that do. The closest thing I've read is a statement that Zen 2 will have some mitigations for spectre.

> Spectre v1 (bounds check bypass) only works inside processes

I don't think this is true. If it is, why did Linux add speculation barriers to bounds checks in the kernel?

I was in a discussion of this last week on another thread - see my previous comments for why I think spectre v1 has impact across processes.

Hi twtw,

I think you were having that discussion with me.

So, I went and read the whole lkml threads you linked and if I understood correctly, regarding spectre v1, the kernel is only expected vulnerable to bpf based attacks or similar. As far as I understand, the speculation barriers are used to protect arrays directly accessible by bpf programs.

There is a mention of out of process attacks to other userspace programs, but no details.

By carefully crafting inputs, I'm ready to admit that it might be theoretical ly possible to attack some exploitable branches, but the big deal with spectre is the high bandwidth that can be attained by directly running code in process.

Do you have any pointer to any description of an even remotely practical out of process spectre v1 attack that doesn't involve executing code in process? Repurposing an interface that is not meant to be used to run code (I e. Build your own VM) is fair game.

Here's alan cox saying what I've been trying to say, from https://marc.info/?l=linux-kernel&m=151503218808512&w=2:

> If you read the papers you need a very specific construct in order to not only cause a speculative load of an address you choose but also to then manage to cause a second operation that in some way reveals bits of data or allows you to ask questions.

> BPF allows you to construct those sequences relatively easily and it's the one case where a user space application can fairly easily place code it wants to execute in the kernel. Without BPF you have to find the right construct in the kernel, prime all the right predictions and measure the result without getting killed off. There are places you can do that but they are not so easy and we don't (at this point) think there are that many.

> The same situation occurs in user space with interpreters and JITs,hence the paper talking about javascript. Any JIT with the ability to do timing is particularly vulnerable to versions of this specific attack because the attacker gets to create the code pattern rather than have to find it.


> big deal with spectre is the high bandwidth that can be attained by directly running code in process

That depends on your perspective. If you are an OS developer who strives to guarantee process isolation, than it is a pretty big deal that spectre v1 allows you to read memory from the kernel or from other processes, even if it might be tricky to do so. If you write a JS JIT, then yeah you are probably most concerned about the single-process case.

> remotely practical

IMO, most spectre attacks are not remotely practical. No, I don't have a pointer. The only actual demonstrations of spectre I've seen is the one included with the original paper (single process).

I'm not aware of a spectre v1 attack for the altair 8800, or indeed any of the CPU's of that era, Z80, 6502, 8086/8....

But then, things moved on, standard ways to add more cache with multiple cores was lapped up and later on we found a design flaw that echoed back for a decade or more upon all these multi-core CPU's.

Though in fairness and to put some context upon all this, CPU design is more complex than writing TAX laws. Yet we have exploits for TAX laws appearing and used all the time by large corporations. Whilst the comparison is not ideal and some would say, unfair. It does highlight that nothing is perfect and what we may class as perfect today (or darn close), could and may very well be classed as swiss cheese in the future. It gets down to how far away that future is. After all, we still have encryption utilised that we have (on paper) shown to be flawed to future quantum CPU's! But in a World that was aware of Y2k decades before the event, the penchant of business to drive everything to the last minute for profit will always be a factor in advancements. After all, if CPU cores had isolated caches instead of sharing, then that would mitigate so many issues, yet it would cost more to make and most consumers would not appreciate the extra cost for what to them is little value in return above and beyond the cheaper solution. That's business for you and CPU's are made by them for profit.

>> The paper notes that Spectre can, and will in the future be able to defeat all programming language level techniques of isolation.

That's why I said we need trusted cores - i.e. ones that don't implement speculative execution or share cache with other cores. Untrusted code needs to be run in physical isolation, not just virtual isolation.

But the real solution to all of this is not to run untrusted code at all. This raises the question of how we come to trust the code we run. The simplest and most obvious thing we need to do is disable javascript. I mean how can you possibly trust code that came in a 3rd party payload used for advertising? How can you trust anything from Facebook? Or any of them? The answer is that you can't and in may cases should not.

What about languages that prevent the program from reading the current time or determining whether some local computation finished before or after some external event? (I'm thinking of something like Haskell code that's executing outside of the IO monad.)

Perhaps we just need to have a more restricted idea about what untrusted code is allowed to do.

In concept this could work. In practice, you bassicly need to design the language (or at least compiler and runtime) with this in mind from the beggining, otherwise you might forget about some backdoor.

Eg, if you do Haskell and just verify the function you are running is not in the IO monad, you might miss some usage of UnsafePerformIO. Even if you check their code, if you let them specify dependencies they might manages to sneak a buggy use of UnsafePerformIO into a library they submitted to hackage.

Plus, your restriction is essentially: no clock, no contact with the outside world, no threading, and a carefully considered interface to the host program to prevent time leaks.

For many usecases, this is not workable

By "something like Haskell code executing outside the IO monad", I meant something like Haskell with the obvious backdoors turned off. (Ghc has a "safe haskell" option to disallow things like unsafePerformIO.)

Disallowing direct access to the outside world is a big restriction, but it may be that a lot of the things you'd want to do inside a sandboxed application that aren't safe could be delegated to trusted code through an appropriate interface.

Threading isn't necessarily a problem; the Haskell Par monad for instance should be fine as there is no program-visible way to know which of two sub-tasks executed in parallel finished first.

Funny you mention Par. I suspect its IVar type could be used as a backdoor. The doc says to not return it, but Haskell does not enforce this.

I looked that up in my copy of "Parallel and Concurrent Programming in Haskell" by Simon Marlow, and indeed you're right -- the runPar function shouldn't let you return an IVar, but it does. This is planned to be "fixed in a future release", but the current online documents say about the same thing.

Presumably, this could be fixed easily by using the phantom type trick (same as ST) but it would make the type signatures ugly and possibly break existing code. (Maybe there's a more modern alternative to phantom types?) So, yeah, you might not want to use the Par monad as it's currently implemented in ghc as your secure parallel sandbox.

The online docs suggest using lvish if you want a safer Par monad interface, which I'm not familiar with (though the lvish docs say that it's not referentially transparent if you cheat and use Eq or Ord instances that lie).

The general idea seems sound, though -- it should be possible to have parallelism in a sandbox environment without allowing the sandboxed program to conditionally execute code based on which of several threads finished some task first.

I point isn't that it is impossible in theory. It is just that, in practice, we do not have a good track record of locking down pre existing languages. As much as Haskell is one of the more idealogically pure languages, its ecosystem is still written under the general assumption that its programmers are not malicious geniuses.

How would that work?

I think the problem is that runPar is currently allowed to return an IVar to the code that called it, which could then pass that IVar into a different runPar invocation. That's not something the runtime system expects, or the type system should allow.

Putting aside the fact that I don't think you're actually disagreeing with the parent, I don't think it's really accurate to say that all programming level techniques of isolation are defeatable. The paper enumerates a specific set of vulnerable language features (e.g. "Indexed data structures with dynamic bounds checks"). The paper doesn't say much about languages lacking these features. Also, I believe retpoline is a (nearly?) perfect software fix for Intel's microarchitecture.

>and will in the future be able to defeat

umm, "on today's hardware"

Processor affinity would become a security-influenced decision with a CPU like that. That would allow for a whole new class of bugs.

>> Processor affinity would become a security-influenced decision with a CPU like that.

Yes it would.

>> That would allow for a whole new class of bugs.

I think it's necessary, but not easy.

Actually what I think is necessary is for people to stop running code from random places - or even common places. Google could work without running stuff on my machine.

For some types of work, it means there's still a case for on-prem/"private cloud".

Pick two: performance, safety, convenience.

The paper states: confidentiality, integrity, availability

...when discussing something different than what I was.

I hope they still offer CPUs optimised for the non Cloud scenario, where all the programs running are trusted and so these sorts of attacks (which requires local access) are not applicable.

Outside of some very niche scenarios, does this usecase even exist? Certainly nothing running a javascript enabled browser, electron app, or in general any VM of any sort qualifies.

All the VMs run on my employer's servers are running code we trust. None of them run arbitrary code from some third party, because we're not a cloud provider, nor are they used to browse the web. I don't want to slow them down to mitigate vulnerabilities that just aren't a serious risk or even applicable.

Do you guys audit the whole stack then?

Most HPC stuff likely (in my experience) fits this scenario.

Sure. I’m comfortable describing HPC and on-premises fully trusted computing (another response) as “very niche scenarios” though (compared to the much much much larger markets of large cloud farms on the server side, and consumer devices), to the point where I have to wonder whether or not it’s worth it for CPU vendors to cater specific SKUs to them without the silicon mitigations.

any art creation workflow ? e.g. music, video production, 3d modeling & rendering, etc...

I'm curious if this leads to us re-visiting more Itanium-like CPU designs.

No, it won't, because OoO with process isolation is still superior to Itanium-like in all respects.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact