Intel should just support ECC RAM on all platforms and possibly make it mandatory. It seems quite clear that DIMMs are not reliable, and the platform should stop assuming that it is.
I agree that rowhammer needs a hardware-based defense. While ECC isn't prohibitively expensive and has negligible performance cost, keep in mind it's only a parity check with a proprietary implementation. There's some multiple of bit flips that will pass the parity check and still achieve the attacker's intended result, and the exploit is definitely going to get multiple attempts - if you get pwned the malicious process might be in memory and hammering away for hours, maybe it's even dropped a registry key to make sure it auto runs next boot.
It's true that it is possible to flip enough bits to bypass ECC and make whatever nefarious change.
However, it seems likely that first you're going to make a lot on correctable errors, which will slow down your system. Even if you don't notice that, chances are you're going to get a double bit error which will halt your system many times before you get a triple bit (or whatever) that gets the desired exploit.
There’s a pretty straightforward mitigation to this: only allow enclaves from software authors you trust. The kernel does not have to allow usermode to run enclaves, which is very much by design.
This solution works, with its own caveats, for end users, but doesn't help cloud providers at all. Then again, with current restrictions on SGX, like having to partition out memory for it at boot time, running SGX workloads from VMs on shared machines seems like plenty difficult already.
Cloud providers should already be concerned about rowhammer related DoS anyway, and be mitigating/preventing rowhammer through other means (e.g. buying ram that doesnt suck).
> all. Then again, with current restrictions on SGX, like having to partition out memory for it at boot time,
Huh? why is this an issue at all. The most memory SGX can directly use is 128MB. So partition it off-- it's a small cost if it's likely that you'll use it.
Enclaves are not limited to 128MB in total, since the OS can page in and out SGX pages...
I think you missed my point. I was talking specifically about providing SGX to VM guests on multi-tenant machines. The difficulty is not insurmountable, but other than some experimental Intel patches, KVM doesn’t support it yet.
There is a special dance the OS has to perform in order to do it. In short, the OS calls EWB instruction, which makes CPU read the page from EPC, encrypt it, and write it to normal memory. To load it back, OS executes the ELDB instruction, which decrypts it and writes to EPC. There is slightly more to it due to version array, which is used to prevent replay attacks, but in general, paging in and out is completely fine in SGX.
> To the best of my knowledge, all SGX attack research papers disable checks and run unsigned SGX code to demonstrate a proof of concept.
While interesting, papers like these usually have titles that are disingenuous at best. I’ve seen the media report on papers with scary titles and make it sound like the sky is falling, but the scenario is generally (not always) contrived.