
ECCploit is the first Rowhammer attack to defeat error-correcting code - rbanffy
https://arstechnica.com/information-technology/2018/11/potentially-disastrous-rowhammer-bitflips-can-bypass-ecc-protections/
======
userbinator
Rowhammer is simply defective RAM marketed as usable. If your RAM will corrupt
itself with certain access patterns, it is defective and should be replaced.

The manufacturers obviously don't want to, but the only way to stop this
stupidity is to reject/return/refuse the product as defective if it shows this
vulnerability. There's been a lot of efforts to downplay it, to the point that
even some memory testing tools have made the tests for RH _optional_. This is
ridiculous not just from a security standpoint, but overall correctness.
Memory that just doesn't behave like memory should, is not fit for purpose.

Unfortunately RAM defects are often very subtle --- I remember a particularly
irritating one which happened only when extracting a certain ZIP file; all the
memory testing tools said the RAM was fine even with a few days of continuous
running, and the file extracted correctly on a handful of other systems, but
on this one it would always end up corrupt. Attaching a debugger or otherwise
attempting to trace the cause naturally made it disappear. It was only
swapping the RAM with a new module that fixed it.

~~~
tangohead
As a follow on from this, I recently went to a talk by the author and asked
whether certain brands of RAM was more vulnerable than others. They said that
the newer the brand (and so newer the production process), the more bits are
likely to flip, but even more established brands would still have some
vulnerable bits.

So it seems more of an issue with a change in acceptable tolerances - what is
fine for normal usage might not be secure. Also, I might have been mistaken
but the author's response implied that no brand of RAM was not vulnerable.

~~~
kabdib
Hmmm . . . some bright spark in marketing will realize there's an opportunity
in "secure DRAM" now . . . :-/

------
atq2119
I wonder to what extent the encrypted memory enclaves in modern CPUs can
mitigate these kinds of attacks.

If the data stored in the physical memory has been encrypted using a randomly
generated secret key by the memory controller on the CPU, it should be
impossible to generate an exact target data pattern in the victim _even when
the aggressor and victim are in the same enclave_.

That in itself isn't sufficient to guard against all attacks, because the
memory enclaves don't provide data integrity, only encryption. So if all
you're trying to do is to change a victim's boolean that controls whether you
have e.g. root access, then changing that boolean from false (0) to true (any
non-zero value) is going to succeed with high probability despite the
encryption.

Still, maybe there's an angle here that can help put an end to Rowhammer once
and for all in a few hardware generations?

~~~
Confiks
On the subject of RAM encryption, there's even an kernel modification that
(ab)uses the x86 debug registers to store are AES key and transparently
encrypts / decrypts RAM; Tresor.

> That [encrypted memory] in itself isn't sufficient to guard against all
> attacks

You might want to read about authenticated encryption [2]. The authenticity of
the decrypted data often lies in the block cipher mode (for example OBS)
rather than the block primitive, where Tresor uses the AES-NI instruction. I
would imagine SGX does something along the same lines.

[1]
[https://www.usenix.org/legacy/event/sec11/tech/full_papers/M...](https://www.usenix.org/legacy/event/sec11/tech/full_papers/Muller.pdf)

[2]
[https://en.wikipedia.org/wiki/Authenticated_encryption](https://en.wikipedia.org/wiki/Authenticated_encryption)

~~~
BeeOnRope
TRESOR doesn't have anything to do with encrypting RAM - it's just a tweak to
existing _disk_ encryption methods which stores the encryption key in a debug
register, rather than RAM. This gives resistance against attackers trying to
break your disk encryption by trying to read your key directly from RAM.

~~~
Confiks
Yes, indeed. Thanks for correcting that.

------
woliveirajr
(previous post:
[https://news.ycombinator.com/item?id=18503795](https://news.ycombinator.com/item?id=18503795))

> Fortunately, while the attack would be extremely difficult to prevent, it
> also looks to be very difficult to actually pull off in the wild. (...) the
> VU Amsterdam team said a successful attack in a noisy system can take as
> long as a week. (from
> [https://www.theregister.co.uk/2018/11/21/rowhammer_ecc_serve...](https://www.theregister.co.uk/2018/11/21/rowhammer_ecc_server_protection/))

Well, if you don't know that you are under attack, taking a week isn't exactly
an advantage. And if the attack can be divided among many agents, even if not
in parallel, can make it harder to find out that you're under attack.

------
Dylan16807
This attack depends on using many 1-bit errors to characterize the physical
memory address. In other words, it defeats ECC _if your system ignores
errors_. A lot of systems do ignore errors, but if you don't you can easily
detect this attack before it's ready to strike.

------
Animats
If someone has gotten this far with relatively ordinary tools and published
the results, someone else probably has an exploit based on it but is not
talking about it. You have to assume that the PLA's Third Department and NSA
are working hard on it. Maybe the guys in St. Petersburg, although they
haven't demonstrated that level of capability so far.

------
FrozenVoid
How does this fare vs 2-bit ECC(e.g. Chipkill)? Article only mentions 1-bit
ECC.

~~~
Dylan16807
They talk about it in the paper. It doesn't really make things more difficult,
it just affects which bits need to be flipped.

------
kingosticks
I always assumed that ECC on these high-end systems was done in hardware by
the memory controller. And thus there would be no timing side-channel. Is that
really not the case?

~~~
avianes
Hardware implementation does not mean that there is no timing side-channel.
Compute ECC takes few cycles. To avoid timing side-channel, you must design
hardware for this purpose.

~~~
kingosticks
Yeh I realised afterwards that at these high clock speeds maybe they do need
some extra cycles to do the correction. I don't see why you must do it in
hardware to avoid timing side-channels. You just need to provide a constant
latency i.e. The cpu ucode does some nops when correction is not necessary.

~~~
avianes
I mean, as long as your ECC is built directly into the silicon

~~~
kingosticks
In that case you need to do the exact opposite i.e. avoid the obvious
optimisation of returning data early if no correction is required.

------
dooglius
If you can flip one bit, then it isn't a big leap to flip multiple bits. No
one who knew what they were talking about should have believed that ECC was a
protection against rowhammer-style attacks.

------
ccnafr
Leave it to Goodin to overyhype a non-practical attack. I wish this guy would
retire. He's brought that overhyping clickbait style of reporting from the
register to ars technica and I can't stand it anymore.

Rowhammer has been around for four years, yet is hasnever been seen in
practice. Ever. Can we stop with reporters passing basic academic research as
a current threat. It's just research pr0n, sci-fi hacking. It's not even
remotely closely to being a threat.

~~~
leghifla
FPGA developer here.

In a former project (with a custom board and FPGA, no processor), we
encountered random bugs. We put many checks and finally came to the conclusion
that our DDR modules were flipping bits randomly, but only on normal load. All
test benches were running fine. Putting the modules in a PC did not show any
problem.

How can you tell your boss "all DDR modules are faulty but run fine in a PC"
and not seem crazy?

It was when I read about rowhammer attacks that I made the link. Changing the
addressing schema completely solved the issue.

All these hardware related failures/attacks may not be a threat (yet), but for
me, the underlying sand castle we are building things on is very worrying.

~~~
SlowRobotAhead
>Changing the addressing schema completely solved the issue.

Can you elaborate at all? Do you mean you changing the FPGA to not write to
exactly the same spots over and over? And just set up something up similar to
wear leveling?

~~~
leghifla
It was very simple, I kept all the "logical" addresses the same, but swapped
some low address bits with high address bits at the memory controller
interface.

With this, the slowly moving indexes became big jumps. This avoids reading the
same row over and over again (mixed with other requests).

~~~
avianes
> This avoids reading the same row over and over again

Didn't you notice a huge drop in performance producing row-buffer conflicts
over and over ?

~~~
leghifla
Good question. Jumping around the memory is clearly not good for throughput.
Before the change, the mixing of requests between processes was already
detrimental for performance anyway. All I can say is that we needed a fixed
bandwidth and it was enough, before or after the change.

