
Memory Deduplication: The Curse that Keeps on Giving [video] - ianopolous
https://media.ccc.de/v/33c3-8022-memory_deduplication_the_curse_that_keeps_on_giving
======
Animats
OK, so "containers" were invented so every program could have its own special
set of operating system packages. This resulted in hugely bloated memory
consumption, of course. Then each package could be run in its own virtual
environment for isolation.

But most of the containers held the same operating system packages anyway. So
memory de-deduplication was developed to reduce the bloat. Then flaws in
memory de-duplication broke the isolation.

There's something very wrong with this attempt to fix the problem by adding
more layers.

~~~
gumby
What happened to just using chroot(), namespaces, and copy-on-write?

~~~
eternalban
I am watching it now but he's talking about VMs ("KVM") and not "containers".

As to why we're in this mess, this rather amusing talk by B. Cantrill
discusses the history:
[https://youtu.be/hgN8pCMLI2U](https://youtu.be/hgN8pCMLI2U)

(Linux) Containers are not VMs:
[https://wiki.archlinux.org/index.php/Linux_Containers](https://wiki.archlinux.org/index.php/Linux_Containers)

~~~
antoniob
Thanks for the youtube link! Didn't see that one before.

Regarding the attacks and how they relate to containers. It's true that two of
the attacks were targeting KVM/KSM i.e. VMs. But one attack was entirely
inside a process and conceptually the problem also applies to containers.

[https://openvz.org/Comparison](https://openvz.org/Comparison)

I haven't yet looked at different container implementations but the row 'Page
sharing' suggests that there might be some form of memory deduplication.

You could probably use KSM under Linux across containers by registering all
memory to KSM through madvise() and MADV_MERGEABLE
([http://man7.org/linux/man-
pages/man2/madvise.2.html](http://man7.org/linux/man-
pages/man2/madvise.2.html)).

[https://openvz.org/KSM_(kernel_same-
page_merging)](https://openvz.org/KSM_\(kernel_same-page_merging\))

~~~
eternalban
> conceptually the problem also applies to containers.

Yep. c.f. BC's talk @ 34:21 "To improve storage efficiency"

------
cixin
The final conclusion the speaker comes to in this talk is that you should
disable de-duplication. It doesn't seem like there are any ways of mitigating
it suggested.

Essentially, the first method is a timing attack. If can tell if a crafted
page is duplicated or not by the time it takes to modify the page.

Modifying a defuplicated page will take longer, because a new copy has to be
created. The only way I can see around this is to introduce random delays into
all page writes, this might be feasible I guess, if you only needed to delay a
small percentage of writes. However it's likely the performance penalty would
be unacceptable.

~~~
ris
I guess something that could help would be for the deduplicator to be slightly
more conservative. When doing a deduplication pass, if it finds a duplicate
page, rather than merging them straight off, mark one of the pages as
"reclaimable", but only _do_ that reclaim once the page is required for
recycling. In the intervening time, the different users are still pointing at
their private copies and no CoW has to take place if there is a modification
(it is simply un-marked as a deduplication candidate). "Lazy" deduplication.

Then the attacker would _also_ have to force the system into enough memory
pressure to be requesting to recycle these pages - something it may not be in
the position to do if it is a guest with capped resources. The pages would
also presumably be recycled in a less predictable order, making it harder to
come up with simple a "wait 10 minutes" rule to ensure the recycling has taken
place.

Now, I'm sure what I've described would not be particularly simple to
implement, but that's another thing.

~~~
DannyBee
This also has performance implications since now you can't use the
"reclaimable" memory for cache, etc.

~~~
ris
Well, I'd say if the need arises to use it for cache, reclaim it and perform
the dedup.

