
Keeping Memory Contents Secret - Tomte
https://lwn.net/SubscriberLink/804658/8eaf9fdc5477865e/
======
jstanley
> its purpose is to request a region of memory that is mapped only for the
> calling process and inaccessible to anybody else, including the kernel.

Easy fix: at the next context switch, the kernel overwrites the code of the
process with its own code that copies the contents of the "secret" memory to
somewhere the kernel can read it, and then executes its own code.

I guess I don't understand the threat model in which having data protected
from the kernel is helpful.

If the idea is that the kernel is malicious, then it's not going to work in
the first place (e.g. the kernel could just fail to honour the MAP_EXCLUSIVE
flag in the first place). If the idea is that the kernel is initially
trustworthy but gets compromised some time after the memory is made
"exclusive", then it can execute its own arbitrary code in that process's
context and get the contents that way.

~~~
MaulingMonkey
I think the threat model is a buggy kernel - or a kernel insuffiently hardened
against buggy processors with speculative execution side channels - with a
malicious process.

E.g. if syscall S reads kernel memory without properly checking
pointers/offsets (maybe GCC optimized them out), then process A can possibly
use that to steal process B's secrets. Or maybe the kernel _does_ check the
pointers/offsets, but isn't hardened against yet another speculative execution
side channel attack your processor is vulnerable to, so leaks some information
via timing information.

MAP_EXCLUSIVE breaks that exploit, and to unbreak it you need both syscall S
and another buggy syscall (or maybe another row hammer attack) - to _write_
kernel memory - and to successfully leverage that into tricking the kernel
into remapping process B's pages.

No panacea, but I could see it rasing the bar in exploit difficulty as part of
a defense in depth strategy.

~~~
mike_hock
It does seem weird to protect user space memory from the kernel but not the
memory of kernel subsystems from other kernel subsystems.

But at that point we're at a microkernel design that needs context switches
between kernel processes... Good, I'd welcome that. This feature just
implements a level of security that the whole kernel needs to catch up to
before it makes sense.

------
latchkey
AMD has this:
[https://github.com/AMDESE/AMDSEV](https://github.com/AMDESE/AMDSEV)

SEV is an extension to the AMD-V architecture which supports running encrypted
virtual machine (VMs) under the control of KVM. Encrypted VMs have their pages
(code and data) secured such that only the guest itself has access to the
unencrypted version. Each encrypted VM is associated with a unique encryption
key; if its data is accessed to a different entity using a different key the
encrypted guests data will be incorrectly decrypted, leading to unintelligible
data.

~~~
Nokinside
IBM z-mainframes have complete hw assisted total encryption in z14 and later.
Keys are not visible to hypervisor, OS or application (code and data) and it's
authenticated. Databases, datasets and network traffic can be transparently
encrypted using hardware without altering applications.

------
tzs
How about where you need some data that is available to all processes running
under the same UID, but not to other processes? How do people handle that
nowadays, especially on a generic server with no special cryptographic or
security hardware?

To make it more interesting, assume that (1) the processes do not have a
common ancestor running as that UID, and (2) if all processes with that UID
exit, the secret should still be available if any new processes are started
with that UID, and (3) the secret must be forgotten when the machine shuts
down or loses power.

Assume that the secret should be safe from attackers who can run arbitrary
code as other non-root users, who can force the machine to reboot, and who can
steal the machine itself. (Stealing the machine causes it to lose power--no
need to worry about a George Costanza/"Frogger" scenario).

For example, imagine a service implemented as a CGI that needs the keys for an
encrypted database it uses, or needs the password for a service that it needs
to access, using Apache using suEXEC to run this service's CGI under a
dedicated UID.

It would seem that you need some kind of persistent storage. The first thing
that comes to mind is a file. It can be owned by the UID of the CGI, and mode
0400. This doesn't meet the requirement of the secret being forgotten on shut
down, though.

Would good old fashioned System V shared memory work for this? It has
ownership and access control similar to files, so a shared memory segment
owned by UID and mode 0600 should only be accessible to processes with the
right UID (and root). It goes away when the machine shuts down. Linux's System
V shared memory implementation adds a flag you can set via shmctl to keep it
from swapping it.

~~~
AgentME
It sounds like you want a file on a ram disk. Linux creates a ram disk by
default in /dev/shm. You could create a file or directory there only
accessible by a specific UID.

------
m_eiman
It'd be interesting to have the kernel transparently encrypt the secret page
whenever the process is put to sleep or waiting for IO, so that the contents
are only readable when the process is active. Could be a nice, if not perfect,
layer of protection against having the secrets read from other processes. IIRC
there's a way to store crypto keys in CPU registers, so that they won't be
available in a RAM dump.

~~~
saagarjha
You'd have to find a way to pin the keys in the registers, lest they be put in
RAM during a context switch.

~~~
tom_mellior
That's OK if your threat model trusts the kernel and only wants to protect
against speculative execution attacks within the application itself. While the
application is running, they keys will only ever be accessed from registers --
context switches are transparent to the application.

------
newnewpdro
Since MAP_EXCLUSIVE pages are implicitly MAP_LOCKED & MAP_PRIVATE, how much
are they expected to be used in practice on a given system?

If it's an edge use case, couldn't they just use huge pages for such mappings
to avoid the whole page-splitting performance problem in the kernel's linear
mapping? Or only resort to splitting after some amount of huge pages have been
wasted on exclusive mappings?

------
mike_hock
> Somehow, Bottomley suggested, the kernel should make the best choice it can
> for how to protect secret memory

Security can't be a best-effort platform-dependent QoI issue.

What could work would be several tiers with certain minimum guarantees that
will either be fulfilled 100% or EOPNOTSUPP.

If a developer doesn't know what level of security they need or want, the
feature is not for them.

------
prostodata
Does it relate somehow to Intel SGX (Software Guard Extensions):
[https://en.wikipedia.org/wiki/Software_Guard_Extensions](https://en.wikipedia.org/wiki/Software_Guard_Extensions)

------
dsamarin
What would happen to the data on suspend or hibernate?

~~~
saagarjha
Without consideration, presumably whatever mlock would do to it. So either it
would prevent hibernation or those pages would be cleared?

------
baybal2
Anybody remember Intel MPX?

~~~
nullc
I can't figure out why there doesn't seem to be a commercial market for
something like CHERI (
[https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/](https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/)
) ... even if it was considerably slower than top end CPUs there are plenty of
applications that don't care (and find chips like arm a8 or atom fast enough)
but where high security is paramount.

Maybe Intel still remembers the pain of iAXP 432 but that doesn't explain the
rest of the industry.

~~~
rst
ARM is doing CHERI-enhanced cores, at least as a research project:
[https://www.eetasia.com/news/article/CHERI-based-
Prototype-t...](https://www.eetasia.com/news/article/CHERI-based-Prototype-to-
Be-Developed-by-Arm)

There's also Dover Microsystems, which is trying to commercialize similar-in-
spirit processor extensions originally developed at Draper Labs.

