
Ginseng: Keeping secrets in registers when you distrust the operating system - tptacek
https://blog.acolyer.org/2019/04/05/ginseng:-keeping-secrets-in-registers-when-you-distrust-the-operating-system/
======
dmitrygr
This seems completely nonsensical! The data in registers hits memory at
context switch time and they do nothing to stop that. So this provides no
protection at all...

~~~
ajross
It's running in a TEE/SGX/TrustZone enclave, so this is inherently firewalled
from routine OS code already.

Whether this is worthwhile, or even works without holes, is sort of an open
question. I agree it's sounds a little heavy on the serpent fat, but the
technical promise is definitely achievable.

~~~
dmitrygr
No. The code they claim to protect runs in userspace not in TEE

------
amluto
This seems far more complicated than necessary. This design assumes that a
trusted execution environment is available, and it _also_ assumes that a
trusted hypervisor is available. If you have both of those, then why can’t you
just run your software in an enclave directly protected by the trusted
hypervisor?

~~~
amluto
I read it more carefully. Egads! They invented their own hypervisor that
essentially poorly imitates Xen’s PV mode. I wouldn’t trust this thing at all.

Here are some likely holes:

They prevent the kernel from mapping “sensitive” code into its own address
space, but they don’t seem to prevent the kernel from mapping it into a user
address space with write permission, which is just as bad. (Also, the kernel
already maps most memory writably in the direct map, and they don’t mention
what they do about this, so I would guess that they have a bug.)

They don’t mention protecting sensitive code from DMA.

The kernel can corrupt use code execution in many ways, such as corrupting
non-“sensitive” registers. They don’t seem to have a rigorous model to defend
against this.

They protect the IDT, but I don’t see anything about protecting the SYSCALL
MSRs. The kernel could redirect SYSCALL to skip the magic hook. This might not
matter if there are no syscalls in sensitive regions.

------
jlrubin
I think that every design that relies on trusted hardware needs to be much
more up front about it (e.g., put it in the title) so the paper can be less
disappointing.

I don't understand academia's obsession with these security theater devices --
it would make sense to see it in industry as a hyped buzzword, but I think it
makes for weak scholarship.

edit: as some note below, it is a TEE not a TPM or a SE. I don't think the
distinction should distract from the point, so I have amended above.

~~~
munchbunny
What makes TPM's security theater?

Also where does it say TPM? The paper references Intel SGX and ARM TrustZone,
which to my knowledge are both on-CPU, whereas TPM's generally sit separately
on the motherboard.

~~~
jlrubin
I don't think the distinction matters that much in this context, so I amended
above.

I think they're theater because they do something complicated with ambiguous
security benefits. And even if they are used correctly, flaws in the designs
like spectre/meltdown/foreshadow/rowhammer/etc etc compromise these use cases.

~~~
Certhas
Because the secret is never in memory, wouldn't it exactly be safe against all
the attacks you mention?

~~~
jlrubin
Not quite -- for example, the way that registers are implemented in modern
processors is super complex.

Here's a sketch of an attack against this:

the registers don't actually get overwritten they get renamed in modern CPUs.
That means the old data is still there, just not logically accessible by non-
pipelined instructions. It's possible the data is still sitting in the
registers, with an old epoch name. The predictors will be predicting branches
and other things based on those registers. So by issuing the right instruction
and measuring delay, you might be able to create an oracle to see if you guess
a byte correctly from the stale data.

I also don't think the secrets are actually out of memory, they are out of
_your_ memory, where your is the kernel and the user space but not necessarily
the TEE Secure World memory. This is, of course, the same RAM but protected by
a page table.

""" To ensure code integrity the kernel page table is made read-only at boot
time. The kernel is modified to send a request to GService whenever it needs
to modify the page table, GService honours this request only when doing so
would not result in mapping the code pages of a sensitive function. The kernel
is also prevented from overwriting its page table base register so that it
can’t swap the table with a compromised one. """

Of course, this sounds like the perfect kind of attack for rowhammer to break.
Just overwrite the page table by doing reads and then overwrite a sensitive
function after it's been invoked once and approved, and now you can leak
secrets out that way.

etc

------
jayalpha
Reminds me of:
[https://en.wikipedia.org/wiki/TRESOR](https://en.wikipedia.org/wiki/TRESOR)

~~~
chalst
The approach is rather closer to Zircon.

[https://fuchsia.googlesource.com/zircon/](https://fuchsia.googlesource.com/zircon/)

------
benj111
So assuming you don't even trust the OS. How can you be sure you aren't
running in a virtual machine or something.

You have to trust the OS to set everything up, do IO etc. I don't see how its
tenable to not trust the OS.

~~~
_underfl0w_
Very true, though this seems to plug at least one potential data leak.

------
WallWextra
Wouldn't you need, in addition to patching the interrupt vector, to make sure
there is no existing code running in the kernel when you restore the sensitive
registers? This makes the nginx use case seem kind of unrealistic.

------
rrdharan
This reminds me of the Overshadow paper which VMware published during my time
there. We never ended up shipping it but it was a neat proof of concept:

[https://www.cs.utexas.edu/~shmat/courses/cs380s/overshadow.p...](https://www.cs.utexas.edu/~shmat/courses/cs380s/overshadow.pdf)

EDIT: I see Overshadow[14] was indeed one of the cited references.

Also, direct link to the actual paper is here: [https://www.ndss-
symposium.org/wp-content/uploads/2019/02/nd...](https://www.ndss-
symposium.org/wp-content/uploads/2019/02/ndss2019_01A-2_Yun_paper.pdf)

------
sneakernets
> we miminize the unsafe part of GService to a small amount of assembly code

Good.

If you want as much security as you can get in an unsafe environment, you're
going to need to write the entire routine yourself, sometimes on a level down
to the bare metal if you have to.

~~~
blitmap
I have not worked in C in a long while. Would you mark the assembly section as
volatile to avoid the compiler & assembler doing anything to it? Is there any
guarantee that the assembler will not aggressively re-optimize assembly-
within-C?

~~~
aidenn0
Most C compilers vary from fairly to completely hands-off when it comes to
inline assembly.

You can also just write it in a separate assembler file and then the C
compiler does not see it.

That leaves just the linker, and most optimizing linkers will treat code
outside of the purview of the compiler as a "black-box" otherwise you wouldn't
be able to link with code that makes system-calls.

~~~
Avamander
How does link time optimization affect separate assembler file?

~~~
comex
In general, in the implementations I've seen, "link-time optimization" is a
bit of a misnomer. It's more like a glorified version of `gcc -combine`. The
"compiler" binary stuffs its half-finished results (IR) into fake object
files, and the "linker" binary calls back into the compiler (built as a
library) and sends it the IR from all the fake objects, which the compiler
combines and builds into one giant, real object file. That object file is sent
back to the linker, which goes on to do its normal job. (LLVM ThinLTO is a bit
more advanced than that in terms of scalability and incrementality, but it
maintains the same "hands-off" approach from the linker's perspective.) On the
other hand, if the linker sees an object file passed to it is a real object,
it doesn't send it to the compiler and handles it during the normal linking
phase instead. And if you build a .s file, you always get a real object file,
even if you passed -flto.

TL;DR: It doesn't affect it.

------
comex
Props to the blog post for being quite well written: easy to understand, yet
also thorough enough to explain exactly what Ginseng is and how it works. By
"easy to understand" I don't just mean the introduction, which goes over the
motivation at a high level, but also the lower-level explanation, with a handy
C and assembly comparison that helps explain what the transformation actually
does.

...On the other hand, the design itself seems like a pretty massive hack. The
goal is to turn parts of a userland process into the equivalent of a TEE
component, without having to manually separate the codebase into two pieces
and set up IPC between them. But although that kind of "automagic" approach is
easier to use, it also makes it really easy to write security flaws.

For instance, in the example code:

    
    
        void hmac_sha1(sensitive long key_top,
                       sensitive long key_bottom,
                       const uint8_t *data,
                       uint8_t *result) {
            sensitive long tmp_key_top, tmp_key_bottom;
            /* all other variables are insensitive */
    
            /* HMAC_SHA1 implementation */
        }
    

It's quite dangerous to say that all other variables are insensitive! It's
hard to say for sure without seeing the actual implementation, but SHA-1
requires first expanding the message into a state of 80 32-bit words, before
performing 80 rounds of hashing on them. If the state is treated as
insensitive, another core could read it out before it actually goes through
hashing, in which case the key could be easily recovered. This design _might_
be secure if the SHA-1 function is separate and itself marks all state as
sensitive, as long as the key never leaks into memory in between, but that's
not how SHA-1 implementations usually work, so I'm pretty suspicious.

I tried to find the actual code to determine whether it's actually vulnerable,
but failed: it's supposed to be released as open source [1], but the
instructions involve downloading from a GitHub repo [2] which is currently
marked as private, I guess by mistake.

...I don't really understand why Ginseng doesn't just mark all variables in a
sensitive function as sensitive; it's not like memory for the secure stack is
particularly scarce. That still leaves other attack vectors, though.

[1]
[http://www.ruf.rice.edu/~mobile/ginseng.html](http://www.ruf.rice.edu/~mobile/ginseng.html)

[2] [https://github.com/susienme/ndss2019_ginseng_arm-trusted-
fir...](https://github.com/susienme/ndss2019_ginseng_arm-trusted-firmware.git)

------
ryacko
Modern chips have over a hundred registers per logical core, but either 16 or
32 you can explicitly access.

~~~
MuffinFlavored
Do you think in the future more registers will be accessible, increasing
performance?

~~~
bluGill
Back when AMD designed x86_64 asked that question: they concluded that it was
better only show a few registers because that was all compilers really needed
(compilers writers had learned a lot of tricks from the mess that x86 was),
and so they could better deal with a small instruction set that fewer
registers offers for greater performance.

~~~
MuffinFlavored
> from the mess that x86 was

Is ARM seen as a mess? Are x86's days numbered as ARM catches up and becomes
more widespread for personal computing devices?

~~~
bluGill
ARM is much better as an instruction set, but it isn't clear if that will ever
matter.

------
termie
The upcoming MKTME support in future Intel processors will hopefully make this
problem simpler to solve.

~~~
wahern
AMD calls this SEV and has shipped it for some time. It doesn't help much.
Here's one of the latest attack papers also with a summary of previous work:
[https://arxiv.org/pdf/1901.01759.pdf](https://arxiv.org/pdf/1901.01759.pdf)

------
toomuchequate
What trust do we need the OS to have?

Encryption, keystrokes?

Web has made the need for desktop applications often unnecessary.

------
njacobs5074
It seems like this kinda' just pushes the issue around. For example, how do we
know that the Ginseng compiler is trustworthy? Of course, there could be a
trusted authority for it but I don't see how this is different from a trusted
authority for an OS.

~~~
aidenn0
Sufficiently complex operating systems will never be trustworthy not because
of maliciousness, but because of bugs.

Pick any old version of Linux and browse the local privilege escalation
attacks. It seems likely that recent versions of linux have as-yet
undiscovered attacks. The existence of even one means that the OS is not
trustworthy in the presence of non-trustworthy applications.

As long as Ginseng is more amenable to verification than the Linux kernel,
this isn't just pushing the issue around, but rather reducing the work needed.

