Security conscious users should be using a gpg smartcard for authentication, in which case the private keys are never held in RAM and shouldn't be susceptible to this type of attack in the first place.
OpenSSH Key Shielding
18 December 2019
1. How key shielding works
On June 21, 2019, support for SSH key shielding was intro‐
duced into the OpenBSD tree, from which the OpenSSH releases
are derived. SSH key shielding is a measure intended to
protect private keys in RAM against attacks that abuse bugs
in speculative execution that current CPUs exhibit. This
functionality has been part of OpenSSH since the 8.1 re‐
lease. SSH private keys are now being held in memory in a
shielded form; keys are only unshielded when they are used
and re‐shielded as soon as they are no longer in active use.
When a key is shielded, it is encrypted in memory with
AES‐256‐CTR; this is how it works:
1. A prekey is generated, which is 16 KiB of random bytes
obtained through arc4random_buf(3).
2. The prekey is then hashed using SHA‐512, of which the
first 32 bytes form the encryption key and the next
16 bytes form the IV (CTR).
3. The private key is serialized.
4. The serialized private key is padded to the cipher
block size (16 bytes).
5. The serialized private key is then encrypted using
AES‐256‐CTR with the parameters determined in steps 1
6. The SSH key struct is replaced with one that only contains the public key, the encrypted private key and the
7. All secrets that were handled are zeroed: the cipher
context, the derived key, the derived IV, the old SSH
key structs and the serialized private key.
In short, 16 KiB of random data are hashed to derive an encryption key and IV which are then used to encrypt the key
2. Thoughts on the prekey
Because cryptographic hash functions exhibit the avalanche
effect, getting one bit wrong will result in a completely
different hash. Every time the key is used, a new prekey is
generated, so any kind of progress on exfiltrating the
prekey is lost every time the key is actually used.
However, there is an attractive goal with significantly
less state than 16 KiB: the random number generator. The
arc4random_buf(3) random number generator operates largely
in userspace. It gets entropy either from OpenSSL (if
linked with OpenSSL) or from the operating system (the latter is always true on OpenBSD); external entropy to seed it‐
self is obtained on initialization and thereafter only every
1600000 bytes (1.6 MB). Its state consists of only 64 bytes
(namely, it consists of a ChaCha20 context, see openbsd‐compat/arc4random.c and openbsd‐compat/chacha_private.h in
OpenSSH‐portable). Once recovered, it becomes fairly trivial to anticipate the prekey by generating all possible
start/end patterns of the generated random bytes until decryption with the generated key and CTR succeeds.
I’m not sure if this is practical, however. While
64 bytes is significantly less data than 16 KiB, it’s still
a decent amount of data to be extracted with limited verifiability: It is hard to locate in memory as it is pseudo‐ran‐
dom, and checking the actual output of the random state is
likely to be difficult. Chances may be that the ChaCha20
state has already changed by the time all the required bits
to reconstruct it have been obtained. And all of that assumes the side channel attacks do not require execution to
actually execute the code paths interacting with the state
more than once: All code paths that lead up to accessing the
ChaCha20 state are also destructive, so all data must be exfiltrated in one go to get all of the new state before it is
lost on the next invocation. Furthermore, a busy server
will likely have torn through the 1.6 MB of random data and
caused fresh data from the operating system to be retrieved
There is also an in‐memory buffer of random bytes,
which consists of 1024 bytes. This is (several times) less
than the size of the prekey. Extracting it is useless unless all of it can be extracted several times in succession
while the prekey generation is taking place, which strikes
me as difficult. The random bytes in the buffer are also
replaced with zeroes after they are consumed.
(Disclaimer: I am not very well‐versed in the intricacies and practicability of exploitation of speculative execution vulnerabilities. Corrections would be greatly appreciated!)
3. Cryptographic notes
Notably, there is no authentication of the encrypted key;
I’d imagine that authentication is not necessary because
modification of memory is not part of the threat model (key
shielding tries to guard against key exfiltration through
limited side channels). They do, however, check the success
of the deserialization and for some reason the validity of
the padding as well.
Padding the serialized key is not necessary for
AES‐256‐CTR, as CTR mode effectively turns AES into a stream
cipher. That the serialized key is padded is likely because
the OpenSSH project may be planning to swap out the cipher
algorithm later down the road; this is suggested by a comment in the code:
#define SSHKEY_SHIELD_CIPHER "aes256‐ctr" /* XXX want AES‐EME */
I can only speculate why AES‐EME is not actually used.
Perhaps it proved to be too computationally expensive as it
requires two invocations of AES per block; perhaps the authors were simply unaware that the EME patent application
had been abandoned.
4. Leftover data and blind spots
While keys are mostly stored in encrypted memory, there is
still a brief moment left during which attacks using speculative execution could take place, namely in the brief periods of time when the keys are unshielded to be actually
used. I assume that these will be significantly harder to
There may also be other leftovers of the key data in
other places, such as the CPU cache. While explicit_bzero(3) guarantees to clear the given block of memory by overwriting it with zeroes, compilers make no guarantee that there are no extraneous copies of data. Stronger
guarantees regarding clearing important data would be helpful in this area, both on a language standard level for C
and C++ as well as on a compiler level (e.g. in LLVM, for
other languages like Rust).
5. Calls to action
I would personally suggest that all applications handling
important or critical key data shield their keys in a similar manner wherever feasible, despite possible shortcomings
of the method. This may be inhibited by performance (I
would imagine that a web server would be able to serve a
considerably smaller amount of requests if it had to shield
and unshield the certificate private keys for every request)
or other resource constraints.
Finally, I strongly urge to consider hardware tokens
and hardware security modules for all non‐trivial key data
wherever possible. OpenSSH has been making steps in the direction of allowing host keys and client keys to be backed
by security keys.