
Project Oak – Meaningful control of data in distributed systems - azhenley
https://github.com/project-oak/oak
======
spaceheeder
Reminds me of a general-purpose version of the secure enclave that Moxie
Marlinspike blogged about when he implemented secure contact look-up in
Signal. Very cool Google released this as open source. Of course, it does
require you to be able to trust the security of the enclaves and (at least in
the Signal implementation) requires you to make some different performance
trade-offs in order to prevent information leakage.

Tangent: On a whim I Googled 'secure enclave risc-v' and, sure enough, there's
an extension in development called "Keystone."[0] RISC-V really has such a
promising future.

[0]: [https://keystone-enclave.org/](https://keystone-enclave.org/)

~~~
AgentME
Also reminds me of the Golem project, which wants to allow everyone to be able
to sell their compute power and uses secure enclaves so the people who sell
their compute power can't spy on the data of the submitted compute jobs. I
find the concept of secure enclaves super exciting and I hate that most
coverage of them ends up just being "secure enclaves are an evil tool for
corporations to make super-DRM on your computer".

~~~
solidasparagus
Golem was cool, but I don't understand their strategy now that secure enclaves
have been shown to be vulnerable to side-channel attacks.

Secure enclaves feel like bait to build systems that are overly reliant on
questionable hardware.

~~~
AgentME
I haven't followed Golem or processor news closely, but I assume a future
generation of processors will fix the side-channel attacks (either in general
or at least just for secure enclaves), so I think it still makes a lot of
sense to build something for secure enclaves for the future.

~~~
solidasparagus
I don't disagree, but side-channel attacks have shown that hardware security
isn't guaranteed and, maybe more importantly, there are unbelievably painful
to patch.

------
skywhopper
My immediate association was with Java, which was originally named "Oak".

Sounds like an interesting project, although the requirement for specialized
CPU hardware will presumably make it of limited use to most folks for the time
being. But maybe GCP will offer this sort of hardware to its customers.

~~~
bri3d
It leverages Google Asylo as the enclave abstraction layer which currently
supports Intel SGX. SGX is (to the dismay of HN readers everywhere) available
in Intel server/workstation CPUs since Skylake, so I don't think there's a
really "specialized" CPU hardware requirement.

~~~
betterunix2
Great, except for the part where SGX is completely broken:

[https://www.usenix.org/conference/usenixsecurity18/presentat...](https://www.usenix.org/conference/usenixsecurity18/presentation/bulck)

~~~
_nhynes
Right, except for the part where it's been patched by a microcode update [0].
You can always choose not to provision enclaves with secrets unless the CPU
has SMT disabled.

[0] [https://www.intel.com/content/www/us/en/security-
center/advi...](https://www.intel.com/content/www/us/en/security-
center/advisory/intel-sa-00161.html)

~~~
teraflop
If SGX is broken, how can you trust the CPU not to lie about whether it has
applied the microcode update and/or disabled SMT?

~~~
tiziano88
SGX is "broken" in that it allows exfiltrating data via side channels; AFAICT
no one has broken the attestation process by which the CPU effectively signs a
statement of its internal state (including SMT) with a key fused into the CPU
by Intel at manufacturing time, and which is used as a building block for
remote attestation.

~~~
betterunix2
If you look at the Usenix paper on the FORESHADOW attack, they did break
attestation as well by applying their attack to the quoting enclave to extract
attestation keys.

------
jasonh3
This feels a lot like a more production-y version of Ryoan (OSDI '16).

[https://www.usenix.org/conference/osdi16/technical-
sessions/...](https://www.usenix.org/conference/osdi16/technical-
sessions/presentation/hunt)

------
azhenley
A paper was just presented at PLDI on Project Oak (PDF not yet available?):

[https://pldi19.sigplan.org/details/deepspec-2019-papers/11/P...](https://pldi19.sigplan.org/details/deepspec-2019-papers/11/Project-
Oak-Control-Data-in-Distributed-Systems-Verify-All-The-Things)

------
erklik
This could be an unfortunate name if the rumoured Huawei OS is actually named
Oak.

------
Uptrenda
If any devs here are interested in learning about trusted computing 101 I can
recommend the Open Enclave SDK by Microsoft. They've done a really good job
making the technology as accessible to new developers as possible and it might
be a little easier to understand than starting with Project Oak.

Project Oak seems to have created something like a generic-purpose enclave
that can run third-party modules inside the enclave via web assembly. Hence
everyone runs the same enclave and you don't need to write the modules in C++
(super useful.) But I think it might make it harder to understand what's
happening under the hood.

The enclave work flow looks something like this:

1\. You create an enclave signing key pair and publish the public key to a
known communication channel belonging to you.

2\. You create the initial code for the enclave (and ideally make it public.)
The enclave code is regular C++ code but you don't call it directly. Instead,
a tool called "oeedger8r" produces wrapper functions to call trusted functions
in the enclave and for the enclave to call untrusted functions on the host. It
reads .EDL files that white list the functions to wrap as 'trusted' or
'untrusted.'

3\. The code is compiled, producing an unsigned enclave binary. For this to
occur, the wrapper code + headers are provided to the compiler. Again, you
don't call code directly because of CPU vulnerabilities.

4\. The binary is signed, producing a signed binary associated with your
public key (oesign tool.)

5\. You publish a few details about the latest binary: the product version
number, the security version no (incremented on security-related bug fixes),
and most importantly -- the enclaves unique ID. The unique ID is like a hash
of all the important details pertaining to that enclave (not sure if it
includes the pub key or sig -- my guess is it doesn't and hopefully
shouldn't.) Unique ID is visible with oesign dump enclave.signed > a.dmp; cat
a.dmp | grep mrenclave

Here's how this mysterious 'attestation thing works':

1\. Your customer downloads the signed enclave binary.

2\. Optional step: they build the enclave binary from the most recent source
and only proceed if the unique IDs match.

3\. The client can either prove to someone else they're running your code, or
you can prove to a client you're running the enclave binary code.

4\. Well go with option 2 -- prove to a client. So, your remote enclave
produces a remote attestation report and the client downloads it. Protip:
nothing complicated needs to happen here. A byte structure to hex, then
sockets between clients -- for an MVP I used ZeroMQ which has good socket
libraries for all languages.

5\. The client then starts up a verification client that must run inside an
enclave on their local computer for security reasons.

6\. The verification client checks that: the pub key matches your signing key,
the product version no and security version no are as expected, and so too is
the unique ID. The client would also check that the message came from Intel's
certificate chain - but you don't have to write code for this directly. The
'signature' in this context is really just something telling Intel that the
code is from an enclave and hence they should proceed to sign the report.
That's why the root of trust is Intel. If the hardware or key pairs are
compromised, the trust model breaks.

7\. Part of the protocol allows you to provide a random value to the remote
enclave to sign as part of the attestation report. This value is actually
incredibly useful because you can use it to prevent replay attacks (i.e. --
this report MUST be recent) AND bootstrap a secure communication channel
between a local and remote enclave at the same time.

Secure communication channels between enclaves 101:

1\. So, this might be confusing at first, but ideally all of the enclave's
code should be public so a client knows exactly what behaviors to expect. What
this means is that you can't put any secrets inside the enclave code directly.
Instead, you can have an initialization function running inside the enclave to
generate secrets the first time the enclave is run and return any public keys
to the outside, OR alternatively: deal them to the enclave (once) from
untrusted memory (not the best idea.) So what you do is this:

2\. Generate an RSA key pair inside the enclave and include the pub key hash
as the challenge in the remote attestation report.

3\. Remote enclave does the same and asks for an attestation report from your
local enclave containing its public key hash for the challenge value.

4\. You generate an attestation report proving that your public key was
generated from a code base that is fit for secure message exchange.

5\. Remote enclave verifies it. At this point, both sides are now sure that
they have received public keys from running code that they expect. Either side
can then generate an AES key pair and encrypt it with the opposite sides temp
RSA pub key (encrypted using the right secure padding scheme for RSA.)

It all seems very complicated at first, but a few days with the Open Enclave
SDK and you'll be writing software that can use these cool features. I'd
recommend checking out the samples at
[https://github.com/openenclave/openenclave/tree/master/sampl...](https://github.com/openenclave/openenclave/tree/master/samples/remote_attestation)
and
[https://github.com/openenclave/openenclave/tree/master/sampl...](https://github.com/openenclave/openenclave/tree/master/samples/attested_tls).
The TLS stuff could be done simpler, but it will show you how to get the
unique enclave ID / "MRENCLAVE" value which should be checked for a full
protocol.

It's really good to see hacker news embracing this tech. Black and white
thinking, and "fallacy fallacies" (finding a problem and then dismissing the
whole argument) gets tiring. The tech isn't perfect, but I think it has a lot
of promise.

------
graphememes
Has anyone ever used one of these?

