Hacker News new | past | comments | ask | show | jobs | submit login
Project Oak – Meaningful control of data in distributed systems (github.com/project-oak)
125 points by azhenley on June 24, 2019 | hide | past | favorite | 25 comments



Reminds me of a general-purpose version of the secure enclave that Moxie Marlinspike blogged about when he implemented secure contact look-up in Signal. Very cool Google released this as open source. Of course, it does require you to be able to trust the security of the enclaves and (at least in the Signal implementation) requires you to make some different performance trade-offs in order to prevent information leakage.

Tangent: On a whim I Googled 'secure enclave risc-v' and, sure enough, there's an extension in development called "Keystone."[0] RISC-V really has such a promising future.

[0]: https://keystone-enclave.org/


Also reminds me of the Golem project, which wants to allow everyone to be able to sell their compute power and uses secure enclaves so the people who sell their compute power can't spy on the data of the submitted compute jobs. I find the concept of secure enclaves super exciting and I hate that most coverage of them ends up just being "secure enclaves are an evil tool for corporations to make super-DRM on your computer".


> wants to allow everyone to be able to sell their compute power

> corporations to make super-DRM on your computer

Well, who else are you going to sell it to? This is the sort of thing that I used to think was cool but now am merely weary of; combining the "everything is a leasable asset" view of Uber and AirBnB with the "don't look at the electricity costs" view of bitcoin. It's going to end up with idle televisions crunching the personal data of unrelated people in order to pay the TV manufacturer some penny shavings.


Golem was cool, but I don't understand their strategy now that secure enclaves have been shown to be vulnerable to side-channel attacks.

Secure enclaves feel like bait to build systems that are overly reliant on questionable hardware.


I haven't followed Golem or processor news closely, but I assume a future generation of processors will fix the side-channel attacks (either in general or at least just for secure enclaves), so I think it still makes a lot of sense to build something for secure enclaves for the future.


I don't disagree, but side-channel attacks have shown that hardware security isn't guaranteed and, maybe more importantly, there are unbelievably painful to patch.


But you do agree enclaves can be used for exactly this? I agree there are exciting use cases but we need to be careful because open, general purpose computing is already being attacked on several fronts.


Enclaves don't get direct IO access and have to interact with code running on the main processor. I'm okay with opaque code if it's running in a tight sandbox and its interactions to the outside world (including my filesystem) are inspectable.

Also it's important to me that regular people can benefit from secure enclaves by using them to protect their data being processed on other people's machines. The secure enclaves aren't closed only to corporations who want to make things like DRM.


Good points. The aspect I'm most worried about is that we'll see a trend towards an increasing amount of code running in enclaves and then becoming a hard requirement for common software to function.


FWIW - Keystone is going to be one of the core components of the Oasis project [0].

[0] https://www.oasislabs.com/


Some more info on Keystone and the goals behind it / open source secure enclaves: https://medium.com/oasislabs/towards-an-open-source-secure-e...


My immediate association was with Java, which was originally named "Oak".

Sounds like an interesting project, although the requirement for specialized CPU hardware will presumably make it of limited use to most folks for the time being. But maybe GCP will offer this sort of hardware to its customers.


It leverages Google Asylo as the enclave abstraction layer which currently supports Intel SGX. SGX is (to the dismay of HN readers everywhere) available in Intel server/workstation CPUs since Skylake, so I don't think there's a really "specialized" CPU hardware requirement.


Linux will soon support SGX on FLC-enabled (Flexible Launchh Control) hardware. Upstream SGX for non-FLC hardware is not on the roadmap. I have no idea which CPUs support FLC.

Also, I consider SGX to be only barely adequate, if at all, for this type of use case. It’s missing a critical feature to protect enclaves from instances of the same enclave.


But it's not widely available from cloud providers at the moment, particularly if you want meaningful remote attestation to work.


Great, except for the part where SGX is completely broken:

https://www.usenix.org/conference/usenixsecurity18/presentat...


Right, except for the part where it's been patched by a microcode update [0]. You can always choose not to provision enclaves with secrets unless the CPU has SMT disabled.

[0] https://www.intel.com/content/www/us/en/security-center/advi...


If SGX is broken, how can you trust the CPU not to lie about whether it has applied the microcode update and/or disabled SMT?


SGX is "broken" in that it allows exfiltrating data via side channels; AFAICT no one has broken the attestation process by which the CPU effectively signs a statement of its internal state (including SMT) with a key fused into the CPU by Intel at manufacturing time, and which is used as a building block for remote attestation.


If you look at the Usenix paper on the FORESHADOW attack, they did break attestation as well by applying their attack to the quoting enclave to extract attestation keys.


This feels a lot like a more production-y version of Ryoan (OSDI '16).

https://www.usenix.org/conference/osdi16/technical-sessions/...


A paper was just presented at PLDI on Project Oak (PDF not yet available?):

https://pldi19.sigplan.org/details/deepspec-2019-papers/11/P...


This could be an unfortunate name if the rumoured Huawei OS is actually named Oak.


If any devs here are interested in learning about trusted computing 101 I can recommend the Open Enclave SDK by Microsoft. They've done a really good job making the technology as accessible to new developers as possible and it might be a little easier to understand than starting with Project Oak.

Project Oak seems to have created something like a generic-purpose enclave that can run third-party modules inside the enclave via web assembly. Hence everyone runs the same enclave and you don't need to write the modules in C++ (super useful.) But I think it might make it harder to understand what's happening under the hood.

The enclave work flow looks something like this:

1. You create an enclave signing key pair and publish the public key to a known communication channel belonging to you.

2. You create the initial code for the enclave (and ideally make it public.) The enclave code is regular C++ code but you don't call it directly. Instead, a tool called "oeedger8r" produces wrapper functions to call trusted functions in the enclave and for the enclave to call untrusted functions on the host. It reads .EDL files that white list the functions to wrap as 'trusted' or 'untrusted.'

3. The code is compiled, producing an unsigned enclave binary. For this to occur, the wrapper code + headers are provided to the compiler. Again, you don't call code directly because of CPU vulnerabilities.

4. The binary is signed, producing a signed binary associated with your public key (oesign tool.)

5. You publish a few details about the latest binary: the product version number, the security version no (incremented on security-related bug fixes), and most importantly -- the enclaves unique ID. The unique ID is like a hash of all the important details pertaining to that enclave (not sure if it includes the pub key or sig -- my guess is it doesn't and hopefully shouldn't.) Unique ID is visible with oesign dump enclave.signed > a.dmp; cat a.dmp | grep mrenclave

Here's how this mysterious 'attestation thing works':

1. Your customer downloads the signed enclave binary.

2. Optional step: they build the enclave binary from the most recent source and only proceed if the unique IDs match.

3. The client can either prove to someone else they're running your code, or you can prove to a client you're running the enclave binary code.

4. Well go with option 2 -- prove to a client. So, your remote enclave produces a remote attestation report and the client downloads it. Protip: nothing complicated needs to happen here. A byte structure to hex, then sockets between clients -- for an MVP I used ZeroMQ which has good socket libraries for all languages.

5. The client then starts up a verification client that must run inside an enclave on their local computer for security reasons.

6. The verification client checks that: the pub key matches your signing key, the product version no and security version no are as expected, and so too is the unique ID. The client would also check that the message came from Intel's certificate chain - but you don't have to write code for this directly. The 'signature' in this context is really just something telling Intel that the code is from an enclave and hence they should proceed to sign the report. That's why the root of trust is Intel. If the hardware or key pairs are compromised, the trust model breaks.

7. Part of the protocol allows you to provide a random value to the remote enclave to sign as part of the attestation report. This value is actually incredibly useful because you can use it to prevent replay attacks (i.e. -- this report MUST be recent) AND bootstrap a secure communication channel between a local and remote enclave at the same time.

Secure communication channels between enclaves 101:

1. So, this might be confusing at first, but ideally all of the enclave's code should be public so a client knows exactly what behaviors to expect. What this means is that you can't put any secrets inside the enclave code directly. Instead, you can have an initialization function running inside the enclave to generate secrets the first time the enclave is run and return any public keys to the outside, OR alternatively: deal them to the enclave (once) from untrusted memory (not the best idea.) So what you do is this:

2. Generate an RSA key pair inside the enclave and include the pub key hash as the challenge in the remote attestation report.

3. Remote enclave does the same and asks for an attestation report from your local enclave containing its public key hash for the challenge value.

4. You generate an attestation report proving that your public key was generated from a code base that is fit for secure message exchange.

5. Remote enclave verifies it. At this point, both sides are now sure that they have received public keys from running code that they expect. Either side can then generate an AES key pair and encrypt it with the opposite sides temp RSA pub key (encrypted using the right secure padding scheme for RSA.)

It all seems very complicated at first, but a few days with the Open Enclave SDK and you'll be writing software that can use these cool features. I'd recommend checking out the samples at https://github.com/openenclave/openenclave/tree/master/sampl... and https://github.com/openenclave/openenclave/tree/master/sampl.... The TLS stuff could be done simpler, but it will show you how to get the unique enclave ID / "MRENCLAVE" value which should be checked for a full protocol.

It's really good to see hacker news embracing this tech. Black and white thinking, and "fallacy fallacies" (finding a problem and then dismissing the whole argument) gets tiring. The tech isn't perfect, but I think it has a lot of promise.


Has anyone ever used one of these?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: