Countering Trusting Trust Through Diverse Double-Compiling (DDC)

jakobegger · on Feb 24, 2015

I'm having trouble understanding this bit:

> In the DDC technique, source code is compiled twice: once with a second (trusted) compiler (using the source code of the compiler’s parent), and then the compiler source code is compiled using the result of the first compilation. If the result is bit-for-bit identical with the untrusted executable, then the source code accurately represents the executable.

Can someone explain this step by step? Which compiler compiles which source?

peterwaller · on Feb 24, 2015

So you have your (possibly tainted at the binary level, see "Reflections on trusting trust") compiler, let's call it [A].

You take source code to a compiler [GCCsource], which for the purposes of this we'll say we can trust (because at least we can read the source, IOCCC notwithstanding).

The problem is that if you Compile [GCCsource] with [A] to make [GCCWithA], [GCCWithA] may be tainted if [A] is tainted.

The fix is to make GCC produce reproducible builds.

Then you compile [A]([GCCsource]) -> [GCCWithA] and the critical step is this:

[GCCWithA]([GCCsource]) -> [GCCWithAWithA].

If [GCCWithA] makes reproducible builds, then [GCCWithX] should make bit-identical outputs with [GCCWithA]. If it doesn't, then you can tell that there are shenanigans going on.

So you do

[A]([GCCsource]) -> [GCCWithA]; [GCCWithA]([GCCsource]) -> [GCCWithAWithA]

And then

[B]([GCCsource]) -> [GCCWithB]; [GCCWithB]([GCCsource]) -> [GCCWithBWithB]

You compare [GCCWithAWithA] with [GCCWithBWithB].

Now you've upped the bar. Not only does an attacker have to put a backdoor in the GCC binary, but now they have to do it to Intel's compiler, and any other compiler capable of compiling [GCCsource]. If the attacker doesn't catch all of these compilers them in exactly the same way, then it will result in a difference to the output.

jakobegger · on Feb 24, 2015

Thanks, I understand now.

peterwaller · on Feb 24, 2015

Reflections on Trusting Trust is a classic lecture by Ken Thompson, which is well worth a read.

http://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thomps...

I always thought that needing to trust a large binary blob (your bootstrapping compiler) was a fundamental limitation with computers that would be impossible to overcome. It is delightful to hear of a solution to it. It's one of those things which is almost obvious in hindsight but takes a leap of imagination to discover, I think.

fdik · on Feb 24, 2015

“source code is compiled twice: once with a second (trusted) compiler”

Here I stopped reading.

Isamu · on Feb 24, 2015

Well, if you are unsure if you can trust the second compiler (e.g. you didn't write it, audit it, bootstrap it yourself) you can just add a third (slightly more trustedworthy) compiler.

After that the infinite stack of turtles will take over and make sure it is asymptotically trustworthy.