
Fully Countering Trusting Trust Through Diverse Double-Compiling (2009) - micaeked
http://www.dwheeler.com/trusting-trust/dissertation/html/wheeler-trusting-trust-ddc.html#0.Abstract
======
dwheeler
jacques_chester: You're absolutely right that recreating things bit-for-bit
identical can require some real elbow grease. I had to overcome bugs in the
Tiny C compiler (tcc) and gcc to make them work, as described in the paper.
Date/time stamps can create problems too.

But all these problems are quite doable. Nobody claims that gcc is small, yet
I managed to get that working. Compiler makers can follow a few guidelines to
make it much easier, see: [http://www.dwheeler.com/trusting-
trust/dissertation/html/whe...](http://www.dwheeler.com/trusting-
trust/dissertation/html/wheeler-trusting-trust-
ddc.html#4.Guidelines%20for%20Compiler%20Suppliers) Check out the graph at the
Debian Reproducible Builds project at
[https://wiki.debian.org/ReproducibleBuilds](https://wiki.debian.org/ReproducibleBuilds)
\- yes, some packages don't reproduce bit-for-bit, but they've made a
tremendous amount of progress and managed it for most packages.

You can see some related information at this page:
[http://www.dwheeler.com/trusting-trust/](http://www.dwheeler.com/trusting-
trust/) including a video of me discussing the paper.

~~~
jacques_chester
Do you have any tips on encouraging upstream projects to invest in
reproducible builds?

Our ideal world would be fully reproducible builds with a complete chain of
custody. We have some of it, but not the whole kit and kaboodle.

But we can't really do this so long as we rely on unreproducible upstream
build configurations.

~~~
dwheeler
I think at least part of the solution is convincing upstreams that the world
has changed. There are now many people and organizations who are actively
working to subvert software - and some of them have a _lot_ of resources and
incentive. They're working to break into a variety of things (including
repositories, build systems, and distribution processes) so that people run
subverted software. Frankly, the world changed decades ago, but only recently
have many developers started to realize it. I try to convince people by
pointing out past attacks, for example. One piece that might help you is:
[https://blog.torproject.org/blog/deterministic-builds-
part-o...](https://blog.torproject.org/blog/deterministic-builds-part-one-
cyberwar-and-global-compromise)

Of course, you then have to convince them what specifically to do. The
reproducible builds project has some nice documentation:
[https://reproducible-builds.org/docs/](https://reproducible-builds.org/docs/)
and I already mentioned my guidelines: [http://www.dwheeler.com/trusting-
trust/dissertation/html/whe...](http://www.dwheeler.com/trusting-
trust/dissertation/html/wheeler-trusting-trust-
ddc.html#4.Guidelines%20for%20Compiler%20Suppliers) . You can also look at
specific war stories, such as Tor's:
[https://blog.torproject.org/blog/deterministic-builds-
part-t...](https://blog.torproject.org/blog/deterministic-builds-part-two-
technical-details) or sbcl's:
[http://christophe.rhodes.io/notes/blog/posts/2014/reproducib...](http://christophe.rhodes.io/notes/blog/posts/2014/reproducible_builds_-
_a_month_ahead_of_schedule/)

We can also make it easier. One great thing is that the Debian reproducible
builds group has been modifying tools to make it easier to create reproducible
builds. That doesn't mean there's nothing left to do, but making it easier
makes it way more likely. The "containerization of everything" also has the
_potential_ to make life easier - it makes it easier to start from some fixed
point, and repeat a sequence of instructions from there.

~~~
jacques_chester
I have to admit that once you're on a project where the customers make you an
attractive vector for state actors, it becomes slightly nervy.

Followup question: how do we find independent folk to help us check our work?
Or do we nut it out ourselves?

~~~
dwheeler
> Followup question: how do we find independent folk to help us check our
> work?

I don't have a very good answer for proprietary software. If a company is
serious, I think they should pay people to independently review it. There's
great evidence that software inspections detect a lot of defects, but in many
circumstances detecting & fixing defects is simply not valued as much as the
costs of reviewers. We need customers to demand independent analysis of
important software.

For open source software, the situation is often better. I think you should
work with the people who write/manage/run the relevant system or language
package management tools so that the packages are reproducible.

As far as the broader question of "checking our work", there a lot of things
you can do to make it easier for people to collaborate. I strongly encourage
all OSS projects to try to get a CII best practices badge:
[https://bestpractices.coreinfrastructure.org/](https://bestpractices.coreinfrastructure.org/)
That has a list of basic things you should do to encourage collaboration and
be secure. (Full disclosure: I lead the CII best practices badge project. But
you should do it anyway :-) ).

~~~
jacques_chester
I'll send the link to my engineering director for a squiz. On a superficial
scan I think we hit many of these criteria, but not all. Of interest, we're
neighbours -- the Cloud Foundry Foundation is also managed by the Linux
Foundation.

The one I'm happiest to exceed is the 60 day CVE-fix window. Our policy is to
release updated versions of our buildpacks and rootfs within 48 hours of a
high-severity CVE being patched upstream -- usually within the same day,
actually. Only possible because we have very extensive testing and build
automatic.

For internal reviews and teaching, one idea that one of my colleagues floated
was having a red team with engineers rotated through that team. The idea being
that it's easiest to think like an attacker if you have, for some time, _been_
an attacker.

It would be difficult to find the right tempo, though. It'd take a few weeks
to get a grip on common attack types and then start hunting for flaws, and
we'd be struggling to find the balance between rotating as many engineers
through as possible vs maintaining ongoing feature work and maintenance.

------
rurban
Can we please add a (2009) to the title?

This is a classic paper on reproducible builds, everybody is working on since.
Better overview: [http://www.dwheeler.com/trusting-
trust/](http://www.dwheeler.com/trusting-trust/)

Older discussion, 7 years ago:

* [https://news.ycombinator.com/item?id=1104338](https://news.ycombinator.com/item?id=1104338)

------
jacques_chester
The hard part -- sometimes _really_ hard -- is the "bit-for-bit identical"
requirement.

Lots of builds are _recreatable_ but not _reproducible_ (there is probably a
better term of art here). You can go back to a point in time and build the
version of the software as it was, but you are not guaranteed to get a bit-
for-bit clone. (See [https://reproducible-builds.org](https://reproducible-
builds.org) for a thorough discussion)

The problem is that there are lots of uncontrolled inputs to a build that are
due to sourcecode or compiler changes. Most famously there are timestamps and
random numbers, which mess up all sorts of hashing-based approaches.

These can even be non-obvious. Just the other day I and a colleague were
investigating the (small but unsettling) possibility that an old buildpack had
been replaced maliciously. We compared the historical hash to the file:
different. We rebuilt the historical buildpack with trusted inputs: still
different.

Then we unzipped both versions and diff'd the directories: identical.

What had thrown our hashes off was that zipfiles, by default, include
timestamps. We have a build that is recreatable but not reproducible.

Speaking of builds, we are able to reproducibly build some binaries but not
others. Off the top of my head our most high-profile non-reproducible build is
NodeJS. Some other binaries (Ruby and Python, in my not-at-all-complete
recollection) are fully reproducible.

This difficulty with fully reproducing makes it hard to provide a fully
trustworthy chain of custody. A company which uses Cloud Foundry have in
actual fact stood up an independent copy of our build pipelines inside their
own secure network, so that they can be completely autarkic for the build
steps leading to a complete buildpack. This doesn't defend against malicious
source, but it defends against malicious builds.

Disclosure: I work for Pivotal, the majority donor of engineering to Cloud
Foundry. As you've probably guessed, I'm currently a fulltime contributor on
the buildpacks team.

~~~
sly010
Nixos goes to a great length to steer towards reproduceability. They run
builds in chroots, they set all ctime to past 0, they make all directories
except the output directory read only, etc. But even then compilers have all
sorts of quirks, like running certain code paths multiple times and deciding
which one runs faster on ~this~ CPU.

The biggest conceptual mistake we are making is that by default compilers
always build for ~this~ machine, linking to this libraries. This makes it so
the state of the machine inherently changes with every compilation (aka
compiling is not a purely functional operation anymore). If I could go back
time and change automake and glibc, cross compiling and explicit dependency
handling should be the norm. (As an aside, containers would greatly benefit
too as you wouldn't need to package an entire linux distribution with every
binary)

I am sometimes amazed, sometimes disappointed by this reproduceability
problem. Computers supposed to be machines that can do the same thing again
and again without a mistake, but this is not the case anymore. We have so many
layers of complexity and everything is bolted together with duct tape. We
focus on developer convenience in the short term but in the long term we
completely loose determinism. Sure we can write more code faster than before,
but building software is more problematic than ever.

Yet, somehow everything seems to be going to this direction, in fact some
people celebrate it and compare it to biology or evolution. I just call is
"accidentally stochastic computing".

~~~
srtjstjsj
> some people celebrate it and compare it to biology or evolution.

Creating life is scientifically exhilarating, but incredibly dangerous.

------
dwheeler
I'm the author. Ask me anything about it.

~~~
nfoz
Huge fan of your work; thanks so much for what you've done on this topic, and
maintaining a clear website that I can refer back to easily :)

I've been wondering if perhaps there's a "trusting trust" problem where we
have 3d-printers that can print circuit-boards (eg CPUs)... and also print
3d-printers. The printer is sort of like a compiler in this case. How do you
know it won't produce printers that will produce malicious CPUs? It might not
be easy to do "bit-for-bit" comparisons between CPUs to make sure one is safe.

Since trusted compilers on untrusted hardware aren't trustworthy, I had hoped
that 3d-printing might eventually allow us to trust our own printed
hardware... but it might be turtles all the way down!

~~~
dwheeler
> I've been wondering if perhaps there's a "trusting trust" problem where we
> have 3d-printers that can print circuit-boards (eg CPUs)... and also print
> 3d-printers.

I can answer that question easily: "Yes, there's a problem :-)". DDC can be
used to help, but there are caveats.

I talk about applying DDC to hardware in section 8.12:
[http://www.dwheeler.com/trusting-
trust/dissertation/html/whe...](http://www.dwheeler.com/trusting-
trust/dissertation/html/wheeler-trusting-trust-ddc.html#8.12.Hardware)

This quote is probably especially apt: "Countering this attack may be
especially relevant for 3-D printers that can reproduce many of their own
parts. An example of such a 3-D printer is the Replicating Rapid-prototyper
(RepRap), a machine that can “print” many hardware items including many of the
parts required to build a copy of the RepRap [Gaudin2008]. The primary goal of
the RepRap project, according to its project website, is to “create and to
give away a makes-useful-stuff machine that, among other things, allows its
owner [to] cheaply and easily… make another such machine for someone else”
[RepRap2009]."

I also note that, "... DDC can be applied to ICs to detect a hardware-based
trusting trust attack. However, note that there are some important challenges
when applying DDC to ICs..." and there's the additional problem that when
you're done, you'll only have verified that specific printer - not a different
one. Determining if software is bit-for-bit identical is easy; determining if
two pieces of hardware are logically identical is not (in the general case)
easy. No two hardware items are perfectly identical, and it's tricky to
determine if something is an irrelevant difference or not.

If someone wants to write a follow-on paper focusing on hardware, I'd be
delighted to read it :-).

~~~
nickpsecurity
It's worse for trusted hardware than most people think. The framework I came
up with predicted a number of attacks including analog and material swapping
at fabs. So, that's an initial test. Here the basic risk analysis:

[https://news.ycombinator.com/item?id=10468624](https://news.ycombinator.com/item?id=10468624)

The smartphone analysis I did also has general stuff in it:

[https://news.ycombinator.com/item?id=10906999](https://news.ycombinator.com/item?id=10906999)

I predicted the A2 paper on analog compromise happening, at a high-level
rather than specific attack, largely due to our hardware guru on Schneier's
blog bragging about mixed-signal attacks years ago. He taught us they resisted
attempts to counterfeit or patent sue them by disguising key functions in
analog or RF components the digital tools couldn't even see. He said
competitors did, too, with him regularly having to carefully inspect lowest-
level representation of 3rd-party components. I have a feeling they were
cloning them, too. ;) Anyway, the actual products were already subverted years
ago just for competitive advantage, counterfeiting, etc. He said some
counterfeiters were so good they cloned his company's products down to the
transistors. I said, "Holy shit!"

One more thing for you while I'm still on this: cost reduction via merged
designs. The mask and fab runs for prototyping cost tons of money. A well-
known way to reduce that is many companies sharing one run (eg shuttle run or
MPW) to get their test chips cheaper. A less-known trick, at least outside
embedded, is them putting multiple products on one ASIC to do same thing for
production runs with a factory-setting telling it what chip to look like. My
hardware guy gave example of 3G or WiFi circuitry embedded in microcontroller
used in your input devices... perfect for keylogging... that was only
incidentally there since supplier offered both feature-phone SoC's and
peripheral chips. And simply didn't want to manufacture two lines. Such extras
might be re-enabled, even remotely, depending on how they control access to
them. So, gotta watch out for them.

------
contingencies
It strikes me that the use of diverse systems to reinforce assumptions of
trust within a given subsystem is an architectural paradigm not limited to
compilers. The key problems are implementation feature-set or edge-case
differences and overhead (real time and maintenance/up-front development). In
fact, it would be ideal with multiple client versions/implementations on any
service (particularly distributed or financial) and indeed I have done this in
the past. Not sure if this paradigm has a name... anyone? I suppose you could
just say consensus-based hedging.

~~~
nickpsecurity
It's in fact a long-established technique in high-assurance systems going back
to I think aerospace or security-critical where the triple-modular redundancy
trick was reapplied with separate teams building each one. Security through
diversity also re-emerged relatively recently as a very active sub-field of
INFOSEC/IT. If you're interested in that stuff, use these terms in various
combination when you're Googling: "artificial diversity," "automated
obfuscation," "moving target software security," "security diversity
software." Also, including "pdf" helps given most good ones are papers. The
word "survey" will occasionally land you on a pile of gold with references all
in one spot. :)

Happy hunting!

------
lamby
Hi all, I do a lot of work on Reproducible Builds within Debian (AMA...!).

Just wanted to mention we are now having regular IRC meetings:

[https://lists.reproducible-builds.org/pipermail/rb-
general/2...](https://lists.reproducible-builds.org/pipermail/rb-
general/2016-October/000071.html)

------
bitwize
I still can't see this paper referenced without thinking of HISSATSU! DOUBLE
COMPILE!:
[https://www.youtube.com/watch?v=FHkFzRZdlV4](https://www.youtube.com/watch?v=FHkFzRZdlV4)

------
nickpsecurity
This again. A perfect example of solving the wrong problems in a clever way.
To his credit, Wheeler at least gives credit to the brilliant engineer
(Karger) who invented the attack, points out it took 10 years before that
knowledge reached anyone via Thompson (recurring problem in high-security),
and did the reference essays on the two solutions to the actual problem (high-
assurance FLOSS & SCM's). That's what you're better off reading.

Here's a quick enumeration of the problems in case people wonder why I gripe
about this and reproducible builds fad:

1\. What the compiler does needs to be fully specified and correct to ensure
security.

2\. The implementation of it in the language should conform to that spec or
simply be correct itself.

3\. No backdoors are in the compiler, the compilation process, etc. This must
be easy to show.

4\. The optimizations used don't break security/correctness.

5\. The compiler can parse malicious input without code injection resulting.

6\. The compilation of the compiler itself follows all of the above.

7\. The resulting binary that everyone has is the same one matching the source
with same correct _or malicious_ function but no malicious stuff added that's
not in the source code already. This equivalence is what everyone in
mainstream is focusing on. I already made an exception for Wheeler himself
given he did this _and_ root cause work.

8\. The resulting binary will then be used on systems developed without
mitigating problems above to compile other apps not mitigating problems above.

So, that's a big pile of problems. The Thompson attack, countering the
Thompson attack, or reproducible builds collectively address the tiniest
problem vs all the problems people actually encounter with compilers and
compiler distribution. There's teams working on the latter that have produced
nice solutions to a bunch of them. VLISP, FLINT, the assembly-to-LISP-to-HLL
project & CakeML-to-ASM come to mind. There's commercial products, like
CompCert, available as well. Very little by mainstream in FOSS or proprietary.

The "easy" approach to solve most of the real problem is a certifying compiler
in a safe language bootstrapped on a simple, local one whose source is
distributed via secure SCM. In this case, you do not have a reproducible build
in vast majority of cases since you've verified source itself and have a
verifying compiler to ASM. You'll even benefit from no binary where your
compiler can optimize the source for your machine or even add extra security
to it (a la Softbound+CETS). Alternatively, you can get the binary that
everyone can check via signatures on the secure SCM. You can even do
reproducible builds on top of my scheme for the added assurance you get in
reproducing bugs or correctness of specific compilations. Core assurance...
80/20 rule... comes from doing a compiler that's correct-by-construction much
as possible, easy for humans to review for backdoors, and on secure repo &
distribution system.

Meanwhile, the big problems are ignored and these little, tactical solutions
to smaller problems keep getting lots of attention. Same thing that happen
between Karger and Thompson time frame for Karger et al's other
recommendations for building secure systems. We saw where that went in terms
of the baseline of INFOSEC we had for decades. ;)

Note: I can provide links on request to definitive works on subversion, SCM,
compiler correctness, whatever. I think the summary in this comment should be
clear. Hopefully.

Note 2: Anyone that doubts I'm right can try an empirical approach of looking
at bugs, vulnerabilities and compromises published for both GCC and things
compiled with it. Look for number of times they said, "We were owned by the
damned Thompson attack. If only we countered it with diverse, double
compilation or reproducible builds." Compare that to failures in other areas
on my list. How unimportant this stuff is vs higher-priority criteria should
be self-evident at that point. And empirically proven.

~~~
zzzcpan
You are right, but reproducible builds are still very useful, not for high-
assurance though.

~~~
nickpsecurity
They're barely useful for low assurance. Just read the Csmith paper testing
compilers to see the scope of the problem. They solution to what they're
really worried about will require (a) a correct compiler, (b) it written in
cleanly-separated passes that are human-inspectable (aka probably not C
language), (c) implemented with correctness checks to catch logical errors,
(d) implemented in safe language to stop or just catch language-level errors,
(e) stored in build system hackers can't undetectably sabotaged, (f) trusted
distribution to users, and (g) compiled initially with toolchain people trust
with optional, second representation for that toolchain.

Following Wirth's Oberon and VLISP Scheme, the easiest route is to leverage
one of those in a layered process. Scheme, esp PreScheme, is easiest but I
know imperative programmers _hate_ LISP's no matter how simple. So, I include
a simple, imperative option.

So, here's the LISP example. You build initial interpreter or AOT compiler
with basic elements, macro's, and assembly code. Easy to verify by eye or
testing. You piece-by-piece build other features on top of it in isolated
chunks using original representation until you get a _real language_. You
rewrite each chunk in real-language and integrate them. That's first, real
compiler that was compiled with the one you built piece by piece starting with
a root of trust that was a tiny, static LISP with matching ASM. You can use
first, real compiler for everything else.

Wirth did something similar out of necessity in P-code and Lilith. In P-code,
people needed compilers and standard libraries but couldn't write them. The
could write basic system code on their OS's. So, he devised idealized assembly
that could be implemented by anyone in almost no code and just with some OS
hooks for I/O etc. Then, he modified his Pascal compiler to turn everything
into P-code. So, ports & bootstrapping just required implementing one thing.
Got ported to 70+ architectures/platforms in 2 years as result.

The imperative strategy for anti-subversion is similar. Start with idealized,
safe, abstract machine along lines of P-code with ASM implementations. Initial
language might be Oberon subset with LISP or similar syntax just for
effortless parsing. Initial compiler done in high-level language for human
inspection with code side-by-side in subset language for that idealized ASM.
It's designed to match high-level language, too. Create initial compiler that
way then extend, check, compile, repeat just like Scheme version.

The simple, easy code of the initial compilers and high-level language for
final compilers means anyone can knock them off in about any language. That
will increase diversity across the board as many languages, runtimes, stdlibs,
etc are implemented quite differently. Reproducible build techniques can be
used on the source code and initial process of compilation if one likes. The
real security, though, will be that many people reviewed the bootstrapping
code, the ZIP file is hashed/signed, and users can check that source ZIP they
acquired and what was reviewed match. Then they just compile and install it.

~~~
dwheeler
Excellent, it's easy! Why haven't you completed this yet? :-)

So, a few thoughts.

The CSmith paper "Finding and Understanding Bugs in C Compilers" is a fun
paper:
[https://www.cs.utah.edu/~chenyang/papers/pldi11-preprint.pdf](https://www.cs.utah.edu/~chenyang/papers/pldi11-preprint.pdf)
\- however, let's delve further. They found defects in every compiler they
tried, proprietary and OSS. They even found defects in CompCert - because they
were defects in CompCert’s unverified front-end code. What's more, they
focused on "atypical combinations of C language features" \- which are
important, but to far fewer users.

Yes, it'd be awesome to have compilers that are perfect have absolutely no
defects. Let's work on that. But it will be many, many years before they are
widespread.

Besides, while no-defects would be awesome, many people are more interested in
a different and simpler requirement - they want to detect subversion of
binaries (where the binary and source do not correspond). Yes, provably
perfect compilers could do that, but you don't need to wait for them;
reproducible builds and DDC can provide that _now_ , and you don't have to
wait for anything.

So let's talk about VLISP. VLISP spawned an amazing number of papers, and was
interesting work. Where's the code? MITRE never released it to my knowledge.
To me, programs I can't run are in the "who cares?" category, and stuff people
can't reproduce & investigate isn't really science anyway. Besides, VLISP only
generated code for computers people generally don't use anymore. (You'll
notice that I posted the scripts demonstrating DDC so that others can
reproduce their execution.)

Sure, p-code was awesome in its time, I used it. But when newer hardware (the
IBM PC) came along, it got superceded. More importantly for our story, it got
superseded before there was time to develop any complex proofs. This is a more
general problem: As long as formal proofs demand a massive amount of time,
their results will become useless due to obsolescence. We _must_ improve the
tooling and models so that the proofs and layers can be done in a much faster
and cost-effective way. Proving obsolete stuff is not very helpful. I think
the ProVal approach (using Why3) is especially promising, but fundamentally,
we need to make it _not_ a massive research effort to write a high-assurance
compiler.

Oh, and a little out-of-scope: As someone who's written Scheme & Common Lisp
for decades, the problem with Lisp isn't its imperative nature; lots of people
like Elm and Haskell, which are also functional languages. The problem with
Lisps is their hideous syntax; Lisps don't even support infix by default,
something every grade schooler learns. One solution is here:
[http://readable.sourceforge.net/](http://readable.sourceforge.net/)

Anyway, what you've outlined is basically a program to build up from small
safe components into larger trustworthy components. It's a sound strategy, and
one that has been repeatedly advocated for decades by many people. But we also
need to admit that it's going to take a long time, because our tooling is only
_just_ becoming good enough, and even then only in certain cases. There are
serious limitations you're glossing over.

Don't ignore shorter-term smaller wins; they're valuable.

~~~
srtjstjsj
We all should have your ability to cheerfully and respectfully reply to
disdainful sneers. Thank you.

~~~
nickpsecurity
I'm disagreeing with a person I respect on complicated topic rather than
offering "disdainful sneers." I do respect how he replies to the disagreement.
A scholar and a gentleman he is.

