

Deterministic, bit-identical and/or verifiable Linux builds - zz1
https://bugzilla.mozilla.org/show_bug.cgi?id=885777

======
indygreg2
I filed the linked bug and am the technical owner of Firefox's build system.

There were efforts made and discussions outside of the linked bug. To say
"nothing" was done is just not true.

It would be more accurate to say that we just can't justify working on this
right now because the timing isn't right and it's high cost for perceived low
reward. The time of everyone involved to implement this would be better spent
on improvements that benefit the general Firefox population. Some of those
improvements include overhauling Firefox's build automation to better support
things like building with Docker. That lays the groundwork for (easier)
deterministic builds in the future. Even then, I'm not sure if this will
happen. Brendan's post called on the larger community to make requests of
Mozilla. That front has been surprisingly quiet. If you really want this, I
would suggest making noise on the mozilla.org domain. Even better, contribute
some patches, like the Tor Project has done: I will happily review them!
#build on irc.mozilla.org.

~~~
zz1
Sorry if I wrote that nothing was done. What I really meant was that no change
happened, and even the discussion for the bug at a first glance doesn't let
understand if there is a clear path toward deterministic builds.

What would it be "making noise on the mozilla.org domain"? Must say, though, I
am kind of saddened to see that this discussion was upvoted 80 times while the
bug is still at 6.

~~~
indygreg2
The path towards deterministic builds is definitely not clear. As many in this
thread have pointed out, it's a difficult technical problem. The difficulties
are multiplied by a project at Firefox's scale.

Further complicating matters is our platform breakdown. The majority of
Firefox users are on Windows. Deterministic builds on Windows are very
painful. And that's before you figure PGO into the mix. Tor works around this
by compiling Firefox with an open source toolchain and doesn't use PGO. But
that's a non-starter for us because choosing an open source toolchain over
Microsoft's would result in performance degradations for our users. Believe
me, if we could ship a Windows and Mac Firefox built with 100% open source to
no detriment to our users, we would. There's work to get Firefox building with
Clang on Windows (but only for doing ASAN and static analysis, not for
shipping to users). That gets us one step closer.

All that being said, there has been exploratory talk lately of serving
segments of our user base with specialized Firefox builds. e.g. a build with
developer tools front and center that caters to the web development community.
If that ever happens, I imagine a deterministically-built Firefox with things
like Tor built in could be on the table. The way you can make that happen is
to direct noise directly at the Mozilla community. Send a well-crafted email
to firefox-dev ([https://mail.mozilla.org/listinfo/firefox-
dev](https://mail.mozilla.org/listinfo/firefox-dev)) explaining your position.
Anticipate that people will likely reply by asking you to prioritize this
against existing goals, such as shipping 64-bit Firefox on Windows and
shipping multi-process Firefox. We don't have nearly unlimited resources like
some of the other browser vendors, so we can't just do everything. Again, I
implore people to directly contribute to Mozilla any way they can.
[https://www.mozilla.org/contribute/](https://www.mozilla.org/contribute/)

------
crshults
I'm in casino gaming. We have to send our source and tools to regulatory test
labs so they can (hopefully) independently generate the same binary as what we
are delivering. Given our tools (C++ and Windows), 'binary reproducibility'[1]
is impossible, but we've got a workaround. We do our release builds on a
VirtualBox that's all tooled up. When it comes time to deliver to the lab, we
export the entire box (with source already synced) as an .ova. Part of our
build pipeline is a tool that strips things like timestamps and paths from the
PE files. Some people don't go to all this trouble and instead use tools like
Zynamics BinDiff to explain away the diffs.

[1][https://www.google.com/?gws_rd=ssl#q=binary+reproducibility](https://www.google.com/?gws_rd=ssl#q=binary+reproducibility)

~~~
chubot
What are the companies that provide this service (reproducing builds)? I
haven't heard of this, but sounds interesting.

Depending on how much effort you're willing to put in, even if you use C++ and
Windows, you can still write a program to parse the executable and zero out
timestamps and other non-deterministic data. That is actually being done in a
BitCoin-related program for Windows I believe.

How do you generate and verify the VirtualBox? If you send the image over to
the test lab, then the obvious thing to do is for someone to attack your
VirtualBox, and you have the same problem all over again, just at a different
level.

~~~
crshults
For jurisdictions that don't have their own state-run labs (so not NV, NJ, PA,
etc.) everybody uses one or a mix of GLI[1], BMM[2], and Eclipse[3] Note: I'm
only familiar with US gaming.

We do have a tool to zero these parts of the executable files out, but in our
testing we still had unexplainable differences unless we were on the same
machine working from the same sync.

The VirtualBox was generated once (installed Windows, Visual Studio, .NET,
some others) and we just continue to use the same base .ova.

The package has to be sent to the lab on physical media where it gets loaded
onto an offline machine that we've supplied.

[1][http://www.gaminglabs.com/](http://www.gaminglabs.com/)
[2][http://www.bmm.com/](http://www.bmm.com/)
[3][http://www.eclipsetesting.com/](http://www.eclipsetesting.com/)

~~~
hobarrera
This works for your goal (being able to reproduce the binary build), but in
Mozilla's case it's slightly different. Being FLOSS software, Mozilla's goal
is that end-users can _completely_ reproduce the builds from source. This
includes dependencies, toolchains, AND the build environment. In this
scenario, accepting a pre-build binary VM would not be acceptable, since it
defeats the spirit of FLOSS.

------
zz1
On January 2014 Brendan Eich [1] called out for organization to build up a
system to verify Firefox builds in order to secure the browser can't be used
as an attack vector being distributed with some malicious feature added to
what's in the source code.

Six months later nothing is done, that is because Firefox build are not
deterministic yet. If you think this is an important issue, please vote this
bug.

Edit: [1] [https://brendaneich.com/2014/01/trust-but-
verify/](https://brendaneich.com/2014/01/trust-but-verify/)

~~~
taeric
I'm curious on the theoretical basis of this effort. I'm reminded of "On
Trusting Trust." Simply put, not at all an easy problem to try and tackle.

No, I'm not against trying. Just going from your thing, I'm not sure what is
being aimed at. Specifically, would a "deterministic" build really help much?

edit: I am perusing [https://blog.torproject.org/blog/deterministic-builds-
part-o...](https://blog.torproject.org/blog/deterministic-builds-part-one-
cyberwar-and-global-compromise) and
[https://blog.torproject.org/blog/deterministic-builds-
part-t...](https://blog.torproject.org/blog/deterministic-builds-part-two-
technical-details) Good reads so far.

~~~
raving-richard
Please have a look at David A. Wheeler’s page on Trusting trust [1], including
his 2009 PhD dissertation [2], where he clearly demonstrates that it is
possible to have trusted (not in the MS sense...) computers (I think).

You may also be interested in 'Countering "Trusting Trust"' on Schneier's
website [3], which discusses a 2006 paper, also by Wheeler.

[1] [http://www.dwheeler.com/trusting-
trust/](http://www.dwheeler.com/trusting-trust/) [2]
[http://www.dwheeler.com/trusting-
trust/dissertation/html/whe...](http://www.dwheeler.com/trusting-
trust/dissertation/html/wheeler-trusting-trust-ddc.html) [3]
[https://www.schneier.com/blog/archives/2006/01/countering_tr...](https://www.schneier.com/blog/archives/2006/01/countering_trus.html)

~~~
taeric
My memory on that was that it let you know whether you could trust your
compiler. I couldn't remember if it extended ot the rest of the tool chain.
Nor did I remember if it really hinged on deterministic builds. I'll have to
retry it.

~~~
dllthomas
Strictly speaking, it lets you know that your compiler binary matches its
source. You can then read the source to decide if you trust the compiler (and
others can audit it, can audit binaries generated from it, &c, &c). At which
point, as raving-richard says, you can start to trust that your other
utilities match their source as well. Which source also should be audited, &c,
&c.

~~~
taeric
Right, my point is having deterministic builds of firefox aren't even really
needed for this. If you trust your compiler, you trust your compiler. What
does it matter if you have a non-deterministic build of a utility. You trust
what is non-deterministically building it.

As this stands, if you deterministically build firefox, you just know that if
your toolchain is corrupted, it is consistent. :)

Right?

~~~
esrauch
If you trust your compiler you can verify that your build is the safely based
on the source that you have. If the build is deterministic then you could
verify that the binary being distributed to the masses isn't compromised by
building the same file yourself and seeing that it is the same.

~~~
taeric
Right, and my question is essentially if this is "putting the cart before the
horse." Do Mozilla have efforts in place to establish trust of their
compilers? (I expanded on my response below. I really wish I knew the correct
way to "merge" conversation trees here. Is there a good protocol for that?)

------
raving-richard
Even if you hate Bitcoin for other reasons, this is one reason to appreciate
it. Gitian, the software used by Tor for deterministic builds (of their build
of Firefox especially), was originally written by Bitcoin developers. Which
makes sense, you want to make sure your money is secure.

Good things come out of things you might hate.

~~~
zaroth
Deterministic builds are pretty neat. I think the second equally important
piece is a a Web of Trust full of people willing to reproduce the build and
sign off on the hashes.

I was able to reproduce the sha512sum of the Bitcoin back when 0.9.0 came out
without too much trouble, but it definitely took a couple hours to get it all
working.

I feel a bit bad I didn't take the next step and attach my digital signature
signifying that I could reproduce it. There are only a few people other than
Gavin who go to the trouble of signing off on the hashes.

I wonder if Docker could be used to speed up the overall process and make
builds more accessible. As I recall, the current scripts setup a single-core
KVM which definitely slowed things down.

~~~
taeric
Docker _could_ speed up parts. But, unless I misunderstand what you mean it
wouldn't really help the trust aspect. You'd just be shifting your trust to
the docker pieces. (That is, then the goal shifts to "can anyone reproduce the
docker container?") Right?

------
jlebar
Before everyone gets up in arms about Mozilla not working on this: As I wrote
the last time this came up, deterministic builds are a nice thing, but they're
only a small piece of the puzzle of protecting users from the state-sponsored
malicious actors. Indeed, it seems to me that messing with builds would be one
of the more difficult ways for the NSA to pwn Firefox users.

[https://news.ycombinator.com/item?id=7045605](https://news.ycombinator.com/item?id=7045605)

------
mrpdaemon
Or use Gentoo, that's what I do. You can verify hashes/signatures on the
Firefox source archive and audit the source code if necessary before
compiling.

That was only half serious - I know that are valid use cases for people to
prefer using binary distros. However I think this particular issue is a good
example why IMO even binary distros need to provide a convenient option to
locally build any package for security conscious users.

~~~
taeric
That sounds tangential. The point is if two people build the same thing, they
_should_ be able to compare their builds to see if they are truly the same. If
not, the argument is that one of them has a "tampered" environment.

In other words, if you don't know your compiled binary is the same as the
distributed binary, you have no reason to think yours does not have a
vulnerability added by the toolchain.

Unless I'm the one that is misunderstanding, of course. :)

~~~
mrpdaemon
Well it's a solution to the same underlying problem - that by running binaries
compiled by a 3rd party you trust that they aren't adding in code to
compromise your privacy (voluntarily or not). If you compile the application
from source yourself you don't need that leap of faith - no need to compare
identical binaries or have deterministic builds (which is not trivial as the
bug report demonstrates).

~~~
taeric
I'm not sure your solution solves that. If Firefox has vulnerabilities in the
source right now, you do little to protect yourself by compiling on your own.
Even if you can verify that you and someone else produce the same binary, they
could just both be vulnerable.

In fact, if you compile it yourself, _unless_ you can verify the compile
against a "known good" one, then you can't even be sure that your local
toolchain hasn't been compromised. (I mean, sure, if you were a perfect
auditor of your entire toolchain, then you could have some confidence here.
You have to be perfect, though.)

Consider, you do a compile of Firefox and it is different than the one for
download. Why? As things stand now, _you don 't know._ And that is the
problem.

~~~
mpyne
> If Firefox has vulnerabilities in the source right now, you do little to
> protect yourself by compiling on your own.

You do more to protect yourself than taking the same vulnerable source and
compiling it with Mozilla's "reproducible build chain".

If the source itself is corrupt then having a verified build of malicious
source is completely useless.

With Gentoo you can verify the source itself matches the "trusted" upstream
source and then build it with your own trustworthy build chain.

And before you go "what if your build chain isn't trustworthy huh????" think
about it a little further... if your own local build chain can't be trusted
you're already screwed even before you download anything from mozilla.org,
just as you'd be if you downloaded a "bit verified" binary from mozilla to run
on your already-pwned local operating system.

~~~
taeric
No you don't. You do _nothing_ to protect yourself from vulnerabilities in the
code by compiling it yourself. Literally nothing.

You _do_ protect yourself from vulnerabilities in their toolchain. And this is
where the effort makes sense. If there are differences in the builds, then you
can at least suspect one of you has a tampered environment. Right now, you
have no way of knowing that one way or the other. You just have the joy of
having done your own build.

My main question is still just one of magnitude. Consider, I have not had a
wreck or other car mishap in 20 years. I could conclude that seatbelts, then,
have not increased my safety really. I am _not_ trying to make that claim, as
I feel it is false. So, my question here is essentially, how much safer would
this really make things? (Or trustworthy, if you'd rather that term.)

------
zobzu
deterministic doesnt really mean trust per se to me. it means reliable.

if you dont know what you're getting every time its hard to be any reliable.

of course, if you're reliable, its more trustable.

~~~
equoid
Repeatable, "reliable" only in the sense that you can rely on the outcome.

~~~
zobzu
here's the difference:

\- make the build system reproducable, so that every build is exactly the same
binary, no matter who runs it. that's "easy". But you don't know _why_ you get
that exact binary

\- make the build system verifiable, or the resulting binary verifiable so
that you know exactly why you get that binary. This is hard.

The first one is repeatable, reliable.

The second one is trustworthy, verifiable.

