
How I compiled TrueCrypt 7.1a for Win32 and matched the official binaries - maqr
https://madiba.encs.concordia.ca/~x_decarn/truecrypt-binaries-analysis/
======
generalpf
That's amazing work. Well done to the author.

~~~
junto
Came here to say this too. It is this kind of in depth investigative work,
that has genuinely taken time and smarts to achieve, is what I come to HN to
enjoy reading. All credit to the author.

------
wai1234
This is a great first step but we're not done yet. It proves the binaries are
built from the published code, but only when the published code has been
thoroughly vetted can we conclude there is no backdoor.

~~~
edwintorok
Also the build is not deterministic yet, even on his own machine he got some
differences in truecrypt.sys between successive builds. By definition the
build is not (yet) fully deterministic then:

"Using the same source and same project directory results in the same pattern
of difference in the block starting at 0002CBAC, as the pattern shown between
my build from the correct project directory and the original file. This means
that this difference is a normal result of the compilation process, and can be
considered harmless from our point of view"

~~~
0x0
The disassembly for both versions of the file was a 100% match, though, which
is a pretty good indicator that the binary difference must be something
unimportant and not related to the actual code.

~~~
ynik
It's possible that the difference is in the actual code: x86 machine code
sometimes has multiple different encodings for the same assembler instruction.
These would show up as single-byte differences in the binary. See
[http://www.strchr.com/machine_code_redundancy](http://www.strchr.com/machine_code_redundancy)
for some examples. Compilers might use this to stenographically hide
additional data in the binary. Printer manufacturers already do something
similar
([https://en.wikipedia.org/wiki/Printer_steganography](https://en.wikipedia.org/wiki/Printer_steganography)).
I wouldn't be surprised if MS compilers embedded a hidden data in binaries --
it could be useful to track down malware authors; or identify software created
with pirated MS tooling.

~~~
oakwhiz
Perhaps this could even be used by an organization to differentiate official
binaries of closed-source software from leaked binaries.

------
yeukhon
"TrueCrypt is a project that doesn't provide deterministic builds."

Why? What is the benefit of doing so when everyone wants a deterministic
build?

~~~
aroch
Deterministic builds are hard...really hard. The combined power of the Debian
community has trouble getting deterministic builds, as does the Tor project.

~~~
gngeal
_Deterministic builds are hard...really hard._

I'd have thought that deterministic builds are really simple unless your
toolkit ecosystem is FUBAR. After all, a compiler is a simple function from
input to output (unless the FUBAR ecosystem syndrome arises, as I said).

~~~
UnFleshedOne
Here is a not-in-depth look at what's involved in trying to generate byte-for
byte binaries on msvc:

[http://stackoverflow.com/questions/1180852/deterministic-
bui...](http://stackoverflow.com/questions/1180852/deterministic-builds-under-
windows)

In short, if you ignore certain things when comparing binaries and make sure
you build things on absolute path of the same length(!), you can tell binaries
are functionally equivalent.

~~~
PeterisP
Well, the parent poster would say that if binaries depend on the path length
of build location, then that's evidence of a FUBAR toolchain :)

------
zokier
Just a slightly off-topic question, but WTF does TC require VC 1.52 for?

~~~
sliverstorm
Considering they document their build process very carefully, I can only
imagine they wound up standardizing on that library a long time ago and keep
using it to this day.

~~~
acqq
Yes, even if there is any recent open source alternative to build 16-bit code,
selecting the 20 years old compiler binaries
([http://en.wikipedia.org/wiki/Visual_C++](http://en.wikipedia.org/wiki/Visual_C++))
for which it can be easiest to be sure they haven't changed (there's still
enough CD's around produced at that time, I believe I still have one too) is a
damn good decision.

~~~
im3w1l
"Reflections on trusting trust" is 30 years old. 20 year old binaries should
still be more trustable than newer ones though, I suppose.

~~~
acqq
Truecrypt didn't exist in 1993.

------
bliker
I am just shooting into darkness, but would not it be easier to compile it
twice and diff outcomes to find found out what parts are being changed so
those can be ruled out?

~~~
pionar
He did that further down in the article, when he noticed that file path was
encoded in the TrueCrypt.sys file.

------
proctor
it seems to me that the relaxed gpg key verification that the author uses
doesn't give us any more assurances regarding the authenticity of the source
than a simple hash offered on the website would. i think in this situation, if
the author did not intend to attempt more rigorous verification of the
truecrypt pgp key, at least cross-checking that the key offered on the site
matches the key offered on a public key server pgp.mit.edu for example would
be prudent before signing the truecrypt key with your own.

    
    
      Import the .asc file in the keyring (File > Import certificates).
      Now you should mark the key as trusted: right click on the TrueCrypt Foundation public key 
      in the list under Imported Certificate tab > Change Owner Trust, and set it as I believe checks are casual.
      You should also generate your own key pair to sign this key in order to show you really trust 
      it and get a nice confirmation when verifying the binary.

~~~
acqq
You can do yours checks and compare with the author's! It's a very fast thing
to do for the checks you mention. The more people repeat it, the better.

~~~
proctor
i think there are two concerns here. one is that the source is not tainted by
a third party during or before the download. the second (arguably much more
important in this case) concern is that the compiled binary matches the
source. the second concern is addressed well by the author as far as i can
tell, but i think that there is room for improvement in their concerns about
the former. i assume they have thought about this and do have at least some
concerns because it is mentioned that

    
    
      The PGP signature of the binary can be downloaded 
      through the button PGP Signature, which makes you 
      download TrueCrypt Setup 7.1a.exe.sig over HTTPS 
      (*although with the NSA in the middle, it might not 
      mean much*).
    

[emphasis mine]

cross-referencing the pgp signature with at least one other (public) source
would go a long way toward allaying those concerns (that the HTTPS might not
mean much).

this criticism is in no way meant to detract from the rest of the work, and i
mean only to refer to pgp sig verification best practices here.

~~~
acqq
But you can do this too easily and be make sure yourself! Do it yourself with
GPG, then calculate SHA of the binaries, compare with his text. he published
the checksums with which he worked in more points of his analysis.

------
pointernil
I get the point reg. verifying the Windows-Compiling-Build, but wouldn't the
same verification on an open source platform allow for even easier (maybe even
automatic) verification?

How about an vmware/vbox image setup explicitly for that purpose? Not feasible
for windows due to licencing issues, i guess.

Also, huge kudos for the effort going into this work. Thanks!

------
CUViper
> TrueCrypt is not backdoored in a way that is not visible from the sources

... as long as you also trust the compiler not to introduce any backdoor...
(cf. Reflections on Trusting Trust)

~~~
dmix
Oh god, not this conversation again.

~~~
CUViper
Why not? If you're being so paranoid about the origin of a binary, you have to
at least acknowledge the fact that you're trusting the compiler in making this
comparison.

So let me throw out an idea that might help justify this trust too. Compile
the same TrueCrypt sources with a totally different compiler, then use both
binaries in a deterministic way and compare the raw encrypted result. (I'm
assuming here that the same encryption keys and data will give the same
result, but don't know for sure if that's true.)

~~~
jlgreco
Create a piss-poor barebones compiler (compiler A) by hand _(literally, if
possible. punchcards would allow you to hand-verify the contents of the
program in a way that is not susceptible to Thompson 's attack (there is no
risk that your eyes were built with a compromised compiler))_ that suffices to
compile a compiler that you want to verify (compiler B). Compiler A should to
run on hardware that you can trust (need not be x86).

    
    
      (= ((a-binary b-source) b-source) b-binary)
    

If you trust a-binary, and you trust b-source, then if that returns true you
should be able to trust anything created with b-binary. _(a-binary b-source)_
will not equal _b-binary_ , but _((a-binary b-source) b-source)_ should.

If the hardware is not executing binaries as described, then all bets are off.
Compiler A could, in theory, be build and run on a homebrew CPU
([http://www.homebrewcpu.com/](http://www.homebrewcpu.com/)), but if you are
putting that much effort into this, this better be your hobby...

~~~
sliverstorm
Let's not forget you need to verify by hand the electrical characteristics &
functionality of every logic chip you use to build that homebrew CPU, because
the supply chain of basic logic chips is already known to be infected
(primarily a QA issue, but clearly a potential vector)

~~~
tedunangst
Don't forget the power company. They can cause "spurious" bit flips by
momentarily dropping the voltage at just the right instant. Best to use a
battery made from potatoes you've grown yourself.

~~~
sliverstorm
You also need to ensure either the device, or the location you operate the
device is rad-hardened. We cannot rule out the possibility that They can
control the phases of the sun and introduce errors to your circuit at-will
with solar flares.

~~~
kefka
I think it's time for the crowbar.

------
pamparosendo
I entered just to say it's an incredible work done by this guy... it's been
years since I analized a file on hex mode (from Norton Commander, jeje).

------
TheRealWatson
Please God, don't let the author be working for the NSA. These days I get
suspicious at every "it's all good" piece of news.

~~~
pilif
The good thing about his analysis: It has all the information you need to
reproduce it and form your own conclusion. So even if he was working for the
NSA, by following his steps, you will either come to the same result or not.

That's how science works.

------
xbeta
Coolest post I've read today! Good work!

------
smegel
Kudos for effort.

------
eterm
Tldr: Binaries didn't match, here's some handwaving at the differences.

~~~
e40
I think that's inaccurate. The disassembly of them matched perfectly.

~~~
PeterisP
Well then, what and why are the differences? I mean, if there's an arbitrary
data block somewhere, then the "matching disassembly" can have wildly
different behavior by simply copying & executing parts of that block.

~~~
voltagex_
It's explained in the article. Timestamps, file paths, certificates and
oddities of the PE format.

~~~
PeterisP
I got an impression from the article that disassembly was what he did to
explain the binary differences that remained _AFTER_ he corrected for
timestamps/certificates/etc.

~~~
Dylan16807
Your impression was wrong. He showed _all_ of the differences in the
screenshots. That's how few there were. Not a single bit of the code portions
was different. Only a handful of metadata bytes, plus appended certificate.

The disassembly was only there to be cute and emphasize the point.

