
So, you want to crypto - bqe
http://blog.existentialize.com/so-you-want-to-crypto.html
======
greenyoda
I think this quote from the article perfectly sums up the dangers of amateur
cryptography:

" _Cryptography isn 't something you can iterate on until you get it right,
because you'll never know if you do._"

~~~
ReidZB
I feel like it's quoted from somewhere else, but the reference escapes me. At
any rate, it's a great quote.

People who try the iterated design approach are especially frustrating. It
ends up becoming a game of whack-a-mole with vulnerabilities, wherein an
experienced cryptanalyst will point out an issue, the designer will say "oh!
of course! let me apply a patch!", and then this continues to infinity. (This
scenario doesn't necessarily indicate a bad approach, but it's certainly a
symptom of iterated design.)

There's a particularly lovely story in Schneier's "Memo to the Amateur Cipher
Designer" [1]:

> A cryptographer friend tells the story of an amateur who kept bothering him
> with the cipher he invented. The cryptographer would break the cipher, the
> amateur would make a change to "fix" it, and the cryptographer would break
> it again. This exchange went on a few times until the cryptographer became
> fed up. When the amateur visited him to hear what the cryptographer thought,
> the cryptographer put three envelopes face down on the table. "In each of
> these envelopes is an attack against your cipher. Take one and read it.
> Don't come back until you've discovered the other two attacks." The amateur
> was never heard from again.

Part of what makes it so frustrating, though, is that usually we want to be
genuinely helpful. Building cryptosystems is _fun_ (dangerously so!), and it's
really crappy to end up saying "just scrap the whole thing" or what have you.
But if you want to keep your sanity...

[1] [https://www.schneier.com/crypto-
gram-9810.html#cipherdesign](https://www.schneier.com/crypto-
gram-9810.html#cipherdesign)

------
plg
The Matasano crypto challenges are a great place to start getting your feet
wet and your hands dirty.

[http://www.matasano.com/articles/crypto-
challenges/](http://www.matasano.com/articles/crypto-challenges/)

Myself, I'm trying them in ANSI C

~~~
alinajaf
This is a good choice. One of the challenges had me stuck for months because I
hadn't realised that you really need fine-grained control over bitshifting
that e.g. Ruby doesn't appear to give you. Taking twenty minutes to re-write
my solution to that challenge in C sorted it out straight away.

~~~
plg
I'm presently stuck on #6 in problem set 1...

~~~
alinajaf
Yeah, that ain't it! I'd warn you that it gets _much_ tougher in later
challenges. The beginning of set three resulted in many, many pages of ones
and zeros in my notebook. All good fun though!

------
phaus
>Do not let users use your product until it's been vetted.

Its OK to let them use it so you can have a large user-base to test with, you
just need to explain to them that it isn't proven secure. As in, explicitly
tell them that they are under no circumstances to use it with sensitive
information.

Playing around with cryptography is the only way to learn it, you just have to
remember to tell people that playing is exactly what you are doing.

~~~
tptacek
Two things.

First, while "playing around with cryptography" may be the only way to learn
it, _building_ cryptographic systems is just about the worst way to learn.
Professional cryptographers start by cryptanalyzing targets and use that
experience to inform their future designs. On the other hand, veteran
implementors who have never taken the time to learn how to break crypto turn
out protocols and designs that are repeatedly broken. You can see that right
now with TLS and the TLS working group, which still hasn't fixed MtE block
ciphersuites because veteran implementors can't get it through their heads
that MtE is a design flaw.

Don't learn by building. You need to learn by breaking.

Second, more than one project has done the tightrope walk of telling their
users "this isn't really safe" but then misleading (innocently or not) non-
savvy users into trusting them. One project made it clear that their system
wasn't "ready" to defend against nation-state adversaries... but then
suggested that maybe it would be good enough for journalists, and even
promoted it at an event for teaching journalists cryptography. It was later
comically broken.

Be honest with yourself. Crypto doesn't get beta-tested into resiliency.
Strong systems start out strong. If you're building something because its your
dream to thwart the NSA, don't kid yourself into thinking that you'll get
there by first protecting people's Warcraft clans.

~~~
sillysaurus2
_Don 't learn by building. You need to learn by breaking._

The best way to do this is to do the Matasano crypto challenges. The
challenges are designed to get progressively more difficult, and since the
goal of most of them is to break something, that means you'll learn how to
employ progressively more sophisticated attacks. They're also great for a
newbie, because in one challenge you'll implement something that seems
impervious to attack, then in the following challenge you'll attack and break
it, often via an unexpected attack vector. (The padding oracle attack comes to
mind.)

You'll come away with an understanding of AES in ECB CBC and CTR modes; HMAC;
timing attacks; attacks on RNGs; attacks on hashes; and a lot more. That's
just off the top of my head.

For me, there were a dozen "aha!" moments in the first 30 challenges. Each of
those moments now live with me and inform my future decisions. They will also
make you much less confident in your ability to design secure cryptosystems,
which is good.

~~~
biscarch
Today you've convinced me to start on the Matasano challenges. I completed the
Stripe CTF from ~ a year ago and I've been looking for something else to try
my hand at that would improve my security knowledge.

------
milhous
I'm taking an Intro to Crypto course this spring. What's interesting is that
it's offered through the Math department, and assumed it was a CS class.

We'll be using this text:

[http://www.amazon.com/Introduction-Cryptography-Coding-
Theor...](http://www.amazon.com/Introduction-Cryptography-Coding-Theory-
Edition/dp/0131862391/ref=sr_1_1?ie=UTF8&qid=1387924295&sr=8-1&keywords=9780131862395)

Is this any good? Apparently a best seller in the "Software Coding Theory"
category on Amazon.

~~~
kbhomes
I had this same textbook for the Crypto course I just completed this semester.
It's a very good textbook, in my opinion, as the descriptions and examples are
really informative. Usually if I couldn't get the material through my
professor's lectures, it was sufficient to look it up in the book. However, we
did only briefly touch on cryptographic hashes and only a little on Legendre
and Jacobi symbols, and not at all on the elliptic curve and other special
topics towards the end of the text, so I can't comment on those.

The book does very good job of talking about different algorithms and
concepts, often times with a very brief historical introduction, and includes
thorough descriptions of various popular/important attacks of those concepts.
In general it's a book I'd recommend for an introduction to cryptography. You
also learn a fair introductory bit of number theory which I really enjoyed.

I also met Dr. Washington, one of the co-authors of this book, who was a very
pleasant and energetic person who really enjoys the topic of cryptography.

By the way, where are you taking this course?

~~~
milhous
Thanks everyone for their reviews. Glad to hear this isn't a POS text.

I'm taking this at Millersville University as a once-a-week, 3 hour evening
course. I'm a Physics and CS major, and am taking it as an elective to get a
Math minor.

With all the NSA and crypto news these days, it sounds like a great time to
learn about the fundamentals of crypto. And I'm curious if there will be
actual programming involved because to my knowledge, there aren't any prereqs
for it, not even an intro to programming course.

------
andrewcooke
article mentions nothing-up-my-sleeve numbers, so a topical reminder that the
permutation for md2 (and rc2 apparently) is still unexplained (despite being
"derived from pi") - [http://crypto.stackexchange.com/questions/11935/how-is-
the-m...](http://crypto.stackexchange.com/questions/11935/how-is-the-md2-hash-
function-s-table-constructed-from-pi)

for all you conspiracists - this was designed by rivest, the r in rsa, now
famous for cooperating with nsa... (i don't really believe that the
permutation is a backdoor, but i would like to know how it's derived - rivest
is famous for elegant algorithms, and for the life of me i can't find a
simple, neat way to get those numbers from pi)

~~~
tptacek
A world full of brilliant cryptographers outraged at the NSA, trying to get
the NSA off the IETF crypto review board, working on publishing results about
NSA-sponsored crypto... and you want to talk about the MD2 and RC2 constants?
What's the largest system that ever relied on MD2? Let's start there.

~~~
pbsd
There were certificates (including a root CA) using MD2 until recently. MD2
itself was only retired in 2011 [1].

[https://www.rfc-editor.org/rfc/rfc6149.txt](https://www.rfc-
editor.org/rfc/rfc6149.txt)

~~~
tptacek
You are obviously right. Now I feel dumb. I concede the importance of MD2.

Do you believe that the starting state for MD2 is a possible backdoor?

 _Later: I 'm batting .000 today on this stuff; it's not the starting state of
MD2 that he's talking about, of course, and the misapprehension that he was is
part of why I was dismissive. Go me._

~~~
pbsd
It seems your edit did all the work for me. Being in the core of the MD2
compression function puts the Sbox in a good place to be a backdoor.

However I strongly doubt this is one. The attacks that have broken MD2 do not
seem to hinge terribly on the Sbox (I may be wrong, it was only a cursory
look). It's more likely to me that the Sbox was generated using a hard-to-
replicate Knuth shuffle using the digits of Pi.

------
haberman
I'm curious to hear people's thoughts about git. Git is "crypto" to some
extent, Linus does not appear to have tons of crypto expertise, and it uses
SHA1 as a MAC AFAICT (which according to tptacek's earlier comment is
invalid). And yet I've never heard about attacks on its crypto.

This was interesting for me to think about because it seems like a
counterpoint to the article, in that it is a very successful project that came
about in a very "quick and dirty" way as opposed to starting with formal
protocol design.

\--

I see that Linus disclaims the idea that SHA1 is about security: "Git uses
SHA-1 in a way which has nothing at all to do with security.... It's just the
best hash you can get.... It's about the ability to trust your data. I
guarantee you, if you put your data in Git, you can trust the fact that five
years later, after it was converted from a hard disk to a DVD to whatever new
technology and you copied it, five years later you can verify that the data
that you get back out is the exact same data you put in."

But it seems like avoiding attacks like this must also be a goal:
[http://lkml.indiana.edu/hypermail/linux/kernel/0311.0/0621.h...](http://lkml.indiana.edu/hypermail/linux/kernel/0311.0/0621.html)

~~~
alinajaf
AFAIK the only "crypto" in git is GPG used to sign tags.

The content addressable data store where all the objects are kept is basically
a filesystem where every filename is the SHA1 of its contents.

If you were to generate an object that was a SHA1 collision of an existing
object and inject it via a commit (without access to filesystem, otherwise the
point is sort of moot) then git won't overwrite the original object with that
SHA1[1].

Maybe there's some other mechanism in Git that you're referring to that uses
SHA1 as a MAC that I'm perhaps unaware of?

[1]: [http://stackoverflow.com/questions/9392365/how-would-git-
han...](http://stackoverflow.com/questions/9392365/how-would-git-handle-a-
sha-1-collision-on-a-blob)

~~~
haberman
Git assumes that a matching SHA1 means that the content is equal to the
original content. Is that not crypto? For example, if you sign a tag, it
appears to sign the SHA1 of the associated content.

This is definitely outside of my expertise, so I'm sure that my understanding
is incomplete. The larger questions for me are:

\- if git's SHA1 content-addressable design is not crypto, how do you
distinguish crypto from software like git that uses cryptographic primitives
for useful purposes?

\- is a project like git a safe/sane thing for a non-cryptographer to design
and implement? If so, why do all the warnings in this article not apply?

~~~
alinajaf
I'm not a cryptography expert either, but I'll give this a crack...

In as much as SHA1 is a "cryptographic hash function", Linus isn't taking
advantage of a few of it's cryptographic properties in his usage of it in git.
It would for example, make no difference to the workings of git if you could
reverse-engineer the contents of an object from its SHA1. In the same way, it
doesn't matter much to the operation of git that you can generate collisions
for SHA1, though if you were running into collisions all the time, it would
make everyday usage difficult.

> if git's SHA1 content-addressable design is not crypto, how do you
> distinguish crypto from software like git that uses cryptographic primitives
> for useful purposes?

If I'm understanding you correctly, software that "uses cryptographic
primitives for useful purposes" is usually trying to guarantee one or more of
the following:

* Confidentiality - Keeping data secret

* Integrity - Making sure data hasn't been tampered with

* Authentication - Making sure that the person you think sent the data is in fact the person who sent the data.

* Non-repudiation - Ensuring that the person who sent the data can't deny that they in fact sent the data.

Git makes guarantees about none of the above in its usage of SHA1. You could
argue that it makes a guarantee of integrity in its content addressable file
store, but it doesn't. If you can modify the files in the .git/ directory, you
can screw up the repository to your hearts content. There's no way to do so
remotely, i.e. by creating and pushing a Git commit with an existing SHA1. You
typically protect Git from local tampering by only allowing access via SSH,
which has plenty of crypto in it.

When it _does_ make guarantees about authentication (signing tags) it uses GPG
more or less off the shelf. In that case the SHA1 is a reference to a commit
object, and you're saying that "I, Najaf Ali, sign off on the commit with this
SHA1". It doesn't guarantee anything about the _contents_ of that commit if
the repository has been tampered with.

> is a project like git a safe/sane thing for a non-cryptographer to design
> and implement? If so, why do all the warnings in this article not apply?

See above on git not making the guarantees that software that "uses
cryptographic primitives for useful purposes" tend to make. Since Git makes
none of those guarantees, it's (I think) a safe/sane thing for a non-
cryptographer to design and implement. In practice, what Linus has done has
let other off-the-shelf crypto (SSH and GPG) make the required guarantees for
him.

~~~
haberman
Thanks for this, you answered my questions thoroughly. I'm not entirely
convinced by this though:

> [A signed commit] doesn't guarantee anything about the contents of that
> commit if the repository has been tampered with.

I think most people would intuitively expect the signed commit to guarantee
the contents of the tree being signed. The idea that you could "git pull" a
repo from a compromised machine, verify the signed commit, but not actually
have a guarantee about the tree matching the one that was signed would run
counter to most people's expectations, I suspect.

In other words, this to me seems like a "technically, we don't guarantee"
statement about something that is _de facto_ thought to be guaranteed.

~~~
alinajaf
I did a bit of quick reading on this and at first glance my description of how
git tagging works appears to be on point, i.e. all it guarantees is that a
particular user asserts that tag X points to commit with SHA1 Y.

I'm not sure that it says anywhere in the documentation that it guarantees
anything more than that, but I agree that a significant proportion of
developers would intuitively expect that the entire content of the tree to be
signed rather than just the SHA1.

~~~
haberman
> I'm not sure that it says anywhere in the documentation that it guarantees
> anything more than that, but I agree that a significant proportion of
> developers would intuitively expect that the entire content of the tree to
> be signed rather than just the SHA1.

Further evidence that they do assume that:
[https://news.ycombinator.com/item?id=7003900](https://news.ycombinator.com/item?id=7003900)

~~~
alinajaf
I agree that they do assume that, but fail to see what connection that has to
the actual workings of git. AFAIK the behaviour of software doesn't change in
accordance with how developers _think_ it works.

------
betterunix
If you want a more "theoretical" look at the theory, Introduction to Modern
Cryptography by Jon Katz and Yehuda Lindell is a great book. Also good (but my
copy had many printing errors) is Foundations of Cryptography by Oded
Goldreich.

~~~
ReidZB
Yes! This is exactly what I was going to post. The article's recommendation to
read _Applied Cryptography_ and the _HAC_ to "learn the theoretical
background" left me dejected, since neither is particularly that great in the
area of theoretical underpinnings. (The _HAC_ is a reference book, for
Chrissake!) Both are great books in their own right, but they're _not_ what
I'd recommend for the theoretical background.

Katz and Lindell's _Introduction_ , on the other hand, is absolutely fantastic
for the task (this was its design goal...). It introduces theoretical
cryptography from the bottom-up and uses it to motivate the various primitives
and constructions from the applied realm. It's really a great mix. The book
has become my go-to recommendation for those who are serious about
cryptography but have had relatively little exposure to it. It also doesn't
assume the reader is an expert in all things computer science, which is nice.

Goldreich's _Foundations of Cryptography_ is more of a treatise on theoretical
cryptography... it goes much deeper and starts out assuming the reader is
pretty familiar with concepts from theoretical computer science and
probability theory. The optional sections of Katz and Lindell's work end up
being the opening chapters of the first volume --- and they're not optional.
Block ciphers aren't even treated until the second book. It's a seriously
theoretical series, which makes it great in its own right, but I would
postpone reading it until well-after Katz and Lindell's book. (And a book on
computational complexity, at minimum, for those not familiar with it.)

------
rnicholson
>Both Applied Cryptography and the Handbook of Applied Cryptography are great
resources, although they're a little dated now. ... Step one is to read
Cryptography Engineering. This is not optional. Read it. It is a fantastic
book that details how to use cryptographic primitives.

It seems kinda superfluous to mention Applied Crypto when the real reco is to
read Cryptography Engineering. I'd almost wonder if it would be better to
direct people away from Applied Crypto...

Personally, I found Applied Cryptography to be so-so at best. Practical
Cryptography was a breath of fresh air in comparison.

~~~
helper
Yes. Recommending Applied Cryptography is usually a warning sign that the
person doesn't know what they are talking about. In this case the rest of the
advice is reasonably sound for an engineer that wants to start learning the
fundamentals of modern cryptography.

------
theboss
TL;DR - If you want to do crypto then learn crypto.

If you want to learn crypto and do crypto then certainly start with this.
Then, when doing crypto...practice. Build it and reach out and ask for help
and talk to people who know what they are doing and learn from them. Ask them
about problems you encountered and ask them about the best ways to solve
them...otherwise you will continue to make the same mistakes.

------
derefr
> And don't make your cryptography project sound like snake oil. Saying
> military grade encryption or N-bits of security makes you sound like you
> don't know what you're talking about.

Interesting to contrast this with patio11's statement from just a few days ago
([https://training.kalzumeus.com/newsletters/archive/sco_remin...](https://training.kalzumeus.com/newsletters/archive/sco_reminder)):

> People are better at remembering images than they are remembering claims or
> facts. "256-bit SSL encryption" is a true fact about your software product,
> but for most customers it goes in one ear and out the other. "Bank-grade
> encryption" is an image -- people can envision the vault -- and is vastly
> more likely to be recalled favorably when someone is worried about security.

~~~
alinajaf
For Joe Public, you say "Bank Level Security", for hackers you drill down into
the details. There's no reason why your marketing material can't give the
visual security imagery that people want and then walk through the exact
countermeasures you're taking on your security page.

N.B. Patrick also says to use your powers for good rather than for evil :)

------
Nursie
Ok so I do want to crypto and (to the best of my ability) I already do. I
follow best practices, read about the subject matter, did coursera's crypto 1
(and where the hell is pt2? 1 was awesome!). I use established algorithms and
I use, well audited implementations etc etc. where available.

I have a question about MACs. We're using HMAC based on SHA256 with 32-byte
keys on our new system, but our security architect only wants us to send and
verify 4 or 8 bytes of the MAC output. Am I wrong to be suspicious of this? It
massively reduces the number of bits an attacker has to guess or calculate,
though at 8 bytes that's 128 bits so not exactly a quick brute-force...

~~~
arghnoname
It definitely does reduce the strength of the MAC, but it is okay if your
security requirements require it. Keeping in mind that some generic birthday
attacks already reduce HMAC strength to n/2 bits (IIRC), and SHA-256 has you
down to 128 bits of security (with an online attack though).

Truncation is mentioned in RFC 2104. I quote:

5\. Truncated output

    
    
       A well-known practice with message authentication codes is to
       truncate the output of the MAC and output only part of the bits
       (e.g., [MM, ANSI]).  Preneel and van Oorschot [PV] show some
       analytical advantages of truncating the output of hash-based MAC
       functions. The results in this area are not absolute as for the
       overall security advantages of truncation. It has advantages (less
       information on the hash result available to an attacker) and
       disadvantages (less bits to predict for the attacker).
       Applications of HMAC can choose to truncate the output of HMAC by
       outputting the t leftmost bits of the HMAC computation for some
       parameter t (namely, the computation is carried in the normal way
       as defined in section 2 above but the end result is truncated to t
       bits). We recommend that the output length t be not less than half
       the length of the hash output (to match the birthday attack bound)
       and not less than 80 bits (a suitable lower bound on the number of
       bits that need to be predicted by an attacker).  We propose
       denoting a realization of HMAC that uses a hash function H with t
       bits of output as HMAC-H-t. For example, HMAC-SHA1-80 denotes HMAC
       computed using the SHA-1 function and with the output truncated to
       80 bits.  (If the parameter t is not specified, e.g. HMAC-MD5, then
       it is assumed that all the bits of the hash are output.)

~~~
Nursie
Thankyou, that's extremely useful. Birthday attacks I had thought of, did not
know sha256 was effectively 128 bits. Will dig into the rfc and other stuff
and see if I can make a case for longer (maybe 16 byte) field.

I know some of the older MAC techniques (ANSI X9.19) turn out to actually aid
key recovery if you use shorter MACs, which is odd...

------
jiggy2011
Surely the correct answer is "just use keyczar"? At least 99% of the time.

~~~
tptacek
Most practitioners would recommend Nacl now.

~~~
jiggy2011
Nacl or libsodium?

~~~
hendzen
The primitives are the same, but libsodium is more portable (but slower).

------
berrypicker
In college cryptography was my main interest, but it was mostly theoretical
(math) and little programming, which meant I was in fact useless when it came
to practice because I had no experience in implementation and (I found) there
are so many unknowns that one of the most important things is experience in
implementing stuff in/on a specific language/platform.

I have signed up to the Coursera course and hope to brush up on basic topics
and start doing more advanced crypto.

------
cconger
I love this article. It takes a pro-active, how to proceed attitude at the
same time laying out the classic pitfalls that exist. This is the tone I wish
to have at all times instead of the cynical one that I undoubtedly adopt.

------
sidcool
The author seems quite pissed at the state of crypto in the world, and he's
definitely trying to help. I like the general language of the post. Good work
and keep it up!

------
lazyjones
This is a condescending blog post by someone with an (apparently) much weaker
crypto background than the telegram people he is ranting about. Of course it's
much easier to post something like that than it is to actually get a rock-
solid implementation at the first attempt - and we can safely assume that the
telegram people do not need such advice.

Would not read again.

------
ztnewman
>Don't listen to idiots who tell you otherwise.

Real mature.

~~~
tptacek
Did you have an actual opinion about what he was saying in the essay, or do
you just want to talk about how he chose to write it?

