Hacker News new | past | comments | ask | show | jobs | submit login
How Not to Learn Cryptography (2014) (brown.edu)
199 points by lucas-piske on June 1, 2020 | hide | past | favorite | 51 comments



I've been working through the Cryptopals Crypto challenges (https://cryptopals.com) over the last month-and-a-half (almost done with Set 6), and they've been extremely educational.

Attacks that I thought were just "theoretical" turned out to be very practical (sometimes even quite simple.) I've always known that one shouldn't roll their own crypto because there's so many ways to shoot yourself -- but holy hell, executing some of these attacks really drive the point through.

(Also, I've learned more number theory in the last few weeks than in my whole life!)

If anyone's interested, I've been solving them in Go (which turned out to be surprisingly convenient for many reasons), and my solutions (so far) are here: https://github.com/0xfe/cryptopals.


Cryptopals strikes me as a very good way of scaring people from not only inventing, but also implementing their own crypto. It seems it's primary effect is to make people confident enough to repeat the "don't roll your own crypto" mantra to anyone who would listen.

Some people however don't really have a choice. "Just Libsodium" doesn't work on anything smaller than a Raspberry-Pi, or pretty much any embedded system out there. There are alternatives out there (shameless plug: https://monocypher.org), but sometimes your only choice is to code and optimise it yourself.

Sometimes, your only reasonable choice is to implement your own crypto. And I can tell from experience, a few weeks of full time learning is enough to not shoot yourself in the foot. The trick is to learn the right things (not everybody can spend a few weeks with Dan Boneh), but if you limit yourself to simple primitives like Chacha20 and Curve25519, it's not that hard. (Fun fact: I did not spend a few weeks with Dan Boneh, and I did shoot myself in the foot once.)

Because let's be honest: you don't need to know all the attacks to protect yourself from them. What you need to know is the relevant classes of attacks, and how to void them. For instance, all timing attacks are stopped if your code runs in constant time (which in practice mostly means without secret-dependent branches and without secret-dependent indices).


Thanks, I agree with almost all of this. I didn't mean to imply that Cryptopals is how you become a cryptographer, just that it's a fantastic way to get down and dirty with cryptography, and learn a ton.

These challenges are not easy, and you need some programming sophistication to get through them, since most of them require you to actually implement well-known primitives and algorithms (like Diffie-Hellman, RSA, DSA, PRNGs, etc.)

And to do that correctly, you end up studying a _lot_ of side material.

You do come out of the other end learning about various classes of attacks, but not necessarily how to avoid them -- that depends entirely upon your intellectual curiosity. :-)


What's rough about implementing your own crypto is that sometimes you screw it up, as you well know. When you screw up an attack, nobody suffers. Not so for novel implementations of primitives that you provide to others.


Looking at Cryptopals in particular it's not ideal that you need to get through dozens of other exercises before you finally email off to the authors and start to run into problems concerning today's fan favourites like Elliptic curve DH.

I'm at home even more than usual now (for obvious reasons) but even then I suspect I will get distracted before I get as far as say, Challenge 42 which is - though still relevant and I get that it makes sense to drum that in - ancient history by today's standards, let alone Set 8.

As a result Cryptopals feels like it recapitulates history from the point where the authors got into the industry and so may unintentionally mislead. I remember your surprise when it turned out Microsoft hadn't done the necessary checks for a curve based signature algorithm and so they'd actually been shipping code that would accept bogus signatures. My instinct is that Cryptopals challenges would be more effective (but might make some people involved less comfortable) if they rearranged some of the 21st century attacks on those "fan favourite" algorithms into earlier sets instead of accumulating them in Set 8.

I dunno maybe this is like complaining that my school maths textbook (first written in the 1940s I swear I'm not that old) assumes you'll use log tables rather than a calculator. But the Cryptopals challenge site clearly doesn't seem to think it's a historic artefact, so it shouldn't act like one.


Set 8 questions can be found here: https://toadstyle.org/cryptopals/


Thank you, it's unlikely I'll get that far (yesterday I wanted to do six things, it is now 0240 and I am starting the first one...) but even if I don't end up using it somebody else might.


Sets 1-6 of Cryptopals were a blog post that I wrote in 2010. At the time, I was enmeshed in a series of Twitter arguments with a security industry personality who shall remain nameless, and was concerned that if I simply published the blog post, I'd be arming that person with a large series of buzzwords that could be deployed in Twitter slapfights without any real comprehension of how this stuff worked.

So, instead, I chunked the post out into sets of 8 challenges which we delivered to all comers via email, on the condition that you had to complete the previous set and return your code to us before receiving the next set. Obviously, my (at the time) Twitter nemesis wasn't going to put the time in to write the code to break repeating-key XOR, let alone BB'98.

The challenges were successful far beyond what I could have predicted. Maciej Cegłowski wrote a blog post about them† that hit the front page here; we got flooded with requests, which we kept up with with a Twitter scoreboard, and made donations to charity when people made it through all 6 sets. There are working professional crypto engineers who got their start with the challenges --- not our intent at all, but I'll take it. I'm most proud of the fact that we've amassed what I believe to be the world's largest collection of Bleichenbacher padding oracle exploits, in every conceivable programming language, including several people made up just to do them.

All this is to say that getting people to whatever you think is "relevant" cryptography by today's standards was never the point. In fact: that's pretty close to the opposite of the point; I wrote the blog post as internal training for our team at Matasano, who needed to know how to break cryptography as it exists in the real world, not just the cryptography people think is the best, most relevant in use today. And, suffice it to say, no matter how distractable you might be, a very clear understanding of how the BB'98 attack actually defeats RSA is important for a working crypto engineer; probably more important than a lot of modern stuff. Bleichenbacher aside, though, really what you're seeing is a snapshot of au courant crypto attacks from 2010. Away from the libsodium world we live in now, CBC and unauthenticated encryption were quite common, and you still had to convince developers to fix them.

No argument from me, though; Set 8 is the best of them. I had nothing to do with it; that was all Sean Devlin. He released them one at a time on Twitter as a fundraiser for the Great Slate back in the 2018 election cycle. Nobody's making you send your homework back to me to see Sean's modern crypto set; you can just skip to it and do those challenges; they're online already, just not on that site (we don't work at NCC anymore, haven't for years, and they own the website).

https://blog.pinboard.in/2013/04/the_matasano_crypto_challen...


There's another side to this coin: while users can be hurt by a buggy implementation, they can also be hurt by the absence of an implementation: they now have to either come up with their own, chose something less than ideal, or give up entirely.

For instance, Monocypher is 20 times faster than TweetNaCl at signature & signature verification. On a micro controller where you can't use Libsodium (say Cortex M0), that speed difference is significant. Erasing Monocypher from the face of the Internet would hurt users who could have used the extra speed.


What good is a signature verification library that's 20 times faster than TweetNacl --- which few use --- if its scalarmult is broken and it incorrectly verifies forged signatures? If correctness isn't a requirement, I can make signature verification take ε cycles.


Broken implementation are worth nothing, duh. Of course correctness is a requirement. That's is why I took a couple precautions since that little disaster 2 years ago: I stopped using maths I didn't perfectly understand. I imported the Wycheproof tests (which by the way revealed no additional bugs, contrary to what lvh jokingly suggested on Twitter at the time). And a number of other things: https://monocypher.org/quality-assurance/test-suite

Don't worry, I've learned my lessons, and I do take correctness seriously.

Now maybe Not many people use TweetNaCl on embedded chips, but the alternatives often aren't much shinier: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8725488 I've compiled the numbers on my benchmark page, https://monocypher.org/speed but the gist of it is: Monocypher is often much faster than everything else. The only thing that beats it is hand written assembly!


Your comments on this thread read like a marketing pamphlet for your crypto codebase, rather than an actual attempt to engage with the discussion.

As a marketing message, hearing that you believe you “perfectly understand” the math involved doesn’t fill me with confidence.


I won't apologise for mentioning Monocypher here and there. I work on it on my free time, I pay for the servers, I make no money from it, and I genuinely believe it fills a legitimate niche.

One way I make sure I perfectly understand the maths I'm messing with, is by not messing with the maths I don't understand. It's a simple rule, which I have broken once. Big mistake[1], won't do it again. Also, I've been at it for over 3 years, I've learned a few things[2,3].

Finally, I do believe I was engaging with the conversation. My first point was that as great as Cryptopals might be, it's not the only way to learn crypto. On the contrary, it would seem the takeway is that nope, you definitely should not roll your own crypto. I get the point about safety, but this feels somewhat counter productive. That and the stark contrast with other domain. I don't hear "don't roll your own compiler", "don't roll your own file format"… even though they're often as safety critical as cryptography itself.

Thomas' reaction is true, but it's not useful. Yet another way of saying "don't roll your own crypto", because well, it can hurt your users if you screw up. No kidding. But he seems to ignore the opportunity cost, so I pointed that out.

Then Thomas subtly suggested that my work is worth nothing. Not what I'd call "engaging with the conversation". That rubbed me the wrong way.

A bit of context may be warranted: the scalarmult error he was referring to I quickly patched and publicly disclosed back in 2018[1]. The reactions[4,5] it elicited where a bit more hostile than is usually tolerated here. It's not just the mockery (I did screw up), it's the willingness to asses something they hardly even looked at.

[1] https://monocypher.org/quality-assurance/disclosures

[2] http://loup-vaillant.fr/tutorials/fast-scalarmult

[3] http://loup-vaillant.fr/tutorials/cofactor

[4] https://twitter.com/tqbf/status/1011219783755468801

[5] https://twitter.com/lvh/status/1011359079376318464


> "Just Libsodium" doesn't work on anything smaller than a Raspberry-Pi, or pretty much any embedded system out there. There are alternatives out there

And the same author provides a solution for those systems as well with libhydrogen.

Though that doesn't look so super hot anymore given the recent advances on Gimli[1,2,3], but it'll likely still hold up in practice.

> but sometimes your only choice is to code and optimise it yourself

That is a critical shortcoming of the ecosystem. If you reach that point, you should be hiring a cryptographer/experienced implementer. Your follow-up question might be “Where do the cryptographers come from, then?” and the answer to that is: “PhD programs at universities, ideally”. Curiously, however, many (most?) cryptography libraries that are used in practice appear to be written by people with barely any academic background. We should be working to rectify that one way or another (send the implementers to university or pull more people from theory into implementation practice).

> you don't need to know all the attacks to protect yourself from them. What you need to know is the relevant classes of attacks, and how to void them

Some attacks, however, can be quite surprising or virtually impossible to mitigate without deep knowledge of the specific problem domain. Are we sure how to mitigate software implementations of EC scalar multiplication against differential power analysis yet?

And that's before you get to protocol design, where there are new, mysterious ways to shoot yourself in the foot (use TLS, use TLS, use Noise).

[1] https://eprint.iacr.org/2020/561.pdf

[2] https://eprint.iacr.org/2020/591.pdf

[3] https://eurocrypt.iacr.org/2020/rump/ec2020rump-paper23-slid...


Its oddly worse than you say. Most crypto code I have seen is of below median code quality. It seems to be written by people with neither a strong academic crypto background nor a solid practical coding background.

Compare e.g. OpenSSL, the most widely used library, with SQLite, the most widely used embedded database. OpenSSL is a mess under the hood with a hideous API on the surface, while SQLite is beautiful code with an API that is a joy to use.

Newer trendier stuff like libsodium are better but they still feel unpolished to me.


> Some attacks, however, can be quite surprising or virtually impossible to mitigate without deep knowledge of the specific problem domain. Are we sure how to mitigate software implementations of EC scalar multiplication against differential power analysis yet?

This particular attack is not really about ECC, and more about the processor & program you use. One "easy" way to immunize yourself against power analysis is to have your implementation be "constant power". That is, no information must flow from secrets to power.

The first step is coming up with hardware with constant-power operations. Then the cryptographic engineer must make sure the power-variable instruction never have secret inputs. That may require compiler support, writing assembly code by hand, or formal analysis.

That can be wickedly hard indeed, but it also has little to do with ECC specifically. The deep knowledge required there is about hardware, compilation techniques, formal methods… The only ECC specific part is identifying the secrets, and that's comparatively trivial.


I want to point out just so people understand that NO popular platforms meet these criteria for any algorithm whatsoever. Any CPU with any form of branch prediction or speculative execution can't be made truly totally secure against these attacks, and that is all high performance CPUs.

I suspect these sorts of issues are why black box hardware units like NSA type I encryptors are used for the most sensitive communication.

What you can do with popular platforms is try to mitigate the most egregious and easily exploitable variants by (a) not branching on secret data and (b) avoiding lookup tables or other memory access patterns that depend on secrets. The latter much harder to achieve and is why AES sucks on chips without hardware AES support. (But on chips with that support, it becomes better than alternatives!)

There are other competing practical concerns though like power use, speed, and standards compliance.

This stuff is hard.

Edit: these sorts of attacks do matter and are worth trying to mitigate if your code will ever run in the cloud. In that case it will run on multi tenant machines where there may be malicious neighbors who can analyze timing on shared hardware.


Cryptopals strikes me as a very good way of scaring people from not only inventing, but also implementing their own crypto.

Now that's very useful.

The real way to learn cryptography is to go through the National Cryptologic School at the National Security Agency in Maryland. They used to start everyone out with paper and pencil cryptanalysis. They classically emphasized that most cyphers are broken because someone made an error in key management. (See Venona). That's probably still the case.

"No new cypher is worth looking at unless it comes from someone who has already broken a very hard one" - Friedman.


> Cryptopals strikes me as a very good way of scaring people from not only inventing, but also implementing their own crypto

Paradox alert: the person who knows enough about crypto to fear implementing it is the person I want implementing the crypto I am using.

So if the effect of such education is to discourage you, that means the crypto is left to the clueless to implement.


Could you give a couple examples of attacks that you thought was just theoretical but turned out to be very practical? Very curious about this topic.


Sure:

- The timing attacks were the most surprising to me -- particularly because how easy they were to exploit once you collected enough statistics.

- Finding the seed / predicting upcoming random numbers from popular PRNGs by reconstructing internal state. (Interesting side note: I had to implement the Mersenne Twister from wikipedia, and my code literally looks like line noise: https://github.com/0xfe/cryptopals/blob/master/prng.go. Reversing the tempering code made me lose a lot of hair.)

- Modifying encrypted data reliably on AES-CBC and AES-CTR by flipping bits!

- If you use the same values for IV and key, you can recover the key with a little math.

- Determining block/key sizes from ciphertext using statstical analysis worked surprisingly well.

- Number-theoretic algorithms like DH, RSA, DSA, are sensitive to their parameters (like choice of primes, generators, etc.)

- The padding oracle blew my mind: https://cryptopals.com/sets/3/challenges/17


They are, however, a little bit out of date in 2020, and the promised follow-ups never materialized for about 6 years now. Some things never changed from 2014 though, so it's a good start


Can you be more specific on what stuff is no longer relevant? I started the challenges and put them on hold but plan to return at some point.


It’s not that specific thing is not relevant or even wrong, just that it misses new attacks


Thanks for sharing. I will definitely check this out.


I don't think you need even an undergraduate education in mathematics to get into cryptography; for many people it's the other way around. Cryptography can be a great motivator to learn the underlying math. Concepts that may have seemed too abstract to be useful suddenly become practical tools that help you build something that solves a real problem.

I love my colleagues who have come from cryptography research programs; they are awesome! But the majority of my colleagues working on cryptography in practice come from software and security engineering backgrounds. You might think that is just a dividing line between folks who design and analyze cryptography, and those who implement it, but I don't think so. I've seen folks from both backgrounds be able to work on cryptography design and implementation.


IANAC but would the lattice stuff or the other post-quantum stuff require it? "Not having a solid math education" might be a condition with an expiration date if so.


I'm saying that working on Cryptography can also be a solid math education. You can start with very basic high-school math; primes and finite fields, extend to algebra, formal logic, information theory, codes, probability, elliptic curves, isogenies, lattices. They can build on one another, but by actually building some things. It's a great way to learn.


Could you offer some tips/advice/resource for a SWE looking to get involved in applied cryptography?

It seems like a very dense subject, with different areas being of different levels of importance depending on how deeply you’re involved (eg, researching vs writing application code)


One of my favorites is to build a DRBG; download one of the NIST DRBG specifications and try to implement it and have your implementation's output match the test vectors.

You'll be building cryptography without having to design or build the underlying foundational layers. It's actually a lot harder to compose these things then it first seem. Then as you go, try to understand where every single limit in the DRBG specifications come from. Those will teach you a lot about how security factors are worked out.

Plus it's hard; and it gets you into that phase of making big dark numbers match other big dark numbers. It's a very unforgiving form of programming.


This seems to boil down to:

> If you can, just get a Ph.D. at a place with a good crypto group (remember that Ph.D.'s in computer science are effectively free)

Which I'm not sure is great advice except for someone who wants a full time career in cryptography. And if you want a career in cryptography it's fairly obvious that the most well trodden root is via a PhD (which is definitely not free when you compare it to how much you could be earning in industry with a compsci or math degree)


Learning Cryptography in any way is never going to be free, when you could have spent that time building a SaaS or doing LeetCode exercises to land a Google job. In fact most people's life choices are very expensive through this lens.


Can anyone elaborate on how CS Ph.D.'s are 'free'? Does this apply to UK universities?

I'm aware of scholarships but I'm not from an under-represented background nor am I a genius. I suppose the student loan would cover the costs but I don't consider that free.


I've dropped out of a PhD programme in the UK (many years ago) so perhaps I can help.

It is usual for PhD students to have tuition and a small stipend for their living expenses paid for by government. The government knows perfectly well it needs some supply of researchers, the old ones eventually die so you need to train new ones and a PhD course is how you do that.

A particular university Computer Science department might have say, six slots this year for PhD students. It needs not only the money (which you could if you're wealthy just pay for yourself) but also experienced staff willing to supervise these students which may be a more limiting factor. If it has eight capable applicants versus six slots then yes, you might get refused, but it's rare for there to be a huge imbalance between the number of applicants and available slots such that you'd need to be a "genius".

If you have a good (say first or upper second honours) undergraduate degree in the same or closely related area, and you can find a PhD topic and a doctoral supervisor, in Computer Science they can probably make the money happen.

You will not be wealthy (the stipend is small), but you should be too busy to notice and you're hanging out with other students who haven't got any money anyway.


But it beats on the fun part if you are into it.

Cryptography is one of those areas that I believe a PHD actually helps. And going through it automatically proves that you have what is needed for future jobs.


tl;dr:

To learn about some cryptography fundamentals:

* Read Katz & Lindell's Introduction to Modern Cryptography

* If you're missing math background, read Timothy Gowers' blog

* Supplement: Take Jonathan Katz' and Dan Boneh's MOOCS (links in post)

To get to the point where you can "design/analyze crypto protocols" (and maybe know what you're doing):

* Read Oded Goldreich's Foundations of Cryptography (Volumes 1&2)

* Try to have quasi-intelligent conversations with real cryptographers (advice for this in blog post)

* Get a PhD somewhere with a good crypto program

edit: formatting


"Applied Cryptography" is the best-written book on this topic that I know of. Author is a brilliant communicator. Esp. the section on cryptographic protocols should be required reading for any computer scientist. It's not overly rigorous or mathematical, and has a lot of informality and humor, so it's a fairly light read. You don't need a lot of mathematical maturity to read it (and reading things like this helps develop mathematical maturity).

Unfortunately, the 2nd edition adds "50% more words, 7 more chapters, and over 1600 new references." I thought the first edition was better in length. It was novel-length, and reads as well as a novel. Going from long-ish novel to short-ish trilogy makes this somewhat less readable. But c'est la vie.


well that page is not even on https


Glad to see that I'm not the only one triggered by the irony. :p


I was really amused by this. Also if you try to visit over HTTPS, it serves up an expired certificate.


I can't figure out who wrote this article, and the site doesn't really make it clear if it's one person's blog or not. Anyone know?

Edit: someone else answered already https://news.ycombinator.com/item?id=23386815


Who is this by? There doesn't seem to be an author listed at either the beginning or the end.


Hi kragen. It was written by one of the Lab's directors. Here is a link to his page: http://cs.brown.edu/~seny


Thank you!


> (1) developing mathematical maturity;

This has stopped me from learning so many things at this point it's not even funny.


Math maturity is probably unavoidable for some of the topics out there. But there are so many more in CS that don't require a ton of it.


Best courses for mathematical maturity besides the standard CS ones? I'm thinking Algebra, or maybe Real Analysis.


Yeah both are good. Basically anything that forces you to prove a lot of theorems that look trivial (but are not easy to prove) is good because they build up a system from small theorems. Back in school we use Rudin for Real Analysis and it's a good textbook. But I'd recommend taking a class if you are still in university because Math is a bit tricky for self study IMO. If you already graduated maybe follow some MOOC.

I'd actually recommend Number theory if you want more fun or if you are interested in Cryptography which is related to CS. Elementary Number Theory throws you tons of questions that even a toddler can understand but you might hit bam head on walls for nights to prove them.


Thanks, I'm doing a Masters now and I was planning on taking Number Theory and Abstract Algebra next term. I took mathematical cryptography this term and it got me interested in more related math.


That's pretty much enough for anyone who wants to have Math Maturity. Good luck!


Speaking the language of reality a bit better pays off.

Try to frame learning more useful bits of math like that.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: