Hacker News new | past | comments | ask | show | jobs | submit login
Everything you need to know about cryptography in 1 hour (bsdcan.org)
254 points by cperciva on May 14, 2010 | hide | past | favorite | 130 comments

I think I can sum up the fundamental difference between my take on crypto (run away!) and Colin's ("use OAEP padding instead of PKCS!").

When Colin thinks of crypto he thinks of things like Tarsnap, which was built and designed to embrace cryptography; cryptography is part of the reason it exists. It is not an afterthought.

When I think of crypto, it's about all the places crypto pops up in normal non-secure CRUD-webapp situations. Like the guy who uses crypto to effect single-signon without sharing databases between apps. Or the team that encrypts URLs instead of implementing access control. Or the team that uses AES to "secure" a viewstate cookie.

If your goal is to produce a system like Tarsnap, which from day one is going to receive attention from security people because it is a security tool, I have less of a problem with Colin's recommendations.

But if you're trying to build client-side database-free single signon (ab)using crypto as a tool, this slide deck is terrible. Follow every piece of advice in it and you will very likely still produce a system that is easier to break than if you hadn't used crypto.

On Thursday afternoon I gave my talk 'Everything you need to know about cryptography in 1 hour' at BSDCan'10. Two HNers (as far as I know -- there might be more?) came to hear me speak, but several other people here have said that they would like to see talk but couldn't attend; so here's my talk slides.

I understand that a recording will go online somewhere at some point, but I'm not entirely sure of the details about that.

Since I've only the slides, I have a question about the following point:

> DON’T: Put FreeBSD-8.0-RELEASE-amd64-disc1.iso and CHECKSUM.SHA256 onto the same FTP server and think that you’ve done something useful.

I take it the FTP got compromised and people simply regenerated the checksum for a modified image?

Wouldn't signing the checksum file solve the problem? Using different FTP servers for distributing the image and the checksums makes mirroring probably difficult.

I take it the FTP got compromised and people simply regenerated the checksum for a modified image?

I'm not aware of any problems with FreeBSD FTP mirrors being compromised recently. I wanted an example of data-adjacent-to-hash and most of the audience was FreeBSD people, so I figured that I'd go with an example close to home.

Wouldn't signing the checksum file solve the problem?

Yes. Or just relying on the FreeBSD release announcements, which contain SHA256 hashes and are signed.

Thanks Colin. Very informative.

Why did you choose to publish your slides in PDF when PDF documents are a main tool for security attacks?

Update: If PDFs are a major source of security attacks, and author cares about security, and author publishes document in PDF form, then why would you downvote this question?

Because it's a completely nonsensical question.

If balaclavas are used in bank robberies, and you care about the safety of your money, why would you wear a balaclava on a mountain climbing expedition?

PDFs are not a tool for security attacks because they are PDFs, they are a tool for security attacks because of vulnerabilities in Adobe Reader. Any given PDF is not a danger, only PDFs with exploits are dangerous.

And including a buffer overflow exploit, rootkit and phone-home code isn't something you're going to do accidentally while publishing your talk slides, is it?

I do not have the power to downvote, and I hate when people downvote because of disagreement but do not explain their stance -- I learn nothing from it; I can give you my perspective on your comment. (It would be nice to have a personal downvote message feature or something; the anonymity helps us avoid embarrassment, I guess.)

First, your point is irrelevant. Whatever the medium, this discussion is about cryptography. There is not a real case of hypocrisy here. Cryptography is certainly related to security, but as far as I can tell, that is not the thrust of this discussion.

Second, while the technology can be abused, as you point out, that is a far cry from this particular author abusing it and using it as a security attack. The author's PDF is fine.

I am not agreeing or disagreeing with you, btw.

In other words, I can see that this comment was downvoted because it did not meaningfully add to the discussion. I usually see downvotes on posts that include offtopic personal axe grinding. I may receive the same, myself, for this post, but I just wanted to help you avoid it in the future.

The repeated advice in this talk not to use block cipher modes that provide both encryption and authentication (which also follows from the advice not to use CBC-MAC) is contrary to a lot of professional best practice advice, very much including NIST, and represents Colin's opinion more than it does a consensus on the issue.

It is easier to get something wrong when it has more moving parts. You are less likely to screw up EAX-mode AES --- presuming your library "just does it" for you --- than you are to screw up CTR-mode AES and HMAC-SHA256.

The rest of the talk looks great, although I flinched when I saw the CTR recommendation, because CTR is also really easy to screw up.

Obviously, the comments about SSL are crazy. But you expect that going in to a Colin talk.

CTR is also really easy to screw up

That recommendation wasn't about avoiding 'easy to screw up'. That one was about avoiding 'almost impossible to avoid having your key stolen via a side channel attack'.

Obviously, the comments about SSL are crazy.

Do you disagree with either of the following two pieces of advice?

  DO: Use SSL to secure your website, email, and
  other public standard Internet-facing servers.
  DO: Think very carefully about which certificate
  authorities you want to trust.

I disagree with the assertion that SSL is a bad protocol. It's an ugly protocol, but that's because things that are elegant are rarely secure.

And, of course, I disagree with your reasoning about authenticated encryption. You're saying, "you're exposing your block cipher to direct attack by overloading it to both encryption and authentication". But it's probably exposed to side-channel attacks either way. Meanwhile, the implementation flaws that people end up with in naive HMAC systems break far more systems.

I suspect this is another example of the impedance mismatch between your academic approach to this and my "having to slog through people's terrible crypto on a monthly basis" approach.

It's an ugly protocol, but that's because things that are elegant are rarely secure.

Funny, my experience has been exactly the opposite.

But [your block cipher] probably exposed to side-channel attacks either way.

Exposed, yes. Exposed to attackers who don't hold the MAC key, no. Exposed to chosen-ciphertext side channel attacks, no. These distinctions matter.

having to slog through people's terrible crypto on a monthly basis

I do this too. But I try to educate people so that they write slightly less crypto.

Are you consulting now?


Good to hear it. I think you'll come around to my way of thinking soon enough. =)

SSL is ugly because of backwards compatibility with SSLv2. Version 2 and previous versions did not have a secure way to negotiate version, so Paul stuck the version in the RSA padding field. It was the best option under the circumstances, but definitely not preferable.

The other thing that makes it ugly are the huge numbers of cipher suites and reliance on certificates (and thus usually centralized CAs.) The cipher suite growth came about because of export controls and then Internet-standards groups not seeing a problem with adding vanity modes. The reliance on CAs can be worked around with using your own cert store or using TLS-SRP for authentication, which is sorely underused.

As somebody who's not already a crypto expert, I found those slides next to worthless by themselves. I suspect that the speaker put a lot of extra explanation into them during the talk to explain why TLA1 is better than TLA2, but it doesn't come across at all in the text.

As an example, I'd read an article about timing attacks, so I knew what the author was getting at in the slide about rewriting that loop. If I saw the slide without any background, I'd have been left scratching my head since it doesn't make any attempt to explain the actual issue.

Let us know when the audio is available. I'll keep the slides around until then.

I suspect that the speaker put a lot of extra explanation into them during the talk to explain why TLA1 is better than TLA2, but it doesn't come across at all in the text.

Yep. There's a practical limit to how much text you can put into slides and have them still be useful; I decided that it was more useful to have the recommendations in the slides and the reasoning orally than vice versa.

Is the audio of the talk available online? A cursory googling of bsdcan sources didn't yield anything.

The audio was recorded, but I don't think it's available yet.

Great slides, some comments though.

I feel they focus too much on "use process X and algorithm Y".

The difficulty is not to choose the algorithm or the process, but to understand what kind of problem you're dealing with in the first place. In other words: threat modeling. There isn't a single slide about threat modeling, unless I'm mistaken.

If your error boils down to using SHA-1 instead of SHA-256, that's trivial to fix and you won't be exposed to real world attacks anytime soon.

Talking about side channel attacks on a hyperthreaded core is just going to confuse the audience. This is a very sophisticated attack that require the attacker to be able to run code on the target machine. If that's your only vulnerability, consider yourself happy.

The difficulty is not to choose the algorithm or the process, but to understand what kind of problem you're dealing with in the first place.

I agree, up to a point. But there's no way you can teach that sort of understanding in 1 hour. :-)

The purpose of this talk was to provide a checklist for developers of what they should and should not be doing.

But, for instance, you recommended CTR-mode AES-256, but didn't discuss:

* How to set the counters so that counter/IV can't collide and destroy your security

* What metadata your messages need to include to make them not replayable

* How to canonicalize your messages so that the process of packing, authenticating, encrypting, decrypting, and unpacking doesn't change the intent of a message.

I think that may be what the grandparent comment is getting at.

Excellent points. I would much rather have seen a talk on the hard issues such as these. Choosing an algorithm is simple if you already handle the above issues -- you've already heard a lot about AES by then. On the other hand, choosing AES-CTR as Colin recommends in the talk without handling the above issues is exactly where most developers are today.

Put another way: find me one crypto library or application implementing CTR mode that got all the above issues right but used a poor block cipher (not AES or 3DES).

That's where we disagree. I think checklists are not the way one should approach a problem.

You don't want developers to follow a checklist, you want them to use their intelligence.

One hour is enough to make developers realize they know nothing about cryptography. Once they reach that point, they will be on the right path (ie really learn about the topic or ask someone who knows).

The most common error isn't improperly used algorithms or techniques, it's improperly used cryptography.

Example: securing a file with AES-CTR and having the password hardcoded in the binary.

One point I think crypto-people need to hammer home to non-crypto audiences is that they should not encrypt and transmit anything that will remain sensitive for more than 5-10 years via any form of encryption relying on factoring, one-way hashes, etc.. (Pretty much everything classical short of one-time-pads exchanged in person.) Credit card transactions? Fine. Affairs? State secrets? Not fine. If you're interesting enough, your coded messages will be archived and cracked within a decade or two (or less).

A lot of crypto algorithm's are theoretically secure for thousands of years (or longer) assuming an eavesdropper has access to existing algorithms and classical hardware with Moore's law scaling. Unfortunately, it's currently non-existing attack algorithms you really need to worry about. Advances are relentless. Then there are things on the horizon like quantum computing...

Quantum computing is overrated. So far all it has told us is that 15 = 3 * 5 with high probability.

While Shor's algorithm demonstrates that integer factorization is in BQP, there are very good reasons to believe that many other problems are not; a large enough quantum computer may break RSA, but it won't break everything.

My point was that you don't need Quantum computing to break all existing crypto algorithms within a couple decades. QC just makes matters that much worse.

(Downplaying Quantum Computing ostrich-style is very popular in classical crypto circles for some reason, but that's a topic for another day!)

Wrong. We have made them do very fast Fourier transforms now. See http://physics.aps.org/pdf/10.1103/PhysRevLett.104.180501.pd... for details.

I once wrote a Crypto plug-in for mIRC that relied on one time pads distributed by floppy disk. (Someone created the masters and mailed them out.)

A floppy disk worth of IRC chat is actually quite a lot but it also had a 'degraded mode' which used the last chunk of the random data as a symmetric cypher key. That would keep things going until the next disk arrived in the post.

This just goes to show that if you are stupid enough to tell a geek what the best answer is then there is a pretty good chance that they will confuse it with the right answer.

My only hope is that our inane chats about movies are, even now, causing cycles to be burned on a NSA supercomputer somewhere :-)

If you were interesting to them, they would just have copied the disks while they were in the mail system. An OTP transmitted over a vulnerable channel like the postal service is not that strong, you have to exchange it in person.

You can just exchange a few floppy disks worth of random bits over various channels, e.g. some postal, some by email, some over the phone, some Diffie-Hellman key exchanges in public and so on. At the end you just XOR them all together, to get the real key. That way all of your channels have to be compromised for your encryption to break.

> * SSL requires that you decide which certificate authorities you want to trust.

One subtle point is that governments can force seemingly great certificate authorities to give them a certificate so that they can do man-in-the-middle attacks and still have a reasonable "trust chain" that you can look at in your browser. More details of this "compelled certificate creation attack" at http://cryptome.org/ssl-mitm.pdf

If I had to bet, I'd say most serious attackers aren't bothering to break crypto math or even weak implementations but are rather focusing on this approach by getting a CA to give them a good cert to spoof another site and it'd go practically undetected.

Ultimately with crypto (or anything else) you have to implicitly trust somebody (e.g. Mozilla Foundation) or something (your processor to correctly execute a program or your quantum crypto box to not behave maliciously). As long as this is the case, you can't have perfect security.

Great paper of "Reflections on Trusting Trust" : http://cm.bell-labs.com/who/ken/trust.html

>Don't even dream about storing your user's passwords on your server. No, not even if they are encrypted.

While I always follow this when it comes to ssh logins, how do you suggest doing this for webapp logins? force everyone to use client certificates? mod_auth_kerberos? (hell, for that matter, kerberos keeps user passwords on the keyserver)

I'm working on bringing my web bits up to snuff, and I'm seriously considering forcing the use of client certificates. That, or just saying 'fuck it' and writing a ncurses billing app that you ssh into. no web interface at all.

Both of those solutions might work okay for me because I have a highly technical userbase, and my marketing is strong enough (or rather, I have other bottlenecks that are bad enough) that I can turn away some customers, but I don't see them working for most web applications.

I think he's saying don't store the users' actual passwords, store salted hashes instead :)

Well, more explicitly PBKDF2 or scrypt. Some people might take "salted hash" as MD5(password+salt) or something.

so the point here is just something that is slow to calculate? I mean, once the attacker has the hash, he can start trying random passwords against it. Sure, with a completely random password, that takes a while. But with a password a human can remember?

So PBKDF2 or scrypt just slows down the process of checking a users password against a hash, right? That does make sense in that it does slow down an attacker (and combined with a strong user password it seems like it could be secure) but it doesn't solve the essential problem that humans are not good at remembering random things, and there is a limit to how slow your 'check password' function can be in production. your attacker can almost always be assumed to have more computing resources than you do.

An attacker has to brute force the whole universe of human-friendly passwords. So 1000x the computation is a lot more painful for the attacker than for a legitimate user who only uses the operation occasionally (create/change password and login).

Additionally, salts protect against attacks that use pre-calculated tables.

I haven't read Colin's scrypt paper, so I'm not sure if its strategy differs.

Right, but if your password is an English word concatenated with a number, you might really be looking at the attacker getting it in the first 10,000 guesses, which isn't much if the password can be checked on your webapp box in less than half a second, (in that completely made up case, even if the attacker only has the same resources as you, the password would last an hour and a half) considering 1. the attacker probably has more compute resources dedicated to the cracking than you have dedicated to logging in, and 2. the attacker doesn't mind waiting all day for a login. (I mean, I'm assuming the attacker has other compromised hosts to use in the cracking effort.) I think it's a losing battle.

I'm not saying that slowing down the checking of a password has no value; Clearly it has much value. Without it, even a strong password would be brute forced quickly. I'm just saying, I am missing how it solves the problem of weak passwords.

The real problem here is that I know I can't remember a sufficient degree of entropy. Sure, there are all sorts of ways to fake it; take the first letter of every word in a sentence, but that's not exactly random, either. I know a guy who, for a pass phrase, would use a 24 line block of text from some project Gutenberg title. I doubt that was any more secure than 4 or 5 random characters.

Yeah, I don't think a KDF makes arbitrarily bad passwords OK. They just move the needle and thus increase the cost of the attack. In many cases, the derived key can be cached by the legitimate user, so you can throw a few seconds at it on the user's machine.

10000 * 1s = ~2 hours cpu time, but that's a terrible password -- common words plus 3 digits means 1000 * 10000. 26^6 * 1s = ~10 years CPU time -- a few thousand bucks on EC2 -- that's getting better for what is still a pretty weak password. In any case, straight MD5s are super-fast. At 1e5-1e6 MD5s/second or more on a modern CPU, paying the compute cost for a 1 cpu-second KDF can be several orders of magnitude stronger than MD5(password+salt).

Yeah. But really, I was hoping someone had come up with a way to make something like client certs usable. a six character completely random password (even if it's all lower-case without punctuation) is a damn sight better than your average password.

Can you tell me more about allowing the user to cache the derived key within the context of web applications? that would mitigate the limitation I was describing (where how slow you made the password check was limited by what delay a user would tolerate when logging in.)

I don't think this is really suitable for web applications. Javascript really sucks at crypto (due to the lack of suitable data types), and requiring Java wouldn't fly with too many people (Flash is sort of in-between, I guess?). There is a very large constant factor here, so and since the whole point of this key derivation stuff is to make an attacker work harder...

scrypt is like PBKDF2 but with careful analysis and a focus on using up lots of memory per-password in addition to CPU. Both seem fine to me.

I don't suppose you're planning to tie these recommendations up into an open source C99 library with an NaCl-like crypto_box interface? Oh, and can I have a pony too? :)

Better, add AES and SHA-256 support to NaCl. With your insight into preventing side channel attacks, it would be a great combo.

I could be wrong, but I don't think NaCl is portable anywhere but x86 or x86_64. ARM needs good crypto too. Though I suppose we could ask for ARM support in NaCl too since we're asking for ponies :)

Also, does NaCl support signing? I think crypto_box offers "authentication" which is a slightly different promise.

No signatures yet. See this summary I wrote a while back for more info.


DO: Use a 2048-bit RSA key, a public exponent of 65537, SHA256, and MGF1-SHA256.

How can I set the public exponent when generating a key? man gpg wasn't helpful.

And why this particular number? I can see that it's 2^16+1, but I don't understand the advantage of this number compared to others. (Although I've been taught that people prefer small values for e now.)

That was explained in the audio part of the talk. :-)

Short answer: A long list of attacks in the past have been much harder with large public exponents; using 2^16+1 instead of 3 is a bit slower but is likely to make you safer if someone else gets smart in the future.

Thanks for answering my curious questions so far. =)

Since I messed up formatting earlier, my first question about generating keys with this particular exponent was hidden:

Do you know how to generate such a key pair using gpg (or another tool)? Or is this done automatically?

In OpenSSL you can pass your public exponent to RSA_generate_key. I'm not sure about gpg... it has been a while since I last generated a key.

Maybe you are looking for this article (generate RSA/RSA GPG key):


For OpenSSL command line:

  openssl genrsa -f4 2048

Thanks, OpenSSL actually uses 0x10001 by default for e. I'll use it instead of GPG from now on to generate keys.

How is this good code?

x |= MAC_computed[i] − MAC_computed[i];

Shouldn't one side or the other (of the minus operation) be MAC_received[i]?

Yes, that's a typo. Sorry.

(Lesson learned: LaTex is not the best language to write code in.)

DO: Do what Bruce Eckel does, and have all code in a publication run through automated tests before releasing it. ;-). And... you're welcome?


      x |= MAC received[i] − MAC computed[i];
Is making me twitchy, I mean I'm sure we're on two's complement machines and there is no C compiler in the world where those two could be unequal but still subtract to zero* , but still, how about one of

      x |= (MAC received[i] != MAC computed[i]);
      x |= (MAC received[i] ^ MAC computed[i]);
Otherwise, Good stuff!

* Possible break: If we have a separate sign bit, then we might have 0 and -0, which would subtract to give 0. Don't know of any machine/compiler that does that though.

I always use ^ when I write this code, but that made LaTex unhappy, so I switched to subtraction instead for the talk.

Did you try to escape your ^ like \^?

I think so. It was something like 4 AM at the time, so it's possible that I didn't escape it correctly.

Your != proposed change likely results in a branch, which is what we're trying to avoid. The ^ version is fine, but the original subtraction is too as long as they're unsigned.

Are you sure? All architectures I am aware of have mechanisms to do this in a branchless way. On my machine: int main(int argc, char argv) { int x = 0; x |= (argc != 1); return x; }

Relevant portion of assembly (no optimizations): cmpl $1, -20(%rbp) setne %al movzbl %al, %eax orl %eax, -4(%rbp)

Most other architectures have one of: 1) condition codes that you can use to predicate an "or $r1, $r1, 1" (e.g., ARM) 2) compare instructions that put a 1 or a 0 in a result register, which can be directly ORed in. (e.g., MIPS)

Of course, that code only works if you are either not writing in C89 or if your compiler and ISA conspire to always return exactly one from true boolean expressions. It is my understanding that the C89 standard only requires "nonzero," so you might need "a |= (b != c) ? 1 : 0;*" instead.

YMMV depending on compiler and architecture, but the 2 or 3 platforms I tried without passing any opti flags were branchless.

Test program:

I wrote about this mistake before. Developers see a single-line RHS expression and think it evaluates without branching.


And why is the original way of not completing the for cycle for some keys wrong? Is the difference between checking the whole array and checking only lets say first byte measurable? Especially when considering that this would probably be done over a network?

The current state of the art in remote measurement (which is in its infancy; the first papers to apply basic signal processing on measurement over IP networks are just coming out) suggest that the timeable thresholds are in the tens of microseconds over a WAN.

But that doesn't matter, because over switched "local" networks (read: from one end of a data center to another), the thresholds are in nanoseconds; everything is timeable. Attackers will spend the $30 to get a machine in the same hosting center as you as soon as the other attacks get less trivial.

Yes, it is measurable. Remote timing attacks are scarily practical -- even a single cycle difference can be a problem, because an attacker can many many attempts and compute the average time in order to kill the noise.

Timing attacks like this have been successfully used against a few gaming consoles.

In particular, a memcmp()-style timing attack was successful against the Xbox360 CPU. This is a multi-Ghz processor.

Could you expand a bit on your comments about Poly1305 ("too difficult to implement" - but Bernstein has a library, right?) and Blowfish ("DON'T use")?

Yes, Bernstein has written code for Poly1305; but bus factors are bad. If you decide to use Poly1305, at some point you're going to want to use it on a new CPU or in a different language, and you'll find that DJB's code won't work for you.

You should avoid blowfish because it has a 64-bit block size.

Okay, I can see that. Most of Bernstein's crypto work falls in that category, probably.

Forgive my ignorance, but why is having a short block size especially bad? You don't want to subject 64 bits to a birthday attack, but something like CTR mode should work, right? (I still wouldn't recommend Blowfish - AES is used more widely and works perfectly well - I'd just like to understand your reasoning).

With a 128-bit block, you can send 2^64 messages of up to 2^64 blocks each; that's not going to cause problems. With a 64-bit block, you're limited to 2^32 messages of up to 2^32 blocks before you overflow.

Poly1305 is a fundamentally different class of authenticator than HMAC-SHA256: the addition of an nonce into the function makes it unsuitable for some environments. But given Intel's new instructions, GCM is looking like a more attractive polynomial MAC these days.

Why would you take GCM over OCB (apart from patents)? Or, for that matter, OMAC/EAX?

What does "bus factors are bad" mean?

And as long as I'm asking questions, what about bcrypt for password hashing?

"Bus factors" refers to how many people need to get hit by busses before there's nobody left who can work on something. In particular, a bus factor of 1 is a very bad thing to have.

Although it's mitigated somewhat by the fact that Bernstein is passionate and has been a reliable maintainer in the past. "Bus factor" is typically an euphemism for "someone is going to pay this employee more than we do".

I'd call that a cacophonism, though, unless you truly prefer that they get hit by a bus.

AFAIK, bcrypt for password hashing is fine (although Percival has his own scheme that he believes to be better - I can't comment either way).

Last time I asked him, one of the authors of bcrypt considered scrypt to be better, too. :-)

That's a pretty good endorsement :-)

I'm a little confused though. S-crypt seems to be a symmetric encryption algorithm and to store passwords you need a one-way hash. I'm obviously missing something. What is it?

scrypt is a key derivation function. As a demonstration of how to use it, I wrote a simple file encryption utility.

I see. Is there some documentation for the key derivation function that I missed? Or are we supposed to reverse-engineer the file encryption code to figure out how to use the the KDF? If the latter, that seems a little odd in light of the fact that you're actively promoting scrypt as a replacement for brcrypt and the concern you've expressed about people using crypto code incorrectly. I do not mean to imply that you are under any sort of obligation to provide documentation. But it seems to me that your stated purposes would be better served if the interface to the KDF were documented.

The scrypt key derivation function, defined: http://www.tarsnap.com/scrypt/scrypt.pdf

Oh come on, give me a little bit of credit here. Where is the scrypt key function defined? What is its signature? Are there any initialization functions that need to be called before calling scrypt?

Yes, I can figure all this out by reverse engineering the code. But is that really what you intend people to do? Even just a couple of sentences of documentation would save a lot of people a lot of head-scratching, e.g.:

"The scrypt key generation function itself is called crypt_scrypt. Its signature is in lib/crypto/crypto_script.h. The reference implementation is in crypo_script-ref.h. There are two optimized implementations, one for CPUs that have SSE support, and another for ones that don't. You don't have to call any initialization functions before calling crypto_scrypt."

Or something like that. If your goal is really for people to use this I think you'd see a pretty large ROI from the effort it would take you to write a paragraph like that and stick it in a README file in the distro.

Just a suggestion.

Oh, you mean "where is the interface to this implementation of the scrypt KDF?"

I thought you were asking where the function itself was defined -- which, as I said, is in the paper.

Let me try to be excruciatingly precise here: you seem to be advocating scrypt for use as a password hash, but your implementation only provides file encryption functionality out of the box. To convert the code you provide to produce and verify password hashes requires some uncommon expertise because there are, as you yourself so often take pains to point out, many ways to get it wrong. My question is: is there some documentation that you have provided to make this task easier that I have overlooked? And if the answer is no, why not? It seems very odd for you to 1) advocate scrypt for use as a password hash, 2) caution people against writing their own crypto code, and then 3) provide neither the code nor the guidance necessary for a software engineer who is not already an expert in cryptography to actually use scrypt as a password hash.

The manifestation of this crisp thinking is tarsnap. The best linux backup product I have ever seen.


I'll be interested to hear the thoughts behind the "Probably Avoid" recommendation on DSA and EC signatures. They're both attractive in some situations because they produce much shorter signatures than RSA.

The recommendation for AES-256 over AES-128 runs contrary to Schneier's advice. It makes it hard for the non-experts when the experts can't agree...

DSA and EC both have benefits in some rare circumstances. That's the 1% of the time when you ought to consult a cryptographer.

I'm aware that my advice concerning AES does not match Schneier's advice. I don't think he takes side channel attacks seriously enough.

He also thought "cube cryptanalysis" was going to break most modern ciphers. Do you think he's a serious practitioner? I think he's a pundit.

I think Schneier does less cryptography now than he used to, but I would still say that he's much more than just a pundit.

I'm interested in why you think that (I'm in listen mode here, not argue mode).

Hmm. I'm having trouble putting my finger on anything really concrete right now. More than anything else I'd say that it's just a general impression from how well in touch he seems to be with current events.

He was part of the SHA-3 team that proposed Skein. I think he also coauthored a paper on some advances in attacks on hashes with Kelsey recently.


Am curious as to why to use "x |= MAC_computed[i] − MAC_received[i];" instead of the block-wise comparison. Wouldn't the x |= approach be less secure, because two (or more) differences may cancel each other out? (am assuming that |= == ||= in Ruby, or a bitwise OR)

No, not possible for differences to cancel. The bitwise OR of a bunch of things is zero only if all of the individual things are zero. The difference between two things is zero only if they're equal. Therefore, the final value of x is zero only if all the elements of MAC_computed and MAC_received match.

(And -- the main point here -- how long it takes to do the calculation should be totally independent of what's actually in the arrays: no early termination, no operations that take different amounts of time depending on how many elements match, etc.)

ah, bitwise or, was thinking incorrectly for some reason. And early termination makes sense, thanks! Didn't think of that.

What's wrong with Elliptical curves?

Seriously, are there some known attacks against ECDSA? And what of DSA2?

From the audio part of the talk: Elliptic curves are considerably harder to implement correctly than RSA. DSA has expensive signature verification, and you're far more likely to create one signature and verify it many times than vice versa.

EDIT: Also, elliptic curves are heavily patented. I mentioned this in my talk (the audio portion, not the slides) but forgot when I woke up and started answering questions here this morning. :-)

I'm neither Colin nor a cryptographer, but I think elliptic curve cryptography (1) involves more complicated ideas and code (which might therefore be easier to get wrong), and (2) is subject to more patent claims, than its rivals, without any obvious big benefit to outweigh those facts. [EDIT: oh, and because of 1 and 2 ECC is used less, and therefore your implementation mightn't have had its bugs and design flaws so thoroughly shaken out.]

Certicom has ~130 ECC-related patents. You probably don't want to have to crawl through them all making sure that your code, or someone else's code that you're using, doesn't infringe (and won't be found to infringe by a not-necessarily-infallible jury if you get sued anyway).

[EDIT again: actually, ECC does have a claimed big benefit, namely not needing such long keys. But one might reasonably be worried by the possibility that an algorithmic breakthrough some day might suddenly make longer keys necessary...]

ECC is extremely well studied and implemented almost by default in embedded settings. There are providers of it other than Certicom.

The reality is, you need to consult a cryptographer before you implement any public key system. RSA and vanilla DSA are also spectacularly easy to screw up.

In an embedded environment I would probably use ECC. I believe that I mentioned this in my talk; certainly I mentioned that my advice was for the context of software on general-purpose CPUs.

Unusual environments need special treatment.

ECC vs. RSA is a place where you certainly know better than me; I have that list of things Kaliski said not to do wrong with ECC and very little else. I also think we probably don't disagree very much about ECC.

Where I know we do disagree: public key crypto is a threshold over which I would not be OK implementing a custom cryptosystem. If I need public key, it's GPG or nothing. I am terrified of public key crypto.

I didn't intend for my audience to be implementing everything themeselves. Knowing that OAEP is better than PKCS padding is useful even if you don't write a single line of code, because it helps you select the right library to use.

Overall, I think this talk best applies as a starting point for learning about crypto. Take Colin's recommendations as signposts to underlying research. Why would he dislike DSA? Hint: it requires a secure RNG at signing time while RSA only requires one at key generation time.


However, I don't think he achieved the stated goal of teaching you enough to become a crypto implementer in 1 hour. I disagree with a lot of his conclusions. But you'd have to learn more than the slides hint at to have a discussion on the details why.

Some specific comments on the slides.

### Common failures and their causes (2):

I find it interesting that he characterized a timing attack in Keyczar as "stupidity" while a cache-based timing attack with HTT was "unusual environment". Same thing with SSL renegotiation. I think all of these were subtle flaws and easy to overlook. The SSL bug was around for at least 10 years. HMAC timing attacks were not as widely published as they are now, but instead were part of common lore to cryptographers.

The WEP vulnerability was almost certainly not "unusual environment". Sure, there were power and cost constraints in the WEP design, but what was unusual about it was that the crypto was obviously designed by a group with no crypto experience and no public review. Using a varying key with RC4, something it was known to be weak to, was an example of "using a tool wrong" and more similar to the AWS v1 signature bug you found.

The only "unusual environment" was that the specialized and closed field of IEEE wifi design led to non-cryptographers designing crypto. There was nothing unique to the environment that a good crypto person would have had an inherent problem with.

I don't think this list is a good categorization of types of crypto bugs.

### Authentication/Signing (5)

Confusing terminology. I use "integrity protection" or "authenticator" for MAC and "signature" only for public-key signatures. They are two very different processes, even though they share some goals.

### Hashes (8)

It was funny to see SHA-1 in the "don't use" list next to MD4, for example. I agree it's best not to use it for new designs, but there's a world of difference between SHA1 and MD4, especially if you have to use SHA1 for legacy compatibility.

### Symmetric authenticators (10)

I think CBC-MAC should be on a "don't use" list, not "avoid". If you put SHA1 on the "don't" list, it's even more important that CBC-MAC be put there. Which will be broken sooner: HMAC-SHA1 or an arbitrary CBC-MAC implementation or use?

### CTR mode (17)

Based on his previous comments, I think Colin advocates CTR mode due to its supposed resistance to side channel attacks. However, it is not a panacea. See this paper by Josh Jaffe on how to use side channel attacks against CTR mode. It's quite interesting how AES's structure makes it more vulnerable to his particular attack.


By citing CBC-MAC as "DONT USE", do you include CCM mode (which is popular)?

I see clear practical reasons to avoid CBC-MAC (there's an incredibly easy and devastating keying mistake to be made with it). But you have more insight into the theoretical problems than I do.

Slides look great and so is the presentation. May I ask what you used to prepare them?

LaTex beamer, with the Warsaw theme.

Seems to have become the standard in mathematical circles.

Thanks will give it a go.

There is a typo "which which" on the 2nd "Side channel attacks" slide.

Slide 5, line 2: "The ciphertext is the data we evil people get to see." [emphasis added]

(Admission that there is something to the beef with the old BSD logo ? ;-)

Oops. Freudian slip, I guess. :-)


Thank you for the guide - the dos/don'ts, from someone who knows, are exactly what's first needed in this minefield.

(And Muphry struck me too - it's in page 6. I double checked the number and then typed the wrong one ...)

Thanks! That slide was written somewhere around 5 AM...

I found some of the information in this presentation to actually be wrong. The guy repeats himself, and at two points he even contradict himself. I also didn't like that a very large portion of it was presented with a very fervent zeal, replacing actual knowledge of why and why not with useless favorism and indoctrination; the guy is a zealot, and that's a very, very bad ingredient in any serving of important information.

I would not recommend this particular presentation to anyone.

add.: Thanks for the downvote, Colin :D Oh boy...

Hackermom, you sound confused about the downvotes. To provide a bit of background: Colin Percival has a doctorate in computer science from Oxford, has won the Putnam, has been FreeBSD's security officer for years, and discovered the hyperthreading leak noted in this presentation. Most people who've been on Hacker News for a while trust his judgement.

I'd like to think we would all--cperciva included--welcome specific criticisms or errors, but by claiming they exist without naming them, you're asking us to take your word over his. Your score reflects the fact that you haven't earned that.

Careful. Colin Percival isn't David Wagner or Dan Boneh or even Daniel Bernstein. He wrote a paper on a side channel attack during a window of time when everyone was writing papers on microarchitectural side channel attacks, wrote a pretty cool paper on a password derivation function, and, like the authors of SSH, Tor, and DESLogin (none of them professional cryptographers), built a bunch of cryptography into a well-understood Unix networking problem.

I very much respect Colin and fully expect him to kick my ass in arguments, but, in particular, I find a good portion of his advice does not square with my professional experience; in other words: he's saying things that lay developers have already heard, and, empirically, lay developers produce very bad crypto.

It's funny you'd say that, because you're one of three-ish people on HN I weight equally with cperciva on security issues. But my explanation doesn't depend on his being one of the top 10 security guys in the world; just on his being orders of magnitude more reliable than an average Hacker News reader, like the one who I responded to.

Even that is more than required by the fundamental message I'd hoped to convey; which is "making something and sharing it can be useful; giving non-specific criticisms from a noncommitted, unassailable position is not useful." Reputation is simply bayesian evidence with which one can weight an otherwise-unsupported claim.

Separately, though, I agree that you've earned your more cynical view of the average developer. You believe Colin commits something similar to the Planning Fallacy[1] by looking at the details of a secure system and his own ability to convey the details of cryptography, instead of looking at the reference class[2] of "developers who have been taught about crypto," which in your experience still gets worse results than developers who just use ssl and gpg.

[1] http://en.wikipedia.org/wiki/Planning_fallacy [2] http://en.wikipedia.org/wiki/Reference_class_forecasting

at two points he even contradict himself

Where are those?

replacing actual knowledge of why and why not with useless favorism and indoctrination

I provided some explanation of why and why not in the talk. Slides != transcript.

EDIT: And no, the downvote didn't come from me.

Please remember that this wasn't a book chapter or a blog posting, but presentation slides.

As presentations are normally held, the slides are just key excerpts from the one hour talk he was holding while the slides whizzed by.

i kinda found this presentation to be chock full of "i am right, you are wrong" "i know better than you" et cetera. it's as if he's trying to coerce people into using what -he- thinks is best and what -he- prefers, but not really caring for sorting out the facts or explaining the details to give people a big enough picture to make educated choices of their own. "just go with this and that, trust me it's the best". no question actual knowledge is in there, the guy knows cryptology, but this seems to be a slide for indoctrinating people into sharing his preferences in crypto, not for teaching, and i personally find it to be the completely wrong approach.

i know better than you

Well, I do know better than my audience. That's why they came to my talk. :-)

Given a one hour overview though there has to be some of this. No time for the history of cryptography and the pros and cons of all the methods employed. It seems he wanted the audience to take away some key points that if without any other knowledge should prevent the majority of attacks.


there ain't anything called learn it in x hrs/days

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact