Worse still is the huge amount of conflicting advice and methods available on the internet. If you had asked me today if AES-256 was the best there was, I would have said yes, because that's what I've read, and I'm not a crypto expert; but now this posts makes it seem like I'm wrong. The truth is that now I have no idea.
Crypto is hard, but it's so important today that it shouldn't be presented in such an obtuse way. There should be one resource that in big, bold headings and simple sentences outlines the best available crypto methods for each particular use case. Hashing password? Bcrypt. Asymmetrically encrypting plaintext? Algo XXX. Etc. Bonus points for step-by-step instructions on how to use those methods in common programs like GPG.
Without a human-readable, trusted, official resource, it's just going to be crypto scientists shoving acronyms down our throats till we don't care any more. Please simplify it and eliminate the bad choices for us, because you know better! I want to be as secure as possible, but it's outrageously hard to do with the way things are presented today.
But the crypto community has told us time and again that we shouldn't be trying to learn the fine details of crypto. If we are, it's a warning sign we're wading into the dangerous world of actually implementing crypto. We're just supposed to use pre-built, peer-reviewed crypto systems and understand at a very high level how they work. E.g. knowing what a hash is versus symmetric keys vs asymmetric keys, knowing which algorithms are still acceptable (Bcrypt) and which aren't (md5). And this, I think, is the right message for developers.
But the OP has a valid point. If it's true that many of the high-level, crypto-layman-friendly libraries do bad things like using all-zero IVs, then we have a real problem. But here's where I disagree with the OP: It should not fall on the average developer to understand this stuff and override a crypto library's insecure defaults.
Instead, authors of crypto libraries need to step up and implement secure defaults, as well as provide solid documentation on the do's and don'ts of their libraries. (Assuming of course the OP is correct about the flawed defaults in existing libraries.)
In any case, the solution is not to urge normal developers to learn crypto at the level the OP describes. I'm a smart guy, but I know better than to trust myself with things like initialization vectors. This is, after all, what the broader crypto community has been telling people.
But, there are so many different crypto libraries with so many varied defaults, that in the end, we need some basic education on what is secure and what isn't. This particular blog post states in the 3rd sentence that developers using libraries are already making a grave mistake. How can we know which libraries to trust? If we can't trust libraries, what can we trust? Which magic combination of alphabet-soup abbreviations is the most secure, and how can I, a layman, be sure? For example: For asymmetric encryption, GPG on Ubuntu 12.04 defaults to the CAST5 algorithim. Is this better or worse than AES-256? I hear about AES-256 all the time, but I've never heard of CAST5. How can I, a layman, know it's the best default choice, considering that I only hear about more popular algorithms?
I feel like there should be a single best answer for most possible use cases. For example, today the best answer to storing password hashes would be to use the bcrypt library for your language of choice. But there's just so many options and so little clarification and consensus to a lot of these things (for regular developers, at least), that it's nigh-impossible to be confident about what you're doing. People are still MD5'ing passwords to this day--and it's not because they're idiots, because even knowing about MD5 means they had to have the initiative to Google "how to hash a password".
If I've misread the intention of this blog post, then perhaps I'm arguing the wrong point. If I am, I submit that this blog post might be better located in "crypto-researcher-news", not "hacker-news".
NACL is sound.
Using straight-up TLS with OpenSSL or your platform default is fine as long as you test that you're validating certificates.
PGP/GPG is fine.
Everything else is extremely suspect.
I really disagree with this, and there are now good free courses available online.
Implementing as anything other than a toy is dangerous, but learning should never be discouraged. Besides which it's fascinating.
Programmers have to use crypto libraries all the time. The 3rd sentence in this post is "Chances are whenever you have tried to use a cryptographic library you made some sort of catastrophic mistake." Crypto is hard, but humans have to use it--this post is ridiculously complex and utterly unhelpful for anyone who's not already an expert, and that's not uncommon in crypto documentation, I think.
That's not how it should be. Something so important and widespread should be written about and explained in a human manner. My point is we need some resource to de-obfusticate the technobabble for those of us who need security but have day jobs that aren't developing the latest and greatest hash algorithm.
Answer: Use PGP, Keyczar, or NACL.
Question: What if I want to use EAX? GCM? CCM? AES-CTR? CMAC? OMAC1? AEAD? NIST? CBC? CFB? CTR? ECB?
Answer: You will perish in flames.
Smart cryptographers already grappled with the problem you're talking about here, concluded that non-crypto-engineers were never going to get these details right (professional crypto engineers don't even get it right, and there's a whole academic field dedicated to why), and designed high-level libraries that don't expose primitives like cipher cores, block modes, and MACs. You need to be using those high-level libraries. You need to start treating things like "CTR" and "OMAC" like plutonium, instead of like AA batteries.
There's a disconnect here. Perhaps to you it seems like you're pounding the same simple point over and over again, trying every which way to explain it, and we always keep bringing up things that we really shouldn't be concerning ourselves with because we'll just screw them up.
But the community needs a better starting point. A lot of us know that there's a universe of stuff that we don't know about crypto, and we don't blithely imagine that we're secure because we used XCZ or LSA-j14(3). What we know is that Bob said to use XLQ and everyone says Bob is an expert, so we're gonna use XLQ. But we often come to this information in the middle of a hacker news thread, or on a website that looks like it was designed in 1993. There's no good general starting point that gives us a way to make good security decisions without knowing what we don't know. Does that make sense?
(There are actually a lot of resources that try to be starting points, but without the tools to do a meta-evaluation of which of these is expert and trustworthy, we're back to the same problem.)
To me it sounds like a bad excuse for security guys to stroke their own egos. 'Puny mortal, you are nothing. Repent your ways of sin and obey your gods.
Honestly, fuck that.
Meanwhile, tptacek makes the very good point that if you ignore this advice because you're offended by it, you're going to end up building insecure systems that will endanger other people's data and possibly worse. But it's an impatient answer and it actually does come off as pretty egotistical. Isn't there some room between "secure" and "incompetent?"
Sorry for being all third-persony. I think you both make valid points, despite the negative tone.
I responded to dignify it because I thought that in spite of the invective, there's a valid point about whether the article is helpful to the people it's meant to reach. The title is needlessly insulting to the reader. The tl;dr is pretty useless. You don't learn to do things right by cargo culting a mantra that you don't understand. The content of the tl;dr should be the block quote starting under "That said, what modes should you be using?"
Yes, you're right, this is actually pretty irrelevant to the content of the article and the question of whether a particular system is secure. I think it connects to a larger issue about security education that's lurking out there, though, and the article is clearly meant to educate.
Pretty much any software being built that touches the internet involves cryptography. So pretty much any software. It's an important subject for any software engineer, and there's a lot of good available to be done by helping educate engineers at every level of experience.
Most of the time, it is easier to dash off a short comment that says something wrong, like, "pretty much any software" is going to involve grappling with cryptosystems, than it is to write a comment that thoroughly refutes that wrongness.
Also, the space of possible wrong things you could write, like, "there's a lot of good available to be done by helping educate engineers" about how to write bespoke custom cryptosystems, is much larger than the space of things you can write that are even strictly speaking correct. So I'm at a double disadvantage.
Ultimately, while I am happy to hear that you find my other comments helpful, I just do not care that you find my condescending, combative, or overly prolific on this thread. Deciding what to say based on what might or might not make random anonymous HN users happy is simply no way to be.
I'm a professional software engineer. I work on a system that occasionally passes secrets through untrusted contexts, encrypted with AES-256-CBC. Is that a good idea? Could we improve it? Would it be worth the effort? I'm open to learning, but this article isn't teaching, it's browbeating. So are your comments.
1. How does the basics and intricacies of crypto work? acabal can hit up coursera for that, and the cryto guys here will probably help him with his homework for free.
2. How do I implement a user authentication mechanism from scratch? well, thats the one where you cannot learn it. dont try it seems to be the advice.
(in fact the simplest advice is on here too - authenticate once, hand over a random number, look it up at the back end.)
I assume that includes Time it out, count it out and then go read about CSRF protection.
The point of this is you shouldn't even be using algorithms directly. If you need to touch crypto at all you need to learn about it, and the more you learn about it the more you'll realise just how damned hard it is.
Absolutely use common programs like GPG and common libraries like OpenSSL. In fact don't use anything else, and don't even think about implementing this stuff yourself.
I agree some sort of authoritative set of info would be good. Very hard to do though.
- from Kurt Vonnegut's novel Cat's Cradle
- Albert Einstein
- Abraham Lincoln
Also I'm pretty sure that's Diablo Cody's quote, not Einstein's.
Take OpenSSL for example... it's almost easier to learn the crypto than to figure out the API. Here's a good one: an SSL read/write operation can fail with more than one error, and if you don't clear or loop through all the errors then the next operation will fail because of the previous errors -- even if it succeeded. Or just try getting it to work with non-blocking sockets, you finally believe it is working and surprise it fails only once the network gets saturated. Or hours later when it renegotiates the crypto.
And you still have to know all the crypt terms to use it. What's a PEM? A BIO? PKCS? DHparams? What's "ASCII armor"? X509? Did SSL_library_init() add weak algorithms? Why do I have to know this just to create a secure connection?
Most of blame for crypto problems belongs to the libraries not the developers using them.
Instead, use NACL or Keyczar. Those are high-level interfaces designed to help generalist developers not make mistakes.
You are always better off not working directly with crypto at all. For that, consider using PGP/GPG for data-at-rest (if you're in a JVM language, you can use Bouncycastle to get access to PGP) and SSL/TLS for data-in-motion.
In our practice, password reset tokens and encrypted session cookies continue to be the top source of exploitable crypto vulnerabilities in web applications. You don't need encryption to build either of these features; send 128 bit random numbers that key a database row instead.
If you use Go, instead of hacking together things with AES from the standard library to encrypt something, just `go get code.google.com/p/go.crypto/nacl/secretbox` and call secretbox.Seal with unique [randomly generated is okay] nonce and your key to get the encrypted and authenticated "box". secretbox.Open with the same key and nonce will open it.
Bonus: on amd64 it's extremely fast (portions are written in assembly, and even pure Go versions for other platforms are hand-optimized), faster than what you can get from any non-hardware AES implementation (let alone authenticated with HMAC), and provides better security margin.
In our practice, password reset tokens and encrypted
session cookies continue to be the top source of
exploitable crypto vulnerabilities in web applications.
You don't need encryption to build either of these
features; send 128 bit random numbers that key a database
In that way, both sender and receiver need only generate the same cipher bits and apply XOR to encrypt and decrypt (meaning encryption and decryption are actually identical operations!). A side effect of XOR is that a single bit flip in the ciphertext corresponds exactly to a single bit flip in the cleartext. An attacker with knowledge of your cleartext can therefore modify it without ever needing to know the cipher parameters.
Imagine a session cookie that contains a single 32bit integer, the user ID. Now attacker knows his user ID, so he merely needs to XOR the cookie with his ID, then XOR it again with his desired ID and voila admin privileges. Wrapping the cookie in a MAC prevents this kind of manipulation.
But there are even more problems than that with unauthenticated encryption. If you don't authenticate there is a good chance attackers will be able to decrypt your messages wholesale.
Eek, that sounds fun :) Tell us more?
I highly recommend Dan Boneh's crypto 101 on coursera for anyone that has the time.
Again, even if you get this part right, there are other things that go wrong. TLS is authenticated, and it fell to two adaptive chosen plaintext attacks because of two different implementation details they messed up. And no public cryptosystem in the world has been as thoroughly tested and analyzed as TLS.
For some mind-boggling reason, the designers of the XML Encryption standard decided to make authentication optional, so an attacker can simply avoid sending an incorrect MAC.
OTOH, if the receiver must decrypt before checking the MAC, any information leaked to the attacker (success or failure, timing, etc) is very likely to give the attacker a systematic method to decrypt some or all of your secret plaintext.
It turns out for the vast majority of use cases, even broken crypto is Good Enough. That has to do with two things.
For you to be worrying about initialization vectors and block modes you need to first have a reason to be working at that level - answer "no" for most webapp use cases.
Secondly, and ignored by security fearmongerers: you need to have something of value commensurate with the efforts you've put toward security.
This is the mindset of a cryptography professional: true, chances are what you're using is broken. But whether that's a problem depends on who you are. If you're the NSA, you might be concerned with crazy shit like China building quantum computers and factoring all your primes (I'm making this up). Whereas you might be the latest social.ly startup, your users have no privacy anyway, security is more of an image concern (it looks bad to be on the front page of HN with security bug) than a real one.
It turns out that for the vast majority of cases, broken crypto schemes don't have enough users for the tiny minority of software security people who look at crypto to bother beating up.
Once one of those systems becomes popular, crypto pentesters finally get around to poking at them, and, lo and behold, thousands of users are discovered to have been communicating effectively in the clear for years.
It's one of the more pernicious evil memes in our field that "if you're not the NSA, everything you do is broken, so just try the best you can". No. The attacks we're talking about take 50-200 lines of Ruby code and less than 30 seconds to run. There's no excuse for being exposed to them. We mercilessly mock people who screw up SQL queries, but then act like it took a network of intelligence service supercomputers to break our amateurish cryptosystems. We'd all be better off if fewer people tried their hand at building these systems in the first place. Use PGP, or Keyczar, or NACL. If you're typing the letters "A-E-S" into your code, you're doing it wrong.
I just wanted to point out the more general notion that, from a business perspective, even a colossal failure like Dropbox's is more of a PR disaster than anything else. And until that equation changes, security will be at the bottom of the totem pole.
The MAC example is a good one. You're at the point where you're trying to protect your app against a chosen ciphertext attack (pretty far up the sophistication scale), and trying to decide the MAC mode. The text in the article makes you sound like a total idiot for not knowing what the "right" choice is and making you worry that you might not get it right.
And then you get hacked because one of your admins had a ssh key on her phone and it got stolen.
Too much security analysis is missing Big Picture issues...
Obviously it's because the cryptography as a field is technically interesting and "security best practices" isn't. And there's nothing wrong with that from the perspective of someone looking for interesting links on HN. But honestly I feel like that kind of tunnel vision has reached the point where it's actually hurting security practice rather than helping.
It doesn't bother you that, if you do it wrong, that by watching a bit of traffic and sending a few thousand page requests I might be able to impersonate any user on your system?
The linked post isn't even an interesting or exciting thing about crypto, it's not even news, it's just reiterating the usual thing - you shouldn't be doing this yourself.
In fact the linked blog post is exactly about best practices.
Obviously it "bothers me" that crypto is easy to get wrong. My point was that other things bother me more, and I don't think this genre of blog post (or your very typical reaction to criticism thereof) is helpful to improving security. See my other post -- are you one of those little BOfH monsters enabled by a little crypto knowledge? Are you sure?
And I'm not sure it's "helpful" really . It's more musing on whether or not this kind of advice is hurting more than helping. What you are saying amounts to "Crypto is really hard so use expert-authored solutions." But in my experience what people hear is "Crypto is Hugely Important and I'm using a expert-authored solution and using the same jargon, so you need to listen to me about all that security stuff and do what I say even though it's totally impractical."
Broadly, I guess I'm thinking that this creates little BOfH monsters, where a more nuanced, "big picture" frame might engender more thought about costs and tradeoffs.
Generally: I'm comfortable writing about crypto when the subject is "how you would practically break a system that makes mistake X or mistake Y". I'm not comfortable about posts with prescriptive content.
I am also not comfortable with posts that condone building cryptosystems out of primitives, even when they limit the solution space to well-regarded tuples of those primitives. If I had to write a prescriptive post about crypto, it would state clearly: you cannot DIY this, and you must use a vetted cryptosystem; your choices include NACL, Keyczar, and PGP.
The people who already know this information don't need this blog posts, and the ones that don't get no information that helps them to understand the problem. I think stating what can go wrong with, for example, ECB, would make it much easier for people to remember which mode is good and which is bad. As it is at the moment one would have to remember a few cryptic acronyms as "good" and others as "bad", without any context. (Or go look it up somewhere, but the people looking these things up are not the ones in need of being alerted...)
Crypto is a place where a little knowledge is often harmful. You tell someone not to use ECB, so they use CBC instead. Now they're vulnerable to a bunch of new practical attacks. You list those off, and they switch to CTR, and their stuff is as trivially decryptable as simple XOR encryption.
What's the point? It's better to give people the (honest, accurate) impression that they don't know anything resembling enough to build a cryptosystem so they won't try, fail, and screw over their users.
The point is that when people read this blog on the front-page of HN, they expect to learn something. 99.99% of us aren't writing our own crypto. But this is a way of learning. I've been told from basically everyone in the industry since a young age that you shouldn't write your own crypto. As a result, I educated myself to understand how I might get it wrong so I can understand when things are wrong, and so that I could understand crypto news. Similarly, as a web dev, I educated myself heavily on common vulnerabilities. The end result is that no, I can't say that everything I make is 100% secure, but it's certainly a lot better than if I didn't do those things. At least now I can actually see when myself and others might be causing a vulnerability.
You seem to be hand-waving things and telling people "STAY AWAY." Maybe, just maybe, instead of treating people like they're all idiots, enlightening them might work a little better. You've put a warning on the label, the blog post is NOT a guide on how to make crypto, but a guide in how you'll probably get it wrong. Now proceed with the juicy details. Maybe I don't know why I shouldn't be using ECB because I don't know what it is because I don't do crypto. Maybe I'm not implementing my own block cipher, or maybe I'm just curious about the different modes of block ciphers and which are good/bad and why. Maybe with a better understanding I can even explain it to other people rather than just telling them "you shouldn't do your own crypto," and look like an idiot when I have nothing to say when they ask "why?"
Applying your reasoning to computer science -- some people end up as negative productive engineers. They create more bugs and problems than they solve. Obviously, we should just STOP TEACHING CS to anyone because such people exist, right?
I am interested in giving people the right advice to build systems that won't blow their hands off in production a year from now. I am at the moment less interested in handholding them through a guided tour of the last 10 years of crypto vulnerabilities.
You'd see the same phenomena occurring in message board threads about DIY nuclear reactors. Except that it's hard to build a DIY nuclear reactor, and very few people do it, so we don't need those threads very often, and we don't need to sort through the "LET"S JUST NOBODY LEARN ABOUT NUCLEAR POWER THEN HUH?" comments.
Unfortunately, it is very easy to build new crypto applications, and people do it, and then a year or two later we see threads from dissident groups in South America where people have been interrogated by police organizations that have full decrypts from those tools, and then a few more weeks later we find out that the tools were doing comically stupid things with their crypto building blocks.
For example, for ECB it's trivial to explain, that it allows reordering of encrypted blocks, an attack that someone not familiar with cryptography has probably never thought about. Of course they won't gain a full understanding of cryptography in this way, but they will hopefully start to appreciate how difficult it is to get right, and how many attacks there are that they have never thought about. This hopefully includes an understanding that possibly other modes that they think would be safe might also have problems that they have never considered, and that they should really use a full package solution.
I think this will drive the point home better than, say, listing three opaque possibilities on how to use a MAC and then saying "no 1 is the right one", without any explanation whatsoever.
[Fake Edit:] Hooray, WaybackMachine to the rescue:
what they seem to be saying is: feeding someone a little extra information that is entertaining somehow helps make the recommendation to use a 3rd party library more convincing.
and i think there's something to that argument, even though i've presented it poorly above. people don't like to be simply told to do something. they like to be indulged a little.
i realise that goes somewhat against your personal style, but it's not obviously wrong...
I get what you are saying but the idea that people should not try to learn is ridiculous. If someone is truly interested they need to be told to go ahead. Encourage caution, but not ignorance. When someone asks why ECB is bad then tell them, or tell them to go read up on it.
Authentication and crypto are areas where I know I'm nowhere near good enough to tell if what i'm doing works or not, other than trying to find out what the best practices happen to be and use what seems to have a good reputation. But the only constant in this field seems to be that everything I thought I knew was fatally horribly flawed from the start and that I might as well have just stuck all of my passwords in a plaintext file called "passwords.txt" because the only thing keeping my site from getting rooted was just having a terrible site nobody ever goes to.
Still this should be an informative thread.
Crushingly depressing and informative.
That is an engineering problem: there are the pieces but there is no real engine.
But this is my biased and humble opinion after using OpenSSL.
I am not advocating a 'just works one way' API. I am asking for a real engineering effort to create a 'dumb people can use this safely as long as they follow the simple instructions'.
If you (developer) need to know why encrypt+mac != mac+encrypt, then the security engineers have not done their job. If you need to know the difference between ECB and CBC, the security engineers have not done their job. If you need to know about the IV in AES, then ... (just repeat).
All the code you've ever written is broken.
With most code the consequences are not all that bad, with cryptographic code the consequences can be disastrous.
Is the guy going for hits on his first blog entry or somesuch?
And while it shouldn't be news to anyone using crypto, I still feel like it's a poorly-known subject.
I learned a lot by doing the coursera course on it earlier this year including an intro to a whole new area of mathematics. Some sort of intro to crypto techniques and uses really ought to be mandatory for a lot of devs, certainly anyone tempted to use anything other than a well-coded TLS API or pre-provided GCM interface.
In fact it's also very important that even when you do know what you're doing (to a greater extent than the entirely uninitiated), you don't implement this stuff yourself, because there are more attacks than you can possibly imagine.
It's so hard that even when you try to "leave it to the experts" you often do that wrong as well. There seems to be no other place in programming where best practices are more important to study and observe.
So, why should I use authenticated encryption and how can I use it on Node.JS/Rails/PHP/etc?
This is one of those situations where if you have to ask why you'd use authenticated encryption or how you'd do it, you simply shouldn't be using cryptography primitives directly at all. Use PGP/GPG instead.
If that's what you think this post is about, stay away from crypto. I don't mean that offensively. Most developers should stay the hell away from crypto. They all know intuitively that they shouldn't be hacking together, say, kernel modules. But since crypto comes in library form, they assume it's safe to build with it. It isn't.
Tony isn't complaining that developers implement the AES cipher core themselves and screw it up, or even that they're implementing their own block cipher mode code. He's saying that even when you use third-party libraries, if you haven't chosen a library that is specifically designed to avoid mistakes, you are boned. And he's absolutely right about that.
Also, from experience: virtually nobody uses the good high-level libraries. What library did you use last time you built a feature, like a password reset token, that relied on crypto?
What library _should_ I be using for sending out a password reset token?
(I realize that by asking I am proving your point. I would still like to know the answer.)
I expect there're other reasons than backwards compatibility or standard's compliance.
The library designer hasn't done anything "wrong" by providing a CBC module for use with cipher cores, but if you just pick up CBC mode and AES and go use them, you're likely to build a vulnerable cryptosystem, because the CBC IV must be unpredictable, it must not be possible to introduce controlled padding errors to the ciphertext, and because CBC ciphertexts can be bit-flip rewritten.
The problem is not with the CBC code; the problem is with exposing crypto at the "CBC mode" level of abstraction.
But what's the author of a low-level library to do? There isn't one simple solution to these problems. Some cryptosystems HMAC-SHA2 the CBC ciphertext. Some use cryptographically random IV's and tack them to the beginning of messages. It all depends on the system; the library designer can't predict all the different ways the code will be used, even though most of the way the code is used will end up being insecure.
This is why developers should use high-level libraries like Keyczar and NACL; by sacrificing low-level control and compatibility with legacy cryptosystems, they allow generalist developers to not have to worry about all the implementation attacks.
So why is a proper MAC better than appending a checksum or hash to the plaintext and then encrypting? Or maybe I am misunderstanding something?