OK, I get it. Encryption is hard. But just look at that post--there's so many acronyms and tech-talk that my head is spinning and I'm not halfway through. How can you expect anyone to do encryption properly when it's presented like that? In fact I'd say this post is literally useless FUD for anyone without a crypto background. It's unreadable unless you're an enthusiast.
Worse still is the huge amount of conflicting advice and methods available on the internet. If you had asked me today if AES-256 was the best there was, I would have said yes, because that's what I've read, and I'm not a crypto expert; but now this posts makes it seem like I'm wrong. The truth is that now I have no idea.
Crypto is hard, but it's so important today that it shouldn't be presented in such an obtuse way. There should be one resource that in big, bold headings and simple sentences outlines the best available crypto methods for each particular use case. Hashing password? Bcrypt. Asymmetrically encrypting plaintext? Algo XXX. Etc. Bonus points for step-by-step instructions on how to use those methods in common programs like GPG.
Without a human-readable, trusted, official resource, it's just going to be crypto scientists shoving acronyms down our throats till we don't care any more. Please simplify it and eliminate the bad choices for us, because you know better! I want to be as secure as possible, but it's outrageously hard to do with the way things are presented today.
Quite a few of the replies to acabal's comment amount to "yes, it's hard, suck it up and learn it."
But the crypto community has told us time and again that we shouldn't be trying to learn the fine details of crypto. If we are, it's a warning sign we're wading into the dangerous world of actually implementing crypto. We're just supposed to use pre-built, peer-reviewed crypto systems and understand at a very high level how they work. E.g. knowing what a hash is versus symmetric keys vs asymmetric keys, knowing which algorithms are still acceptable (Bcrypt) and which aren't (md5). And this, I think, is the right message for developers.
But the OP has a valid point. If it's true that many of the high-level, crypto-layman-friendly libraries do bad things like using all-zero IVs, then we have a real problem. But here's where I disagree with the OP: It should not fall on the average developer to understand this stuff and override a crypto library's insecure defaults.
Instead, authors of crypto libraries need to step up and implement secure defaults, as well as provide solid documentation on the do's and don'ts of their libraries. (Assuming of course the OP is correct about the flawed defaults in existing libraries.)
In any case, the solution is not to urge normal developers to learn crypto at the level the OP describes. I'm a smart guy, but I know better than to trust myself with things like initialization vectors. This is, after all, what the broader crypto community has been telling people.
I agree with you completely, and perhaps to clarify my comment (the one you've replied to): As you say, developers shouldn't have to worry about the fine details of crypto. What I mean by that is: Common tasks like hashing a password, creating a secure http connection, encrypting plaintext asymmetrically, etc., should be obvious, clear, and as secure as technology allows.
But, there are so many different crypto libraries with so many varied defaults, that in the end, we need some basic education on what is secure and what isn't. This particular blog post states in the 3rd sentence that developers using libraries are already making a grave mistake. How can we know which libraries to trust? If we can't trust libraries, what can we trust? Which magic combination of alphabet-soup abbreviations is the most secure, and how can I, a layman, be sure? For example: For asymmetric encryption, GPG on Ubuntu 12.04 defaults to the CAST5 algorithim. Is this better or worse than AES-256? I hear about AES-256 all the time, but I've never heard of CAST5. How can I, a layman, know it's the best default choice, considering that I only hear about more popular algorithms?
I feel like there should be a single best answer for most possible use cases. For example, today the best answer to storing password hashes would be to use the bcrypt library for your language of choice. But there's just so many options and so little clarification and consensus to a lot of these things (for regular developers, at least), that it's nigh-impossible to be confident about what you're doing. People are still MD5'ing passwords to this day--and it's not because they're idiots, because even knowing about MD5 means they had to have the initiative to Google "how to hash a password".
If I've misread the intention of this blog post, then perhaps I'm arguing the wrong point. If I am, I submit that this blog post might be better located in "crypto-researcher-news", not "hacker-news".
Does that mean Bcrypt is extremely suspect? If so, then we have a real communication problem, because lots of people who claim to know crypto sing Bcrypt's praises.
While I think that people should learn the cryptographic primitives as much as possible, the point of my post was to use a well-reputed open source authenticated encryption library. In addition to NaCl, there's also Keyczar: http://www.keyczar.org/
That's the point. The post is hard to read because the subject is hard. If that post --- which is pretty breezy --- is too hard for you to grok, the message you should be taking away with it is "I shouldn't be designing systems with crypto". The headline is good enough for you.
Yeah but how do I use systems with crypto? What if I want to encrypt some text? Do I worry about a MAC? Does GPG do MACs? How do I know? Do I worry about EAX? GCM? CCM? AES-CTR? CMAC? OMAC1? AEAD? NIST? CBC? CFB? CTR? ECB?
Programmers have to use crypto libraries all the time. The 3rd sentence in this post is "Chances are whenever you have tried to use a cryptographic library you made some sort of catastrophic mistake." Crypto is hard, but humans have to use it--this post is ridiculously complex and utterly unhelpful for anyone who's not already an expert, and that's not uncommon in crypto documentation, I think.
That's not how it should be. Something so important and widespread should be written about and explained in a human manner. My point is we need some resource to de-obfusticate the technobabble for those of us who need security but have day jobs that aren't developing the latest and greatest hash algorithm.
Question: What if I want to use EAX? GCM? CCM? AES-CTR? CMAC? OMAC1? AEAD? NIST? CBC? CFB? CTR? ECB?
Answer: You will perish in flames.
Smart cryptographers already grappled with the problem you're talking about here, concluded that non-crypto-engineers were never going to get these details right (professional crypto engineers don't even get it right, and there's a whole academic field dedicated to why), and designed high-level libraries that don't expose primitives like cipher cores, block modes, and MACs. You need to be using those high-level libraries. You need to start treating things like "CTR" and "OMAC" like plutonium, instead of like AA batteries.
Ok, we're getting closer, but I think acabal's point was that it's hard for us to tell on a general basis which acronyms matter and which don't. i.e. the question isn't really "what do I use?" it's "how do I know what to use when I don't have the knowledge to evaluate the different options? or even to tell which options matter?"
There's a disconnect here. Perhaps to you it seems like you're pounding the same simple point over and over again, trying every which way to explain it, and we always keep bringing up things that we really shouldn't be concerning ourselves with because we'll just screw them up.
But the community needs a better starting point. A lot of us know that there's a universe of stuff that we don't know about crypto, and we don't blithely imagine that we're secure because we used XCZ or LSA-j14(3). What we know is that Bob said to use XLQ and everyone says Bob is an expert, so we're gonna use XLQ. But we often come to this information in the middle of a hacker news thread, or on a website that looks like it was designed in 1993. There's no good general starting point that gives us a way to make good security decisions without knowing what we don't know. Does that make sense?
(There are actually a lot of resources that try to be starting points, but without the tools to do a meta-evaluation of which of these is expert and trustworthy, we're back to the same problem.)
And with that attitude nobody ever learns anything.
To me it sounds like a bad excuse for security guys to stroke their own egos. 'Puny mortal, you are nothing. Repent your ways of sin and obey your gods.
Both of these comments are wrong and right. I think tomjen3's comment accurately conveys how some people feel about an article that starts off with "All the crypto code you’ve ever written is probably broken." "YOU did something WRONG" is a crappy way to educate people. But on the other hand, as engineers/developers we have to learn to separate the tone from the soundness of the advice, because we work with other engineers.
Meanwhile, tptacek makes the very good point that if you ignore this advice because you're offended by it, you're going to end up building insecure systems that will endanger other people's data and possibly worse. But it's an impatient answer and it actually does come off as pretty egotistical. Isn't there some room between "secure" and "incompetent?"
Sorry for being all third-persony. I think you both make valid points, despite the negative tone.
I'm not sure why we should even dignify questions about egotism or how we're discouraging developers from learning. Those issues just aren't relevant. You either built a system that resists attacks or you don't. As Daniel J. Bernstein once said, that may sound harsh, but that's engineering.
They're relevant because security is social as well as technical. If you want the systems that your friends or relatives use to be more secure, then you can't just dismiss someone who may be implementing those systems. Okay, fine, if they're just insulting you, keep moving.
I responded to dignify it because I thought that in spite of the invective, there's a valid point about whether the article is helpful to the people it's meant to reach. The title is needlessly insulting to the reader. The tl;dr is pretty useless. You don't learn to do things right by cargo culting a mantra that you don't understand. The content of the tl;dr should be the block quote starting under "That said, what modes should you be using?"
Yes, you're right, this is actually pretty irrelevant to the content of the article and the question of whether a particular system is secure. I think it connects to a larger issue about security education that's lurking out there, though, and the article is clearly meant to educate.
So tptacek, you've currently got 37 comments on this story. Why? What are you trying to accomplish here today? If you have it in mind that you're educating people or improving the community, maybe take a step back. Your tone is condescending and combative. It reads like you're just arguing on the internet, and it's bringing down the general level of discourse. "tptacek" is normally a name that I associate with thoughtful comments so I take it that you're having a bad day.
Pretty much any software being built that touches the internet involves cryptography. So pretty much any software. It's an important subject for any software engineer, and there's a lot of good available to be done by helping educate engineers at every level of experience.
People write comments that say things like "pretty much ay software being built that touches the internet involves cryptography". In the context of the thread, that statement is not just wrong, but dangerously wrong. So I write a comment saying why.
Most of the time, it is easier to dash off a short comment that says something wrong, like, "pretty much any software" is going to involve grappling with cryptosystems, than it is to write a comment that thoroughly refutes that wrongness.
Also, the space of possible wrong things you could write, like, "there's a lot of good available to be done by helping educate engineers" about how to write bespoke custom cryptosystems, is much larger than the space of things you can write that are even strictly speaking correct. So I'm at a double disadvantage.
Ultimately, while I am happy to hear that you find my other comments helpful, I just do not care that you find my condescending, combative, or overly prolific on this thread. Deciding what to say based on what might or might not make random anonymous HN users happy is simply no way to be.
I don't know why you're talking if you don't care how it is received.
I'm a professional software engineer. I work on a system that occasionally passes secrets through untrusted contexts, encrypted with AES-256-CBC. Is that a good idea? Could we improve it? Would it be worth the effort? I'm open to learning, but this article isn't teaching, it's browbeating. So are your comments.
I don't think anybody is saying that you shouldn't strive to understand crypto. I think the overall message is don't build systems with crypto unless you understand it, or have someone in your employ who does. If you don't understand crypto, you will likely implement it incorrectly. It's as simple as that.
That's unfair - there are two things to learn here
1. How does the basics and intricacies of crypto work? acabal can hit up coursera for that, and the cryto guys here will probably help him with his homework for free.
2. How do I implement a user authentication mechanism from scratch? well, thats the one where you cannot learn it. dont try it seems to be the advice.
(in fact the simplest advice is on here too - authenticate once, hand over a random number, look it up at the back end.)
I assume that includes Time it out, count it out and then go read about CSRF protection.
The point of this is you shouldn't even be using algorithms directly. If you need to touch crypto at all you need to learn about it, and the more you learn about it the more you'll realise just how damned hard it is.
Absolutely use common programs like GPG and common libraries like OpenSSL. In fact don't use anything else, and don't even think about implementing this stuff yourself.
I agree some sort of authoritative set of info would be good. Very hard to do though.
It does not then follow that when someone refuses to explain something to you like you are their grandmother that they do not really understand what they're talking about.
Also I'm pretty sure that's Diablo Cody's quote, not Einstein's.
There's a difference between explaining to an eight-year-old what you're doing and instructing that eight-year-old in how to do the same thing safely without spending a long time on his education.
The real problem is the software libraries are just as hard.
Take OpenSSL for example... it's almost easier to learn the crypto than to figure out the API. Here's a good one: an SSL read/write operation can fail with more than one error, and if you don't clear or loop through all the errors then the next operation will fail because of the previous errors -- even if it succeeded. Or just try getting it to work with non-blocking sockets, you finally believe it is working and surprise it fails only once the network gets saturated. Or hours later when it renegotiates the crypto.
And you still have to know all the crypt terms to use it. What's a PEM? A BIO? PKCS? DHparams? What's "ASCII armor"? X509? Did SSL_library_init() add weak algorithms? Why do I have to know this just to create a secure connection?
Most of blame for crypto problems belongs to the libraries not the developers using them.
NACL and Keyczar are two viable alternatives to DIY crypto. I'd use NACL before I considered trying to DIY an AEAD secure transport; there are more things to get wrong than properly authenticating your data!
You are always better off not working directly with crypto at all. For that, consider using PGP/GPG for data-at-rest (if you're in a JVM language, you can use Bouncycastle to get access to PGP) and SSL/TLS for data-in-motion.
In our practice, password reset tokens and encrypted session cookies continue to be the top source of exploitable crypto vulnerabilities in web applications. You don't need encryption to build either of these features; send 128 bit random numbers that key a database row instead.
If you use Go, instead of hacking together things with AES from the standard library to encrypt something, just `go get code.google.com/p/go.crypto/nacl/secretbox` and call secretbox.Seal with unique [randomly generated is okay] nonce and your key to get the encrypted and authenticated "box". secretbox.Open with the same key and nonce will open it.
Bonus: on amd64 it's extremely fast (portions are written in assembly, and even pure Go versions for other platforms are hand-optimized), faster than what you can get from any non-hardware AES implementation (let alone authenticated with HMAC), and provides better security margin.
We ( http://bu.mp/company/ ) used NACL client and server side when we built our new app ( https://theflockapp.com/ ). It's a pretty great library--a shame it's not more widely used.
In our practice, password reset tokens and encrypted
session cookies continue to be the top source of
exploitable crypto vulnerabilities in web applications.
You don't need encryption to build either of these
features; send 128 bit random numbers that key a database
row instead.
If I'm not mistaken (and haven't read up on this stuff in years, so probably), the majority of encryption modes rely on XORing the stream of bits from the cipher with your plain text.
In that way, both sender and receiver need only generate the same cipher bits and apply XOR to encrypt and decrypt (meaning encryption and decryption are actually identical operations!). A side effect of XOR is that a single bit flip in the ciphertext corresponds exactly to a single bit flip in the cleartext. An attacker with knowledge of your cleartext can therefore modify it without ever needing to know the cipher parameters.
Imagine a session cookie that contains a single 32bit integer, the user ID. Now attacker knows his user ID, so he merely needs to XOR the cookie with his ID, then XOR it again with his desired ID and voila admin privileges. Wrapping the cookie in a MAC prevents this kind of manipulation.
I'm not sure I totally follow this (you seem to be talking about an attack on CBC mode, but the mode you're describing sounds more like CTR mode), but a good rule of thumb is, without explicit authentication, attackers can alter and often rewrite messages even though all they can see is ciphertext.
But there are even more problems than that with unauthenticated encryption. If you don't authenticate there is a good chance attackers will be able to decrypt your messages wholesale.
If they can get your system to tell them if a message is valid somehow, perhaps by making thousands of attempts to pass a message and noting where it says 'login failed' or '404' instead of invalid message (for instance) then there are all sorts of things that can be done to recover messages and keys.
I highly recommend Dan Boneh's crypto 101 on coursera for anyone that has the time.
The CBC padding oracle is one such attack. There are a bunch of similar ones. They're "chosen ciphertext" attacks.
Again, even if you get this part right, there are other things that go wrong. TLS is authenticated, and it fell to two adaptive chosen plaintext attacks because of two different implementation details they messed up. And no public cryptosystem in the world has been as thoroughly tested and analyzed as TLS.
For some mind-boggling reason, the designers of the XML Encryption standard decided to make authentication optional, so an attacker can simply avoid sending an incorrect MAC.
The simplest way to think about it is during the receiving process: checking the MAC before decrypting catches and rejects evil messages earlier so there are fewer things that can (and will) go wrong. Even if you tell the world that the message wasn't invalid, you don't reveal any new information to the attacker.
OTOH, if the receiver must decrypt before checking the MAC, any information leaked to the attacker (success or failure, timing, etc) is very likely to give the attacker a systematic method to decrypt some or all of your secret plaintext.
Missing from this blog post, but made clear here (http://blog.cryptographyengineering.com/2012/05/how-to-choos...) is that you need to USE TWO DIFFERENT KEYS, one for the cipher, the other for the MAC. This makes it practically impossible to forge a valid message, since even if someone forged an apparently valid cipher section they'd be unable generate the corresponding MAC (unless they already had your keys, in which case you're already toast).
Not to mention: if you're doing it right, you really don't have to think about whether you have two keys; you just naturally kind of do. It's the systems where you consciously have to handle two different keys that you have to worry about, because they're probably not keying themselves safely to begin with.
Security is a series of tradeoffs. Remember the Debian SSL fiasco? Turns out even when every Debian/Ubuntu user's SSH keys were essentially useless, the world did not explode. Or consider the state of the internet in the 90-early 2000s - security was basically a joke, article author would've bust an artery raving about all the ways YOU ARE PWNED RIGHT NOW. Oh, and it looks like people are okay with Dropbox even though it occasionally fails horribly and lets you see other peoples' files.
It turns out for the vast majority of use cases, even broken crypto is Good Enough. That has to do with two things.
For you to be worrying about initialization vectors and block modes you need to first have a reason to be working at that level - answer "no" for most webapp use cases.
Secondly, and ignored by security fearmongerers: you need to have something of value commensurate with the efforts you've put toward security.
This is the mindset of a cryptography professional: true, chances are what you're using is broken. But whether that's a problem depends on who you are. If you're the NSA, you might be concerned with crazy shit like China building quantum computers and factoring all your primes (I'm making this up). Whereas you might be the latest social.ly startup, your users have no privacy anyway, security is more of an image concern (it looks bad to be on the front page of HN with security bug) than a real one.
No, it does not turn out that for the vast majority of cases, broken crypto is good enough.
It turns out that for the vast majority of cases, broken crypto schemes don't have enough users for the tiny minority of software security people who look at crypto to bother beating up.
Once one of those systems becomes popular, crypto pentesters finally get around to poking at them, and, lo and behold, thousands of users are discovered to have been communicating effectively in the clear for years.
It's one of the more pernicious evil memes in our field that "if you're not the NSA, everything you do is broken, so just try the best you can". No. The attacks we're talking about take 50-200 lines of Ruby code and less than 30 seconds to run. There's no excuse for being exposed to them. We mercilessly mock people who screw up SQL queries, but then act like it took a network of intelligence service supercomputers to break our amateurish cryptosystems. We'd all be better off if fewer people tried their hand at building these systems in the first place. Use PGP, or Keyczar, or NACL. If you're typing the letters "A-E-S" into your code, you're doing it wrong.
Well, we're in agreement there - and I'd go further to say if you find yourself concatenating strings and feeding them to hash algorithm you're probably doing it wrong too. It's insecure and it's more work, and it's stupid.
I just wanted to point out the more general notion that, from a business perspective, even a colossal failure like Dropbox's is more of a PR disaster than anything else. And until that equation changes, security will be at the bottom of the totem pole.
Right. To paraphrase: it turns out that for the vast majority of cases the "security holes" in the system aren't due to broken crypto. So what's the practical difference?
I tend to view things like this the same way. Yes, all that advice is good. But all the effects that the listed minutiae deal with are way, way down the priority list of any real-world application. Specifically they're well below the threshold where a compromise of the user's computer via other means and/or a compromise of the human communication processes involved is a bigger threat.
The MAC example is a good one. You're at the point where you're trying to protect your app against a chosen ciphertext attack (pretty far up the sophistication scale), and trying to decide the MAC mode. The text in the article makes you sound like a total idiot for not knowing what the "right" choice is and making you worry that you might not get it right.
And then you get hacked because one of your admins had a ssh key on her phone and it got stolen.
Too much security analysis is missing Big Picture issues...
That's kind of the point though. If you're trying to decide your MAC mode you're already doing it wrong and should be looking at something higher level that does it for you.
And yet it still doesn't protect you from the stolen phone (or whatever -- bad password or password in email, SQL injection on the backend, bribed employee, etc...), which is I argue an immensely greater practical threat. So why do we keep reading blog posts about cryptographic minutiae instead of the real threat?
Obviously it's because the cryptography as a field is technically interesting and "security best practices" isn't. And there's nothing wrong with that from the perspective of someone looking for interesting links on HN. But honestly I feel like that kind of tunnel vision has reached the point where it's actually hurting security practice rather than helping.
Seriously?
I'm not sure I know how to argue with this.
It doesn't bother you that, if you do it wrong, that by watching a bit of traffic and sending a few thousand page requests I might be able to impersonate any user on your system?
The linked post isn't even an interesting or exciting thing about crypto, it's not even news, it's just reiterating the usual thing - you shouldn't be doing this yourself.
In fact the linked blog post is exactly about best practices.
If you don't know how to argue with a point, maybe it's because you're not in an argument. :)
Obviously it "bothers me" that crypto is easy to get wrong. My point was that other things bother me more, and I don't think this genre of blog post (or your very typical reaction to criticism thereof) is helpful to improving security. See my other post -- are you one of those little BOfH monsters enabled by a little crypto knowledge? Are you sure?
How is this helpful? How does it help to point out that so many of the people who are deploying crypto are so boned that their systems are compromised before the crypto even comes into play? Help me understand this. If you can't get the basic systems programming concerns of your software right, what business does your system have telling users that it's "cryptographically secured"?
First: I agree with all of your practical advice. I'm making a meta point to security people like you and the author of the linked post.
And I'm not sure it's "helpful" really . It's more musing on whether or not this kind of advice is hurting more than helping. What you are saying amounts to "Crypto is really hard so use expert-authored solutions." But in my experience what people hear is "Crypto is Hugely Important and I'm using a expert-authored solution and using the same jargon, so you need to listen to me about all that security stuff and do what I say even though it's totally impractical."
Broadly, I guess I'm thinking that this creates little BOfH monsters, where a more nuanced, "big picture" frame might engender more thought about costs and tradeoffs.
I would not have written the same post Tony wrote.
Generally: I'm comfortable writing about crypto when the subject is "how you would practically break a system that makes mistake X or mistake Y". I'm not comfortable about posts with prescriptive content.
I am also not comfortable with posts that condone building cryptosystems out of primitives, even when they limit the solution space to well-regarded tuples of those primitives. If I had to write a prescriptive post about crypto, it would state clearly: you cannot DIY this, and you must use a vetted cryptosystem; your choices include NACL, Keyczar, and PGP.
It's very good that the author is trying to get the word out, but it would help tremendously if he tried to at least give some indication (or links to further material), of why certain modes are insecure.
The people who already know this information don't need this blog posts, and the ones that don't get no information that helps them to understand the problem. I think stating what can go wrong with, for example, ECB, would make it much easier for people to remember which mode is good and which is bad. As it is at the moment one would have to remember a few cryptic acronyms as "good" and others as "bad", without any context. (Or go look it up somewhere, but the people looking these things up are not the ones in need of being alerted...)
If you have to explain to someone why they shouldn't be using ECB, it's hopeless. Instead, you need to be educating them as to why they should be using someone else's whole cryptosystem, like NACL or Keyczar or PGP.
Crypto is a place where a little knowledge is often harmful. You tell someone not to use ECB, so they use CBC instead. Now they're vulnerable to a bunch of new practical attacks. You list those off, and they switch to CTR, and their stuff is as trivially decryptable as simple XOR encryption.
What's the point? It's better to give people the (honest, accurate) impression that they don't know anything resembling enough to build a cryptosystem so they won't try, fail, and screw over their users.
You're interpreting these posts all wrong. There are plenty of things within the field of CS that are sufficiently complex and difficult to get right that your average Joe Coder shouldn't be attempting to write them in production code. There isn't some shroud of difficulty around crypto in particular. For instance, in just about any project, getting ANY security right, not even considering crypto, is typically an overwhelming task.
The point is that when people read this blog on the front-page of HN, they expect to learn something. 99.99% of us aren't writing our own crypto. But this is a way of learning. I've been told from basically everyone in the industry since a young age that you shouldn't write your own crypto. As a result, I educated myself to understand how I might get it wrong so I can understand when things are wrong, and so that I could understand crypto news. Similarly, as a web dev, I educated myself heavily on common vulnerabilities. The end result is that no, I can't say that everything I make is 100% secure, but it's certainly a lot better than if I didn't do those things. At least now I can actually see when myself and others might be causing a vulnerability.
You seem to be hand-waving things and telling people "STAY AWAY." Maybe, just maybe, instead of treating people like they're all idiots, enlightening them might work a little better. You've put a warning on the label, the blog post is NOT a guide on how to make crypto, but a guide in how you'll probably get it wrong. Now proceed with the juicy details. Maybe I don't know why I shouldn't be using ECB because I don't know what it is because I don't do crypto. Maybe I'm not implementing my own block cipher, or maybe I'm just curious about the different modes of block ciphers and which are good/bad and why. Maybe with a better understanding I can even explain it to other people rather than just telling them "you shouldn't do your own crypto," and look like an idiot when I have nothing to say when they ask "why?"
Applying your reasoning to computer science -- some people end up as negative productive engineers. They create more bugs and problems than they solve. Obviously, we should just STOP TEACHING CS to anyone because such people exist, right?
Tony's post is prescriptive. "Do this, not that". It's not "let's explore all the intricacies of this problem".
I am interested in giving people the right advice to build systems that won't blow their hands off in production a year from now. I am at the moment less interested in handholding them through a guided tour of the last 10 years of crypto vulnerabilities.
You'd see the same phenomena occurring in message board threads about DIY nuclear reactors. Except that it's hard to build a DIY nuclear reactor, and very few people do it, so we don't need those threads very often, and we don't need to sort through the "LET"S JUST NOBODY LEARN ABOUT NUCLEAR POWER THEN HUH?" comments.
Unfortunately, it is very easy to build new crypto applications, and people do it, and then a year or two later we see threads from dissident groups in South America where people have been interrogated by police organizations that have full decrypts from those tools, and then a few more weeks later we find out that the tools were doing comically stupid things with their crypto building blocks.
I think educating people about some pitfalls and then recommending them a "full package" solution will stick better.
For example, for ECB it's trivial to explain, that it allows reordering of encrypted blocks, an attack that someone not familiar with cryptography has probably never thought about. Of course they won't gain a full understanding of cryptography in this way, but they will hopefully start to appreciate how difficult it is to get right, and how many attacks there are that they have never thought about. This hopefully includes an understanding that possibly other modes that they think would be safe might also have problems that they have never considered, and that they should really use a full package solution.
I think this will drive the point home better than, say, listing three opaque possibilities on how to use a MAC and then saying "no 1 is the right one", without any explanation whatsoever.
The old "interview screenplay"[1] post on the Matasano blog
had a particularly compelling image of Why ECB Sucks. Sadly, it doesn't seem to have survived the 3 years of blog herd migrations.
the argument you're replying to doesn't require that the explanation be correct, does it?
what they seem to be saying is: feeding someone a little extra information that is entertaining somehow helps make the recommendation to use a 3rd party library more convincing.
and i think there's something to that argument, even though i've presented it poorly above. people don't like to be simply told to do something. they like to be indulged a little.
i realise that goes somewhat against your personal style, but it's not obviously wrong...
I get what you are saying but the idea that people should not try to learn is ridiculous. If someone is truly interested they need to be told to go ahead. Encourage caution, but not ignorance. When someone asks why ECB is bad then tell them, or tell them to go read up on it.
No. I've never seen anyone criticize someone for working with and asking about questions about developing attacks on crypto. It's only when people "Show HN" their latest tool for protecting dissidents from foreign governments, or protecting financial transactions, that people catch flack for going the DIY route.
Authentication and crypto are areas where I know I'm nowhere near good enough to tell if what i'm doing works or not, other than trying to find out what the best practices happen to be and use what seems to have a good reputation. But the only constant in this field seems to be that everything I thought I knew was fatally horribly flawed from the start and that I might as well have just stuck all of my passwords in a plaintext file called "passwords.txt" because the only thing keeping my site from getting rooted was just having a terrible site nobody ever goes to.
The real problem is the lack of a REAL crypto API which does not suck as an interface for security and with which, <b>using default parameters</b>, you get good security. That is the problem. All the ones I know are too low level.
That is an engineering problem: there are the pieces but there is no real engine.
But this is my biased and humble opinion after using OpenSSL.
I am not advocating a 'just works one way' API. I am asking for a real engineering effort to create a 'dumb people can use this safely as long as they follow the simple instructions'.
If you (developer) need to know why encrypt+mac != mac+encrypt, then the security engineers have not done their job. If you need to know the difference between ECB and CBC, the security engineers have not done their job. If you need to know about the IV in AES, then ... (just repeat).
You could make a much simpler and more powerful statement:
All the code you've ever written is broken.
Since cryptographic code is harder than most code and getting simple stuff right is impossible there is an extremely small chance that you'd get crypto code right 100%.
With most code the consequences are not all that bad, with cryptographic code the consequences can be disastrous.
In my experience it's news to something like 90% of all developers, who can generally be counted on to say things like "we used AES so not even the US military can break our encryption".
I learned a lot by doing the coursera course on it earlier this year including an intro to a whole new area of mathematics. Some sort of intro to crypto techniques and uses really ought to be mandatory for a lot of devs, certainly anyone tempted to use anything other than a well-coded TLS API or pre-provided GCM interface.
--edit--
In fact it's also very important that even when you do know what you're doing (to a greater extent than the entirely uninitiated), you don't implement this stuff yourself, because there are more attacks than you can possibly imagine.
This post does not go at all into the "why" of authenticated encryption. If you're interested in more of the reasons for authenticating, including actual attacks it protects against, I found http://blog.cryptographyengineering.com/2012/05/how-to-choos... to be a good explanation of this.
Crypto: If you don't know exactly what you're doing, you're not doing it at all, no matter how hard you try. It's very hard to know if you know.
It's so hard that even when you try to "leave it to the experts" you often do that wrong as well. There seems to be no other place in programming where best practices are more important to study and observe.
Because if your ciphertext isn't authenticated attackers can modify it to launch adaptive attacks against you. Also, obviously, they can potentially rewrite messages.
This is one of those situations where if you have to ask why you'd use authenticated encryption or how you'd do it, you simply shouldn't be using cryptography primitives directly at all. Use PGP/GPG instead.
If that's what you think this post is about, stay away from crypto. I don't mean that offensively. Most developers should stay the hell away from crypto. They all know intuitively that they shouldn't be hacking together, say, kernel modules. But since crypto comes in library form, they assume it's safe to build with it. It isn't.
I wouldn't touch cryptography code in production with a long stick. Thing is the headline all crypto code you've written is broken is sensationalist and doesn't apply to 90% of developers, who "write crypto code" using third-party libraries.
No. The "third party libraries" you're talking about are, from my experience (my whole job is looking at random applications deployed in production for problems like this) things like OpenSSL and crypto++ and CryptoAPI and Common Crypto.
Tony isn't complaining that developers implement the AES cipher core themselves and screw it up, or even that they're implementing their own block cipher mode code. He's saying that even when you use third-party libraries, if you haven't chosen a library that is specifically designed to avoid mistakes, you are boned. And he's absolutely right about that.
Also, from experience: virtually nobody uses the good high-level libraries. What library did you use last time you built a feature, like a password reset token, that relied on crypto?
Also, from experience: virtually nobody uses the good high-level libraries. What library did you use last time you built a feature, like a password reset token, that relied on crypto?
What library _should_ I be using for sending out a password reset token?
(I realize that by asking I am proving your point. I would still like to know the answer.)
You shouldn't use crypto at all (for instance, for reset tokens, you can just use long random numbers as keys), but if you have to use a library, use PGP, NACL, or Keyczar.
Take CBC mode for instance. The CBC block cipher mode is a fine choice for bulk encryption. Low-level crypto libraries must provide it, because CBC is the most common block cipher mode used by preexisting cryptosystems (like TLS). And a generic implementation of the CBC block cipher mode must expose the CBC IV.
The library designer hasn't done anything "wrong" by providing a CBC module for use with cipher cores, but if you just pick up CBC mode and AES and go use them, you're likely to build a vulnerable cryptosystem, because the CBC IV must be unpredictable, it must not be possible to introduce controlled padding errors to the ciphertext, and because CBC ciphertexts can be bit-flip rewritten.
The problem is not with the CBC code; the problem is with exposing crypto at the "CBC mode" level of abstraction.
But what's the author of a low-level library to do? There isn't one simple solution to these problems. Some cryptosystems HMAC-SHA2 the CBC ciphertext. Some use cryptographically random IV's and tack them to the beginning of messages. It all depends on the system; the library designer can't predict all the different ways the code will be used, even though most of the way the code is used will end up being insecure.
This is why developers should use high-level libraries like Keyczar and NACL; by sacrificing low-level control and compatibility with legacy cryptosystems, they allow generalist developers to not have to worry about all the implementation attacks.
Not that it will avoid errors on the part of the implementor, but if you do a thorough reading of the classic "Applied Cryptography" by Schneier, he pretty much covers all these problems. He covers all the issues such as ECB vs CBC, IVs, HMACs, etc. It's the definitive book on the topic and anyone who is interested in cryptography should buy this book.
No, he famously and openly admits to not doing this. _Applied_ is not a definitive book, and _Practical Cryptography_ (now _Cryptography Engineering_), which he co-authors, starts out with a disclaimer to this effect.
Interesting, I'm admittedly a few years out of date, the last time I used the book heavily was about 10 years ago... I'll have to pick up this new(er) book.
The tone is not very constructive. I would say, at the very least you should read "Applied Cryptography" by Bruce Schneier and the most recent NIST recommendations.
But if someone chooses a ciphertext (without knowing the key), how could they get the checksum right (which is also encrypted)? If the checksum was wrong, the system would refuse to decrypt the corrupted message.
No. There's no such thing as "refusing to decrypt a corrupted message"; cipher cores either produce an expected plaintext or, if the message is corrupted, an unexpected plaintext. Attackers can use changes in behavior based on the differences between different unexpected resulting plaintexts to infer the original plaintext.
My question was why is a proper MAC better than appending a checksum to the plain text and then encrypting. With an appended and encrypted checksum, the system could easily reject corrupted messages, that's the whole point of a checksum in the first place.
So why is a proper MAC better than appending a checksum or hash to the plaintext and then encrypting? Or maybe I am misunderstanding something?
I've answered that already. The MAC allows the cryptosystem to reject messages with corrupted ciphertexts. If you don't do that, it can be possible to use controlled corrupted ciphertexts to learn things about the plaintexts of messages, and not just via CBC padding oracles.
Congratulations on your first blog post! It looks pretty good, very attention grabbing. Please follow up with some more focus on the positives. I will be very interested to read more information on the correct way to implement crypto in all its subtle contextual glory.
Worse still is the huge amount of conflicting advice and methods available on the internet. If you had asked me today if AES-256 was the best there was, I would have said yes, because that's what I've read, and I'm not a crypto expert; but now this posts makes it seem like I'm wrong. The truth is that now I have no idea.
Crypto is hard, but it's so important today that it shouldn't be presented in such an obtuse way. There should be one resource that in big, bold headings and simple sentences outlines the best available crypto methods for each particular use case. Hashing password? Bcrypt. Asymmetrically encrypting plaintext? Algo XXX. Etc. Bonus points for step-by-step instructions on how to use those methods in common programs like GPG.
Without a human-readable, trusted, official resource, it's just going to be crypto scientists shoving acronyms down our throats till we don't care any more. Please simplify it and eliminate the bad choices for us, because you know better! I want to be as secure as possible, but it's outrageously hard to do with the way things are presented today.