Previous discussion here: https://news.ycombinator.com/item?id=1346711
And here: https://news.ycombinator.com/item?id=2253336
The first thing I did after the Snowden leaks was read through the entire thing and after doing so I really wished I had done this years earlier. There's very few books that I think should be required reading across the board for software engineers, but this is one that I do think everyone writing code should read every page of.
In general, the authors seem to subscribe to "crypto is black magic" school of thought, which doesn't make for good pedagogy.
I found the course pretty hard as programmer with a strong interest in crypto, but no formal CS/maths background. The coding pieces were fairly straightforward, but the maths hurt.
It is in my "summer reading" pile.
x |= MAC_computed[i] − MAC_computed[i];
x |= MAC_computed[i] − MAC_received[i];
x |= MAC_computed[i] ^ MAC_received[i];
For example, is the suggested key size still the same as 2010?
Considering this, what do you think has changed in the past 5 years? Eg the slides mention SHA3 as a future option, now it is finalized and afaik usable.
There is also one interesting thing about SHA3 which I learned recently, namely that it can apparently be easily used as MAC too:
> Unlike SHA-1 and SHA-2, Keccak does not have the length-extension weakness, hence does not need the HMAC nested construction. Instead, MAC computation can be performed by simply prepending the message with the key
In this light, do you think it would be reasonable to add exception for "DON’T: Try to use a hash function as a symmetric signature." rule?
Another thing is that ECC seems to be on the rise (or is it just my perception?). Do you think a revised slide set would include something more about elliptic curves?
Not much has changed. I still think SHA3 is worth considering 5-10 years into the future; five years ago I was concerned about the implications of MD5 and SHA1 breaks on SHA2, but the lack of recent progress makes me happier with staying on SHA2 while SHA3 gets more analysis.
do you think it would be reasonable to add exception for "DON’T: Try to use a hash function as a symmetric signature." rule?
I mentioned this in the talk; even with SHA3 you need to be careful, since a simple "append and hash" would result in MAC("key", "data") == MAC("keyd", "ata"), which breaks the MAC assumptions. Yes, you can use SHA3 as a MAC, but make sure you know what you're doing.
ECC seems to be on the rise (or is it just my perception?). Do you think a revised slide set would include something more about elliptic curves?
ECC is getting more popular; not necessarily for the right reasons, though. (The big drivers seem to be "bitcoin uses this" and "the most common way of using this provides forward perfect secrecy".) That said, as mathematicians continue to attack ECC systems, I am gradually becoming more comfortable; the 2025 version of this talk might recommend using them, but for now I don't think it makes sense to change my recommendation (except for situations like bitcoin which specifically need ECC's advantages).
$ youtube-dl http://blip.tv/blahblah
$ ffmpeg -i file.flv file.ogv
Actually, this is a bit wrong - it's just that this is not cryptography and security-related and doesn't have anything to do with authentication. It's the same reason why some fansubbers still include CRC32 into filename - basic unauthenticated integrity checking. Totally insecure, but still better than nothing.
I had case where downloaded file was broken - for some weird case either software, storage or a network had failed but TCP/IP checksumming didn't help and the fact was, I had wrong bits on the hard drive. I hadn't broadband connectivity at home and that cryptographically-pointless CHECKSUM.SHA256 (well, actually it was MD5) helped to find that out before I left the library.
And recently, the same concept had helped me to validate files with a faulty cloud storage service that managed to lose some chunks and silently replaced them with zeroes when I tried to download the data back.
Obviously, a signed file stored on another server would always be a better idea.
As far as I can tell the gotcha with Poly1305 is that you absolutely cannot reuse the same key ever. His construction involves using your keyed cipher to create 32 random bytes by encrypting 32 zero bytes before encrypting the actual message. These random bytes are then used as the one-time-use Poly1305 key. In NaCl he does this with either Salsa20 or AES-CTR, both of which are stream ciphers. (CTR converts a block cipher into a stream cipher.)
> The purpose of cryptography is to force the US government
to torture you.
Proper cryptography can keep them from learning that it's you they'll need to kidnap to get the secret, or even keep them from learning that there is a secret they might care about in the first place.
Also, there are plenty of bad guys in the world that can't kidnap and torture you that it's still quite worthwhile to keep your secrets from.
Then the only thing you need to worry about is that person dying, in which case the investigation continues. So similar to upping the number of rounds on PBKDF2 every year, you need a new scapegoat every year, or however long it takes them to break either the crypto or the scapegoat.
This is subject to the caveat that ECC offers benefits under certain specific conditions (e.g., you need small signatures or a small ASIC die area); but in those situations you want to talk to a cryptographer anyway. My talk was providing guidelines for software developers who are writing code for general-purpose PC hardware.
First, dedicated AEAD cipher modes are superior to manually composing AES-CTR and HMAC-SHAx. AEAD modes provide both authentication and encryption in a single construction. AES-CTR+HMAC-SHAx involves joining two constructions to do the same thing.
Colin points to AES-GCM, the most popular AEAD mode, as an example of how AEAD modes can be more susceptible to side-channel attacks than AES-CTR+HMAC-SHAx. I don't think this is a strong argument for a couple reasons:
* AES-GCM is so far as I can tell no practical cryptographer's preferred AEAD mode; "whipping boy" might be a better role to ascribe to it.
* AES-GCM's side-channel issues are due principally to a hardware support concern (GMAC, AES-GCM's answer to HMAC, requires modular binary polynomial multiplication, which in pure software is table-driven, which creates a cache timing channel). That hardware support concern might be receding, and platforms that don't have AES-GCM might prefer a different AEAD mode anyways.
* Modern AEAD modes, of the sort being entered into CAESAR, have resilience against side channels as a design goal.
* But the most important rejoinder to Colin's preference for HMAC-SHAx over AEAD modes is this: HMAC-SHAx is responsible for more side-channel flaws than any other construction. Practically every piece of software that has implemented HMAC-SHAx has managed to introduce the obvious timing channel in comparing the candidate to the true MAC for a message. What's worse is, using AES-CTR+HMAC-SHAx in practice means that the developer has to code the HMAC validation themselves; it's practically an engraved invitation to introduce a new timing leak. Almost nobody codes their own AES-GCM.
I also think Colin is wrong about ECC, particular versus RSA.
RSA might be the most error prone cryptographic primitive available to developers. There are two big reasons for this:
* RSA easily supports a direct encryption transform, which is widely used in practice. ECC software as a general rule does not synthesize an encryption transform from the ECDLP; instead, ECC software uses the ECDLP to derive a key, and then uses something like AES to actually encrypt data. You might be thinking, "that's how RSA software works too". But RSA in practice is subtly different: you can, for instance, easily encrypt a credit card number directly using RSA, or build a crappy key exchange protocol where the client encrypts (doesn't derive or sign, but encrypts) a key for the server. RSA and conventional DLP crypto therefore offer (encrypt, key-exchange, sign); ECC in practice offers just (key-exchange, sign). That's one less dimension of things developers have to get right in ECC cryptosystems.
Colin is smarter than me and much better trained in cryptography than I am. All I do is find and exploit crypto bugs in real software. I am usually as nervous as any of you are to challenge Colin about crypto, but I think I can win these two arguments. (Spoiler alert: Colin doesn't think so).
Both CTR+HMAC and combined AEAD modes, correctly implemented and correctly used, will keep you safe against existing published attacks. Thomas takes the view that "correctly implemented and correctly used" is a problem, and I'll accept that he's right to recommend AEAD modes in that case (as long as you have a good cryptographic library available to you).
I take the view that if you can't take CTR and HMAC and put them together correctly, there's no way the rest of your code is ever going to be secure, so you've already lost; so I focus on the "against existing published attacks" side of things, look at the places where novel cryptographic attacks tend to be found, and opt for combining two very simple and well-understood constructions.
The same story plays out for RSA vs. ECC: Thomas is worried about the fact that people have made dumb mistakes when using RSA, while I figure that if you're going to make those dumb mistakes (especially after my talk) you're going to write code which is otherwise insecure anyway, so I focus on the places where I think it is more likely that attacks will be found in the future.
If you're a high school student with two years of Python experience and you want to add some cryptography to your cat photo sharing startup, listen to Thomas. If you're a senior developer with 20 years of experience writing C code for internet-facing daemons, and the code you write is going to be used by democracy activists in China, listen to me.
 I'm not convinced that such a thing exists right now.
 Or, alternatively, where attacks may have already been found, but not published.
Also, if you asked me who was more likely to get crypto right, the Django web guy or the C daemon guy, I'd bet on the Django guy every time. Betting against crypto implemented in C is like betting when you've made a full house on the flop: all in.
I have to agree with Colin's  point.
The standard libraries for the languages I see in assessments most often just don't include AEAD constructions. And when public libraries exist, they haven't been properly assessed.
* OpenSSL supports AEAD through CCM and GCM (and OpenSSL's GCM uses PCLMUL and shouldn't have the obvious cache leak)
* Botan supports AEAD through OCB, GCM, CCM, EAX, and SIV(!).
* Java JCE with the Bouncycastle provider (extremely popular) does AEAD with GCM, CCM, and OCB
* .NET stack languages get AEAD through CCM or GCM.
* Crypto++ supports AEAD with GCM, CCM, and EAX.
* Golang supports AEAD through GCM in the standard library (I did implement a crappy OCB myself, but the go.crypto package probably has a better one).
What bases aren't covered here? Bear in mind that most languages get their crypto through bindings to OpenSSL.
But .NET probably represents the largest block of applications I see. Seconded by (shudder) CF. I don't really have any hope for CF but crypto is hardly it's biggest problem.
As far as .NET goes, my first point was that the standard library does not support it; true in this case. I'm aware of CLR and Bouncy Castle as external libraries but neither of them inspires me with confidence.
Supposedly, CLR was released by Microsoft but why didn't they include it in the standard library or release any associated security assessment reports? Were there any?
I've heard the name Bouncy Castle thrown around quite a bit but that's about it. When I dig through their websites, it leaves me with a feeling not unlike trying to find information about Truecrypt. Granted, I haven't followed their project(s) very closely. But because of that feeling, I honestly trust OpenSSL more because people are scared about it.
So, maybe I'm just missing something but this is where I've arrived. Please correct me if I'm way off base.
Elliptic curves were proposed, and have since been studied in the context of cryptography, in 1985. They're 30 years old! For comparison, finite field discrete-log Diffie-Hellman is 38 years old, and RSA is 37. The latter have been severely beaten down in the ensuing decades that followed, whereas elliptic curves have stayed (modulo special cases, but those also exist for DH/RSA) resistant to every non-generic attack so far. I would say ECC has a better track record, and could be considered the conservative choice.
It could be argued that the underlying problem, integer factorization and FF discrete log, has been studied for much longer than the ECDLP. Maybe. Some basic algorithms go centuries back, but I would argue that the field has only seen real progress starting in the mid-1970s, with CFRAC and Pollard's algorithms. It can be counter-argued that elliptic curves as a subject have only existed for around 100 years, so they are still underdeveloped. Again, maybe.
There is indeed a new wave of interest in elliptic curves, and it is still fueling many publications every year. But these are mostly performance engineering at this point: Edwards curves are not fundamentally different from what Miller proposed in 1985, as far as the ECDLP is concerned. I would not recommend the wonkier stuff like GLS/GLV curves, though, nor curves over extension fields or of higher genus: that would be too neophiliac, even for me.
(I realize that I'm not gonna change your (or probably anyone's) mind, but you make it sound like elliptic curves are much more of a novelty than they really are. I don't disagree too much about the AEAD vs CTR+HMAC issue.)
Somewhat disappointing, since this slide is the only one in the presentation containing anything cryptographically "meaty".
True, I oversimplified a bit. I was referring to situations where you don't know k' and x', e.g., x' = x and k' = k ^ \epsilon for some value \epsilon.
(I know that ideal ciphers are defined correctly elsewhere, and agree that their definition makes sense.)
Direct implementation of AES from specification can be attacked using cache-timing side-channel. ARX ciphers are much easier to implement in software and also because they run in constant time, and are therefore immune to timing attacks.
What is your opinion of ARX ciphers Chacha20 (from Daniel J. Bernstein ) and Threefish, Skein hash (Bruce Schneier, Niels Ferguson) ?
A question I have though is given that we (those of us who collectively don't identify ourselves as security rocket-surgeons) are always advised to stick with the high-level stuff and to stay away from stuff like AES (ref. slide 16), which nobody does anyway, then why is no mention ever made of established and known protocols? Examples like
- Needham-Schroeder (fixed version), http://en.wikipedia.org/wiki/Needham-Schroeder_protocol
- Otway-Rees, http://en.wikipedia.org/wiki/Otway%E2%80%93Rees_protocol
have proven to be really useful. Yes, there's a metric ton of work involved in implementing something like this, but going through it once is amazing practice for getting it right in future (that, at least, is my experience).
Why is it that we talk about symmetric and asymmetric encryption, but never go as far as the protocols that provide real context for their uses?
Is that solely an SSL/TLS concern, or is it generally applicable to cryptosystems?
The problem with the polynomial AE modes is they are trickier to implement (AES is actually small, as is SHA).
It also means that the implementation of the algorithm does the work of both authentication and privacy, whereas with HMAC-SHA256 (and HMAC-SHA256 + AES-CBC/CTR) you will see developers hand-rolling more of it, leaving more room for them to do it wrong.
I also prefer that AES-GCM is a mode itself, while encrypt-then-mac generally requires a developer to do research into the appropriate encryption mode and other details required to "get this right".
Use SSL for transport""
Honest question here: suppose I have a webapp with multiple servers, being load-balanced through Amazon's ELB. This sounds about as standard as it can get. Question is: how does one handle client migration between servers, and client authentication, without writing cryptographic code or knowing anything about cryptography?
(apologies if this is trivial)
A quick googling found this guide to setting up SSL on ELB: http://docs.aws.amazon.com/ElasticLoadBalancing/latest/Devel...
And leaving this specific example aside for a moment, I was trying to prove a larger point: people write crypto code for a reason. Some people will write it for no reason at all, yes. A lot of other devs will be dying to save themselves the time it will take to write any type of code, especially if it has to deal with something they aren't experts in, but they just can't. Now, it seems likely you and I can hash out the details of an acceptable solution in this discussion (in fact, I think once you take the steps to make sure the random source isn't predictable, you're good, but even this is slightly crypto-related), but how would the average developer know this? How would he go from "don't write crypto code" to this? In my humble opinion, simply saying "don't write crypto code" isn't a solution, and there are good reasons why this hasn't worked so far.
Usually this sort of code is already present and well tested in whatever web framework you're using and there's just no good reason to write it again yourself. Best case is you wasted your time, worst case it's buggy and insecure.
On the other hand, why is the author using "i++" in a preamble of a for loop?
That style is in many cases inefficient.
/* summation of successive integers */
/* don't worry, this should not cause overflow in a 32-bit signed integer type */
volatile int i=0;
for( ; i<65536; i++) /* ++i is better suited to this scenario */
On the other hand, it would make sense to use postfix on something like .
His lack of precision in his writing lowers confidence in
the quality of what he has written.
There are better ways to ask someone to define their symbols.
Look, guys, 'n' just does not abbreviate 'integer', and 'k' does not abbreviate 'key'. Yes, yes, yes, I know; I know; and we should all know, that while too often people write, say, O(ln(n)) saying nothing about either 'n' or 'O', it's darned bad mathematical writing. If in some context n is a positive integer, then in that context it is just necessary to say so.
Read math from calculus to Halmos, Rudin, Dunford and Schwartz, Spivak, Lang, to Bourbaki, and will find that symbols are always defined; good authors just do not have undefined symbols hanging around. It's just not done.
Now you have learned something important. Sorry some people didn't know.
And remember, dogma is always wrong.
And there are many more symbols undefined than just n; just read the document. E.g., there is k. Now what is k? "Blindingly obvious"? Nope.
The author actually did say a little about his x; not saying what k was was poor mathematical writing in any sense.
And there were many more symbols.
I'm talking about rock solidly, standard good mathematical writing and not "dogma".
You are refusing to acknowledge or learn a simple and elementary but important lesson in mathematical writing, and your excuses are not serious responses. You are just angry and fighting. As much as it irritates you, I'm fully correct, and my remarks are fully appropriate.
Yes, computer science and practical computing have a lot of difficulty in such writing lessons; as fields, in writing, computer science and practical computing have quality way, way below that of, say, math or physics although the document of the OP is not nearly the worst example. For the worst example, the competition is severe.
"And remember, dogma is always wrong." sounds pretty dogmatic.
Your response deliberately attempts to be insulting, is not really responsive to anything I submitted, and is not serious, constructive, or appropriate, and I will not respond to you further.