
Everything you need to know about cryptography in 1 hour (2010) [pdf] - epsylon
http://www.daemonology.net/papers/crypto1hr.pdf
======
techpeace
Video link: [http://blip.tv/fosslc/everything-you-need-to-know-about-
cryp...](http://blip.tv/fosslc/everything-you-need-to-know-about-cryptography-
in-1-hour-3646795)

Previous discussion here:
[https://news.ycombinator.com/item?id=1346711](https://news.ycombinator.com/item?id=1346711)

And here:
[https://news.ycombinator.com/item?id=2253336](https://news.ycombinator.com/item?id=2253336)

------
cperciva
The reason this is being (re)posted now is that I gave this talk at a Polyglot
Vancouver meetup last night. Freed from the constraint of a conference
schedule I actually took about 90 minutes to go through this talk this time
(followed by another 30 minutes of questions).

~~~
zatkin
As someone who is completely new to cryptography and knows very little, where
would you recommend I start? I've recently been reading about bitwise
operations to become familiar with how to (somewhat) interpret what a
cryptograhic algorithm is doing in a program, since bitwise operations seems
to be popular in almost all crypto algorithms.

~~~
epsylon
Dan Boneh's free "Crypto 1" on coursera. A new session will be starting on the
30th of June. I've taken it myself and this is hands down one of the best
MOOCs (and class overall) I've ever taken.

~~~
uulbiy
I agree, "Crypto 1" was excellent! On Coursera's website, in the upcoming
section, it says that "Crypto 2" starts in 3 months. I hope that's true!

~~~
Tomte
It hasn't been the last, oh... four times or so?

------
sotte
Here is the video of the talk:
[http://www.fosslc.org/drupal/content/everything-you-need-
kno...](http://www.fosslc.org/drupal/content/everything-you-need-know-about-
cryptography-1-hour)

~~~
afandian
The comments say that there's an Ogg download but I can't find it. Any
pointers?

~~~
dfc
Someone posted the bliptv link above: [http://blip.tv/fosslc/everything-you-
need-to-know-about-cryp...](http://blip.tv/fosslc/everything-you-need-to-know-
about-cryptography-in-1-hour-3646795)

    
    
      $ youtube-dl http://blip.tv/blahblah
      $ ffmpeg -i file.flv file.ogv

~~~
GHFigs
youtube-dl also has a --recode-video option that takes ogg or webm.

------
drdaeman
> DON’T: Put FreeBSD-8.0-RELEASE-amd64-disc1.iso and CHECKSUM.SHA256 onto the
> same FTP server and think that you’ve done something useful.

Actually, this is a bit wrong - it's just that this is not cryptography and
security-related and doesn't have anything to do with authentication. It's the
same reason why some fansubbers still include CRC32 into filename - basic
_unauthenticated_ integrity checking. Totally insecure, but still better than
nothing.

I had case where downloaded file was broken - for some weird case either
software, storage or a network had failed but TCP/IP checksumming didn't help
and the fact was, I had wrong bits on the hard drive. I hadn't broadband
connectivity at home and that cryptographically-pointless CHECKSUM.SHA256
(well, actually it was MD5) helped to find that out before I left the library.

And recently, the same concept had helped me to validate files with a faulty
cloud storage service that managed to lose some chunks and silently replaced
them with zeroes when I tried to download the data back.

Obviously, a signed file stored on another server would always be a better
idea.

Just nitpicking.

------
api
I used Poly1305, but _exactly_ in the way Daniel J. Bernsetein used it in his
NaCl library. I also used his ECC signature scheme, also in exactly the same
way. :)

As far as I can tell the gotcha with Poly1305 is that you absolutely cannot
reuse the same key _ever_. His construction involves using your keyed cipher
to create 32 random bytes by encrypting 32 zero bytes before encrypting the
actual message. These random bytes are then used as the one-time-use Poly1305
key. In NaCl he does this with either Salsa20 or AES-CTR, both of which are
stream ciphers. (CTR converts a block cipher into a stream cipher.)

------
asciimo
Here's a provocative gem:

> The purpose of cryptography is to force the US government to torture you.

~~~
cperciva
It's pretty simple -- if the US government really really wants your secrets,
they can kidnap you and torture you until you tell them what they want to
know. Cryptography can protect data, but it doesn't protect humans; all it can
do is make sure that humans are the only remaining point of attack.

~~~
mseebach
That's not entirely true.

Proper cryptography can keep them from learning that it's you they'll need to
kidnap to get the secret, or even keep them from learning that there is a
secret they might care about in the first place.

Also, there are plenty of bad guys in the world that can't kidnap and torture
you that it's still quite worthwhile to keep your secrets from.

~~~
cperciva
You're over-thinking this. The point is simply that no matter how good the
cryptography in a system is, if there are humans involved then you need to
worry about human factors as well.

~~~
peterwwillis
The easiest way to avoid the human factor is to get a scapegoat. You make it
seem like someone else is responsible for, or knows about, the crypto or its
data payload. They will then torture that individual indefinitely until they
confess to something. It's better if they don't know you or anything about
your scheme as that way it'll look like they're holding out a really long time
on important information.

Then the only thing you need to worry about is that person dying, in which
case the investigation continues. So similar to upping the number of rounds on
PBKDF2 every year, you need a new scapegoat every year, or however long it
takes them to break either the crypto or the scapegoat.

------
drewcrawford
Would be interested in hearing more detail about your objections to both
Poly1305 and ECC.

~~~
cperciva
Too many ways to screw things up, not enough decades of cryptographic
analysis, and there are simpler tools available which have been around for
longer.

This is subject to the caveat that ECC offers benefits under certain specific
conditions (e.g., you need small signatures or a small ASIC die area); but in
those situations you want to talk to a cryptographer anyway. My talk was
providing guidelines for software developers who are writing code for general-
purpose PC hardware.

~~~
robryk
What are your thoughts on using NaCl? On one hand it gives harder-to-misuse
primitives, on the other is uses elliptic curve crypto.

------
lisper
> PROBABLY AVOID: Elliptic Curve signature schemes.

Including Ed25519?

~~~
cperciva
Probably. As I said elsewhere, this is subject to the caveat that sometimes
you need the performance characteristics of ECC; I'm providing advice for
general-purpose computing environments which do not have any such constraints.

~~~
lisper
Not that I really feel comfortable challenging you on anything crypto related,
but it seems to me that Ed25519 is superior to RSA in every conceivable way
even for general-purpose computing environments. It's more performant, easier
to generate keys, the signatures are shorter, and there are fewer ways to
shoot yourself in the foot (e.g. no padding issues). Is there a reason you
don't actively advocate it other than the fact that it's not widely used?

~~~
cperciva
Elliptic curves have had fewer decades of cryptographic analysis. There's a
lot of structure still being explored and I'm not so confident that nobody
will ever find a way to exploit it as I am with integer factorization.

~~~
lisper
Thanks!

------
tptacek
I disagree with two big points in this talk, but I'm sorry to say they're the
same two things I disagreed with last time, so this is going to be a boring
comment.

First, dedicated AEAD cipher modes are superior to manually composing AES-CTR
and HMAC-SHAx. AEAD modes provide both authentication and encryption in a
single construction. AES-CTR+HMAC-SHAx involves joining two constructions to
do the same thing.

Colin points to AES-GCM, the most popular AEAD mode, as an example of how AEAD
modes can be more susceptible to side-channel attacks than AES-CTR+HMAC-SHAx.
I don't think this is a strong argument for a couple reasons:

* AES-GCM is so far as I can tell no practical cryptographer's preferred AEAD mode; "whipping boy" might be a better role to ascribe to it.

* AES-GCM's side-channel issues are due principally to a hardware support concern (GMAC, AES-GCM's answer to HMAC, requires modular binary polynomial multiplication, which in pure software is table-driven, which creates a cache timing channel). That hardware support concern might be receding, and platforms that don't have AES-GCM might prefer a different AEAD mode anyways.

* Modern AEAD modes, of the sort being entered into CAESAR, have resilience against side channels as a design goal.

* But the most important rejoinder to Colin's preference for HMAC-SHAx over AEAD modes is this: HMAC-SHAx is responsible for more side-channel flaws than any other construction. Practically every piece of software that has implemented HMAC-SHAx has managed to introduce the obvious timing channel in comparing the candidate to the true MAC for a message. What's worse is, using AES-CTR+HMAC-SHAx in practice means that the developer has to code the HMAC validation themselves; it's practically an engraved invitation to introduce a new timing leak. Almost nobody codes their own AES-GCM.

I also think Colin is wrong about ECC, particular versus RSA.

RSA might be the most error prone cryptographic primitive available to
developers. There are two big reasons for this:

* RSA has fussy padding requirements. The current best practices for RSA encryption and signing are OAEP and PSS (huge credit to Colin for getting this right in 2010). If you want to form an opinion about RSA, go read up on OAEP and ask yourself (honestly) if you would come up with anything resembling OAEP if you were called on to encrypt with RSA. Developers as a rule do not use OAEP; instead, most new software _still_ uses PKCS1v15 padding, which introduces susceptibility to terrible vulnerabilities that can often be easily exploited in the wild. Even better are the Javascript developers who build RSA from first principles and skip the padding altogether (the fact that RSA padding is a requirement and not an annoyance to be avoided is why Nate Lawson says we should call it "RSA armoring" and not "RSA padding").

* RSA easily supports a direct encryption transform, which is widely used in practice. ECC software as a general rule does not synthesize an encryption transform from the ECDLP; instead, ECC software uses the ECDLP to derive a key, and then uses something like AES to actually encrypt data. You might be thinking, "that's how RSA software works too". But RSA in practice is subtly different: you can, for instance, easily encrypt a credit card number directly using RSA, or build a crappy key exchange protocol where the client encrypts (doesn't derive or sign, but encrypts) a key for the server. RSA and conventional DLP crypto therefore offer (encrypt, key-exchange, sign); ECC in practice offers just (key-exchange, sign). That's one less dimension of things developers have to get right in ECC cryptosystems.

Colin is smarter than me and much better trained in cryptography than I am.
All I do is find and exploit crypto bugs in real software. I am usually as
nervous as any of you are to challenge Colin about crypto, but I think I can
win these two arguments. (Spoiler alert: Colin doesn't think so).

~~~
cperciva
You are entitled to your opinions, no matter how wrong they are. ;-)

~~~
M4v3R
This lets me down a little bit. If two guys that I consider to be _very_
knowledgeable on this topic disagree on some key things, how am I supposed to
get most of this right?

~~~
cperciva
I was skipping the details because Thomas and I have had this argument here at
least a dozen times, and I figured that people were getting sick of it by now.
To summarize the arguments:

Both CTR+HMAC and combined AEAD modes, correctly implemented and correctly
used, will keep you safe against existing published attacks. Thomas takes the
view that "correctly implemented and correctly used" is a problem, and I'll
accept that he's right to recommend AEAD modes in that case (as long as you
have a good cryptographic library[1] available to you).

I take the view that if you can't take CTR and HMAC and put them together
correctly, there's no way the _rest_ of your code is ever going to be secure,
so you've already lost; so I focus on the "against existing published attacks"
side of things, look at the places where novel cryptographic attacks tend to
be found, and opt for combining two very simple and well-understood
constructions.

The same story plays out for RSA vs. ECC: Thomas is worried about the fact
that people have made dumb mistakes when using RSA, while I figure that if
you're going to make those dumb mistakes (especially after my talk) you're
going to write code which is otherwise insecure anyway, so I focus on the
places where I think it is more likely that attacks will be found in the
future[2].

If you're a high school student with two years of Python experience and you
want to add some cryptography to your cat photo sharing startup, listen to
Thomas. If you're a senior developer with 20 years of experience writing C
code for internet-facing daemons, and the code you write is going to be used
by democracy activists in China, listen to me.

[1] I'm not convinced that such a thing exists right now. [2] Or,
alternatively, where attacks may have already been found, but not published.

~~~
tptacek
I love your summary about cat-sharing startups vs. C code for internet-facing
daemons, but the cat-sharing people can't usually afford our rates for crypto
work; most of my experience comes from code built by people with lots of
experience.

Also, if you asked me who was more likely to get crypto right, the Django web
guy or the C daemon guy, I'd bet on the Django guy _every time_. Betting
against crypto implemented in C is like betting when you've made a full house
on the flop: all in.

~~~
archwisp
First off, do not interpret my comments as trying to provide any sort of
advice to anyone. I am merely expressing frustration.

I have to agree with Colin's [1] point.

The standard libraries for the languages I see in assessments most often just
don't include AEAD constructions. And when public libraries exist, they
haven't been properly assessed.

~~~
tptacek
Can you be more specific? Because this assertion didn't ring true to me based
on my own experience, and less than 5 minutes of Googling appears to refute
it:

* OpenSSL supports AEAD through CCM and GCM (and OpenSSL's GCM uses PCLMUL and shouldn't have the obvious cache leak)

* Botan supports AEAD through OCB, GCM, CCM, EAX, and SIV(!).

* Java JCE with the Bouncycastle provider (extremely popular) does AEAD with GCM, CCM, and OCB

* .NET stack languages get AEAD through CCM or GCM.

* Crypto++ supports AEAD with GCM, CCM, and EAX.

* Golang supports AEAD through GCM in the standard library (I did implement a crappy OCB myself, but the go.crypto package probably has a better one).

What bases aren't covered here? Bear in mind that most languages get their
crypto through bindings to OpenSSL.

~~~
archwisp
Sorry for the slow reply. I haven't looked into setting up notifications for
replies. (if that exists on HN?)

But .NET probably represents the largest block of applications I see. Seconded
by ( _shudder_ ) CF. I don't really have any hope for CF but crypto is hardly
it's biggest problem.

As far as .NET goes, my first point was that the standard library does not
support it; true in this case. I'm aware of CLR and Bouncy Castle as external
libraries but neither of them inspires me with confidence.

Supposedly, CLR was released by Microsoft but why didn't they include it in
the standard library or release any associated security assessment reports?
Were there any?

I've heard the name Bouncy Castle thrown around quite a bit but that's about
it. When I dig through their websites, it leaves me with a feeling not unlike
trying to find information about Truecrypt. Granted, I haven't followed their
project(s) very closely. But because of that feeling, I honestly trust OpenSSL
more because people are scared about it.

So, maybe I'm just missing something but this is where I've arrived. Please
correct me if I'm way off base.

~~~
cpach
Notifications are possible via this third party site:
[http://hnnotify.com](http://hnnotify.com) – works great!

------
Spearchucker
Thanks for sharing the slides (and techpeace for the links to the video).

A question I have though is given that we (those of us who collectively don't
identify ourselves as security rocket-surgeons) are always advised to stick
with the high-level stuff and to stay away from stuff like AES (ref. slide
16), which nobody does anyway, then why is no mention ever made of established
and known protocols? Examples like

\- Needham-Schroeder (fixed version), [http://en.wikipedia.org/wiki/Needham-
Schroeder_protocol](http://en.wikipedia.org/wiki/Needham-Schroeder_protocol)

\- Otway-Rees,
[http://en.wikipedia.org/wiki/Otway%E2%80%93Rees_protocol](http://en.wikipedia.org/wiki/Otway%E2%80%93Rees_protocol)

have proven to be really useful. Yes, there's a metric ton of work involved in
implementing something like this, but going through it once is amazing
practice for getting it right in future (that, at least, is my experience).

Why is it that we talk about symmetric and asymmetric encryption, but never go
as far as the protocols that provide _real_ context for their uses?

------
scott_karana
There weren't any "Don'ts" about compression.

Is that solely an SSL/TLS concern, or is it generally applicable to
cryptosystems?

------
mmastrac
He recommends HMAC-SHA256 in this paper, but I think that AES-GCM is a better
construction, as long as you understand the requirements for IV uniqueness. It
offers significant improvements on top of the standard HMAC (privacy and
additional out-of-band data) without adding much in terms of size.

~~~
dchest
"And I still maintain that recommendation. CTR+HMAC is far more robust against
side channel attacks than any AE mode." \--
[https://twitter.com/cperciva/status/475360367191674881](https://twitter.com/cperciva/status/475360367191674881)

~~~
tptacek
I think Colin has a hard row to hoe with this argument, given how often HMAC
implementations cough up timing vulnerabilities. Even if you're stuck with
GCM, GCM has the advantage of having mercifully few implementations. Everyone
writes their own HMAC verifiers (badly).

------
maxtaco
Good suggestion to avoid PKCS v1.5; sadly PGP requires it:
[http://tools.ietf.org/html/rfc4880#section-13.1](http://tools.ietf.org/html/rfc4880#section-13.1)

------
Nimi
"""Conventional wisdom: Don't write cryptographic code!

Use SSL for transport""

Honest question here: suppose I have a webapp with multiple servers, being
load-balanced through Amazon's ELB. This sounds about as standard as it can
get. Question is: how does one handle client migration between servers, and
client authentication, without writing cryptographic code or knowing anything
about cryptography?

(apologies if this is trivial)

~~~
aianus
SSL usually terminates at the load balancer.

A quick googling found this guide to setting up SSL on ELB:
[http://docs.aws.amazon.com/ElasticLoadBalancing/latest/Devel...](http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/ssl-
server-cert.html)

~~~
Nimi
Yes, but what happens to the client authentication when it migrates between
servers? Do I write a cookie to the client saying "authenticated as user X"?
Encrypt the cookie? Sign the cookie? It sounds one would need to write
cryptographic code to do that...

~~~
aianus
You write a cookie to the client containing some random key and associate that
key to the user on your backend in the shared db. Since it's just random data,
you don't need to sign or encrypt the cookie and SSL takes care of protecting
it in transport.

~~~
Nimi
Does this random value need to be unpredictable?

And leaving this specific example aside for a moment, I was trying to prove a
larger point: people write crypto code for a reason. Some people will write it
for no reason at all, yes. A lot of other devs will be dying to save
themselves the time it will take to write _any_ type of code, especially if it
has to deal with something they aren't experts in, but they just can't. Now,
it seems likely you and I can hash out the details of an acceptable solution
in this discussion (in fact, I think once you take the steps to make sure the
random source isn't predictable, you're good, but even this is slightly
crypto-related), but how would the average developer know this? How would he
go from "don't write crypto code" to this? In my humble opinion, simply saying
"don't write crypto code" isn't a solution, and there are good reasons why
this hasn't worked so far.

~~~
derefr
Here's a principle from which you could derive "use unique tokens to identify
the user": effectively, TLS already _has_ a mechanism of fingerprinting users,
called "client certificates." Their current browser UI sucks (much like HTTP
Basic Auth) so nobody uses them (much like HTTP Basic Auth), and instead
provides poor reimplementations of them (much like HTTP Basic Auth.) If you
think about what you need to simulate having a client cert--a persistable
shared secret generated by the host and securely sent to the client--then "a
long random key inside a cookie" is what you'll be naturally lead to.

------
NAFV_P
Disclaimed: I know sweet f-a about cryptography, and was expecting 100% of
this pdf to go straight over my head.

On the other hand, why is the author using "i++" in a preamble of a for loop?

~~~
bazzargh
That's the style in C-like languages.

[http://en.wikipedia.org/wiki/For_loop#C.2FC.2B.2B](http://en.wikipedia.org/wiki/For_loop#C.2FC.2B.2B)

... also used in Java, Javascript, Perl, Php, Bash (and many more I'm sure).

~~~
NAFV_P
> _That 's the style in C-like languages._

That style is in many cases inefficient.

    
    
      /* summation of successive integers */
      /* don't worry, this should not cause overflow in a 32-bit signed integer type */
      volatile int i=0;
      int j=0;
      for( ; i<65536; i++) /* ++i is better suited to this scenario */
        j+=i;
    

Being declared as volatile, the compiler will not attempt to optimise the for
loop and change the code that increments i in the preamble from postfix to
prefix. I will admit this is an extreme example, but I often see examples on
the internet of simple for loops written in this manner when prefix is more
suitable and simpler for the cpu to execute. Postfix increments of integer
types involve making a local copy (y) of the variable to increment (x), then
incrementing x and returning y. Prefix increments merely involve incrementing
x. The former method is rather expensive.

On the other hand, it would make sense to use postfix on something like [0].

[0]
[http://en.wikipedia.org/wiki/Duff's_device#Original_version](http://en.wikipedia.org/wiki/Duff's_device#Original_version)

~~~
pbsd
Have you actually tried your example? Even at -O0, it results in the exact
same code: [http://goo.gl/kDEFnU](http://goo.gl/kDEFnU)

------
graycat
He doesn't know how to write even simple mathematics. E.g., he uses symbols
_n_ and _k_ without definition. Without careful definitions right there in the
context, such symbols mean nothing. He is not clear enough about what he means
by _x_.

His lack of precision in his writing lowers confidence in the quality of what
he has written.

~~~
cperciva
_He doesn 't know how to write even simple mathematics._

Umm...

~~~
graycat
If you know how to write math, then do so. In particular, define your symbols.
Nearly none of your symbols are defined.

~~~
DanBC
You know that this is a presentation, right? And that the symbols will be
defined in the talk?

There are better ways to ask someone to define their symbols.

~~~
graycat
It's simple: Instead of just _n_ , write "For a positive integer _n_ ".
Instead of just _k_ , write "For a key _k_ ".

It's simple.

Look, guys, 'n' just does not abbreviate 'integer', and 'k' does not
abbreviate 'key'. Yes, yes, yes, I know; I know; and we should all know, that
while too often people write, say, O(ln(n)) saying nothing about either 'n' or
'O', it's darned bad mathematical writing. If in some context _n_ is a
positive integer, then in that context it is just necessary to say so.

Read math from calculus to Halmos, Rudin, Dunford and Schwartz, Spivak, Lang,
to Bourbaki, and will find that symbols are always defined; good authors just
do not have undefined symbols hanging around. It's just not done.

Now you have learned something important. Sorry some people didn't know.

~~~
Robin_Message
It's simple: it's a slide, so it should never, ever have excess verbiage on
it. Otherwise, you don't know how to use simple English properly. Good authors
never say more than necessary. I wouldn't trust someone who's presentation
skills were so bad they wrote "n (a natural number)" when it was blinding
obvious from the context.

And remember, dogma is always wrong.

~~~
graycat
It's not "blindingly obvious": Or, for one, for some, or for all?

And there are many more symbols undefined than just _n_ ; just read the
document. E.g., there is _k_. Now what is _k_? "Blindingly obvious"? Nope.

The author actually did say a little about his _x_ ; not saying what _k_ was
was poor mathematical writing in any sense.

And there were many more symbols.

I'm talking about rock solidly, standard good mathematical writing and not
"dogma".

You are refusing to acknowledge or learn a simple and elementary but important
lesson in mathematical writing, and your excuses are not serious responses.
You are just angry and fighting. As much as it irritates you, I'm fully
correct, and my remarks are fully appropriate.

Yes, computer science and practical computing have a lot of difficulty in such
writing lessons; as fields, in writing, computer science and practical
computing have quality way, way below that of, say, math or physics although
the document of the OP is not nearly the worst example. For the worst example,
the competition is severe.

For your

"And remember, dogma is always wrong." sounds pretty dogmatic.

Your response deliberately attempts to be insulting, is not really responsive
to anything I submitted, and is not serious, constructive, or appropriate, and
I will not respond to you further.

