I wish I knew more about encryption. Unfortunately it isn't a very accessible ar...

jacques_chester · on Dec 17, 2012

You need to distinguish between:

1. "Rolling your own crypto" meaning "inventing your own algorithms".

If you invent your own algo for creating one-way hashes, performing symmetric encryption etc and are not an active, up-to-date cryptographer ... you will make a mistake.

This is a world for academics and researchers; for open publication of algorithms followed by the open publication of attacks on those algorithms.

2. "Rolling your own" meaning "Assembling known cryptographic primitives into a larger system".

If you assemble well-studied primitives (SHA-2, AES, HMAC) into larger systems ("I've invented a simple messaging system, guys!"), you are probably still mistaken.

But at least you won't be as much of a professional embarrassment (unless you're keeping passwords in plaintext or some such thing).

At this level, you're between academia and industry. Design at this level is mostly about using certain rules of thumb, exercising defence-in-depth and taking inspiration from existing designs that are battle-hardened. For example, when using multiple kinds of cryptographic operation, have different keys for each operation to mitigate potential bit-leaking attacks.

The good news is that you can get professional reviews done. Seek out respected security experts and ask if they are prepared to exchange labour for currency.

mseebach · on Dec 17, 2012

There's also "1.5": Implementing well-known algorithms yourself, which, given the reference to poor crypto-lib, I suspect is what the GP referred to. I'm not an authority on the subject, but I believe that if you do this, you risk stuff like leaving keys (or parts of keys) in memory, being susceptible to timing attacks or not using PRNGs properly.

zobzu · on Dec 17, 2012

Most of the implementations i regularly see keep plaintext in memory and generally say its not an issue.

Then they hash passwords because plain text passwords is dumb.

We're doomed, i'm telling you, doomed. I sometimes want to just yell: ALL YOUR PLAINTEXT PASSWORDS ARE.. PLAIN TEXT IN MEMORY RIGHT NOW DUMBASS.

But obviously, indeed, I just explain it nicely (and generally still get ignored, mind you)

Oh and hi python developers, you can't mlock memory let alone wipe it properly (yeah, i know, you can erase it with some ctype binding and _hope_ it didn't hit swap)

RyanZAG · on Dec 17, 2012

If you reach the point where in-memory values are an issue, you have already lost.

If I can pull the in-memory password value out of your program, I also have access to inserting my own code into your process to prevent any hashing attempts you may make on that memory. Once memory safety comes into play, all bets are off, and all security is irrelevant, as all your security is simply data in memory as well.

The reason we don't store plain text passwords to disk is because stuff on disk tends to be backed up and placed in positions where it has a chance to be read by others. Storing plain text passwords is not actually a security problem in itself if you can guarantee safety of those plain text passwords - but so far, such guarantees tend to be very false.

amalcon · on Dec 17, 2012

The danger zobzu referred to was that the pages containing sensitive information might be paged out to disk. If that should happen, you're pretty much out of luck: they're just as vulnerable as anything else on disk.

The only defense against this is to make sure that the sensitive information is never paged out, using either mlock() or just not having any swap on your system.

There's also the class of bug whereby the software starts spitting out stuff from somewhere in memory that was never meant to be displayed, a la the Minus World in Super Mario Bros. Those are much less common in these days of memory protection and high-level languages, but it is still a concern.

CrLf · on Dec 17, 2012

You can have encrypted swap.

amalcon · on Dec 17, 2012

That's correct, that is a viable third option.

tomkinstinch · on Dec 18, 2012

Even if you keep things exclusively in memory, the information is still vulnerable via cold boot attacks:

https://citp.princeton.edu/research/memory/

zobzu · on Dec 19, 2012

That's not entirely correct, in fact, far enough from it to warrant a reply.

Wiping off memory ensure that the attacker doesn't get the past 29329847 logins, only the ones from the moment he compromised the system. Thus, it's quite a win.

There's no easy "all bets are off" escape from properly coding apps so that sensitive data is kept only when needed, for the time needed, never more.

geofft · on Dec 17, 2012

"If you roll your own crypto you're an idiot" is technically correct, but discouraging in all the wrong ways and worse for everyone in the long term. I've been a teaching assistant for MIT's introductory graduate crypto class, and I still don't consider myself qualified to implement core crypto algorithms. But I've seen -- and used -- entirely too many crypto implementations that are obviously poorly written in my eyes.

I would love to see a solution to the fearmongering that still keeps people aware of their mistakes. Perhaps public development is the answer -- many eyes make all bugs shallow, so all we need is more people who feel like their eyes are qualified. How many people have actually read the OpenSSL or gnutls or GPG source with an eye to its security?

marcosdumay · on Dec 17, 2012

Probably the enitire community of security researchers did review OpenSSL, gnutls or GPG, maybe only once, but they did. (Didn't you?)

Anyway, pulic development was the anwer that I'd give here. Keep in mind that even the researchers often make mistakes when they implement the algorithms, and the main reason that their implementation is better than the GP's or mine is not because they know more (altough they do know more), but because there are knowledgeable people doing QA on it.

Anyway, you are idiot only if you use your own imlementation for anything serious. There is no problem in rolling your own crypto if you only do it for fun.

DanBC · on Dec 17, 2012

People should really be saying "If you roll your own crypto, and then use it, expecting to stay secure against GCHQ and NSA, then you're an idiot".

But rolling your own crypto, then attacking it, is probably a good learning exercise. Having a group of people rolling their own crypto, and then attacking each other's attempts is probably better.

Somewhere there's a list of common mistakes people make, and it's surprising how often those mistakes are found in commercial "MILITARY GRADE ENCRYPTION!"

Tuna-Fish · on Dec 17, 2012

The answer is necessary.

Doing crypto right is really, really, really hard. Not only do you get to fight against advanced math, but if you get even the tiniest detail wrong somewhere, leak even a little bit of state, even if what you leak doesn't even look like state, someone will hang you with it.

Such as, does the length of time taken to de/encrypt ever vary based on the internal state? And in your answer, do consider all the interesting ways in which the microarchitecture details like branch prediction or TLBs of the CPU you run the code on affect your code.

Time and time again, organizations with lots of money have decided to roll their own crypto to solve a problem they have. And it has almost inevitably failed.

fragsworth · on Dec 17, 2012

> Time and time again, organizations with lots of money have decided to roll their own crypto to solve a problem they have. And it has almost inevitably failed.

Your last sentence is kind of hard to believe. I'm sure nobody ever even attempted to crack the vast majority of stuff that's encrypted out there.

duaneb · on Dec 18, 2012

I would consider a weak encryption, even if it's never attempted to be broken, to be a failure of a crypto system.

epsylon · on Dec 17, 2012

I took Dan Boneh's "Cryptography I" course this year on coursera and I learnt a lot. The mathematical basics are covered (with references to books and free resources if necessary) and the concepts are very well explained. I would definitely recommend taking it. There isn't a new offering planned yet (Crypto II will start in January), but I think it will be proposed again soon considering that they had two runs of the course (spring and summer) in 2012.

You can also find many useful resources on this nice blog : http://blog.cryptographyengineering.com/p/useful-cryptograph...

thisone · on Dec 17, 2012

Crypto 1 just "ended". The final exam came out today.

It's been good, as in I never could have followed this thread before it, but I don't know about the first run, but there has been little to no professor interaction with this run of the course.

Which is a bit unfortunate. I don't have any education in modular arithmetic and I can only get half of those questions correct. Being completely lost as to why the other half are wrong, and being unable to get help or worked out answers (to be able to figure out where I'm going wrong before the final) really isn't the best.

jessaustin · on Dec 17, 2012

Unfortunately I think this may be what we can expect from Coursera. If you're ready for it then it's great. (I've put Boneh's Crypto 1 in the top ten of courses I ever took.) But if you're not familiar with modular arithmetic then you're definitely not familiar with abstract algebra, in which case yeah you need to study that first.

thisone · on Dec 18, 2012

Yep, I've had math through differential equations (so long ago it doesn't matter anymore), but never touched on abstract algebra that I can remember.

Thanks for the info.

epsylon · on Dec 18, 2012

(I didn't realize there was a fall run!) As for modular arithmetic did you check out the free book by Victor Shoup (it's mentioned on the course site) ? http://shoup.net/ntb/

As for professor interaction, it is a byproduct of the MOOC formula : don't expect too much from the professor, he is already teaching to thousands of students ! You'll have to manage your way through the discussion forums (who unfortunately have a lot of noise). During all the courses I took, I always managed to find answers like that.

thisone · on Dec 19, 2012

Thanks for the link, for some reason I thought that book was on a different topic. I've found a series of lectures from Harvard Extension using the Artin Algebra book. I'm going to use that (and the new edition of the book) to pick up on the concepts I'm missing (over the course of the next few months)

The previous course I took with CourseRA had a decent amount of Prof and TA interaction, and the forums were well tended. But, I guess that could be down to the fact that the course was being used by the professor to instruct his actual University students at the same time.

ars · on Dec 17, 2012

By far the simplest encryption to understand is the factorization problem.

If you have two prime number it's trivial to multiply them and get the result. Or if you have the result and one prime number, it's also trivial to divide and get the second prime number. But if you have that result, figuring out which two prime numbers divide it requires you to check every single one of them - which takes much longer.

Now imagine your password is one of the prime numbers, and your secret message is converted to a different prime number. Multiply your password with the secret message and you have your result. That's the encrypted message.

Without knowing what the password was, there is no feasible way to figure out what the message was - you would have to divide by every prime number looking for another prime number that evenly divides into the encrypted message. (Technically once you do that you don't know if that prime number was the message or the password.)

This is the core of encryption - the trouble is the side-details can mess you up, and they are really not obvious. For example: How do you securely convert a message into a prime number? Or a password into a prime number?

You can't use the entire message as one enormous prime number, that's not feasible, so you have to break it up. How do you do that properly, without leaking information?

There are other issues with the math (I'm not familiar enough with encryption to even know what they are, but they are things like certain primes have patterns that make themself obvious in the result). See http://en.wikipedia.org/wiki/Semiprime for some primes patterns to avoid.

Then there are issues with the implementation - if you are on a multi user system (or a remote system) you want to make sure people can't figure out the password by watching which system resources you use, or how long things take you.

And then issues with the algorithm: If you encrypt two documents with the same password it can be possible to compare them and figure out what you used. So you want to add some randomization to the password to make that impossible. But getting good random numbers isn't always easy. And you have to store that random number in the message so it can be reversed - but watch out that you didn't do that in a way that negates the advantage of the random number.

gjm11 · on Dec 17, 2012

This system is (almost) just a variant of the one-time pad.

One-time pad: before Alice and Bob need to send secret messages to one another, they generate a large blob of random bits and each take one copy of it. They keep these bits very secret. When one needs to send an N-bit message to the other, s/he takes the next N bits of the blob and XORs them with the message. This is trivial to do and provably impossible to break -- except that making and sharing and keeping all that key data (the big blob of bits) is really inconvenient and easy to get wrong.

So, anyway, your system has the same weaknesses as any other one-time pad: you need new key data for every message you send, which means you have to make a lot of key data, share it with your partner in crime, and remember it for a long time, all without leaking any of it.

It's not quite the same as a one-time pad. You can get by with fewer bits of key than bits of message, provided there's still "enough" key. On the other hand, "enough" is quite a lot, much more than the key length of a typical symmetric cryptosystem like AES.

Anyway, I don't think this counts as "by far the simplest encryption to understand". A standard one-time pad is simpler and stronger, and the extra complexity of multiplying large prime numbers doesn't really buy you anything much to compensate.

ars · on Dec 17, 2012

I don't intend for anyone to actually implement this. It's just to understand how encryption can work. i.e. how it's possible for an operation in one direction to be quick and easy, but the reverse operation is hard and slow. (And why it's hard to implement properly.)

But you are right that as described it is a lot like a one time pad. (Especially since if you encrypt a known message you can instantly find the password.)

Someone · on Dec 17, 2012

"But if you have that result, figuring out which two prime numbers divide it requires you to check every single one of them"

It does not. There are many factorization algorithms that are significantly faster than trial division for large numbers (http://mathworld.wolfram.com/PrimeFactorizationAlgorithms.ht...)

And your description makes little sense to me. I guess it is based on a bad understanding of RSA (http://en.wikipedia.org/wiki/RSA_%28algorithm%29). That does not convert your message to a prime.

ars · on Dec 17, 2012

Not for numbers made of only two primes (i.e. a semiprime). Those algorithms are for numbers with more than two prime divisors.

Read the description again then, RSA is a public key algorithm, and is not what I am describing.

alexkus · on Dec 17, 2012

Many, but not all, of those algorithms will work for numbers which are a product of two roughly even sized primes (depending on size). Especially the Quadratic Sieve and Number Field Sieve class of algorithms (NFS, GNFS, SNFS, etc).

Size is key though. A rough guide of the GNFS times for a quad core PC would give:-

    c100 - c120: hours
    c120 - c140: days
    c140 - c160: weeks
    c160 - c180: months
    c180 - c200: years 
    
    source: http://www.mersenneforum.org/showthread.php?t=16311

(c100 means a 100 digit composite number, doesn't matter if it is just the sum of just two evenly sized primes).

For comparative sizing:-

    2^256 =~ c80
    2^512 =~ c160
    2^1024 =~ c320

So a 512-bit composite can be quite reliably factorised in a few months by a commodity PC. Parallelisation is a partial help, the first stage of the algorithm can be distributed, but the final stage(s) (~25% of the work) still requires a single machine to do the work.

Part of the hard work in choosing random primes to use within such encryption algorithms is avoiding numbers which are susceptible to each of those algorithms, e.g. avoiding primes with simple p-1 factorisations to avoid being susceptible to Pollard's P-1 algorithm.

The wrong choice of prime and the factorisation becomes (relatively) trivial.

jlgreco · on Dec 17, 2012

512 bit keys taking weeks was the case a few years ago on old hardware, but even then you could do it in a few hours/days with a modest BOINC setup.

I'm not sure how a modern CPU would do, but I suspect on the low end of "weeks".

ars · on Dec 17, 2012

A 512 bit composite is not a lot of entropy - it's about 45 bits of entropy, which isn't much. sqrt(512) * 2 (since each prime is approx the size of the sqrt of the number, and there are two of them).

A few months to crack a 45 bit encryption doesn't sound quick to me. Even a 64 bit encryption would take many centuries at that rate.

dfox · on Dec 17, 2012

Each prime is approximately sqrt of the product, not of size that is sqrt of size of product, the size of primes is approximately half of the size of their product. As they are primes their entropy is slightly lower than their length, but not too much (there are approximately x/ln(x) primes smaller than x, which would lead to entropy of 256bit prime being 248.5bits)

rwmj · on Dec 17, 2012

Schneier's Applied Cryptography is a very readable introduction.

If you want to get into the actual mathematics of it, then it is considerably more difficult. Luckily for me I took courses in pure math, RSA and some hashing functions at university, so I had to learn it :-)

As others have said, crypto is properly hard and it's very easy to make cock-ups. So unless you're prepared to do the years of work becoming a proper researcher, don't try and design any algorithms yourself (for fun is ok, but not for real world use).

tptacek · on Dec 17, 2012

Applied Cryptography is readable but not particularly instructive and occasionally misleading. It's not a good book on crypto. I'm not the only one who says this; Schneier himself alludes to it.

A better book is _Practical Cryptography_, now called _Cryptography Engineering_, which Schneier also co-authored.

dagw · on Dec 17, 2012

"If you roll your own crypto you're an idiot."

Rolling your own version of a popular crypto algorithm using only the spec and then seeing how many of the well known attacks your implementation is vulnerable to is quite educational.

thibaut_barrere · on Dec 17, 2012

Is there something like CodeSchool or similar with crypto and/or web-security oriented courses?

Interactive tutorials on both areas is something many people I know would pay for.

dagw · on Dec 17, 2012

There are a couple of crypto courses coming up on coursera.org that might be worth a look

thibaut_barrere · on Dec 21, 2012

thank you!

tptacek · on Dec 17, 2012

Start by learning how to break bad crypto implementations. Nobody calls breakers "idiots".

alexkus · on Dec 17, 2012

> I wish I knew more about encryption. Unfortunately it isn't a very accessible area, even if you have a degree in CS you aren't really qualified to even touch the stuff, you need a maths degree at the very least.

A good Number Theory course will cover much of the basics of the common asymmetric encryption systems. You don't need to do a whole Maths degree on top of a CompSci degree (although I've done exactly that).

But before that, I'd definitely recommend Schneier's Applied Cryptography.

YZF · on Dec 17, 2012

I'm sure many CS graduate programs offer very good courses in cryptography. There are also books and online resources. Good ciphers/cipher schemes have withstood the test of time by large groups of experts and even under these circumstances some weaknesses are occasionally discovered. That's why a single person should not roll their own no matter what their qualifications are. The difference is those with more knowledge understand that.

geofft · on Dec 17, 2012

In my experience, "don't roll your own crypto" primarily means "do not reimplement a well-known algorithm, because you are likely to get it wrong and likely to leak side channels out the wazoo". I think most people have realized by now that inventing their own algorithm is a terrible idea. (Although certainly there are people who haven't realized that!)

dfox · on Dec 17, 2012

I think that most people mean inventing own primitives by "don't roll your own crypto". Side channel attacks on software implementations on software implementations of cryptography are certainly practical, but often not significant for overall security of the whole system (as either there are simpler attack vectors or the system has to be already partially compromised for the attack to be possible).

Amounts of home-grown algorithms (pseudo-"asymmetric" algorithms optimized for simple hardware or small messages), misused algorithms (RC4 without IV, mostly), otherwise secure protocols implemented with parties transposed around (many proprietary RFID and other authentication token protocols authenticate reader against token, not other way around) and plain weirdness in many even newly designed commercial cryptosystems today is still more significant that any kind of sidechannels caused by careless implementation.

manish_gill · on Dec 17, 2012

Implementing a well known algorithm can be a very good learning experience! I implemented AES-256 for my college project this semester. Learned a whole lot about Cryptography that I never know before. :)

B-Con · on Dec 17, 2012

> I'm sure many CS graduate programs offer very good courses in cryptography.

On the topic: I have an interest in crypto and for a while I was looking at grad schools for cryptography. The number that even offer a course is less than impressive, and the number that offer more than one class are shockingly few.