Plus every time anyone brings up the subject the answer is always the same: "If you roll your own crypto you're an idiot." Which is not very encouraging. Likely why crypto' libraries are so antiquated and terrible.
1. "Rolling your own crypto" meaning "inventing your own algorithms".
If you invent your own algo for creating one-way hashes, performing symmetric encryption etc and are not an active, up-to-date cryptographer ... you will make a mistake.
This is a world for academics and researchers; for open publication of algorithms followed by the open publication of attacks on those algorithms.
2. "Rolling your own" meaning "Assembling known cryptographic primitives into a larger system".
If you assemble well-studied primitives (SHA-2, AES, HMAC) into larger systems ("I've invented a simple messaging system, guys!"), you are probably still mistaken.
But at least you won't be as much of a professional embarrassment (unless you're keeping passwords in plaintext or some such thing).
At this level, you're between academia and industry. Design at this level is mostly about using certain rules of thumb, exercising defence-in-depth and taking inspiration from existing designs that are battle-hardened. For example, when using multiple kinds of cryptographic operation, have different keys for each operation to mitigate potential bit-leaking attacks.
The good news is that you can get professional reviews done. Seek out respected security experts and ask if they are prepared to exchange labour for currency.
Then they hash passwords because plain text passwords is dumb.
We're doomed, i'm telling you, doomed. I sometimes want to just yell: ALL YOUR PLAINTEXT PASSWORDS ARE.. PLAIN TEXT IN MEMORY RIGHT NOW DUMBASS.
But obviously, indeed, I just explain it nicely (and generally still get ignored, mind you)
Oh and hi python developers, you can't mlock memory let alone wipe it properly (yeah, i know, you can erase it with some ctype binding and _hope_ it didn't hit swap)
If I can pull the in-memory password value out of your program, I also have access to inserting my own code into your process to prevent any hashing attempts you may make on that memory. Once memory safety comes into play, all bets are off, and all security is irrelevant, as all your security is simply data in memory as well.
The reason we don't store plain text passwords to disk is because stuff on disk tends to be backed up and placed in positions where it has a chance to be read by others. Storing plain text passwords is not actually a security problem in itself if you can guarantee safety of those plain text passwords - but so far, such guarantees tend to be very false.
The only defense against this is to make sure that the sensitive information is never paged out, using either mlock() or just not having any swap on your system.
There's also the class of bug whereby the software starts spitting out stuff from somewhere in memory that was never meant to be displayed, a la the Minus World in Super Mario Bros. Those are much less common in these days of memory protection and high-level languages, but it is still a concern.
Wiping off memory ensure that the attacker doesn't get the past 29329847 logins, only the ones from the moment he compromised the system. Thus, it's quite a win.
There's no easy "all bets are off" escape from properly coding apps so that sensitive data is kept only when needed, for the time needed, never more.
I would love to see a solution to the fearmongering that still keeps people aware of their mistakes. Perhaps public development is the answer -- many eyes make all bugs shallow, so all we need is more people who feel like their eyes are qualified. How many people have actually read the OpenSSL or gnutls or GPG source with an eye to its security?
Anyway, pulic development was the anwer that I'd give here. Keep in mind that even the researchers often make mistakes when they implement the algorithms, and the main reason that their implementation is better than the GP's or mine is not because they know more (altough they do know more), but because there are knowledgeable people doing QA on it.
Anyway, you are idiot only if you use your own imlementation for anything serious. There is no problem in rolling your own crypto if you only do it for fun.
But rolling your own crypto, then attacking it, is probably a good learning exercise. Having a group of people rolling their own crypto, and then attacking each other's attempts is probably better.
Somewhere there's a list of common mistakes people make, and it's surprising how often those mistakes are found in commercial "MILITARY GRADE ENCRYPTION!"
Doing crypto right is really, really, really hard. Not only do you get to fight against advanced math, but if you get even the tiniest detail wrong somewhere, leak even a little bit of state, even if what you leak doesn't even look like state, someone will hang you with it.
Such as, does the length of time taken to de/encrypt ever vary based on the internal state? And in your answer, do consider all the interesting ways in which the microarchitecture details like branch prediction or TLBs of the CPU you run the code on affect your code.
Time and time again, organizations with lots of money have decided to roll their own crypto to solve a problem they have. And it has almost inevitably failed.
Your last sentence is kind of hard to believe. I'm sure nobody ever even attempted to crack the vast majority of stuff that's encrypted out there.
You can also find many useful resources on this nice blog :
It's been good, as in I never could have followed this thread before it, but I don't know about the first run, but there has been little to no professor interaction with this run of the course.
Which is a bit unfortunate. I don't have any education in modular arithmetic and I can only get half of those questions correct. Being completely lost as to why the other half are wrong, and being unable to get help or worked out answers (to be able to figure out where I'm going wrong before the final) really isn't the best.
Thanks for the info.
As for professor interaction, it is a byproduct of the MOOC formula : don't expect too much from the professor, he is already teaching to thousands of students ! You'll have to manage your way through the discussion forums (who unfortunately have a lot of noise). During all the courses I took, I always managed to find answers like that.
The previous course I took with CourseRA had a decent amount of Prof and TA interaction, and the forums were well tended. But, I guess that could be down to the fact that the course was being used by the professor to instruct his actual University students at the same time.
If you have two prime number it's trivial to multiply them and get the result. Or if you have the result and one prime number, it's also trivial to divide and get the second prime number. But if you have that result, figuring out which two prime numbers divide it requires you to check every single one of them - which takes much longer.
Now imagine your password is one of the prime numbers, and your secret message is converted to a different prime number. Multiply your password with the secret message and you have your result. That's the encrypted message.
Without knowing what the password was, there is no feasible way to figure out what the message was - you would have to divide by every prime number looking for another prime number that evenly divides into the encrypted message. (Technically once you do that you don't know if that prime number was the message or the password.)
This is the core of encryption - the trouble is the side-details can mess you up, and they are really not obvious. For example: How do you securely convert a message into a prime number? Or a password into a prime number?
You can't use the entire message as one enormous prime number, that's not feasible, so you have to break it up. How do you do that properly, without leaking information?
There are other issues with the math (I'm not familiar enough with encryption to even know what they are, but they are things like certain primes have patterns that make themself obvious in the result). See http://en.wikipedia.org/wiki/Semiprime for some primes patterns to avoid.
Then there are issues with the implementation - if you are on a multi user system (or a remote system) you want to make sure people can't figure out the password by watching which system resources you use, or how long things take you.
And then issues with the algorithm: If you encrypt two documents with the same password it can be possible to compare them and figure out what you used. So you want to add some randomization to the password to make that impossible. But getting good random numbers isn't always easy. And you have to store that random number in the message so it can be reversed - but watch out that you didn't do that in a way that negates the advantage of the random number.
One-time pad: before Alice and Bob need to send secret messages to one another, they generate a large blob of random bits and each take one copy of it. They keep these bits very secret. When one needs to send an N-bit message to the other, s/he takes the next N bits of the blob and XORs them with the message. This is trivial to do and provably impossible to break -- except that making and sharing and keeping all that key data (the big blob of bits) is really inconvenient and easy to get wrong.
So, anyway, your system has the same weaknesses as any other one-time pad: you need new key data for every message you send, which means you have to make a lot of key data, share it with your partner in crime, and remember it for a long time, all without leaking any of it.
It's not quite the same as a one-time pad. You can get by with fewer bits of key than bits of message, provided there's still "enough" key. On the other hand, "enough" is quite a lot, much more than the key length of a typical symmetric cryptosystem like AES.
Anyway, I don't think this counts as "by far the simplest encryption to understand". A standard one-time pad is simpler and stronger, and the extra complexity of multiplying large prime numbers doesn't really buy you anything much to compensate.
But you are right that as described it is a lot like a one time pad. (Especially since if you encrypt a known message you can instantly find the password.)
It does not. There are many factorization algorithms that are significantly faster than trial division for large numbers (http://mathworld.wolfram.com/PrimeFactorizationAlgorithms.ht...)
And your description makes little sense to me. I guess it is based on a bad understanding of RSA (http://en.wikipedia.org/wiki/RSA_%28algorithm%29). That does not convert your message to a prime.
Read the description again then, RSA is a public key algorithm, and is not what I am describing.
Size is key though. A rough guide of the GNFS times for a quad core PC would give:-
c100 - c120: hours
c120 - c140: days
c140 - c160: weeks
c160 - c180: months
c180 - c200: years
For comparative sizing:-
2^256 =~ c80
2^512 =~ c160
2^1024 =~ c320
Part of the hard work in choosing random primes to use within such encryption algorithms is avoiding numbers which are susceptible to each of those algorithms, e.g. avoiding primes with simple p-1 factorisations to avoid being susceptible to Pollard's P-1 algorithm.
The wrong choice of prime and the factorisation becomes (relatively) trivial.
I'm not sure how a modern CPU would do, but I suspect on the low end of "weeks".
A few months to crack a 45 bit encryption doesn't sound quick to me. Even a 64 bit encryption would take many centuries at that rate.
If you want to get into the actual mathematics of it, then it is considerably more difficult. Luckily for me I took courses in pure math, RSA and some hashing functions at university, so I had to learn it :-)
As others have said, crypto is properly hard and it's very easy to make cock-ups. So unless you're prepared to do the years of work becoming a proper researcher, don't try and design any algorithms yourself (for fun is ok, but not for real world use).
A better book is _Practical Cryptography_, now called _Cryptography Engineering_, which Schneier also co-authored.
Rolling your own version of a popular crypto algorithm using only the spec and then seeing how many of the well known attacks your implementation is vulnerable to is quite educational.
Interactive tutorials on both areas is something many people I know would pay for.
A good Number Theory course will cover much of the basics of the common asymmetric encryption systems. You don't need to do a whole Maths degree on top of a CompSci degree (although I've done exactly that).
But before that, I'd definitely recommend Schneier's Applied Cryptography.
Amounts of home-grown algorithms (pseudo-"asymmetric" algorithms optimized for simple hardware or small messages), misused algorithms (RC4 without IV, mostly), otherwise secure protocols implemented with parties transposed around (many proprietary RFID and other authentication token protocols authenticate reader against token, not other way around) and plain weirdness in many even newly designed commercial cryptosystems today is still more significant that any kind of sidechannels caused by careless implementation.
On the topic: I have an interest in crypto and for a while I was looking at grad schools for cryptography. The number that even offer a course is less than impressive, and the number that offer more than one class are shockingly few.