Hacker News new | comments | show | ask | jobs | submit login

By far the simplest encryption to understand is the factorization problem.

If you have two prime number it's trivial to multiply them and get the result. Or if you have the result and one prime number, it's also trivial to divide and get the second prime number. But if you have that result, figuring out which two prime numbers divide it requires you to check every single one of them - which takes much longer.

Now imagine your password is one of the prime numbers, and your secret message is converted to a different prime number. Multiply your password with the secret message and you have your result. That's the encrypted message.

Without knowing what the password was, there is no feasible way to figure out what the message was - you would have to divide by every prime number looking for another prime number that evenly divides into the encrypted message. (Technically once you do that you don't know if that prime number was the message or the password.)

This is the core of encryption - the trouble is the side-details can mess you up, and they are really not obvious. For example: How do you securely convert a message into a prime number? Or a password into a prime number?

You can't use the entire message as one enormous prime number, that's not feasible, so you have to break it up. How do you do that properly, without leaking information?

There are other issues with the math (I'm not familiar enough with encryption to even know what they are, but they are things like certain primes have patterns that make themself obvious in the result). See http://en.wikipedia.org/wiki/Semiprime for some primes patterns to avoid.

Then there are issues with the implementation - if you are on a multi user system (or a remote system) you want to make sure people can't figure out the password by watching which system resources you use, or how long things take you.

And then issues with the algorithm: If you encrypt two documents with the same password it can be possible to compare them and figure out what you used. So you want to add some randomization to the password to make that impossible. But getting good random numbers isn't always easy. And you have to store that random number in the message so it can be reversed - but watch out that you didn't do that in a way that negates the advantage of the random number.




This system is (almost) just a variant of the one-time pad.

One-time pad: before Alice and Bob need to send secret messages to one another, they generate a large blob of random bits and each take one copy of it. They keep these bits very secret. When one needs to send an N-bit message to the other, s/he takes the next N bits of the blob and XORs them with the message. This is trivial to do and provably impossible to break -- except that making and sharing and keeping all that key data (the big blob of bits) is really inconvenient and easy to get wrong.

So, anyway, your system has the same weaknesses as any other one-time pad: you need new key data for every message you send, which means you have to make a lot of key data, share it with your partner in crime, and remember it for a long time, all without leaking any of it.

It's not quite the same as a one-time pad. You can get by with fewer bits of key than bits of message, provided there's still "enough" key. On the other hand, "enough" is quite a lot, much more than the key length of a typical symmetric cryptosystem like AES.

Anyway, I don't think this counts as "by far the simplest encryption to understand". A standard one-time pad is simpler and stronger, and the extra complexity of multiplying large prime numbers doesn't really buy you anything much to compensate.


I don't intend for anyone to actually implement this. It's just to understand how encryption can work. i.e. how it's possible for an operation in one direction to be quick and easy, but the reverse operation is hard and slow. (And why it's hard to implement properly.)

But you are right that as described it is a lot like a one time pad. (Especially since if you encrypt a known message you can instantly find the password.)


"But if you have that result, figuring out which two prime numbers divide it requires you to check every single one of them"

It does not. There are many factorization algorithms that are significantly faster than trial division for large numbers (http://mathworld.wolfram.com/PrimeFactorizationAlgorithms.ht...)

And your description makes little sense to me. I guess it is based on a bad understanding of RSA (http://en.wikipedia.org/wiki/RSA_%28algorithm%29). That does not convert your message to a prime.


Not for numbers made of only two primes (i.e. a semiprime). Those algorithms are for numbers with more than two prime divisors.

Read the description again then, RSA is a public key algorithm, and is not what I am describing.


Many, but not all, of those algorithms will work for numbers which are a product of two roughly even sized primes (depending on size). Especially the Quadratic Sieve and Number Field Sieve class of algorithms (NFS, GNFS, SNFS, etc).

Size is key though. A rough guide of the GNFS times for a quad core PC would give:-

    c100 - c120: hours
    c120 - c140: days
    c140 - c160: weeks
    c160 - c180: months
    c180 - c200: years 
    
    source: http://www.mersenneforum.org/showthread.php?t=16311
(c100 means a 100 digit composite number, doesn't matter if it is just the sum of just two evenly sized primes).

For comparative sizing:-

    2^256 =~ c80
    2^512 =~ c160
    2^1024 =~ c320
So a 512-bit composite can be quite reliably factorised in a few months by a commodity PC. Parallelisation is a partial help, the first stage of the algorithm can be distributed, but the final stage(s) (~25% of the work) still requires a single machine to do the work.

Part of the hard work in choosing random primes to use within such encryption algorithms is avoiding numbers which are susceptible to each of those algorithms, e.g. avoiding primes with simple p-1 factorisations to avoid being susceptible to Pollard's P-1 algorithm.

The wrong choice of prime and the factorisation becomes (relatively) trivial.


512 bit keys taking weeks was the case a few years ago on old hardware, but even then you could do it in a few hours/days with a modest BOINC setup.

I'm not sure how a modern CPU would do, but I suspect on the low end of "weeks".


A 512 bit composite is not a lot of entropy - it's about 45 bits of entropy, which isn't much. sqrt(512) * 2 (since each prime is approx the size of the sqrt of the number, and there are two of them).

A few months to crack a 45 bit encryption doesn't sound quick to me. Even a 64 bit encryption would take many centuries at that rate.


Each prime is approximately sqrt of the product, not of size that is sqrt of size of product, the size of primes is approximately half of the size of their product. As they are primes their entropy is slightly lower than their length, but not too much (there are approximately x/ln(x) primes smaller than x, which would lead to entropy of 256bit prime being 248.5bits)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: