
Show HN: Write (almost) perfect encryption in an hour - singularity2001
https://github.com/pannous/xipher
======
tptacek
The author implies (without saying it) that this is a one-time pad. There are
two issues with that:

(a) This isn't a one-time pad at all, because the keystream comes from
/dev/random. This is, instead, a simple stream cipher keyed from the
/dev/random CSPRNG. Since /dev/random is difficult in practice to seed and run
deterministically, the only mode you can run it in is "one time pad mode". How
close to an actual OTP it is depends on your environment.

(b) One-time pads don't work well in practice. Enthusiasts fixate on the
"unbreakable" point without having a full mental model of how a cryptosystem
works. That robs them of the insight that (i) only a small part of the
cryptosystem is "theoretically unbreakable" with an OTP, and (ii) that part
is, in the history of cryptanalysis, _the least risky component of the
system_. In exchange for hardening what was already the strongest component of
any cryptosystem, OTP crypto makes grave sacrifices in other parts of the
system that are routinely and profitably exploited. It's a terrible tradeoff.

Seconding the (pragmatic) skeptical comments in this thread. Don't use
anything like this to encrypt real data. You are safer with any modern block
or stream cipher than you are with a ad-hoc stream cipher, which is what this
is in practice.

As a learning experience, though: keep at it! This isn't a bad start at all.

If you like/grok how this cipher works, a great next step towards practical
cryptography is to learn Salsa20, which is a very simple algorithmic keystream
generator that is fast and highly regarded by cryptographers.

But I think practical crypto engineering is a bad next step. The better next
step is to take this knowledge and apply it to practical _cryptanalysis_.
Trust me: if you're just getting started with crypto and you're a programmer,
you will get 1000x more insight from learning to break systems than you will
from building random systems. Here's a bunch of exercises that start in a
place that is very compatible with this XOR code:

[http://cryptopals.com/](http://cryptopals.com/)

~~~
dsacco
Your second point is very important for engineers to take to heart if they're
looking to develop cryptosystems.

One-time pads are fetishized because they are the only theoretical silver
bullet to cryptanalysis, but they have a bunch of problems that prevent them
from being serious solutions to real cryptography needs.

1\. They are just about the worst compromise between security and utility. The
one-time pad digest must be at least as long as the message, and any
communication is going to be more or less stateless because you need to
generate a new one-time pad for every message (by design) to prevent
cryptanalysis. This is great for an ivory tower exercise, not so great for
typical use cases.

2\. One-time pads have a terrible ideal to implementation ratio: they look
great _theoretically_ and are almost never implemented securely. Among other
things, it is very difficult to seed them properly.

3\. They do not solve the problem of key exchange whatsoever, which is
arguably a much larger vulnerability in practice. A one-time pad is alluring
for its standalone strength, but in practice it won't help you elsewhere.

Practically speaking, one-time pads don't solve any real problem in
cryptography in a meaningful way, and using them practically tends to attract
the sort of programmer that goes into cryptography with a dangerously
inaccurate belief from the outset: that you can really develop a silver-bullet
to cryptanalysis in real-world cryptosystems.

Other than that, completely agree that it's good practice, but there needs to
be a README disclosing that this is unsafe crypto (like just about every other
repository anywhere).

------
lawl
Oh god. Please don't use that code for anything remotely serious. It's fine as
a learning project, but please, PLEASE make that very clear in the readme.md
then.

from random.c

    
    
        srand(time(NULL));// seed
    

Here goes your security. All one needs to know is when you generated your
keyfile. srand() and rand() _ARE NOT_ secure for crypto applications.

~~~
TheJH_
Aaaactually, if you read the code, you'd see that he XORs the bytes from
rand() with bytes from /dev/random. This project is horrible, but that's not
one of the completely wrong parts.

~~~
lawl
Oops. You're right. I stand corrected.

------
mholt
This post may draw up a lot of criticism, but if nothing else, kudos to the
author for implementing a crypto scheme to understand how it works and for
encouraging others to examine and learn.

Now, that isn't to say that we should actually use this library (as-is). You
know, don't roll your own crypto for real use. With this particular algorithm,
you're putting a lot of faith in the RNG, and it's easy to miss that you need
a huge key (or lots of small ones).

Finally, the claim that it's "unbreakable, even in theory" (which comes from
the Wikipedia article) to me seems a bit misleading -- isn't the theory
usually stronger than any particular implementation? (The word "even" throws
me off.)

~~~
dllthomas
I think the idea is that systems with a finite unicity distance can be
"unbreakable, in practice" because brute force would take too long, but in
theory if you had enough ciphertext and _did_ exhaust the keyspace you could
show that only one possible key produces realistic output. When the key has as
many degrees of freedom as the plaintext, every possible valid output has a
possible key.

------
swehner
For one-time-pad encryption, both sides have a copy of the same secret key.

The name says "one-time," because if a key is used to encrypt two different
plaintexts, information will leak.

In this implementation, the results of two encryptions may be based on the
same bytes from the key file; not for all bytes, but for some. This is because
it chooses bytes from the file (function generate_random_offsets() at
[https://github.com/pannous/xipher/blob/master/encrypt.c](https://github.com/pannous/xipher/blob/master/encrypt.c)),
then writes both the address of those bytes and the XOR'd value to the result
of the encryption.

There is no provision to avoid picking the same offsets/key-bytes from one
encryption-run to the next, except that they are chosen at random. Because the
format of the encrypted messages is very simple, it can be easily determined
for two encrypted messages, which key-bytes they share. Remedies: store
between runs, remove 00 from file,define chunks in the file?

It's not too difficult to avoid sharing the secrets between two encryption
runs; it is not even necessary to chose the key-bytes from the file at random.
One could just use the key-bytes in sequence, send the address of the first
key-byte, and store the next available key-byte for the next run. When the
shared key-file doesn't have enough random bytes, because it has been used for
encrypting too many message, it can error out -- which it should, but this
implementation will never do that.

------
fletom
> What if you don't trust the random key generator on your computer? Just xor
> the generated key with some other files.

> NOTE: You can increase security significantly if you xor/encrypt zipped
> files, as they already contain very little structure!

> If your key is too small or if you are using it too often, you may be
> reducing security.

I hope it goes without saying that this thing should not be taken at all
seriously.

------
binwiederhier
In regards to this comment:
[https://github.com/pannous/xipher/blob/b62a3b864d435a1e61066...](https://github.com/pannous/xipher/blob/b62a3b864d435a1e61066fd6c5fd392e11cd1d96/random.c#L8):
/dev/urandom isn't insecure. Both /dev/urandom and /dev/random use CSPRNG.
There is a wonderful artical (with diagrams and all) at
[http://www.2uo.de/myths-about-urandom/](http://www.2uo.de/myths-about-
urandom/)

------
sz4kerto
"if(num_cypher_bytes<100000)// or num_bytes"

If num_bytes << num_cypher_bytes (and even more if we know the key length)
then the encryption leaks information.
[https://en.wikipedia.org/wiki/XOR_cipher](https://en.wikipedia.org/wiki/XOR_cipher)
Otherwise this is a stream cipher and it is really secure but it's
inconvenient :)

But it is a good starting point for anybody who's interested in crypto.

------
striking
Unlike a modern cipher like AES, XOR is vulnerable to a known-plaintext
attack. If your keys are truly random then you're fine, but otherwise you risk
exposing enough random output from your RNG that a seed can be found.

Also, weak RNG. Use arc4random. (And /dev/urandom isn't insecure, either.)

~~~
bluedino
It's kind of a joke post. "Use a one-time pad! Proven unbreakable! And
exchange the huge key on a physical CD with your intended destination!"

~~~
nawitus
Tinfoil Chat[1] is a semi-practical application of the one-time pad
encryption.

The paper mentions the inconvenience of sending the keys:

>Another common argument is that unlike symmetric PSKs, OTP keys also
eventually run out and users need to exchange another keyfile. The messages of
TFC only consume 420 bytes of key material. The drop in the price of storage
space has made the issue negligible: A $10 microSD card can store 4GB of
keyfiles to encrypt more than 9 million messages, which is in most cases
enough for several years or decades.

1\.
[https://www.cs.helsinki.fi/u/oottela/tfc.pdf](https://www.cs.helsinki.fi/u/oottela/tfc.pdf)

------
laurent123456
Maybe I don't get it, but that seems completely unsecure, isn't it? In fact I
wasn't sure if this was some kind of parody, especially XORing strings and
calling that "almost perfect encryption".

~~~
sz4kerto
> especially XORing strings and calling that "almost perfect encryption".

If the key size is at least as large as the plaintext AND the key is truly
random then it actually is a perfect encryption, because every single bit in
the output is going to be random. It is called 'one time pad' \--
[https://en.wikipedia.org/wiki/One-
time_pad](https://en.wikipedia.org/wiki/One-time_pad)

~~~
laurent123456
If the key must the same length as the message, and must be different for
every message, I guess we can't really call that encryption. May as well
communicate the complete message by which ever way the secret key was
communicated.

~~~
sz4kerto
No, think of the following scenario: you and your friend know that you want to
communicate in the future about something, maybe the outcome of some future
event. You generate a long random key, and somewhere later in the future, you
send the information using the pre-shared key.

It's a real encryption, much-much better than basically anything before
because it's really unbreakable.

