
Why is encryption so hard? - retube
So I see a lot of posts on HN that alude to difficulties when developing apps with encrypted comms. Do most languages not already have encrypt&#x2F;decrypt libraries to leverage. E.g I would expect to be able to find a public-private (RSA) implementation in most languages, where I could do somthing like:<p>String encryptedMessage = RSAEncryptor.encrypt(publicKey, message)<p>Is this not the case? Do such libs have bugs?
======
jgrahamc
You can certainly find libraries with interfaces like that. For example,
OpenSSL has extensive libraries for all sorts of cryptographic primitives and
protocols.

If you take a narrow focus on a particular cryptographic event (such as your
encryption of a string with an RSA public key) then you miss the greater story
about encryption: it's not just the individual cryptographic primitive that
needs to be implemented correctly, it's everything else.

An RSA encryption like that does not stand alone. Keys must be generated,
secured and distributed. The RSA library itself must be validated to ensure
that it works correctly. The actual primitive must be used correctly (in the
case of RSA don't use a stupid exponent as some have done). And the
environment within which the encryption is used must be understood and secured
(just look at the CRIME and BREACH attacks against TLS to see how something
'secure' can be broken because of something apparently irrelevant, in this
case, compression).

The overriding reason that encryption is 'hard' is that secure computer
systems have enemies and those enemies (attackers) will do _anything_ to
attack the system. They will attack it based on timing, compression problems,
flaws in the protocol, freezing the RAM to extract a private key, etc. etc.
There's really no end to the variety of things you can try to attack a
cryptosystem.

So, building a secure system may have encryption as a necessary condition, but
it's not sufficient. So much else can and will go horribly wrong.

If you are interested in this hit the books and understand the history of
cryptography. For example, look at how Vigenere was broken by Babbage, or the
Venona ciphers, or Lorenz. These 'old' ciphers can tell you a lot about how
people actually attack things. Then read about modern ciphers and attacks on
them. Wikipedia has much. Read about TEMPEST and imagine other attacks
possible in that way.

~~~
forgottenpaswrd
"They will attack it based on timing, compression problems, flaws in the
protocol, freezing the RAM to extract a private key, etc. etc. There's really
no end to the variety of things you can try to attack a cryptosystem."

I will say that in my experience 95% of the attacks are going to be social
engineering, less about sophisticated things when the social is way easier.

~~~
podperson
The problem is that while probably FEWER than 5% of attacks are technical,
they generate industrial scale issues -- e.g. tens of thousands of stolen
identities.

Even so, while there have been high profile examples of encryption algorithms
being shown to be flawed (e.g. RSA had to reissue all its dongles a couple
years back because they were using a flawed RNG) I do not know of any actual
successful dark-hat attacks along those lines (of course they may have
occurred undetected or not been disclosed or I may simply be ill-informed).

High profile security breaches are generally a result of poor or no
cryptographic practices, negligence (e.g. IEEE keeping its member records in a
plain text file on an FTP server), or (as you say) social engineering.

In short, while really good cryptography may be hard, halfway decent is not
hard, so it becomes a case of "assumed hard and left untried" rather than
"tried and found hard".

Finally: there's also the problem of security theater, such as forcing people
to change their passwords at a ridiculous rate.

------
e12e
Lots of good answers here. NaCl (salt) is one (relatively) recent effort to be
just such a library, see eg under the sub-heading "High Level Primitives" on
the features page:

    
    
      http://nacl.cr.yp.to/features.html
     
      High-level primitives
    
      A typical cryptographic library requires several
      steps to authenticate and encrypt a message.
      Consider, for example, the following typical
      combination of RSA, AES, etc.:
    
      * Generate a random AES key.
      * Use the AES key to encrypt the message.
      * Hash the encrypted message using SHA-256.
      * Read the sender's RSA secret key from
        "wire format."
      * Use the sender's RSA secret key to sign the
        hash.
      * Read the recipient's RSA public key from wire format.
      * Use the recipient's public key to encrypt the
        AES key, hash, and signature.
      * Convert the encrypted key, hash, and signature to wire
        format.
      * Concatenate with the encrypted message. 
    
      Sometimes even more steps are required for storage
      allocation, error handling, etc.
    
      NaCl provides a simple crypto_box function that
      does everything in one step. The function takes the
      sender's secret key, the recipient's public
      key, and a message, and produces an authenticated
      ciphertext. All objects are represented in wire
      format, as sequences of bytes suitable for
      transmission; the crypto_box function
      automatically handles all necessary conversions,
      initializations, etc.
    
    

Of course, "such libs have bugs" \-- it is software after all. But bugs can
(and will be) fixed.

Somewhat unique to security and cryptography are the number of subtle bugs
possible. There are both problems of actual "normal" bugs (like the Debian
entropy bug) and system level design errors (like CRIME).

NaCl/Salt tries to reduce the number of errors possible by using the library
wrong (as opposed to eg: openssl that has a very (some say too) rich
interface). But you could still end up writing the secret key to swap. Or
doing something silly with the plain text. Or expose yourself to a buffer
overflow in the part of the code that renders those cute avatar-images for
your chat application.

edit: formating

~~~
ippisl
Encryption is basically a tool against wiretappers and men in the middle. But
what if you don't fully trust your users, and fear they'll try to add exploits
to your software ?

Another high level effort in security is meredith paterson's "language
theoretic security"[1] can help you code secure protocols, to fight against
this problem.

There's also a tool to help implement it called hammer[2].Not sure if fully
developed yet.

[1][http://boingboing.net/2011/12/28/linguistics-turing-
complete...](http://boingboing.net/2011/12/28/linguistics-turing-
completene.html)

[2][https://github.com/abiggerhammer/hammer](https://github.com/abiggerhammer/hammer)

------
preinheimer
Simply invoking some sort of "encrypt" library is easy, it's everything else
that's hard, and you have to get it perfect.

\- Simply encrypting your message as indicated will not protect you from
replay attacks. Someone could record your message and re-transmit.

\- Simply encrypting your message will not assure that the contents haven't
been modified, someone could patiently sit in the middle poking bits to see
what happens.

\- Most encryption schemes will require you to choose a block cypher, doing so
requires some knowledge of the options and the data you're sending. Some
handle large amounts of data poorly, others fail when you send identical
messages.

\- Most encryption schemes will require you to initialize them with truly
random data, both an early version of Netscape, and Debian messed something up
and provided far less entropy than they appeared. Relying on /dev/urandom on a
machine that's just booted, or otherwise faulty entropy providers is fatal.

\- Attackers can record your data and play with it forever, so even if a
mistake or attack isn't revealed for years, they can still go back and decrypt
your data. I believe the NSA broke the Russian's use of a One Time Pad because
they re-used pages years later.

\- Simply encrypting data doesn't provide assurances that you're communicating
with the system you think you are, the initial contact is still tricky.

So there's more to it than a single function call.

------
sdevlin
A few reasons.

The overarching problem is that you don't really get any feedback about
whether what you're doing is right or wrong. For example, no cryptographer
would use RSA like that, but that's not obvious just from studying the wiki
article. Or from looking at the function output - it does turn ASCII into
gibberish, as advertised, and that's where most developers will call it a day.

The moving parts are also treacherous. You're not _just_ going to encrypt a
string - someone is meant to decrypt it. Have you authenticated the
ciphertext? Are you exposing a padding oracle? Or timing attacks? Are messages
susceptible to replay? In crypto systems, these things are equivalent to
locking the front door and leaving the window wide open.

In practice, most insecure crypto constructions aren't due to bugs in the
implementation of RSA or AES. They're because of developers choosing
inappropriate primitives, gluing them together incorrectly, or inadvertently
exposing dangerous side channels.

Fortunately, there are libraries that can help. As mentioned elsewhere,
NaCl/Sodium and KeyCzar provide higher-level interfaces that can abstract away
many of these issues.

------
andrewcooke
To answer the "why is it hard?" question, I tried to collect my own
experiences at
[http://www.acooke.org/cute/WhyandHowW0.html](http://www.acooke.org/cute/WhyandHowW0.html)
\- not sure I did a good job, but the main conclusion was that you
underestimate how important experience is in avoiding errors.

To repeat what others have said in answer to your more general question -
solutions to "real world" problems include more than a single call to a
primitive. So you need to find libraries that provide a higher level API, like
parts of NaCL [http://nacl.cr.yp.to/](http://nacl.cr.yp.to/), Google's keyczar
[http://www.keyczar.org/](http://www.keyczar.org/), etc.

Even for simply encrypting a string with a password -
[https://pypi.python.org/pypi/simple-
crypt](https://pypi.python.org/pypi/simple-crypt) which is what I talk about
in the first link - I needed three things: key strengthening, the encryption
itself, and an HMAC. Making those work well together was harder than I
expected (at least 5 bugs harder...)

------
gcv
Lots of good answers on this thread. I think the fundamental underlying reason
is that programming is difficult and so poorly understood.

A given: all software has bugs. Usually, that doesn't matter — a CRUD app will
eventually get debugged enough to the point of usability. (Sometimes even
maintainability.) We do not understand enough about programming to guarantee
perfect execution in all cases, but no one gains any value by causing an
obscure input case to cause a null pointer exception.

Whenever we use crypto, however, we inherently have code which protects
something valuable: from forum passwords to credit card numbers to state
secrets. This means that all the subtleties which break in ordinary code, but
no one cares about, suddenly become important. Every interaction of input to
memory to processing to storage (to network) must be scrutinized for places
where a crucial piece of data may leak an encryption key, or perhaps just
enough known plaintext of known cyphertext to mount an attack.

------
sarbogast
Also, encryption is all about maths, so there are hundreds of ways to do just
about anything, different parameters, different algorithms with different
tradeoffs about speed, performance, resistance to attacks, data bandwidth,
etc. etc. So I don't think a library with the kind of interface you describe
would be very useful. But I do think it would be great to have a library that
allows us to configure encryption based on requirements instead of
technicalities.

------
Zigurd
There are two facts about crypto that often get mixed up in these discussions:

1\. For a high value target like Edward Snowden, there is a broad spectrum of
attacks, and any operational weakness is fatal. There are many examples of
these attacks described on this thread. Unless you know what Snowden knows,
odds are you will not get it right.

2\. BUT, if everyone had easy encrypted email and real time communication, the
mass surveillance machine would be blinded, because the kinds of attacks that
are used against high value targets do not scale up well.

------
pothibo
I guess the main problem is that encryption is foreign to most of us (myself
included). It's hard to understand what is safe from what isn't.

It's also very hard to figure out if your encryption is bugged or not. I guess
that for most us, once your method returns a hash, you expect that everything
is secure.

On a side note, I wonder how many people on HN would claim to know the inside
out of encryptions. (Not the difference between SHA1/MD5/bcrypt but the actual
math behind derivations and how they work)

------
nilved
Encryption being painfully and needlessly difficult is one reason why it isn't
widespread on both the business end and the consumer end. GPG, which
_everybody_ should use for email, has one of the most terrible interfaces
conceived. It is absolutely no surprise that people would rather be spied on
than spend a week getting that POS working.

There's a massive market for easy-to-use encryption. Easy-to-use does not
imply insecure in any way at all.

~~~
bostik
There are two orthogonal aspects, and you touch both of them.

1) All software has bugs. The problem with crypto software is that bugs are
far more dangerous, even when they may appear to be insignificant. To use an
utterly broken analogy: a faulty seal in a pressure cooker does not cause all
the locks and hinges in your house to magically unscrew themselves. But even a
slightly broken crypto implementation in software _can_ cause a complete
breach. (No pun intended.)

2) Developing good and intuitive UI's is hard. When the UI has to hide the
complexity of secure key management, it's even harder. Humans are by nature
lazy and inventive; if the UI allows any way to achieve convenience over
security, conveniene _will_ be what most of the users choose.

In my mind, there is one particular implementation where adding security
allowed more convenience. The humble ssh-agent. When used properly, you don't
_need_ to know any of the passwords for remote systems. And unfortunately, the
most convenient way to achieve this is, of course, to leave the private key
either unencrypted or protected with an empty passphrase...

------
Shish2k
AFAIK there's no "encryption for humans" library (at least no widely known,
widely used, widely tested one) - they all rely on the developer to specify
the right parameters into the function, with no sanity checking asserts.

The results of this is things like the developer who used "1" as the
multiplication factor, so to decrypt the data, you need to divide each block
by 1...

------
weavejester
Sometimes crypto libraries have bugs, but it's also easy to use them
incorrectly, especially if you don't have an good understanding of
cryptography.

For example, a common mistake is to assume that by encrypting something,
attackers can no longer change it. Or perhaps you'll use your standard
equality operation to check whether a decrypted string matches some value,
without thinking about timing attacks. Or maybe you'll just use AES in ECB
mode.

------
reaperhulk
Just in your example there's already a problem. If you don't use something
like OAEP padding (PKCS1 v1.5 padding has been proven to have issues) then
you're vulnerable to attack (see: Bleichenbacher [http://www.bell-
labs.com/user/bleichen/papers/pkcs.ps](http://www.bell-
labs.com/user/bleichen/papers/pkcs.ps)).

------
geoffsanders
For anyone interested, we have encryption examples in Python (PyCrypto &
M2Crypto library), Ruby (OpenSSL), PHP (phpseclib), Java (Spongy Castle) and
Objective-C (CommonCrypto) here:
[https://launchkey.com/docs/api/encryption](https://launchkey.com/docs/api/encryption)

disclosure: I'm a co-founder of LaunchKey

------
skrowl
I know it's cool to hate Microsoft and .NET here on [Y], but .NET framework
actually comes with a ton of encryption classes & methods -
[http://msdn.microsoft.com/en-
us/library/system.security.cryp...](http://msdn.microsoft.com/en-
us/library/system.security.cryptography.aspx)

------
joshka
There is a crypto challenge that explains many of the flaws in crypto done not
exactly right, by giving real examples / puzzles on how to break the crypto.
See [http://www.matasano.com/articles/crypto-
challenges/](http://www.matasano.com/articles/crypto-challenges/)

------
gamachemarkr
This is exactly what the NSA wants you to think! Encryption is only a tiny
part of the problem space, and yet still gets broken in fun ways (padding
oracles, bad RNGs, etc). The more difficult mart is key management and
distribution. This is where crypto rubs up against the human. Humans suck.

------
NegativeK
As a different, less technical response: crypto is so hard because it's
natural to assume that cryptanalysts are so persistent.

A good crypto library should keep your data safe for decades. We don't make
the same demands (no bugs, due to no updates possible) of other software that
often.

------
falsedan
Encryption is easy, security is hard. Every time you increase the security of
a product, you decrease usability.

e.g. easiest to use :: SSH with password <<>> SSH with passphraseless keys
<<>> SSH with passphrase-protected keys :: most secure

~~~
tbrownaw
This is not entirely correct.

For your SSH example, pubkey authentication has a bit of a learning to _to set
up_ but then makes _ongoing_ use far easier.

For networked file systems, at work we have a common shared drive (H:) and a
private shared drive (J:). The private drive is clearly more secure as people
who shouldn't have access can't even _see_ the files, but it's also much
easier to use as you don't have to sort thru tons of other people's old crap.

------
bradleyjg
[https://code.google.com/p/keyczar/](https://code.google.com/p/keyczar/)

From their homepage:

Crypter crypter = new Crypter("/path/to/your/keys");

String ciphertext = crypter.encrypt("Secret message");

------
ra
Encryption is hard because computers are constrained by (but exceptionally
good at) discrete maths. All encryption does is slow cryptographic attacks
down (a lot).

Also, proper key management is out of reach for most of us.

------
mangeletti
More simply put, and to answer the main (title) question, the reason
encryption is so hard is because it's usually economically rewarding to break.

------
a3n
To step back a bit from the tech problems -- encryption is hard because not
everybody uses it.

I think what we need, for email at least, is a completely new protocol that's
end to end secure (as hard as that is). The problem though is that I don't
think something like this can be done anymore, without "interested"
corporations co-opting or talking it to death. The golden age of the internet
is gone.

~~~
remosi
So, reading many of the replies in this thread, they all cover good points,
but I have a slightly different point: The current encryption libraries make
the "easy" stuff hard. If we assume that all the hard work of actually doing
things is taken care of, and just looking at the API of the command line tools
(the C api's are generally worse).

The NSS command line tools require you to provide a provide a entropy file of
"sufficient size". "Oh!" you think, "I'm on linux, I can just use
/dev/random". No, it tries to read the entire file, and thus hangs infinitely
after using up all your entropy. "Ok, I'll do something like: dd
if=/dev/random of=- bs=1024 count=1 | .... --nonce-file /proc/self/fd/0",
nope, doesn't work, because it tries to read from this file multiple times.
(Also how big is "sufficient"? 8196 bits of entropy seems overkill worth of
entropy I just threw down the toilet, but the opposite case is even worse.)

The openssl command line tool has a simple "ca" command, which appears to be
the "State of the art", but doesn't support revoking keys or even concurrent
access. The openssl command line in general is confusing and difficult to
remember.

PKCS#11 is a nice C api, but there's no standardised config to tell the
machine to use PKCS#11 for everything, so you have to specify it on every
command line. Forget it just once and you've possibly just created a key thats
not in your HSM and you might not notice that.

A little bit of UI TLC would make the world much more secure. SSH sees high
levels of uptake because while it has strong crypto, it's relatively easy to
use.

Exploits with people using TLS and then not properly verifying the far ends
cert is usually down to the fact that library that they're using doesn't
provide a "just verify the cert for me" but expects every developer to
implement it themselves. Most libraries will default to ridiculously stupid
ciphers, even if the other end will let you negotiate stronger ciphers, again
requiring every application developer to make sure that they have overridden
the defaults to something sane, and perhaps just offloading the problem onto
the system administrator.

djb's NaCl has a nice API, but it's incompatible with everything, so you can't
use it to interoperate with the rest of the world. (Also djb has a history of
abandonware, and licenses that prevent other people from continuing to
maintain the software as the realities around it evolve, so it's probably not
useful for something that's going to last >3-5 years.).

Crypto is notoriously difficult to get right, but the API of the current tools
(both the language bindings and CLI tools) makes it far harder than it needs
to be, inviting a slew of additional errors.

