Hacker News new | past | comments | ask | show | jobs | submit login
How I Learned Symmetric-Key Cryptanalysis (akircanski.github.io)
156 points by aleks224 3 months ago | hide | past | favorite | 31 comments

This is a great post. I'm no expert, but for the nuts and bolts of differential and linear cryptanalysis (as the author points out, some modern attacks derive from these strategies), I got a lot out of the Howard Heys tutorial:


In particular, it's got good worked examples you can follow along with.

Cracking DES as set 9 of cryptopals [0] :) ? Awesome challenges in general, of course, but iirc no actually breaking a symmetric key cipher ("actually" doing a lot of work here, I admit, since there's all kinds of oracle attacks which are awesome!).

[0] For the uninitiated: https://cryptopals.com, which is of the parent's and collaborators' creation!

Ninja edit to add: This is all in good fun, recognizing that cryptopals focuses on real-world crypto that actually is used today!

The reassuring thing about DES is that DES is actually broken only for the reasons people knew about when DES was standardised in the 1970s.

The DES key size is too small (56 bits) and the DES block size is too small (64 bits).

Practical attacks on DES (as opposed to stuff like oracles that isn't a block cipher problem per se) all attack these known weaknesses of DES, theoretically it's still fine, within the bounds of those two fatal limitations.

That's reassuring because it means we're probably done. AES is faster, and it fixes the two things that are wrong with DES by having the longer keys (128-bit or 256-bit) and the larger blocks (128-bit) and so if DES is any indication there won't be a need to replace AES in the foreseeable future.

But I'm pretty sure it makes this hypothetical Cryptopals set silly. On specialist hardware DES cracking via these two obvious flaws is practical, though not exactly cheap, but "Pay somebody some Buttcoins to crack the key" isn't much of a Cryptopals exercise, and "Build your own DES cracker" is more hardcore electronics project than crypto introduction.

No need to replace AES except for implementation concerns, because constant-time AES on a CPU with caches but without AES in hardware is an absolute nightmare. I mean, have you seen how bitslicing works? It’s awful. Brilliantly clever, but still awful.

I believe better performance without hardware specializations is why the chacha cipher was invented and added to TLS. I even think most big websites prefer it, for better mobile experience.

> I even think most big websites prefer it, for better mobile experience.

Actually the clients get to present a list of ciphers in descending order of preference, a server can (and most will) choose the first from the list that they're willing to use.

So from most heavier devices AES will be chosen because they have a hardware AES accelerator and so they put AES suites at the top of the list, while devices that don't are likely to put a suite with ChaCha20 at the top.

AES is Mandatory To Implement for modern TLS, even if your client can't do it efficiently you will need to support it in case your peer doesn't want to agree anything else.

The point, of course, is not cracking DES, but instead understanding cipher design, which has not ended with AES.

I liked Heys so much that I thought about putting together a block cipher cryptanalysis Set 9, but I'd much rather do someone else's Set 9 and learn from it. Maybe I can troll Aleks and Thomas Pornin into doing it.

For anyone who wants Cryptopals set 8 (not linked on the website unfortunately):


Thats what I was aiming for a while ago but never got to finish https://davidwong.fr/blockbreakers/

It's kinda surprising that symmetric-key algorithms have been so resistant to attacks. I'm primarily an applications developer, and I feel like I've used tons of asymmetric key algorithms. Where is symmetric key cryptography used for nowadays in normal applications programming?

Are there places where it would be well-suited, but we don't really use it because the default is reaching for our public keys?

> Where is symmetric key cryptography used for nowadays in normal applications programming?

Practically every time you use asymmetric keys what they really encrypt with them is a symmetric key that encrypts the underlying data. Thus symmetric key cryptography is everywhere, just not directly exposed.

Asymmetric key algos are kinda slow, so in practice when you're using asymmetric crypto on larger amounts of data (like with TLS or SSH), you're actually just using RSA to encrypt a quick handshake where both sides agree on symmetric keys (for ciphers like AES or Chacha) that they'll use to encrypt the actual payload

You can look out for the "AES" standard which is one of the symmetric algos that's used everywhere, often in TLS or on-disk encryption of ofcourse. ChaCha20 another often used one (by djb).

> It's kinda surprising that symmetric-key algorithms have been so resistant to attacks.

Not really. RSA was subject to attacks because it relies on arithmetic, which has a lot of structure and you can exploit that structure. Thus a number-theoretic advance in factoring becomes an attack on RSA. The reason why ECC is so much stronger on a per-bit basis is because there is a lot less (known) structure to group multiplication on a curve than to integer multiplication. But with the primitives used in symmetric cryptography -- basically substitution or in math terms a permutation -- there is almost no mathematical structure to exploit from general permutations of elements.

Incidentally, one of the concerns that many have about AES is the agressive design and reliance on inversion in the S-boxes which does have some arithmetic structure. E.g. in AES, the permutations are obtained from taking the reciprocal of field elements, so there is some math there. The Soviets famously went in a different direction, generating their s-boxes randomly so there would be no structure whatsoever. Those heavy-duty ciphers are built like tanks and stood up well over time, their only weakness was small block sizes for early versions of the cipher.

On the contrary, I find it amazing that bit shift and XOR can be combined in such a way to provide strong security at all. Maybe it just goes to show, if you have 128 or 256 bits of key, that’s actually a heck of a lot.

Strictly speaking, shift and xor alone will not net you a secure cipher, seeing that those are both linear GF(2) operations. You would need something else in the mix to generate some nonlinearity, like integer addition, multiplication, etc.

But on the subject of extremely simple constructions, you can compose many random permutations of a particular form---state[i] = state[i] xor F(state[j], state[k]), where state[i] is the ith bit of the state and F is a randomly chosen boolean function--- and obtain something asymptotically secure [1].

[1] https://arxiv.org/abs/math/0411098

> You would need something else in the mix to generate some nonlinearity, like integer addition, multiplication, etc.

Worth noting that bitwise AND also works (indeed, you can build addition from it), if you don't want to implement anything that operates on bits in non-parallel for some reason. (Which is to say/emphasize that it's linear GF(2) operations that are insufficient, not GF(2) operations in general.)

Yeah that is true. Also you could still stick with xor and shift/rotate, but make the shifts data-dependent. That would make it nonlinear (technically a multiplication), but analysis is generally more difficult.

> but make the shifts data-dependent

Ah, yes; cryptography usually assumes only fixed-amount shifts since some early shifters were non-contant-time, but variable-amount shifts are a thing you can do (and don't necessarily have side-channel problems on modern hardware).

I'm a little over my skis on this (I'm not a cryptographer) but for the underlying insight on why this works, I think you can start with the idea of iterated ciphers --- the design argument that a very simple function that is repeated (in "rounds") with an a key that has been expanded to fit those rounds (the "key schedule") "outperforms" --- gains adequate security with less computation --- a sufficiently complicated function. Once you have that idea, it seems natural to find the simplest and most efficient round function.

Past that: I spent a long time blowing off block cipher cryptanalysis (I'm pretty much exclusively interested in crypto bugs that will actually let me own up a system, and you are essentially never going to cryptanalyze a block cipher for a vulnerability research project) but a lot of block cipher design clicked for me once I took the time to work through linear and differential cryptanalysis.

So I guess my point here would be: if you're blown away that you can get such durably effective security --- so much so that 3DES is still in some ways resilient in 2021 --- well, (1) you should be and (2) Aleks' post here is I think probably a really good way to understand why this stuff works.

Also, 128 bit numbers are just very big. :)

Just xor-ing a secret key that's as long as a message is guaranteed to be 100% safe, given that you don't reuse the key.

You can take a look at Shannon's perfect secrecy theorem and one time pads. Symmetric encryption is easy, safe and fast, but requires a pre-shared secret between the participants. And that's not an easy requirement: it can't be done over a public channel (because it can be intercepted), the secret should be very random (so that it can't be guessed), and we want forward secrecy (if someone manages to intercept and understand one secret message, it we don't want them to know everything discussed before).

Asymmetric crypto and dozens of other schemes are used to fix those shared secret issues, so that symmetric crypto can be used, since it's so good and fast! In fact, symmetric crypto is so good that even the most dangerous futuristic quantum computer for cracking RSA codes wouldn't be able to do anything.

Much military crypto is still symmetric. Before the ship leaves port, or the plane takes off, or the unit leaves base, some officer goes to a key distribution point and picks up something containing the necessary keys.

Over the years, the "something" has been books, paper tapes, punched cards, and various devices. Today, for US/NATO military, it's usually something called the "Simple Key Loader"[1], a rather bulky hand-held device which runs, of all things, Windows CE 6.0.

Just xor-ing a secret key that's as long as a message is guaranteed to be 100% safe, given that you don't reuse the key.

A truly random secret key. Any non-randomness in a one time key system is exploitable. See Venona.

[1] https://www.sncorp.com/media/2400/eis_cns_an-pyq-10-skl-v31-...

> Asymmetric crypto and dozens of other schemes are used to fix those issues.

I wouldn't say "fix" them, they replace them with easier to follow protocols which have their own issues. But the security of all crypto boils down to following protocols. When someone introduces new cryptographic techniques, the advantages of those techniques (if any) are that easier protocols need to be followed. But there is always some protocol that must be followed, the crypto isn't going to let you levitate on air and secure a message just because of an algorithm. So deep scrutiny must be paid to the new protocols required to make the new tech secure. Just saying "problem X is solved" without figuring out what the replacement problem is will lead to disaster.

E.g. Suppose you have two people that need to exchange messages.

A) Without any crypto, they need to make sure no one intercepts any of the messages. So your protocol relies on "make sure no one intercepts a message and you authenticate the sender".

B) With a symmetric key, protocol A is replaced with "make sure no one intercepts the first message and you authenticate the sender".

C) With a public key, say using Diffie-Hellman, protocol B is replaced with "make sure no alters the first message and you authenticate the sender"

D) With a certificate authority, protocol C is replaced with "Make sure you authenticate that the right certificate authority signed every message and you check the hostname"

E) With a web of trust, protocol D is replaced by "make sure there is a quorum of signers you already trust that authenticate the identity of the sender".

So a new tech is adopted when the protocols it follows are easier to enforce than the protocols followed by the previous tech. But it never just "solves" the problem of needing to follow the protocols of the earlier tech.

It is lack of awareness of this that led to many people being surprised about what happens when a rogue CA signs a message. Because they were too busy popping champagne corks to celebrate retiring protocol C that they forgot to worry about correctly enforcing protocol D.

> It's kinda surprising that symmetric-key algorithms have been so resistant to attacks.

One reason is public-key cryptography is tremendously hard to get right. PGP as a case-in-point. Signal and Keybase tried to replace and extend PGP respectively with some success, so it could be done. Just that, it demands a lot of discipline.

> Where is symmetric key cryptography used for nowadays in normal applications programming?

As two examples: TLS, after the PKI-dependant key exchange, goes symmetrical (the same holds true for Signal and Keybase but not for PGP). Next, most server-side data encryption schemes are reliant on symmetric cryptography, as well.

The algorithms have been resistant, the implementations not so much. We discovered pretty quickly that side channel free implementations are very hard, and thus most attacks have been on implementations and not the algorithms themselves.

And for practical attacks on real world systems, key lifecycle is usually what you want to look at first.

I definitely think we use AES much more than we use RSA, etc.

Symmetric key is used all over the place (file/disk encryption is a big one that comes to mind). The main exception being when you're communicating with someone you never have before, in which case a combo is used.

Generally symmetric is a lot faster and simpler than public/private key, so people usually use it when pub/priv is not needed.

99.9% of communications done by browsers is symmetric. Asymmetric is used initially only to encrypt the symmetric key. After that exchange is done, everything else is followed by symmetric - with AES being overwhelmingly used.

> Asymmetric is used initially only to encrypt the symmetric key.

Not even that really. For a TLS 1.3 handshake (and the details for most TLS 1.2 sites visited in most browsers are morally equivalent but technically complicated) it goes like this (but the random values are much, much larger):

Client: I want to talk to www.example.com, I suggest we try x25519 key agreement and here are some symmetric algorithms I know, I have chosen a random value and as a result I now say 41389

Server: Hello, I can do x25519 key agreement and I've picked 128-bit AES GCM from your list. I also chose a random value, and as a result I say 15402.

After this point everything is encrypted with 128-bit AES GCM using a key that you'd only be able to work out if you know one of the two secret random values (because you're the Client or the Server) and the number the other party announced. Since the server spoke last, this next message might actually appear right after the last one in the same data packet...

Server: Cool, now that everything is encrypted, here is a copy of a certificate for www.example.com. That certificate has an RSA public key inside it, so now here is my RSA signature over the entire transcript of our conversation so far, proving that I, the owner of that key, was in fact the Server all along.

Client: Cool. I want to GET /index.html or whatever

Notice that the symmetric key wasn't technically "encrypted" by anybody, but instead the two parties somehow agreed on a random key despite it never being chosen by anybody or sent over the wire. This key agreement protocol is technically asymmetric cryptography, but it isn't really encryption per se. The signature used by the server to prove its identity also isn't encryption, although since an RSA key was used it is mechanically more similar to encryption than it would be in modern schemes.

That is what i meant by "in which case a combo is used"

The correct address of Khovratovich's “Methods of Symmetric Cryptanalysis” is https://www.microsoft.com/en-us/research/wp-content/uploads/...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact