GCHQ Cracks Frank Sidebottom's Codes

dlgeek · on April 14, 2019

"I'm embarrassed to say, on the very next day Chris's very own code grid was found in the back of his address book. It was almost like Chris Sievey was going, 'There you go, now we've all had our fun, there's the explanation.'"

And they say the universe doesn't have a sense of irony...

paulgb · on April 14, 2019

If Frank Sidebottom is an unfamiliar name, I recommend the short audiobook Frank by Jon Ronson as an entertaining story. Jon played keyboards in his band and is an entertaining writer.

ozzmotik · on April 14, 2019

oh wow thanks for sharing that! I had never heard of frank but i definitely have heard of jon ronson from his research and look into psychopathy. oh and the men who stare at goats, of course. anyway i must now add yet another line to my ledger of things i really should be looking into

stevenwoo · on April 14, 2019

The movie Frank is a fictionalized version of the character (written by Ronson), and there was a documentary about the guy as well: http://www.beingfrankmovie.com/info.html I watched the movie and was shocked to learn the movie was based on a real life phenomena.

Joeboy · on April 14, 2019

I don't think the film has all that much to do with Frank Sidebottom, beyond the name and the head. It's more a take on more troubled outsider geniuses like Captain Beefheart and Daniel Johnston. As far as I can tell Chris Sievey was maybe quite eccentric and probably a bit frustrating, but Sidebottom was a piece of art he was doing and not an artifact of mental illness like the film's Frank.

mcguire · on April 14, 2019

I believe the documentary was the origin of this article.

gadders · on April 14, 2019

Lucky enough to see him at Reading Festival. Absolutely hilarious.

mellosouls · on April 14, 2019

An example of Frank's original worldview.

Anybody who's only seen the fictional film representation may find it surprising.

https://youtu.be/yrM6sLx_DXo

bcherny · on April 14, 2019

Any more details on how the encoding worked? Was this just a substitution cypher with some gibberish sprinkled in?

LeoPanthera · on April 14, 2019

Since the article mentions finding duplicate symbols which corresponded to duplicate letters in words, it's probably a simple substitution cipher. Such codes can usually be broken with a combination of frequency analysis, and guesswork.

This one was apparently made more difficult by the fact that every other symbol was random. (And apparently using some symbols that did not otherwise appear in the code.)

rocqua · on April 14, 2019

It feels like they could've figured that out by simply looking at the distribution of triangles on the inside as compared to the outside.

gmueckl · on April 14, 2019

For that you simply need to have the sudden inspiration of how the scheme works. Snark aside, I don't think that discarding half of the source material as random junk is such an obvious thing to do.

rocqua · on April 14, 2019

You could easily think, 'hey perhaps the outside triangles are different from the inside ones, lets make a histogram of the symbols in both.'

From there, you'd certainly notice if one side had a lot more symbols than the other side. Trying to analyze both separately is a decent next step, and we are well along to solving this.

unparagoned · on April 14, 2019

That's not something a normal human would easily think. I doubt an experienced God breaker would have, easily thought your suggestion. But don't worry, go on telling yourself it's something you would have easily thought.

Zenst · on April 14, 2019

Certainly seems that way. How I read "Noticing some repeated pairs of symbols - which represented letters - the first word cracked by GCHQ boffins was Sidebottom's favourite word, "bobbins"."

Though two triangles are used per letter - you can check that with one of the examples which has a message "Why does my nose hurt after concerts?" - 37 characters in total (including spaces), then count the triangles - 74. Hence two inside triangles are used per character.

But certainly highlights how adding noise to any encryption has it's upsides.

otwo3 · on April 14, 2019

> Adding noise to any encryption has it's upsides

That's assuming that the secret is the encryption algorithm itself rather than the key. Modern symmetric encryption does not work that way - the algorithm is public and well known while the key is the actual secret required for encryption/decryption.

I don't see how adding noise in modern encryption can help other than increase the size of the output.

Some modes of operation make use of random noise (IV in CBC, nonce in CTR, etc) because it's a convenient way to get a unique number but it's not for obscurity, it's because it's needed to prevent attacks on these modes.

moomin · on April 14, 2019

Look up “confounders”, random noise can be extremely useful if you encrypt it as well. This significantly increases the work required to decrypt (since you’ve got to decrypt random noise as well as signal), makes it much harder to tell if you’ve actually decrypted something successfully (depending on how well you can test the plaintext, obviously) and frustrate correlation attacks because every message has a different payload even if the logical payload was the same.

brad0 · on April 14, 2019

Is gibberish called nulls or nonces? I forget.

lainga · on April 14, 2019

Those are nulls. Nonces are random numbers for one-time-only use

kosievdmerwe · on April 14, 2019

Nonce contains "once", which makes it easier to remember.

cortesoft · on April 14, 2019

Nonce is short for "number used only once"

dtf · on April 14, 2019

Once is from the Middle English "anes", meaning "one".

Nonce is from "then anes", meaning "the one".

The "-n" got smooshed into the latter word, to become "nonce", rather like "an ewt" became "newt".

willyt · on April 14, 2019

It’s also UK slang for peadophile, so be careful that everyone understands you are talking about cryptography not sex offenders if you use the word in public as there could be a nasty misunderstanding!

vidarh · on April 14, 2019

There's a lot of vocabulary like that. I've had conversations about ensuring a daemon reaped zombie children in public before we realized what it sounded like.

Then there's a lot of master/slave terminology.

rocqua · on April 14, 2019

Is the term that old? I was told it literally meant N(umber)once.

dtf · on April 14, 2019

Yes, the construction "for the nonce" goes back to the Middle Ages. The concept of a "nonce word" goes back to 1884, according to this article.

https://www.dailywritingtips.com/nonce-words-for-the-nonce-a...

OJFord · on April 14, 2019

I'm not necessarily arguing there's shared etymology, but consider 'for the nonce' - it's not merely a made up word used exclusively in cryptography.

mattlondon · on April 14, 2019

It is also the word for a paedophile which is pretty unfortunate.

https://en.oxforddictionaries.com/definition/nonce

buu700 · on April 14, 2019

That was confusing for me at first when David Cross was sent to the "nonce wing" in The Increasingly Poor Decisions of Todd Margaret. It seemed like an odd situation to start chatting about crypto.

lainga · on April 14, 2019

Cue the old protest that "root" means something vulgar in Australia, and so the Unix superuser should be named something else.

https://en.wiktionary.org/wiki/root#Etymology_2

herodotus · on April 14, 2019

A "nonce word" is a word that is created to be used once. For example, if we were surfing off the coast near the Kruger National Park, we might send this tweet: "We are on Surfari!" The wikipedia entry for Cryptographic nonce says: "It is similar in spirit to a nonce word, hence the name."

cortesoft · on April 14, 2019

Yeah, I should have said "this is the mnemonic I use to remember what it is"

lisper · on April 14, 2019

Also known as a hapax legomenon.

TeMPOraL · on April 14, 2019

I... don't understand how it works. Isn't the Wikipedia page listing these words essentially invalidate their "hapax legomenon" status?

Also, GP's "Surfari" doesn't sound like a word meant to be used once, but as a word meant to be funny and with high probability of becoming a piece of jargon between a band of friends. My wife & I invent words like these all the time (half of them being born from misspelling or moments of confusions). Are they "nonces" too, even though we keep using them?

jholman · on April 14, 2019

A hapax legomenon is a hapax legomenon with reference to a particular corpus. In this context, a "corpus" is a set of words, or more generally a set of works under consideration.

Any corpus of one word is, by construction, composed entirely of of hapax legomena. I think the wikipedia page is fairly clear on the subject, honestly. In general, they're a phenomenon which is fairly obvious and uninteresting.

Where it becomes slightly more interesting is when, in some long text, an author uses a word the no one knows, and doesn't bother to explain it, and never uses it again. It becomes particularly interesting when trying to translate important ancient texts... what the devil did this word really mean?

mcguire · on April 14, 2019

Which is also a fish creature from the web comic Narbonic.

thaumasiotes · on April 14, 2019

Nonce isn't short for anything. It's a word, like "number" or "short".

Hendrikto · on April 14, 2019

> I spent a while just looking at them going, 'What could he be saying, what could this mean?'

> But it was impossible to crack them […]

Confirmed uncrackable. He looked at it Jim, what else was he supposed to do??

ape4 · on April 14, 2019

I didn't know til now that the GCHQ headquarters is a flying saucer like Apple's

keithpeter · on April 14, 2019

A couple of quotes from OA with a personal 'translation'

"GCHQ told Sullivan that Sidebottom "had a small but dedicated following" among its staff."

Couple of people do Sidebottom dialogues as an in-joke to the extent that it begins to annoy co-workers.

"[After random outer triangles explained] 'Right, we've cracked it during a light-hearted training exercise.'"

Took a couple of minutes as a starter in a session.

PS: I use a Playfair style grid to jumble up my pass phrases to try to make them less susceptible to rainbow table attack. Am I wasting my time?

sjclemmy · on April 14, 2019

Forgive me but;

You know you are, you really are.

laurencei · on April 14, 2019

"The country's top codebreakers too seemed flummoxed until Sievey's son Stirling recalled how his dad would get the children to fill an outer row with random symbols, while Sievey would insert real code into the inner row."

Is it really that hard to fool some of the worlds top code breakers, simply by including some random digits?

So a code where every {x} symbol is random, and suddenly you've got an uncrackable code? Surely it cant be that simple?

jjeaff · on April 14, 2019

I doubt and hope that they didn't dedicate a ton of time on the problem. The hint probably just got them there faster.

rjf72 · on April 14, 2019

Every {x} could be cracked by normal methods if it was simply added as a rule. So let's assume something just barely more sophisticated and we use a nonrepeating function instead of a fixed {x}. For instance, maybe the nth digit of pi means the next n mod 3th digit is random. How would you even begin to crack this? So imagine "thecodeisx". The first digits of pi are 141592653. So we get a noise pattern of: 1, 1, 1, 2, 0, 2, 0, 2, 0. So the code in deciphered format, but with our noise added, could look something like:

tlhpebcokqodengisxf

And keep in mind that is the code before it's enciphered. Even better, the random characters could be not entirely random but rather weighted to try to bring most characters in the message to a roughly similar frequency. So far as I know the primary tool of code cracking is just plain old frequency analysis. Curious if anybody has any proposals on how this would even be possible to crack.

krackers · on April 14, 2019

Does code cracking still work in the era of modern cryptography? I thought that cryptosystems like AES and others were essentially impossible to crack if implemented right. What role do codebreakers play these days?

jdietrich · on April 14, 2019

The very complex modern approaches to cryptanalysis still borrow from the oldest attacks.

A simple substitution cipher is easily broken by frequency analysis - find the most common letter in the ciphertext and it'll probably be E in the plaintext. Nothing so simple would work today, but we often see vulnerabilities in cryptosystems due to pseudorandom number generators with inadequate entropy. It's the same basic principle (exploiting a lack of randomness to identify patterns in the ciphertext), albeit with vastly more mathematical sophistication. The NSA allegedly took advantage of this principle to deliberately weaken cryptosystems by promoting an intentionally weak PRNG.

https://en.wikipedia.org/wiki/Dual_EC_DRBG

manquer · on April 14, 2019

frequency analysis works easily if you know the source language, i.e. English in this case, while entropy of the message is critical to cracking, complex approaches are not the only ones which are immune to attacks, for example simple ciphers like using a one time pad is mathematically impossible to crack.

rocqua · on April 14, 2019

The power of a one time pad relies on the inherent entropy of the pad though. So attacks against a real one time pad still need to deal with entropy.

pcnix · on April 14, 2019

Things like the decipherment of Linear B are probably what come close to classic code breaking.

With AES etc, though, building a system that uses them effectively is the core principle of modern security and crypto.

DanBC · on April 14, 2019

> if implemented right

This step is harder than people think.

benj111 · on April 14, 2019

Theres still room for users to 'mess up' though.

I believe allied code cracking in WW2 was helped by one wireless operator habitually ending their transmission "Heil Hitler" or something.

ColinWright · on April 14, 2019

Also ...

German Enigma operators in WW2 were told always to send a certain number of messages per day to make it harder to perform traffic analysis. One bored operator sent a message composed entirely of "W" repeated 4000 times (or so).

One on-the-ball analyst noticed a message that had no "W"s in it, and deduced what had been sent[0]. That allowed the daily settings to be cracked, and thus all messages for that day.

[0] Enigma has a weakness in that no letter can be encrypted as itself[1].

[1] Enigma is effectively a "one-time-pad" where the pad is a pseudo-random sequence determined by the daily settings.

slededit · on April 14, 2019

I imagine they weren't trying all that hard.

mmjaa · on April 14, 2019

[flagged]

F_r_k · on April 14, 2019

What's pharma's implication in gchq?

shazeubaa · on April 14, 2019

It's a shame F_r_k's reply here got flagged. It's actually quite insightful.

mmjaa · on April 14, 2019

[flagged]

sctb · on April 14, 2019

We've asked you several times to please not post political or ideological rants here. Could you please review the guidelines and stop? We'd rather not ban the account.

> Please don't use Hacker News primarily for political or ideological battle. This destroys intellectual curiosity, and we ban accounts that do it.

> Eschew flamebait. Don't introduce flamewar topics unless you have something genuinely new to say. Avoid unrelated controversies and generic tangents.

https://news.ycombinator.com/newsguidelines.html