Hacker News new | comments | show | ask | jobs | submit login
Fake Linus Torvalds' Key Found in the Wild, No More Short-IDs (lkml.org)
701 points by demiol on Aug 16, 2016 | hide | past | web | favorite | 127 comments

Hi, Eric here, co-creator of evil32. I posted a brief note on our site about this, but here's a little more detail.

I found an old (local) backup of the private keys and used it to generate revocation certificates for each key. Fortunately, there is no way for anyone else to access or regenerate the private keys for this particular clone of the strong set, and I have been very careful with my copy - it is only available on my personal machine, and I have only used it to generate the revocation certificates. I will not use these keys to generate any fake signatures nor to decrypt any messages intended for the original recipients.

We wanted to bring awareness to the dangers of using short key IDs in the 21st century, since that ID is very easy to fake, and most of the contents of the key body are not covered by the signature, so they can be changed at will. However, we feel that the keys uploaded to the public keyserver are, on balance, more of harmful to the usability of the GPG ecosystem than they are helpful in highlighting security flaws.

It's important to realize that anyone could repeat our work pretty easily. While we did not release the scripts that automated cloning the web of trust, the whole process took me less than a week. Cloning a single key is even easier - it could be done with only a few minutes of effort by someone familiar with GPG. The GPG ecosystem needs to develop better defenses to this attack.

Our original talk (and previous work) seems to have convinced people to stop using 32-bit IDs in documentation or on their business cards. However, there is another common and harmful pattern: users who want to email someone discover their key by searching the keyserver for that email, then taking the newest key. This is akin to trust-on-first-use, and opts out completely from the web of trust or any kind of external verification.

Proof of identity: https://keybase.io/aftbit

> users who want to email someone discover their key by searching the keyserver for that email, then taking the newest key. This is akin to trust-on-first-use, and opts out completely from the web of trust or any kind of external verification

Well, yes? What is the alternative, if I want to email someone who exists only in the form of a pseudonymous online identity?

Most of the time there's some at least semi-trusted communication channel. If they have a website, ask them to publish the key or the full fingerprint on their website. If they frequent some IRC channel, ask them on IRC for their key's fingerprint. If they regularly sign their emails you can check mailing lists they participate on to confirm they use the same key there.

If the key is just for their pseudonym, I usually offer to sign the key if they can send me the key through one service of my choice (where their username is public knowledge) and the fingerprint through another (meaning an attacker would have to compromise both accounts I chose). The offer to sign their key often makes people much more willing to jump through hoops, and I get to improve the web of trust.

But for some people I just don't care enough and just add the first best key.

I guess you're talking, here, about identities that are at least in some way connected to the "public" social network. Identities that publish things on public websites, etc.

But if this isn't true—if, for example, you are someone who wants to get in contact with a terrorist group (maybe for an interview, maybe because you want to join them, etc.) then there's not much to do but to trust-on-first-use some channel that seems to be them, no? No public channel can possibly be vouched for as being "the real them", or that channel would have been chased up by the CIA. Which means that any/every channel might just be a honeypot from the CIA or whoever else, trying to either frustrate your efforts, or convert you into a double-agent.

The bigger terrorist groups all have websites and/or a social media presence.

As you say any one of those channels could be a CIA operation, that's why asking for verification from two independent channels (i.e. asking for the keyfile on one channel, for the fingerprint on another) is preferable. A terrorist group that actually uses pgp might even entertain you if you ask on more than two channels for the fingerprint. The more channels you chose, the less likely it is that a single attacker controls all of them.

Another factor is that any public channel that is a front is likely to be called out sooner or later as a non-official channel. Most people and organizations are wary of the dangers of impersonation.

Of course there will always be situations where it's impossible to establish trust, like a leak by a group who tries to stay anonymous to the point of not associating with any previously used pseudonyms. Here you can't do anything but trust the first communication. But I think those cases are extremely infrequent: most groups and individuals try to establish a reputation, which nearly always gives you more points to anchor trust.

Find the fingerprint on their website / business card / Twitter / Keybase, download the full key from the keyserver.

Anyone can upload any key to a keyserver with any name.

Deprecate searching keyservers by name or email address, and only allow searching by fingerprint. Still not a complete fix (the source of the fingerprint may have been compromised) but better than before.

> Well, yes? What is the alternative, if I want to email someone who exists only in the form of a pseudonymous online identity?

If you want to communicate with someone specific then presumably there's something that distinguishes that person from other people. Find a way to connect that something with a key fingerprint. E.g. if the point is that that someone is a journalist for the New York Times (as when Snowden was first looking to leak), they should publish their fingerprint in the NYT, or at least something NYT-official (like their website).

There are use cases where trust-on-first-use is adequate, sure. But there are use cases where it isn't.

I think Keybase.io is a pretty good solution to the problem of key ownership. You can confirm the identity of anyone's Keybase key by comparing the fingerprint to one listed in any one of several "public" sources: Twitter, Github, Reddit, and even Hacker News.

Also, I have several invites for Keybase is anyone wants one.

> I think Keybase.io is a pretty good solution to the problem of key ownership. You can confirm the identity of anyone's Keybase key by comparing the fingerprint to one listed in any one of several "public" sources: Twitter, Github, Reddit, and even Hacker News.

Doesn't that undermine the whole decentralized web of trust concept? All those services are operated by US companies - or what if someone simply compromised Keybase itself?

An important part of keybase is that all proofs are publicly verifiable. When i prove i own a github account, I have to post a public gist. When you get my key from keybase, your client automatically looks at that link and verifies that the gist, and the text within (which is signed by my key) is valid.

Keybase is just the place that connects all the proofs. The actual client verifies that they are correct. As such, if keybase was every compromised they would only be able to change the link to the gist, which wouldn't do them much good without access to my github account.

> Doesn't that undermine the whole decentralized web of trust concept? All those services are operated by US companies - or what if someone simply compromised Keybase itself?

Ideally: Nothing. Keybase refers to other sources. I.e. a page on GitHub woth username and key fingerprint. So if keybase is compromises those links miss and it's no prove.

In reality some people stores their private keys there ('encrypted' according to them) so it can be stolen when they're compromised.

You can always use Namecoin.

Sorry, whose private keys did you find a backup of?

It sounds like they have issued revocation certs for the keys associated with the fake accounts to stop people from sending any more messages to the fake accounts. They are also promising not to decrypt any of the messages that were meant for the attack victims, but encrypted for the fake accounts' public keys and sent to the attacker during the attack.

The ones in the Evil 32 tarball (https://evil32.com/).

All the fake keys that I've seen mentioned are from the data set at https://evil32.com.

It appears a couple of researchers decided, back in 2014, to demonstrate this issue by cloning the entire strong set of the PGP web of trust (not just Linus' key, but basically everyone who uses PGP/GPG for Free Software development - myself included).

It would appear that sometime quite recently, someone decided it would be fun to upload all of those keys (there's ~24,000 in their tarball) to the keyservers...

One would hope that the researchers behind evil32.com are ethical enough and sensible enough to have permanently destroyed the secret keys - but obviously, anyone could mount this attack quite trivially with modern hardware.

So, check your fingerprints!

If you haven't seen it yet, a co-creator of evil32 says (https://news.ycombinator.com/item?id=12298230) that they found an old local backup of the secret keys and generated revocation certificates for all of the keys, and further promises that the secret keys never left that machine and will never use them for anything beyond the revocation certificates.

At this point this is INSANE that GnuPG still defaults to short IDs...

It doesn't anymore.

With GnuPG 2.1 listing of keys shows the fingerprint.

But for server installations (auto-signing, checking, etc.) you are often directed to GnuPG 1 because "less dependencies". Also "apt install gnupg" / "dnf install gnupg" both give you version 1 on the most recent Ubuntu/Fedora.

For desktop usage many prefer GnuPG 2.0, because they fear compatibility issues that the new 2.1 key storage format could have with 3rd party software, and you can't go back (at least this is the reason why the Homebrew maintainers still default to 2.0, but at least not to version 1 anymore since a few days).

So you have a mess of 3 stable versions, all used by many at the same time.

I don't know where people would run three versions at the same time. The information from the GnuPG project is clear: You can run GnuPG 1.x and GnuPG 2.x at the same time, but you should not have GnuPG 2.0 and GnuPG 2.1 installed at the same time, as bad things might happen.

What 3rd party software is using the keystorage mechanisms directly? Do you mean how information is output from GnuPG?

It sounds like the situation you are describing is the keystore, which has changed formats. GnuPG 2.1, as far as I can remember, will oll use the older versions keystore, but you are correct, once you have a 2.1 keystore it can't be used by GnuPG 2.0 and 1.x.

It's a tough call for the GnuPG developers and something distributions should help with. On one hand there is immense pressure to improve GnuPG, on the other hand, you have many actors who kick GnuPG around when it makes any deviation.

I would say defaulting to GnuPG 1.x is a bug and new releases of Linux, Homebrew, etc., should use GnuPG 2.0 at the very least, but better yet, use GnuPG 2.1 which has many of the things that people complain about fixed or in process of being fixed.

> I don't know where people would run three versions at the same time.

Not on the same machine, but server/automatic -> GnuPG 1, desktop 2.0 or 2.1. Also different people may run different versions, even GnuPG 1 on desktop because they are used to. This compatibility mess that seems to persist for while was what I meant why people prefer to use the lowest common denominator in their sigs/cards/slides -> evil32.

> What 3rd party software is using the keystorage mechanisms directly?

Aren't there any? Good, then I misunderstood that argument in the Homebrew debates I read, sorry. That leaves only the fear of automatic upgrading and the inability to downgrade again as a blocker.

In practice you can use all three versions of GnuPG on three different devices without a particular difference. One problem you might see is if you are using the newer experimental curve-based algorithms on a computer running GnuPG 1.4 and you get blocked, but you really ought not to do that anyway.

As for the downgrading issue:

It used to be you could just copy your .gnupg directory from computer to computer to computer and that's what constituted migrating your PGP keys.

This was also true for moving frmo GnuPG 1.4 to GnuPG 2.x. If you are starting with a new GnuPG keystore from 2.1 you can't just copy .gnupg and use it in a GnuPG 1.4 system, you have to export your public keys, your private keys, and your trustdb (although I am iffy on what this does) and import them on the systems where you are running GnuPG 1.4 or 2.0

I am unaware of any 3rd party software directly accessing the GnuPG keystore, but that doesn't say much.

Debian is currently switched to using gpg2 by default.

Debian is in the process of switching to using gpg2 by default.

Here's this week's LWN article about it:


Note that it's a subscriber-generated link (articles <1wk old are paywalled). Please consider funding LWN's excellent work by becoming a subscriber: https://lwn.net/subscribe/

> For desktop usage many prefer GnuPG 2.0, because they fear compatibility issues that the new 2.1 key storage format could have with 3rd party software

Any idea what this will mean for Yubikey users? I've been wanting to set up my Yubikey 4 for GPG for a while, and have been... daunted.

It's insane that it ever defaulted to short IDs, the insecurity of which was not just known, but proven by demonstration, years before GnuPG was released.

If you use gnupg 2.0.x, you can set "keyid-format long" in your gpg.conf to get 64-bit key IDs (instead of the classic 32-bit ones). Not really good enough, but a reasonably effective short-term stop-gap measure. IMHO.

Worth noting for any Mac users, the version of gpg2 available by default in Homebrew is 2.0.30. If you want 2.1 you can install it from Homebrew/homebrew-versions/gpg21 at the risk of possibly breaking other formula that expect 2.0.

If you think that's crazy, don't look at the KDF it uses to generate a symmetric key from your passphrase. :D

I remember reading that they updated it to be full key a few versions ago. I can't seem to find the actual link for that..

This bug report makes GPG to default to full fingerprint


I've attended security conferences in the past two years wherein representatives of companies that claim to see 80+ percent of all Internet traffic in their threat intel offerings who were presenting about Android malware reverse engineering used short IDs in their slide deck and business cards.

When I mentioned Evil32 to them, they looked at a loss.


If this is news to you, don't feel bad. Many infosec conference speakers don't do this right either, and those are the sorts of people you'd expect to get this right.

I'm just some guy, but I've got a stack of cards with my short ID on them - what do I do now? Toss them?

Destroy them carefully and use the opportunity to create new ones with QR codes on them with your full public key (or just the full fingerprint).

I've written software to help me create QR codes for my business card, including PGP, address, web site etc. There are many gotchas, some apps ignore some VCard fields, hence it puts copies into generic comment fields, too. Seems to work OK now. Maybe I should turn this into a web service.


This has also been discovered with the Debian project, as I submitted a while ago [1].

The really scary part is the follow-up [2]:

  > € gpg --search-key samuel.thibault@gnu.org
  > ...
  > (1) Samuel Thibault <samuel.thibault@gnu.org>
  > 4096 bit RSA key 7D069EE6, created: 2014-06-16

  And it has 55 signatures from 55 colliding keys...
Edit: even the 64-bit fingerprint is probably insufficient, see [3].

[1] https://lists.debian.org/debian-devel/2016/08/msg00143.html

[2] https://lists.debian.org/debian-devel/2016/08/msg00144.html

[3] https://lists.debian.org/debian-devel/2016/08/msg00215.html

Why is this scary?

Because it removes many obvious tells of a deliberate key collision targeting a specific key, and thus is harder to detect.

For example, pgp.mit.edu and Enigmail would currently output information for both keys that would be almost identical per 2014-08-05, the day evil32 apparently generated the keys. I say "almost" only because they didn't set the correct timestamps, and apparently did not duplicate all UIDs -- but they easily could have.

The diligent PGP user will of course not fall into such a trap, but an inexperienced user easily might, and there are many of them.

The whole point of this research was to underscore that PGP key acquisition is commonly broken. You could choose to blame PGP software, users, documentation, or the web-of-trust model itself, but in any case what a significantly number of people commonly do is unsafe.

Seems some people are playing havoc with key ids.

I got a mail earlier today I couldn't decrypt for unclear reasons. Now I understand why: It seems it was encrypted with a copy of my public key that is on the keyserver colliding with the keyid of my real key.

Right now there is a revoked copy of my key there: https://pgp.mit.edu/pks/lookup?search=hanno%40hboeck&op=inde...

What's exactly going on here? Other commentors indicate that someone uploaded keys from the evil32 page to the keyservers. Have the authors of evil32 now used their private keys to revoke them?

Anyway, the conclusion seems obvious: Keyids are dead, use full fingerprints. Latest gpg 2.1 versions already show full fingerprints by default.

I still had a short keyid on my webpage, will change that now.

From evil32.com:

> I saw that your clone of the strong set is revoked?

> Someone downloaded our copy of the strong set and uploaded all of the keys to the SKS keyserver network. :( While we took on this project to help prompt GPG to build a more secure ecosystem, this mass clone made the keyservers harder for everyone to use. Of course anyone could use our tools to regenerate their own strong set clone and do this again, but we'd rather our keys not be used that way.

I take that to mean that yes, they continued to be in possession of the private keys.

Seems it is not merely the strong set.

Mine is not in the strong set and it had a collision uploaded. It has the same upload (creation?) date -- 2014-06-16 as many others.

Yes, they've been revoked. See this comment https://news.ycombinator.com/item?id=12298230

I don't know why more folks don't display keys and fingerprints as Base64; it seems to me that "q68RxlopcLEwq+PEeb4+QwBBGIY=" (Linus's real key) and "D2oUZTLYaa7kOPdLYhGqOwBBGIY=" (Linus's fake key) are pretty easilt-distinguishable, and not terribly verbose.

You can do what PGPfone did and encode the fingerprint as a series of dictionary words:

Real gregkh: 647F28654894E3BD457199BE38DBBDC8 = style tactful newcomers file gallows adored insist flags athletics

Fake gregkh: 497C48CE16B926E93F49630127365DEA = jukebox governor fashionable mahogany prepares gobble surprised martha apostles

There's even an Internet standard for this: http://tools.ietf.org/html/rfc1751 , although its dictionary isn't very large or interesting. Here's another implementation intended for BitTorrent magnet hashes: http://pythonsweetness.tumblr.com/post/56715292510/cheatcode...

SSH keygen has a mode where the digest is printed as ASCII art, I imagine by using the key as input into something like a fractal function. I can't find the option for it, but I'm sure you've seen it on the console at some point

Regarding the SSH randomart, it's enabled by adding `VisualHostKey=yes` to your ssh config or adding the flag `-o VisualHostKey=yes` on the command line. It was announced with OpenSSH 5.1 (http://lists.mindrot.org/pipermail/openssh-unix-dev/2008-Jul...) and there's also a paper on it (http://www.dirk-loss.de/sshvis/drunken_bishop.pdf). You can find the code/comments in `key.c` under `key_fingerprint_randomart()`. (http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/key...)

Sure, but what is the point of that? It's not as if I would remember all the different images. And it checks with the previous one, so what's the purpose (genuinely asking, as it looks pretty cool).

It's for people to recognise differences in art, between runs.

The idea is that you may not "remember" it but you'd subconsciously notice if the image was different. I'm not sure it's any better than a hex key in that regard - either way people usually just go "yes" the first time they connect - but I can imagine it might be easier for some people to notice.

This is just turning one hash into another, so would be just as easy to brute force.

The point is to turn one fairly big hash into a representation where humans can easily spot the differences.

Pretty sure most of the time people only read the first few and last words and e.g. don't concern if the words in the middle are in correct order. Not sure about the size of that dictionary, but it would seem to greatly diminish the entropy.

It doesn't. The dictionary is exactly the same size as the input data. It's lossless.

It's asking for a solution with more gestalt impact. Turn the key into a painting or some sort of visualization.

Whether it is vision or words, the point is to enlist some of our primal, automatic brain machinery. Both the random word lists and randomart are a good start, but far from perfect.

The word lists ignore and even foil, our gramatical machinery. And I at least have never been able to remember what my own randomart SSH key fingerprint looked like. Adding colour might be a good start.

It has to be a visualization in which changing a few pixels make it look significantly different. Otherwise we can still make a 'low distance' brute force attack.

I don't understand.

What you need is a picture that makes the visually salient information tot up to about 160 bits.

That's tough, but since the human visual system is so powerful, it's not hopeless. But we would need real psychologists to help design the art generators, backing the results with experiments.

It depends on how valuable the identity is. I check some characters, for additional security I check some in the middle until I am satisfied with security. The downside is security creep, but verified identities generally grow more secure the older they are (does this grow faster?).

The upside to showing a larger hash is that humans are very good at roughly comparing two things. The difference in casing is probably enough to trigger a conscious check. A visual hash is still better.

The correct way to compare hashes is to let the computer do it.

i.e. Ctrl + C, Ctrl + F, Ctrl + V

"oh look it didn't find it ... they don't match", versus, "yup all 64/128/2048/n digits match, the hashes are the same".

We still need to define a hash format. Typical hex/base64 would work, but imagine someone tries to be smart and invent a dictionary word encoding with Unicode characters, and then someone brute forces another key that's actually different but will match a search with smart Unicode collation algorithms.

When is it sensible for a human to inspect the fingerprint anyway?

Other than the business cards people keep mentioning, PGPfone wasn't a typo but rather a voice communications package, where it's very reasonable to think that you might bootstrap a secure conversation by reading someone your key fingerprint over a phone. Prior to the NSA reveal, the threat model I most heard for this was someone doing business in China or Russia where there have perennially been allegations that the intelligence agencies help large businesses, where blocking a passive wiretap is a success.

I've got to say, I think Satoshi picking the Base58[0] subset instead of B64 was a very smart choice for Bitcoin and it would make sense to adopt it elsewhere as well (as far as anywhere that the string needs to be parsed by a human-eye).

[0] https://en.wikipedia.org/wiki/Base58

Something I hadn't realised before now is that a PGP fingerprint has the same format as an IPv6 address, so if you just substitute colons for spaces you can display it as a hipku


    > Hipku.encode('0F6A:1465:32D8:69AE:E438:F74B:6211:AA3B');

    > Bold grouse and brass ghosts
    > clamp strict lean sane tart dry whales.
    > Fresh geese blur rust dice.

I hadn't heard of hipku before, it's quite clever. And my current IP produces a nice one.

  The clever dark lamb
  prowls in the ancient jungle.
  Tiger lillies dance.

While it's certainly more easy to tell the difference between two random keys represented in base64, it's probably not that hard to create a similar brute force algorithm that makes the base64 representation look similar.

Odds are 1 in 64 to get the first character as lowercase q, 10 in 64 to get digits in the second spot, once more for the third spot, one in 16 to get a + sign close to where the plus sign is right now (so 1/64 for an exact match, 1/32 for one position off to either side, 1/16 for one position off to the left or right), another 1/16 for the second plus, and finally let's match the last three characters so another 1/(64^3). The padding should always be the same I think.

After, on average, 27.5 billion attempts you'd have a matching base64 output. It's not as great as a 32-bit integer (2 billion attempts on average) but it's in the same ballpark.

This is different from the 32 bit key id because the key id would be an exact match and the "visually similar" base64 string is only similar on first glance, but if you check the base64 carefully you might as well check the fingerprint carefully.

This is why it's recommended to check a few random positions when matching cryptographic hashes by the way.

It's software, but it seems like some sort of digit-coloring scheme for hashes could make fakes easy to spot by eye. The simplest thing would just be a different color for every character (and and pick very distinct colors for similarly-shpaed characters). Or maybe color groups of 3 digits or something.

If you're using software to display both key ids, why not just use software to compare them?

They may be on different devices, for example. I agree we should use software to compare them whenever possible, but sometimes it is practical to be able to eyeball differences.

If everyone did that, couldn't attackers just target collisions that "look like" the target key in Base64? i.e. the same first 5 or 6 chars and the punctuation in the right places.

There was a study posted here recently on this very topic:



From the abstract:

"The highest attack detection rate and best usability perception is achieved with a sentence-based encoding. If language-based representations are not acceptable, a simple numeric approach still outperforms the hexadecimal representation."

I believe miniLock also does that. Any reason why miniLock can't be used as an alternative to PGP? (for email/file encryption)


Because this isn't just about that. It's about an independently implemented web of trust. PGP has buy in from people like Linus, it has independent implementations, it is distributed, and has a strong web of trust.

Is there a reason a hacker couldn't just create a key like "q68RxlopoLEwq+PEeb4+QwBBGIY=" to create a similar problem?

The string is longer so collision is harder to find. btw that's a way .onion addresses are generated, it takes first 80 bytes from the key and outputs base32 from the input.

Rather, it takes the first 80 bits of a SHA1 hash of a 1024-bit RSA public key, then converts it to base32.

Obviously, collisions aren't obscenely expensive.

Linus' fake key [1], as well as all the others from the random sample I took, have been revoked as of today.

[1] http://pgp.mit.edu/pks/lookup?op=vindex&search=0xEA185A5E76E....

So the evil32 people kept the private keys. That's exciting.

Not necessarily, they might have just generated revokation certs which are separate (so that they can be used in case your private key is lost). Keeping a revokation cert would be a responsible thing to do, just in case something like this happens.

They had an old backup that contained the private keys. See this comment https://news.ycombinator.com/item?id=12298230

Personally I like the keybase approach for this of tying GPG keys to a set of accounts.

Generally when I'm talking to someone online I know someone via other sites (e.g. twitter, github), so being able to say "the person who controls the account xxx on github, uses this key" lets me establish a level of initial trust.

Obviously for high trust applications that's not enough, but better than nothing...

Just to point out, this attack goes back at least 22 years. According to [this thread](https://groups.google.com/forum/#!topic/sci.crypt/JSSM6Nbfwe...), credit is due to Paul Leyland.

Take a leaf out of urbit's book, and convert hex strings into readable nonsense syllables that are a lot easier for humans to compare.

https://en.wikipedia.org/wiki/PGP_word_list is the original version of this, and was created for exactly this - conveying the full fingerprints of openpgp keys.

UTF-8 v9.0 contains 1085 emoji, that should be even easier to compare than random symbols.

I think emoji would be hard to compare. A lot of very similar little faces. Is that a wink or a blink or a frown?

An example of urbit's rendering of a 128-bit number into textual form is "racmus-mollen-fallyt-linpex--watres-sibbur-modlux-rinmex". While it might be gibberish, it's gibberish that even a screen-reader program could take a swing at, and humans can easily read.

A similar design is Proquint, which IPFS uses:


Proquint (5 letters per 16 bits) is tighter than Urbit's `@p` (6 letters per 16 bits). The Urbit form was designed for synthetic names and restricts itself to phonemes that sound comfortable and natural to English speakers. (Not to say that English should be the universal language, it's actually a terrible language to make everyone learn, just that it is.)

Word lists work reasonably well, but they're quite bulky and they don't take advantage of the human hardware accelerator for learning new words. When you have a GPU, use it. These kinds of synthetic strings also make great passwords, BTW.

(Disclaimer: Urbit guy here.)

If you want to make a more universal phoneme-generator, the basic contours of a nearly-universal [1] phonotactics is as follows:

* Strict CV syllable scheme.

* Atonal

* Consonants distinguished only by voiced/voiceless (Chinese, e.g., doesn't do a voicing distinction, but switching to an aspiration distinction would suffice for them)

* 5 vowels: a, e, i, o, u (actual vowel quality may vary; every language that has at least 5 vowels has these 5 vowels) (some languages, particularly indigenous languages in North America, have 3 or 4 vowels, but the intersection yields too few vowels).

* Consonants are harder to inventory. /p/, /t/, /k/, /m/, /n/ are nearly universal, and /b/, /g/, /d/, /s/, /z/ are also quite common. The IPA /j/ (that's the 'y' in 'ya' for English speakers), /w/ (pronounced as you'd think in English) are pretty common semi-vowels. Maybe /l/, /ʃ/, /ʒ/ as well, should you need more consonants.

That gives you 25-75 plausible syllables, depending on how many consonants you go with.

[1] If you go by least common denominator, you end up with maybe 1 vowel and no consonants (there's no consonant phoneme present in every language IIRC).

Doesn't Lojban try to have pretty easy phonotactics or something? They do have consonant clusters, but I thought they did some kind of study and chose their phonemes and some rules on the basis of things that most languages wouldn't find too difficult.

Edit: not suggesting that Lojban's solution is somehow preferable to your advice, just trying to remember what they did about this issue.

Lojban uses the consonants I gave (sans /j/ and /w/, although these are counted as dipthongs instead), plus /f/, /v/, /x/, /ʔ/, /h/, and /r/, as well as /ə/ for a sixth vowel. The syllable scheme seems to be largely C(C)VC(C), with largely only mixed voiced/unvoiced and geminate consonant clusters prohibited. That said, they do allow for "buffer" vowels in pronunciation to aid speakers who have trouble with consonants (and yet they have a /ə/?).

From what I can tell, CV(C) (with the second consonant usually having some restrictions) is fairly widespread. However, in my personal (purely anecdotal) experience, pronouncing foreign consonant clusters or unfamiliar final consonants is much harder than pronouncing unfamiliar initial consonants or vowels, so I'd be slightly wary of letting the final consonant go too unrestricted.

Proquint example for comparison: "pokak-fijus-zavaz-posuf-bizar-luhuf-kulor-marak".

If you don't have a font that contains those emoji, you'll get either blocks of ??????? or little square blocks containing difficult-to-see numbers in them.

Do Debian or any of the *BSDs yet default to including a font that supports emoji?

Other possible alternatives (somewhat more limited in number so longer strings) would be box drawing characters[0] or game blocks (mahjong[1], tiles[2] or cards[3]).

[0] https://en.wikipedia.org/wiki/Box_Drawing

[1] https://en.wikipedia.org/wiki/Mahjong_Tiles_(Unicode_block)

[2] https://en.wikipedia.org/wiki/Domino_Tiles

[3] https://en.wikipedia.org/wiki/Playing_cards_in_Unicode#Playi...

I'm not sure if you're sincere but I don't think emoji would be easier to compare. +Might be less friendly to screen reader users?

> I'm not sure if you're sincere but I don't think emoji would be easier to compare.

The point is to leverage human pattern matching so you want short-ish figures with large differences.

Each hex digits is 4 bits, but each emoji is 10 bits, a 128 bits key is 13 emoji which is significantly more eyeballable than 32 hex digits, and chances are you'll notice EGGPLANT being replaced by CAMERA in 13 pictures easier than you'd notice B being replaced by 8 in 32 characters.

> Each hex digits is 4 bits, but each emoji is 10 bits, a 128 bits key is 13 emoji which is significantly more eyeballable than 32 hex digits, and chances are you'll notice EGGPLANT being replaced by CAMERA in 13 pictures easier than you'd notice B being replaced by 8 in 32 characters.

Compare like with like - what are the two most similar emoji from that set of 2^10 ?

I hate to say it, but comparing two, long hash values via a screen reader doesn't seem viable for humans, regardless of emoji.

Maybe an auralizer to turn the hash into a short piece of music?

See my above example from urbit. They are something a screen reader would clearly read differently for different values, if not comprehensibly.

32 bit seems so obviously bad I'm not sure why we still have it..

The 90s were a different time, that's the only explanation I can come up with. Short key IDs are yet another 90s-crypto wart.

Even in the 90s it seems odd to design a system trivially vulnerable to birthday attacks (with maybe a simplification, but that's how primitives are evaluated: if it fails in a slightly simplified version or environment, it's unfit for service). Now birthday attacks are not exactly applicable to the "how to forge a key with the same short id as that exact person", but with an high number of keys available there may be other ways to ease your attempts to generate collisions, even if that means relaxing your target requirements somehow. Even if there are not, you must be extremely paranoid when designing anything security related: here barely doubling the size of the short id would have mitigated today's problem, but we see that now 64 bits is not even enough. So there has been a fuck-up (with, I agree, a (lack of) security climate in the 90s that might have contributed a lot to that fuck-up), it has enormous consequences right now (given some distro still default their gpg packages to an obsolete version, it seems). We must take what happened and happens right now into account, and learn our lesson: trivial conveniences decreasing security against abstract hole shall be considered an absolute no-go, and something to fix ASAP with absolute priority.

Sadly I do not expect the security approach to change before at least one more people generation, and then to be honest I'm not even sure it will ever change at all if we consider the mean global situation: the approach of far too many people is still "we don't give a fuck, we don't know anything about that, actually we don't even know that we should know something about that, this will just not happen to us, this is only a cost we can skip". Unless they are personally fucked, I don't expect half of that kind of people changing their mind. And then there are now so much software everywhere that I expect that the vast majority are so full of holes this is not even funny, and I expect that the ratio of insecure software will actually increase unless some kind of regulation are put in place -- but then I don't expect regulation to actually be sane and mandate for real security, given that politics want back doors at least every 4 years.

To optimistic people, please consider the following: even in a mainstream IT field, on one of the most used kind of device today, handling personal data all the day, the market leader designed an ecosystem where the OS that most people are actually using is most of the time not patched during most of the lifetime of said devices. If Google can get away with having such insane and shameful approach, why would you expect a random car vendor to have any real security in its embedded software? Obviously it is even worse for gadgets that VC currently think should/will be installed everywhere.

We are heading to security nightmare unless each of you who think security is important wake up and push the hardest they can to improve the situation. Relentlessly.

We have 8-digit PGP short IDs for the same reason we have abbreviated 7-digit hashes for git commits (`git rev-parse --short`): it's short enough to keep in one's head (see https://en.wikipedia.org/wiki/Seven_plus_or_minus_two).

But you usually don't have to worry about adversarial input when you check out a local git branch.

As mentioned in debian-devel lists, add this to your ~/.gpg.conf:

    keyid-format long
as a stop gap measure, it will show long IDs (64-bit). Obviously comparing full key (`--with-fingerprint`) is the best.

Someone claims (https://groups.google.com/forum/#!topic/sci.crypt/JSSM6Nbfwe...) that it's just as easy to spoof 64 bit as 32 bit keys. I have no idea whether to believe him, but even if he's full of it, it's prudent to assume that if the 32 bit attack was practical 20 years ago, 64 bits is within reach today.

That's a post from 1996. I'm not sure how "easy" was calculated. Possibly both were hard but tractable, but 64 bits should have been much harder.

Today it is significantly easier to collide a 32-bit key than a 64-bit one, but both are pretty easy. A 32-bit key can be collided in 4 seconds on a GPU, according to https://evil32.com. It can certainly be done on normal desktop hardware in hours. For a 64-bit key, 'JoshTriplett calculated (last week) that a collision would take 15 days if someone built hashing hardware of comparable quality to a commercial Bitcoin miner:


(Incidentally, I am very pleased with Bitcoin having created a liquid market between cryptographic computational speed and money, so we can answer these sorts of questions precisely.)

This means state actors like three letter agencies already have this capability.


I've mentioned an idea in Phoronix forums[1] yesterday: Since most users won't bother comparing the entire signature (also applies to comparing md5/sha*/etc. hashes), it might be a good idea to map & display blocks of the sig as English words to the user.

Care must be taken to avoid similar-looking and similar-sounding (homophonic) words, but since there are >150k words in Oxford English Dictionary, so it should be possible to get 65k usable ones.

In fact, someone pointed out something like this already exists to a degree: [2]

A contrived visual example (suppose someone matches first two and last two segments!):

  Fake Linus Torvalds: ABAF 11C6 32D8 69AE E438 F74B 6211 AA3B 0041 1886
  Real Linus Torvalds: ABAF 11C6 5A29 70B1 30AB E3C4 79BE 3E43 0041 1886
With a word salad approach:

  Fake Linus Torvalds: lopsided crate threatening hydrant peep bumpy art work earth spurious
  Real Linus Torvalds: lopsided crate symptomatic equal kaput chunky kettle include earth spurious
Even with an even-spaced font, it's hard to confuse the two

(random words from [3])

[1] https://www.phoronix.com/forums/forum/phoronix/latest-phoron...

[2] https://github.com/bitcoin/bips/blob/master/bip-0039.mediawi...

[3] https://www.randomlists.com/random-words

The pgp word list was created for exactly this purpose. https://en.wikipedia.org/wiki/PGP_word_list

Just make the keys 256 characters and add an ASCII art penis in the front, just to make it very clear that humans should not be seeing these things.

wait, this looks interesting. Can someone comment on this? Are you supposed to know the fingerprint after you input name and stuff or before? Or is this just modification of data like how you can add email IDs?

You're supposed to confirm the fingerprint with the person. At the time the recommendation was a phone call (if you knew their voice) using the PGP word list - it was felt to be computationally implausible to fake that up in realtime. Or people publish fingerprints on their site etc.

Obv. the fingerprint only matters if you want to be sure you are talking to someone specific, in which case you usually have a way to know who they are or why you care. For some use cases trust-on-first-use is adequate.

Thanks for taking the time, but seems like you didn't click on the link. When you look at the name of the keyholder of the fingerprint (B9E39278), the User Name of the keyholder is same as the fingerprint of the key. I was asking if the User Name is set after the key is generated or before. Or if one can change user name after key generation. And I think you can, so there's nothing interesting here :)

method 1: Keep generating keys until you have a collision.

method 2: What you said, modify the details.

A fun way to play/explore all things GPG is to use a javascript library and Chrome's javascript debugger (e.g. https://openpgpjs.org/)

Yeah I was wondering if method 2 exists. I remember reading about something similar but wasn't sure. Thanks for the link!

The e-mail includes a couple of links. The linked site at http://gwolf.org/node/4070 returns "page not found?" for me currently. Here is an archived version of that page which works: http://archive.is/sSXvX

So my solution to this was to sign a ton of publicly archived email with my full fingerprint. Then I stopped signing email and not a single person noticed/cared, so yeah. Meh. CVE assignments, no big deal I guess.

apt package manager uses short-ids with a repository key is not recognised, as well as when you use apt-key to add the key to your keychain.

When using 3rd party repositories, you will often see something like the following, which also uses short-ids:

> apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10

So it makes me wonder if any commonly used repositories (or PPA's) have fake duplicate keys with the same key-id.

These keys have always seemed remarkably short.

Roughly how many CPU-hours does it take to find a collision?

"It takes 4 seconds to generate a colliding 32bit key id on a GPU"

source: https://evil32.com/

I'm using ABCDABCD, and from memory it took a couple of hours. The hardest part was picking a 'vanity' short id that wasn't already taken.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact