Hacker News new | comments | ask | show | jobs | submit login
Hipku – encode any IP address as a haiku (gabrielmartin.net)
143 points by gabemart on Dec 15, 2014 | hide | past | web | favorite | 35 comments

Two notable addresses:

    Ace ants and ace ants
    aid ace ace ace ace ace ants.
    Ace ants aid ace apes.
    The hungry white ape
    aches in the ancient canyon.
    Autumn colors crunch.
We can see that ipv6 adds ants to the traditionally monkey themed localhost...

Reminds me of http://en.wikipedia.org/wiki/PGP_word_list , which is less poetic, but designed to be unambiguous when read over the phone.

The ideas behind that list is good, but I think they went overboard with their machine learning approach to find words that were as far apart as possible in pronunciation. Considering one would think that such a list will be used in international contexts, basing it off of Basic English[1] or something would probably have been a better idea. How many non-native speakers (that aren't a fan of Dave Sim's "Cerberus") will know the word "Aardvark" ?

[1] http://en.wikipedia.org/wiki/Basic_English

I love it. The practical side of me, however, would recommend increasing the name space a little bit and adding checksums; human memory is not perfect, and having built-in error correction would help immensely.

Great idea. Thanks. I need to find some extra space for fitting in a version number anyway, to preserve backward-compatibility.

I've been thinking about something similar to this for secure passwords. And your haikus neatly demonstrate both the challenge and viability. I was thinking of encoding as a phrase, but keeping the two first letters of each word in a 256-word set unique, so that a phrase could be used for memorization, but you'd "only" have to type the two first letters of every word. But 16 characters for 64 bits of entropy is still pretty long (for password entry) -- and ideally you'd want 96 or even 128 bits. Especially for passwords/pass phrases that are used to derive/protect encryption keys (the goal being to avoid having the entropy of the password as the weakest link. There are still other challenges with passwords, of course. Eg: key loggers).

I'd also thought about having a generic library, that could be used for things like representing hashes/fingerprints etc (think: an alternative to the ssh key ascii art).

While I'd considered poetry (especially using rhyming dictionaries) -- my main idea so far is to just construct grammatically well-formed sentences. With a limited vocabulary, it should be easy enough...

And as touched upon in this post, it might make a nice alternative for verbally communicating 64-128 bits of random data -- say reading out a password while someone across the server room types it into a console.

[edit: eg, using a random haiku:

    Hoarse dunes and slim germs
    gulp pure ripe foul dead bland sole.
    Firm trees bleed short heads.
This would become: "hoduslgegupurifodeblsofitrblshhe" (ignoring the and). If anyone wonders why you'd not just type in the whole thing: it's hard, even for a skilled typist, to get something so long, exactly right, when typing blind. And obviously even the short form here strains typing ability. But then again, how many can say with confidence that their passwords hold 128 bits worth of entropy? With a one-to-one mapping between word-lists (or equivalently sets of two leading characters, and a random N bit integer, assuming the integer is as random as your other secret/session keys -- it's trivial to demonstrate a lower bound for the entropy of such a password]

[edit2: I should add that I considered adding a standard transform to the final password: capitalize the first letter, and terminate with a punctuation mark. It would add nothing to the entropy of the password, but might allow an "insecure" lowercase-only password to pass inane "strength" requirements, by mixing three character classes.]

I made this a while back. :) http://rmmh.github.io/abbrase/

An interesting fork: http://bcaller.github.io/abbrase/

Some comments: https://news.ycombinator.com/item?id=8059210

Letting users pick their passphrases from a list sacrifices a few bits of entropy in theory, and in practice gives much more usable mnemonics.

I'm not sure grammatical models are a definite improvement over Abbrase's naive bigram method -- part-of-speech constraints can make sentences more awkward than freeform associations.

Having 128 bit passwords feels like overkill. I'm not sure on the precise threat model for sites, but the worst case of someone getting your password hash should be ameliorated by not reusing passwords between sites and the password being stretched properly.

If you're actually deriving encryption keys directly from passwords, you'll probably be okay with 60 bits of password entropy and 30 bits of memory-hard stretching.


I should add, that my idea was to publish an RFC-style standard that worked both ways (from the password to number and back again) -- which is why mapping two letters to 256 bits was nice: it makes for short word lists. It might even, with some care, allow for localizing the actual word list to at least some other European languages (we're really only concerned with the two first letters).

> Having 128 bit passwords feels like overkill.

I agree. However, as a user can't be sure sites properly stretch passwords, I'd say 64-bits feels more comfortable.

For use-cases where one are protecting keys, it'd be good if the password wasn't the weakest brute-force target with too large a margin. Obviously, if you can manage to get 64-bits into a password suitable for everyday use, chaining two together for the (presumably rarer case of) master passwords should be viable.

But even 64 bits is a lot of entropy to be typing in, no matter how you encode it :-/

Interesting concept. Somehow, humans need to be able to quickly and clearly say/pronounce an IPv6 address over the phone when talking with Tech Support. Not everyone knows ITU phonetics (two charlie eight foxtrot...). I bet there are other, even simpler ways to human encode IPv6 addresses (outside of DNS names of course).

Your link does not work. This S/KEY? http://en.wikipedia.org/wiki/S/KEY

Should work now.

This is fun! I'd have liked to have seen my IP address auto-detected and presented to me as a Hipku so I didn't have to go figure it out and come back to see the results.

I thought this was a fun idea so I did this: http://hipku.gabrielmartin.net

Related: proquints (PRO-nouncable QUIN-tuplets) http://arxiv.org/html/0901.4016       lusab-babad   gutih-tugad     gutuk-bisog  mudof-sakat    haguz-biram

I'm looking for flaws to using this as a password remembering scheme.

Basically you remember the haiku but when it comes to typing your password you convert it in an IP address using Hipku.

I guess an IPV6 address makes a really strong password, not breakable using a dictionary.

Any thoughts on the validity of this use case ?

You might as well use the full length haiku - you wouldn't be exposing it to an external service, you wouldn't need internet access or a program installed, and you can probably type it faster than the IPv6 address. Using dictionary words for a passphrase is actually fine as long as the phrase is long enough. Figure out how much security you need (128 bits if you're going with the IPv6 example).

The problem with using the whole haikus, is that you must type quite a lot, typically blindly, into a password entry field -- and not make a single mistake.

I think the more interesting question is: if you manage to memorize the haiku -- will you be able to retain it longer than the ipv6 address? After typing in the address a few times a day, you'd have it memorized (at least in muscle memory). But what if this was something you either used rarely (passphrase for restoring backups for example). Would you remember the haiku even after you'd forgot the ipv6 address?

If you memorize poems at all you have to remember the exact words; IMO they're easier than memorizing numbers. (In fact, the way I can remember the first 50-or-so digits of pi is that someone wrote a poem where the word-lengths correspond to the digits).

It's not that it's hard to remember, it's hard to type in. At least that's my experience (I use a few 16+ character passwords/passphrases). It might be easier if it's all lower-case, no numbers etc -- I'm not sure -- I've not tested myself.

But in writing the above, I had to hit backspace at least once -- something that's a bit hard to catch when you're typing blind into a password entry field, like when typing in the pass-phrase for unlocking a LUKS partition, or logging into a console session. Or even typing in a login password in a graphical login manager, like the windows login prompt, or gdm/ldm/xdm etc.

And it also takes time. Especially if you only get it right on your third attempt.

Shrug; I find it much easier to write English words than numbers and the like, but I guess YMMV. If you don't realize when you've made a mistake, it's well worth spending a bit of time learning to type properly, IMO; I spent a weekend practising and while I still occasionally typo, I know when I have without having to check. E.g. I wrote this sentence with a couple of letter transpositions, backspaced and corrected them, all without looking at the screen, just to check that it was possible.

Maybe. Then there is typing in the passphrase on Android [ed: ie using an on screen keyboard] to unlock the FDE etc. Fwiw I had a year of touch in junior high, so I do generally type pretty well - but I still end up having to type in my pw to unlock my computer a couple of times on average. I'm not sure which parts I miss - quite possibly I'd be better off with a slightly longer, all lowercase pw.

If you take full disk encryption with Truecrypt, the max length of your password is limited to 64 char.

If you use a pass phrase it will be easier to crack with a dictionary attack since at best you will put about 15 words in 64 chars.

How long's the actual Truecrypt key? Ordinary conversational English has about 3 bits of entropy per character, so 64 characters will have an equivalent strength to a 192-bit key. If Truecrypt uses 128-bit keys that's plenty; if they use 256-bit keys, they ought to allow a longer passphrase.

Of course if you choose truly random characters you have 7 bits of entropy per character, so you would have the same strength with a 28-character password. But which is going to be easier to remember, 28 random characters or 64 characters of ordinary English?

I think you'd be better off designing a scheme for generating secure, easy-to-remember passwords in the first place rather than using hipku. I had to make a lot of compromises to make the entire IPv6 space fit into a haiku. For example, hipku uses many one-syllable words which are relatively obscure, and it uses some words that are very similar to other words it uses.

Without the constraint of fitting an IPv6 address into a haiku, you wouldn't have to accept these compromises. I think it would be better to generate passwords that were a little longer (while still covering a 128 bit space) but much less ambiguous and easier to remember.

It would be pretty easy to do.

That's an interesting use case, so I thought I'd compare it against something like diceware[0] which is dictionary based but generally accepted to have strong levels of entropy. N.b.: I'm using a Shannon calculator[1].

Let's use the two most forthcoming examples on hand, that is: the IPv6 address from this article's demo, A = 29A1:A600:F19B:B703:7080:5387:3685:A2AF, and the diceware example from xkcd/936, B = correct horse battery staple[2]. Using the Shannon entropy calculator, we find that A has entropy H(X) = 3.55397. The same analysis of B returns H(X) = 3.49468.

Now that's Shannon entropy which calculates the entropy of an outcome in relation to itself, and not the entropy of an outcome from a set of potential outcomes. To do that, we might start with analyzing the number of potential outcomes an IPv6 address can take (340 undecillion[3]) versus the number of words on our diceware list (~7,776 English words[0]) by the number of words our diceware passphrase potentially uses.

In the case of IPv6, you have a 'finite' number of combinations, albeit of fixed length, whereas the unfixedness of diceware means that, theoretically, the scheme scales upward into infinity. That's probably not practical, and one could simply add additional sets to an IPv6 address in order to remove that advantage. So, where does that leave us?

Well, I'm not sure that I'd like to say, but at least I can examine things this way: using an IPv6 address is probably secure, but is the added overhead of a translating agent between your memorization utility and password input worth it? At least, when compared against something like a four word diceware passphrase, it seems the entropic gains perhaps aren't worth the additional computational overhead.

[0] http://world.std.com/~reinhold/diceware.html

[1] http://www.shannonentropy.netmark.pl

[2] http://xkcd.com/936/

[3] http://en.wikipedia.org/wiki/IPv6

The length of IPv6 addresses is a minor usability problem from a devops perspective. I don't feel like this is a solution, but something to help ease the pain of this would be nice.

Another issue is that nearly all terminal emulators are (so far) too stupid to auto-highlight an IPv6 address on double-click. They all break on : -- highlighting only 16 bits of the address. Annoying.

That's often configurable. For example, in xterm:

> -cc characterclassrange:value[,...]

> This sets classes indicated by the given ranges for using in selecting by words. See the section specifying character classes. and discussion of the charClass resource.

(I haven't tried using this myself.)

The hungry white ape

aches in the ancient canyon.

Autumn colors crunch.

The agile green ape

jumps in the ancient mountains.

Autumn colors rest.

The weary white wolf

yawns in the wind-swept wetlands.

Autumn colors blow.

Hey now, who downvotes

haikus? Even if random?

Must be Hacker News.

(not my actual IP)

With IPv6, you can give each site a separate address and encode it as a haiku. Poetry to replace DNS.

And there is no central Internet Corporation for Assigned Haiku. This is what namecoin was meant to be.

Chilled apes and fat sprats

aid bleak brave prone ace ace ants.

Ace ants aid sharp beaks.

> And there is no central Internet Corporation for Assigned Haiku.

Um, yes there is. It's an address; you get your address space from your upstream AS (probably your ISP), they get it from their RIR, and they get their addresses from IANA.

I don't find this particular app very useful, but the very idea of encoding something hard to read into redundant, but easily comprehensible format is simply great.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact