Hacker News new | past | comments | ask | show | jobs | submit login
How I Stole a User's Siacoin (mtlynch.io)
292 points by mtlynch on June 16, 2017 | hide | past | favorite | 71 comments

I often wish that password entry for things fully under your control (i.e. when there are no retry limits aside from brute computational power) would come with limited brute forcing support.

Such password dialogs could just let you type your best effort, and they could use the things you type to inform the guessing process; you could fat-finger a character or two, and it would just take a moment longer to log in as it uses the accurate parts of the data to make educated guesses of the password. For old encrypted files, for example, I often don't remember which password or which combination of passwords I might have used, but I can provide all the important bits and a smart program could easily guess the right combination.

Some people (e.g. Facebook) do this already. And it turns out that this doesn't really impact security all that much!

Here's a recent research paper on the topic: pASSWORD tYPOS and How to Correct them Securely - http://www.ieee-security.org/TC/SP2016/papers/0824a799.pdf

Does Facebook do anything beyond capitalization correction?

Think they remove ending space.

IIRC Facebook does something like this, where they actually hash multiple variants of your password when you set it and will accept any of them when you log in.

It's a protocol, there's no way to rate-limit beyond the mathematics behind the protocol. The author of the blog post could have written their own software to try passphrases if existing software is rate limited.

(offtopic - misunderstood comment. Let's assume the parent comment said 'we should prevent bruteforcing')

As the other commenter said, there's nothing you can do to prevent brute forcing. What you can do, is have a very expensive KDF. So for every password you enter the wallet will take a very long time to 'unlock', which is basically the process of deriving the key from the input.

'Expensive KDF' sounds cryptic, but often just having some memory/CPU requirements for an instance of the KDF should suffice.

Fun fact: There is a reason why when you enter a wrong password for `su` or `sudo` it seems to take longer to throw the wrong password dialogue than to log you in. That's because the password authentication module (called PAM) artificially delays you to prevent brute forcing. You can go and change it if you want. (I would discourage it)

Of course, this won't stop everyone. One can just put the harddrive in a different computer to get the hash of the password and crack it within a day with proper resources. This is the problem with trying to 'stop' bruteforcing at the input level. You must already asssume the attacker has the hash, then the difficulty must be determined. That's the point of people arguing about password hashes (for fun, of course)

The post you responded to was asking for an easier way to support multiple attempts without having to sit there typing all the nearby variants of a mostly-known passphrase. I'm not sure what part of this post addressed the parent post.

>I often wish that password entry for things fully under your control (i.e. when there are no retry limits aside from brute computational power) would come with limited brute forcing support.

This part. I misunderstood the comment. I thought by 'limited brute forcing support' they meant to limit the brute forcing process. The sibling comment also thought the same thing so I didn't doubt it.

What do you mean you can't prevent brute forcing? You can count login attempts server side and throw a recaptcha every 5th try say.

I could be thinking the wrong space though I don't know what Siacoin is or KDF.

edit: after reading...

>If you’re not familiar with Siacoin, it’s a cryptocurrency that allows you to rent out your spare hard disk space or buy space from others.

Interesting not sure I'd do it, what scale you need for this to be worth something as a leasee (leaser?)

That was pretty cool that distance between keys... ahh automation me like, parse out the function/tasks write it out, then let the computer do its magic. Batch processing yeahhhhhhh

This example with Siacoin is more similar to the situation that you have a file encrypted with a password on your computer than it is similar to a network service that users send their password to in order to log in.

For anyone interested in a little more detail: Bitcoin and every decentralized cryptocurrency operates on the concept of a blockchain, which is synchronized to every user (and regularly appended to). Most cryptocurrency blockchains effectively contain a list of (hashes of) public keys known as addresses and currency amounts. You own some currency if you know the private key corresponding to an amount listed in the blockchain, and using that private key you can sign a transaction to send the amount associated with it to another address. (That's the easy half of how Bitcoin and friends work. The other half is the innovative part about getting everyone to agree on the same blockchain even when people create conflicting transactions attempting to double-spend the same money. This involves continuous proof-of-work mining generally. It's not super relevant to anything in OP's post though.)

The user in the OP post who had their Siacoin stolen was using a system (often called a "brain wallet") where their private key was generated from a 29-word phrase. Anyone who knew the 29-word phrase could use that to generate the private key and then create a transaction to steal the currency associated with it. If you almost know the 29-word phrase, then you could brute-force it by repeatedly modifying the phrase, generating the private key, and then looking at your copy of the blockchain to see if that private key had any currency associated with it. (Well, actually the brain wallet system uses a checksum like credit card numbers do, so most invalid 29-word phrases just fail the checksum check and don't need to bother checking the blockchain itself, but that doesn't really impact anything about this process.)

I see, thanks for the clarification.

Fun side effect of the su/sudo delay: typing in a wrong password feels punishingly interruptive while getting it correct feels satisfyingly brisk, triggering a reward that makes you learn to type it better and better. A bit like sl and gti commands.

For UX this would be nice. However, many applications that encrypt locally (like KeePass, etc.) perform key stretching, which would make this take quite a long time.

I used to mine Bitcoin back in 2011 and I lost my wallet.dat file (through several stupid moves on my part). It's got approx 103 BTC in it, anyone is welcome to it, I've given up trying.


Wow. That's like, "I know I bought a house a few years ago. It was a decent house, I think, but I lost the address and can't seem to remember where it was or how to find it again." (except that with Bitcoin, it's like, "the house is somewhere in this galaxy, but I can't remember which solar system")

It's more like: "I bought a painting for 20$ at flea market. 6 years later I learned it was from Van Gogh but sadly I don't know where it is now."

My dad actually bought a painting for $25 at a flea market in Amsterdam and it turned out to be a Paul Citroen, we ate pretty good for a couple of years because of that. Not quite a Van Gogh but still, a good catch.

What was the market value of the painting? I fail to find any result corresponding with the comment.

It fetched ~50K Guilders in an auction by Mak van Waay, which in the 1970's was a lot of money.

My brother accidentally deleted his wallet.dat from Dropbox a few years ago - he had given it a random filename and encrypted it with GPG so it was unrecognizable to hackers (and apparently him as well).

It had 1,000 BTC in it! He had received them from a generous Bitcoin contributor in the early days who said "here you go, hold on to it, it will be worth something someday."

I still give him a hard time about his $3 million USD mistake...

Did he try Dropbox support to see if they have a backup?

Oh yeah, completely exhausted that route - it was over a year before he realized it and way past their retention period...

"Hi Dropbox support! You get $1m USD if you figure this out."

Generous bitcoin contributor may be thinking that he made a wrong choice of giving away.

He probably has several times that amount...

As in trying to guess the private key given the hash of a public key?

If thats the game there are higher value address to try.

It's not possible with only the address. If you have the encrypted wallet I could give it a try though.

Dave the wallet recovery guy might be able to help if you have any sort of hints as to what your key might be. Obviously do some research to verify I'm not scamming you. Good luck. https://walletrecoveryservices.com

Any hints on how you generated you key/passphrase? Did you use Satoshi's original Bitcoin client? Will reward information which results in successful discovery :)

Wondering the same thing. Was a passphrase used to generate the private key?

do you have any spinning disks around that once may have had the unencrypted keys on it? I could modify the commercial data recovery software I wrote (https://macosxfilerecovery.com/) to scan your drive for the private keys

Thanks for the offer, I kept the drive I stored the TrueCrypt container on. I think it was like a bad drive arm or PCB. I'd considered paying for a service to move the platters to a good drive, but at the time the service cost more then Bitcoin worth and there was no guarantee of recovery. I ended up throwing the drive away (another dumb move). Never in my wildest dreams did I think it would increase in value like it did. I think I would have sold a lot sooner if I had access to the wallet file for the whole time.

That's alright. I see a lot of these posts saying "the bitcoin I lost would be worth $X Million dollars now", but the thing that people overlook is that it would not be worth that now, because they would have sold it a lot earlier if they had access to it. Many people would sell the moment it's all worth about $20k depending on a few factors. In almost no cases is is it truly a million or multi million dollar loss.

Do you have the hard drives wallet.dat was stored on at some point?

i was going to ask the same thing, you can search the bytes in hex for 01 03 6B 65 79 41 04

I have tried that, I've been scouring old hard drives for the last few weeks. I also had the file "encrypted" on a True Crypt Container in Dropbox, but I deleted the container in 2012 before the file was lost. (One of the stupid moved I made) Dropbox will/ can not help me recover the file.

So what are both "ionic" and "tonic" in the same dictionary for a human readable entropy library?

EFF has a new wordlist which can be used for things like this. It focuses on phonetic and spelling differences across each word so that this doesn't happen, plus it prevents words from "duplicating" when you combine them (ie the two words `in put` and `input` being the same).


This is awesome. I had no idea my EFF money was doing great things like this in addition to fighting for net freedom :)

Nice! As I started reading about it, I idly wondered if they talked to Marc Brysbaert, who is very serious about researching which words people know. And the answer is yes, they did.

That's fantastic. My only complaint is that none of the lists' lengths are a power of two, but that's easily fixed by truncating one.

Good question. I'm the author of that library and I can say I just borrowed the word list from another project.

Luckily, that library only cares that you get the first three letters of each word correct, so we can update the word to 'tonsil' or 'tongue' without breaking compatibility.

Using ngram or part-of-speech data could create better mnemonics: https://rmmh.github.io/abbrase/

Filtering the list to require a minimum edit distance of 2 or 3 would be quite easy, too!

Ya, it seems like trying to choose words that are hard to mistake for each other (high lev distance, and perhaps also considering letter similarity) would be a very good idea for selecting a dictionary like this.

I wonder what the original process was?

I've used PGP Words - https://en.wikipedia.org/wiki/PGP_word_list - "which are carefully chosen for their phonetic distinctiveness". There is also a "parity" feature to catch simple errors.

My application was pass phrases to be read over the phone, but it seems generally useful whenever you want to avoid confusing two words in the set.

    A wise wife tagged and jagged and nagged
    Her aptitude had altitude to push the lush
    He bore the brunt with a grunt and tonic
    His music was ionic sonic
    Their topic too toxic to adapt
    And they, too adept to adopt

This was an amazing story, but there are LOT more take-aways here!!!

First of all, let's look at something: the burden of memorizing 29 words was SO great, that despite carefully writing it down and double-checking it, the user failed to memorize it or even come close: after trying 500 times, they could not tell that ionic was a different word from tonic. No doubt they had looked at each handwritten word very carefully during the 500 attempts, but just could not do it. By the way, if you write the word ionic down in your own handwriting, you could easily see that it might look exactly like your own handwritten tonic.

There is something else about these 29 words. You can find the number of bits of entropy in a dictionary you'd pick one word from at random by taking the log2 of the number of entries. (In a pinch you do log 2 by taking the log and dividing by the log of 2). That shows that 1626 words (the number of entries in the dictionary) have 10 bits of entropy.[1]

So by making the user "remember" (write down) 29 such words, you are making them memorize (write down) 290 bits of entropy.

2^290 is 1.9892929e+87. There are about 10^80 atoms in the ENTIRE universe (a hundred billion galaxies with a hundred billion stars each). You'd have to get every atom in our entire universe -- every planet's every atom, every sun's, every black hole's, every one of the atoms anywhere in the world, to try 10,000,000 operations each, before you got an answer.

That is WAY too much.

But despite having such an incredible amount of extra information in there (base-64 encoding 290 bits would take 48 characters - six bits per character), it does not contain enough of a checksum to correct against a single transcription error.

So this is a great example of a solution that is very user-hostile: so long that the user is forced to write it down, but despite its length so fragile that it does not contain any help against any amount of corruption. And very clearly, the longer it is, the greater the possibility of user error: could you hand-write an entire Dickens novel without a single error anywhere for example? What about a 12-character alphanumeric password? So the latter is stronger than the former! The latter is a better password.

I am not sure what kind of passwords would have redundancy built-in (so that a slightly wrong version would be corrected and accepted) but this would be a good time to find out.

One last thing. Does anyone know how long it takes to try a combination? I'm surprised that the blog poster went through the trouble of finding Levenshtein distance, since I would think from a coding standpoint it would be faster to code trying all 1625 other possibilities for the 1st word (leaving the rest unchanged), trying the other 1625 possibilities for the 2nd word, and so forth. Since there are 29 words this is just 47125 possibilities in total which doesn't seem like it's that many. (Then again, some 'treasure hunter' the blog poster was "competing with" might have had that script running already when the blog poster got there first!)

[1] https://www.google.com/search?q=(log+1626)+%2F+(log+2)

The 29 words is from a legacy mistake. It's 32 bytes of entropy plus 6 bytes of checksum. We had originally only wanted 16 bytes of entropy and 6 byes of checksum, but the types we had were all 32 bytes. We implemented it initially with 32 bytes and shipped before rewriting it.

It's been like that for almost 2 years now, and while 29 is a lot, you aren't going to memorize 17 words either.

The checksum is 6 bytes, and a laptop can verify maybe 100,000 tries per second. So checking for a mistake of 1 word out of 29 will take you maybe 0.5 seconds. 2 words will take 6.5 hours.

If you find one that matches the checksum, it would take maybe 30 minutes on an SSD to scan the blockchain and realize that it's the wrong seed even though the checksum is correct.

But at 6 bytes of checksum, you'd be unlikely to bump into an incorrect but valid seed having just 2 words incorrect.

These seeds are used precisely because they are easier to distinguish than alphanumeric randomness. In this case, 'ionic' and 'tonic' ended up being an unlucky word pair, but we will swap out the word 'tonic' for 'tonsil' I think and that should fix the confusion. (The library only reads the first three characters, so there will be no compatibility issues with this change)

The need for embedded some form of error correcting codes into readable keys like these is a really good point.

While not the same, I'm reminded of the issue with etherium addresses where they've (after initially having no extra checking) started using mixed case to provide a checksum to detect incorrect entries. Otherwise, it's really easy to send coins to a very slightly different address due to a typo.

With Saicoin, it seems like just adding 1 more word could allow correction of mistakes like these. (And you'd end up with a round 30 words :).

It does have a checksum. That means error-correcting works just like in the article, by picking the nearest valid code. This could be built-in to the software.

It would take about 0.5 seconds of brute forcing for the library to figure out if you had gotten a word wrong, so that's actually reasonable.

Rather than a checksum, I think we're talking about something like a recovery record[1] or some kind of error correcting code, which is slightly different.

[1] like these - https://en.wikipedia.org/wiki/Comparison_of_archive_formats#...

> (In a pinch you do log 2 by taking the log and dividing by the log of 2)

In a pinch it's easier to reason that since 2^10 = 1024, 2^11 = 2048, and 1626 lies between those two numbers, log_2 1626 is a bit more than 10.

Memorizing powers of two is useful for lots of quick mental estimation!

12 words of 1626 characters is almost exactly 128 bits, which is what is typically accepted as cryptographically secure.

We added a checksum though and then grabbed 256 bits instead of 128, so the numerical alignment no longer applies sadly.

I'd pay to discover the error I made :)

So I can overlook the misdemeanor pocketing of a few bucks with the intent on giving it back, but you basically admit and brag about breaking the Computer Fraud and Abuse Act as some kind of exercise of how clever you are for doing a dictionary attack against a weak and exposed key?

Good luck sir.

Would CFAA really apply here? He's not accessing any computer illegitimately, the blockchain is public record, the key was posted to a public website. He's accessing the public siacoin network, posting transactions that anyone has permission to.

I'm not convinced that "posting to a public network" gets you out of the sketch. Sections 5-7 seem at least partially relevant, but then I'm neither in the US nor am I a lawyer anywhere else.

Posting data to a remote system, with the clear intent of taking a thing of value from another person without permission. Perhaps it falls between the cracks, but I'd be reasonably surprised if it doesn't come under this or another similar act.


He didn't just post the passphrase, he also posted this:

"If someone figures it out, I will send you free sias"

I'd call that a clear invitation/authorization for anyone to try to crack his passphrase.

Reasonable, but not an invitation to transfer the entire amount then setup an automated process to transfer any remaining amount to your own address.

I don't know, I think transferring it was the reasonable thing to do, rather than leaving it in the compromised wallet.

It's like if you find someone's (real) wallet, and you pick it up and contact the owner to ask where to drop it off. Rather than leaving it there and just telling the owner what street corner it's on.

I recently heard about a guy who was arrested and charged for exactly that. He took a phone home that he found in a carpark, the owner texted it saying "please return to Subway" He went there the next day to hand it in and the police were waiting! Apparently he was supposed to have turned it in to any nearby business at the time he found it. Not tried to find out who owned it the next day.

Any nearby business? I thought lost property was to be turned into the police station​. Otherwise the owner could call the cops on the business employees "I was never at Dunkin Donuts yesterday."

Makes sense! Like when your neighbor leaves their door unlocked and you steal all their stuff just in case.

Well, have a look at United States v. Kane, it basically indicates that if you don't exceed authorized access you're in the clear. It's hard to say that posting a cryptographic signature to a network design to accept them from anyone exceeds authorized access if pushing buttons to trigger an exploit on a poker machine doesn't.

I'm sure you could have a pretty good argument in court if they went after you using CFAA. Other theft and fraud laws might cover it without issue though, just saying CFAA might not be the right choice here.

Perhaps, again I'm not a lawyer. However, one of the things brought up is that they didn't do something with a computer “which is used in or affecting interstate or foreign commerce or communication”.

I don't know about the additional "unauthorised access" but I'd be surprised if someone can't make a case from cracking a password to do something on a network you know shouldn't be possible unless you were the person who owned the address.

This comment is unhelpful and negative for no reason. Somebody helping somebody else should not be dismissed but instead rewarded.

Imagine you were the person who decides whether or not to bring charges against Mr. Lynch. Would you?

Alternatively: imagine you were selected to sit as a juror in such a case. Would you nullify?

The alternative is leaving their money sitting in a compromised address. The options are take it and give it back, take it and keep it, or let some random person take it.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact