Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Anatomy of a hack: even your 'complicated' password is easy to crack (wired.co.uk)
44 points by dsr12 on Aug 26, 2013 | hide | past | favorite | 60 comments


First published in Ars Technica:

http://arstechnica.com/security/2013/05/how-crackers-make-mi...

and discussed here:

https://news.ycombinator.com/item?id=5777719

The main point to remember was that the story is to expose the tactics and thinking process of attackers, not to be a perfect representation of true life. They deliberately relaxed some constraints (giving attackers unsalted MD5s, for example) but gave a fixed deadline to impose pressure.


They used GPU crackers, not rainbow tables. Salting would not have made a difference.


Actually, salting makes a HUGE difference, even without rainbow tables.

Salting does nothing to the difficulty of cracking any individual password, but it makes you crack each person's password INDIVIDUALLY. So you have to crack 123456 + %randomString1%, 123456 + %randomString2%, etc instead of cracking 123456 once for all of the entries on the list. So if it's a list of 1,000,000 password hashes + salt, that will be ~1,000,000 times harder to crack all of them than a simple unsalted list.


You don't understand how GPU crackers work, or, for that matter, how Alec Muffet's "Crack" works, from the early 1990s, the original password cracker that everyone used to use to work on Unix password files. Suffice it to say: no, salting does not make a huge difference.


Ok, I may be confused. I thought that with an unsalted list, you could try a password (generated by your algorithm of choice, whether that be dictionary based or brute force) and compare it to the entire list of hashes, because Alice's "123456" would hash to the same thing as Bob's "123456". But if the list was salted, Alice and Bob would have different stored hashes - and so you would not be able to parallelise cracking across users.

Yes, if you only want to crack Alice's password, the salt will not slow you down. But if your goal is to crack a million, wouldn't the salt slow you down?

If I'm missing something, I would really like to know. Always good to learn something new!


Read the Ars Technica article.

The hole in your logic is: there is no point in guarding against the case where, if two users share the same password, cracking one will crack the other, because the only time that two users share a password is when that password is really weak.

What are the odds that two users share the same password? If passwords were totally random and 24 characters long the answer would be "pretty low". And the odds of cracking one of these passwords by brute force would also be pretty low. If only we lived in that world.

In the real world the odds of two users sharing a password are frighteningly high, but that's because most passwords are awful: they are things like "pizzapizza" or even "password". The cracking programs can crack one of these in a millisecond, because they are equipped with lists of thousands of real-world weak passwords harvested from real websites. And then, if another user shares this awful weak password, that user will also get cracked, a few milliseconds later. There is no point in "parallelising cracking across users" because it's so easy to just guess all the weak passwords for every user.

Another way to think about it: A salt could only help in cases where (a) two users share the same password, but (b) that password doesn't appear in the list of several million other real-world passwords that is built into the password cracker. And that just isn't worth worrying about.


There actually is a point to parallelizing cracking across users. Consider: You're testing a specific password, say, "2398fje#f", and want to see if anyone is using that password. If there are no salts, you can hash this, then look it up in a hash table of password hash to user (or, to get more fancy, you can do things like performing a bloom filter lookup on the GPU, etc). If there are salts, however, you must hash it multiple times - potentially once for each user. This applies even if there is only one user using the password in question.

So the difficulty of the search is no longer proportional to the number of passwords tried, but rather proportional to the product of the number of users and number of passwords - a rather large increase!


But if there is a database of 1,000 unsalted password hashes with no people having the same password. Your password cracking regime does 1 trillion guesses. So you hash each guess then try it against each of the 1,000 password hashes. That is a total of 1 trillion hash operations.

But if the database has individually salted hashes then you need to do each hash operation individually for each user. That means you have to do 1 quadrillion hashes. That will take 1,000 times longer.


A factor of 1000 isn't that important.

Really. First, stare at this chart from the Ars Technica article:

http://cdn.arstechnica.net/wp-content/uploads/2012/08/expone...

This graph shows, roughly, that a short password can be brute forced in seconds using modern hardware. If it is just one character longer, it suddenly takes a week to brute-force. This is the miracle of exponential functions.

Now multiply the y axis by 1000. The long password now takes 1000 weeks. But the shorter one still only takes a few thousand seconds. If you are running your cracking program for a day, you'll get the same results you got before. The weak passwords are still weak enough to crack. The strong ones are still strong.

(Note, by the way, that it doesn't take 1000 times longer to guess 1000 salted passwords, because some of those passwords are super weak and will fall in seconds, after which one no longer needs to guess them. So, for example, once half the passwords have been broken, the new multiple is only 500. The Ars article explains this, but I didn't understand its wording at first.)

It is true that salt makes some difference. But it is not a meaningful difference. The difference between giving up 60% of your passwords and giving up 73% of your passwords is probably moot.

EDIT: And my example number of 73% is too low! That is bad news for MD5! Again, the article: they cracked 82% of a 16k-password file in one hour using only a single commodity GPU.


> lists of thousands of real-world weak passwords

They are compiled in a different way but serve the very same purpose as a rainbow table. A list of precompiled hashes.

Thus, they are just a kind of rainbow table.

Therefore, rainbow tables do matter.


Your theory of salting-per-user is correct.

Rainbow tables are useful if the time it takes to compute is big enough that storing a large list is useful.

The GPU crackers are so fast nowadays that that isn't the case. If you can try 500M per second, that's about 10 GB of SHA1 generated! Imagine storing a day's worth of work in a rainbow table :)

It's better to use a proper password hash, which will be slow, and will likely have per-user salting built in, anyway.


If you have a database of 100 unsalted hashed passwords and a dictionary of 10,000 words you could hash the 10,000 words and compare each resulting hash to each hash in the database. So that would take 10,000 hash operations.

But if the passwords were individually salted you would have to individually hash each dictionary item for each hash in the database. So that would be 1,000,000 hash operations.


Could you explain why? The post below explains about how two users really shouldn't share the same password, and if they do, it's easily crackable - that makes sense. However, if you had a list of 100,000 passwords and did, say, a brute force attack of all passwords with letters, numbers, and special characters to a length of 8, without salted hashes you would be able to run that brute force once and grab every password matching the criteria. With salts, you would have to run it once per salt. Am I off on that?


You're not wrong, but:

1. If you're using salt, it implies you chose to roll your own key function using hashes. That's a bad idea, because:

2. GPUs can produce so many combinations per second that the difference between salted and unsalted is basically indistinguishable for smart attackers.


Moral of the story? Just use a password manager for your accounts if your biggest threat vector is someone getting a hold of hashes on a compromised server. You can make them as long and random as you like. And unless a major breakthrough in crypto occurs the attacker can waste eons trying to break it. Of course, securing the key to the password manager becomes that much more important. But if compromised hashes are your biggest concern, Ars rightly notes that password managers are by and large the best solution.


Yes, I agree that using a password manager that auto-generates random password is the way to go for protecting against the attack of a compromised hashed password database table.

Here is the thing I'm wondering though: in the case when brute force attacks is not practical, is using human generated password/passphrases really that bad when compared against randomly generated password? Most decent and important sites would throttle the number login attempts one can try before at least throwing up a captcha or outright blocking.

This implies, to me at least, that in such a scenario using something like 12 character human generated password would give about the same effective security as a longer randomly generated one (e.g. 24 character random password). Yes, the randomly generated password is objectively better from a theoretical standpoint but I am thinking the effective difference in reality is negligible in the case when a brute force attack is infeasible (due to throttling of attempts). Thoughts?


> Most decent and important sites would throttle the number login attempts one can try before at least throwing up a captcha or outright blocking.

The types of attacks discussed in the article are not feasible when the only way to try a guess is to send an HTTP request. Look at how many guesses per second those GPUs are doing. You won't get anywhere near that sending HTTP requests.

Passwords are far, far more likely to be cracked if the database of hashed passwords is compromised. That's what you should really be worried about, and it's the main reason to use strong passwords.

Also be sure not to reuse passwords, even strong ones. A strong password can still be compromised, because there are many types of attacks that have nothing to do with cracking hashes.


I use a password manager but so many important sites still require me to use <10 characters in my password and then restrict the character space. It's crazy really. My two bank online accounts force me to use weaker passwords than any blogging service or email account I have.


In case anyone hasn't already seen it: http://codahale.com/how-to-safely-store-a-password/

tl;dr

Use bcrypt.


In case anyone hasn't already seen it: http://www.tarsnap.com/scrypt/scrypt.pdf

tl;dr

Use scrypt.


Use scrypt if you can. But don't get too hung up on this. All the modern KDFs are fine; use PBKDF2, bcrypt, scrypt, whatever you have available. Just stop "salting hashes". Attackers laugh at salted hashes.


I'm sorry if I'm misunderstanding, but are you saying that application developers should not use salts at all?


I'd interpret the GP as "Application developers should not create Key Derivation Functions (KDFs)." They should choose an existing, well reviewed KDF and read up to understand its relevant best practices.

For example, PBKDF2 does require a salt (as does scrypt, which relies on PBKDF2 for its implementation). It also comes with specific recommendations on the salt's minimum length. Salting an MD5 hash is pointless in the face of modern attack methods -- rice paper against a tiger.


Wouldn't individually salting MD5 still help with large databases? With a database of 1,000 users the individual salt should slow down attackers by a factor of around 1,000.

If they are targeting a single user it doesn't help though.


It falls in line with running you ssh on an obscure port or putting your password database in .hidden/. Most likely it is just a false sense of security and security though obscurity. You are doing X,Y,Z and W and in the end you could have just used a KDF.

If anything the false sense of security plays tricks on you psychologically "oh look we have put our database in a .hidden directory. Nobody we'll find it here" and that makes you not pay attention at the weakest vulnerability -- a weak algorithm or parameters of the encryption.


Yes, salting has a role to play. That is not the point. The point is that use of a weak underlying Key Derivation Function makes the benefits of salting nearly moot.

To fully spell it out: MD5 is a very weak KDF.

I would recommend looking into the KDFs mentioned in the comments here as alternatives: PBKDF2, bcrypt, scrypt.


Not enough to matter. Attackers haven't been dependent on rainbow tables for a while now. As discussed in the article, they're using GPUs to hash guesses individually for each account.

More to the point, using MD5 for password hashes isn't acceptable, at all. Not even with any extra layers of security. Not with salts, not with extra rounds of MD5, not when combined with SHA1, etc.. With reasonable options (like bcrypt) available in every major programming language, there's no reason to use something provably ineffective like MD5.


I understand MD5 should never be used. But I'm not talking about rainbow tables. The 1000x benefit comes even when crackers are using GPUs.


Sure, salting definitely helps. A little.

It's like telling someone being shot at to stand sideways, because their profile is smaller that way. The right thing to tell them is to get the hell off the firing range.

The problem with salting is that people feel they're safe, and stop thinking about security there.


No they should stand head on so the bullet will take a shorter path through their body should they be hit.


This table of hash functions should show you why MD5 is never to be used when privacy/security is a concern...

http://valerieaurora.org/hash.html

MD5 first started coming under pressure in 1994 and was cracked in 2004.


He's saying that rolling your own with functions designed to run as fast as possible, with or without salt, is not going to give you much security.

What you want is functions that run slowly, thus increasing attack cost.


Moreover, unless you are Schneier or tptacek don't create your own obscure slow function, use a well known one like scrypt.


unless you are Schneier or tptacek

ahem...


/cough cough ... or cperciva


He's saying that if you can't use bcrypt or something like it, you should store passwords in plain text.

Like a broken record people like him/her chant "no security at all is better than security by obscurity".


I do not believe that this is what he is trying to say. He does not mean "use plaintext" when he says "don't salt hashes." He means "use a key derivation function."


> if you can't use bcrypt or something like it, you should store passwords in plain text

No, he's saying that if you can't use an acceptable hashing function, you shouldn't store passwords at all.

But, why would you be unable to use at least one of the suggested hashing functions, anyway? It's hard for me to imagine a language or platform where none of those functions is available, excluding very simple, special-purpose systems like PLCs.


Have you heard of Google App Engine?

You can't use any python module that runs C, which rules out bcrypt and its ilk.


Or even better dont't even store them at all.

If its an internal app use LDAP, Active Directory, or whatever other centralized ID system your company has. If it's a public app then consider using a federated approach like OpenID. It doesn't make sense for everything but if you can avoid storing passwords entirely then it's one less thing that can go wrong[1].

Course if you do store them then yes:

scrypt > bcrypt > PBKDF2 > ... If you get to here then you've got a problem!

Just make sure you choose a same number of rounds. PBDKF2 is fine for most folks if you have a large enough number of rounds. The old recommended default of 1K is not large enough. Ditto for bcrypt with a work factor less than 10 (or better yet 12). Your bet bet is to either use scrypt (who's defaults are paranoid enough) or choose a work factor for bcrypt/PBKDF2 that's has a decent CPU time (say .5 to 1s).

[1]: Though you do have to worry about the risk of compromise of the party your delegating too. For apps where this makes sense (say Google+ login to a Grouppn knockoff) that's a fair enough trade off for you an your users.


Not sure why you're downvoted, but the idea of actually not storing passwords seemed intriguing to me, if it was actually possible.

I did a little bit of research and I found the Secure Remote Password protocol [1]. Interestingly, this protocol does appear to protect against the case of a stolen password database. If true, that would mean that when site X loses control of the password database, that would be OK as this is designed to be secure against that attack.

I wonder why it's not been implemented anywhere widely. Anyone more knowledgeable about the security field care to comment.

[1] - http://en.wikipedia.org/wiki/Secure_Remote_Password_protocol


Yeah I'm not sure either about the down vote. By not storing passwords I meant delegating to an external authority for authentication services. Whether that's OpenID, Persona, Facebook login, direct OAuth integration with a limited number of parties (ex: GitHub and Google Plus) can be decided on a per app basis. The important thing is the if your app use case allows you to delegate out authentication to an external party (again it's a non-trivial "if" to decide this) then you don't have to store or deal with passwords at all (and by extension don't need to worry about handling password DB leaks or hashing algos).


OpenID is a horrible idea. I have like a bunch of them, and I never remember if I signed up with Yahoo for this site or Google or whatever.

With Persona at least I know which email I use to sign up for random websites. I hope it succeeds.


Is there any reason why you wouldn't use bcrypt + random, individual salt? Am I right in assuming that bcrypt would protect against brute-force attacks and salt protects against pre-computed bcrypt rainbow tables? Or is the salt basically useless?


bcrypt is already randomized, as is every other modern KDF. There is no such thing as a bcrypt rainbow table. Rainbow tables have never really mattered. Stop thinking about rainbow tables.

You need to be using real KDFs to store passwords. Salted hashes are not real KDFs.


Thanks very much for that explanation. I'm not a computer expert, but I'm endlessly fascinated by passwords and cracking.

When it comes to picking passwords that humans can remember, what's your opinion on Diceware? Do five or six word passwords still stand up with the increases in computational power? http://world.std.com/~reinhold/diceware.html


I think it's important to pick a password that isn't in a list, or likely to be 1-2 transformations away from being in a list, and it's important to use a longer password, but apart from that it shouldn't matter as long as you use a different password for each service, and as long as the apps you use use bcrypt or some other real KDF.


Would it be useful to check password hashes against well-known lists of passwords? If so, it sounds like a service would be doing pretty good if they:

1. Required >7 character passwords

2. That don't appear on (constantly updating) lists

3. Using a reasonable KDF (b/scrypt)

Sound right?


That sounds fine to me.


$300/hr also buys a lot of $5 wrenches.


60 $5 wrenches doesn't help the attacker if I'm not physically on the same continent. Attacker needs to know where you are and have physical access... and as we all know, once you have physical access, it's game over :)


Remember: you'll be hard pressed to buy a quality wrench for 5 dollars, and - perhaps more importantly - why are you breaking them in the process?!

I suspect your process should use the wrench on lower wear items than computers, for example the desk (if plastic or wood) and things sitting around it.


LastPass has a secure password generator that lets you specify character set, length, etc. It has extensions for all major browsers, support for two factor auth, form fillers, one time passwords, great iOS and Android apps.

http://www.lastpass.com

Big fan here. Just thought I'd throw in a little discussion from the user side of things, as I see some great points about being a responsible maker.


The threats to passwords are generally two-fold: 1) SQLi leaking the contents of your database to the public. 2) Malicious insiders with access.

Blocking SQLi should be every web developer's first priority, not debating the efficacy of password hash algorithms, salting, and so forth.

Preventing malicious insiders is more complicated and requires other defenses besides the underlying choice of password algorithm.

scrypt, bcrypt, pbdkf are all fine in preventing the ill effects of #1 as it pertains to passwords, but prevent #1 at all costs nonetheless. Not only are your password hashes at risk, but your entire serving infrastructure and everything you consider sacred behind your firewall. Game over.


> "On the corporate side, its so different," radix said. "When I'm doing a password audit for a firm to make sure password policies are properly enforced, it's madness. You could go three days finding absolutely nothing."

I don't see how this could be true. Most of the harder ones that were exposed would pass corporate requirements - typical corp thing is like 12 digits and all 4 character classes, and the article included a few of those.


+1: I thought the same thing... Maybe the problem would be finding the entry to a "rabbit hole" (as he puts it). If there are no weak passwords it might be more difficult to find the patterns in use... but then again, the patterns from other corporations would still help.


Maybe the real takeaway from this article is that password strength meters should do a better job, not only assessing raw password complexity, but also comparing the proposed password to lists of common passwords, etc.


Or that password strength meters can't do a good job.


I wouldn't assume most readers of HN would be using these kinds of passwords for anything important, no?

I've been using SuperGenPass with one of the XKCD 4 random word style passwords for years and it works great. I don't store it anywhere, just type it in to the bookmarklet.


[deleted]


If you use the printable representation of the hash, the dictionary of characters that the hash uses is only digits 0-9 and a-f. While they likely still won't guess this, it's better to use 40 characters of the printable output of /dev/urandom directly, e.g.:

  dd if=/dev/urandom bs=512 count=1 | strings -n1 | tr -d '\n'
Edit: the original question, before it was deleted, was whether using a hash of data from /dev/urandom would result in a good password.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: