
How crackers ransack passwords like “qeadzcwrsfxv1331” - co_pl_te
http://arstechnica.com/security/2013/05/how-crackers-make-minced-meat-out-of-your-passwords/
======
tomku
Enjoyable read, but I question the bit near the end claiming that salts
wouldn't help much against this kind of attack.

From my understanding, per-user salting does substantially slow down this kind
of attack because it forces you to calculate a different hash for each
user/plaintext combination rather than hashing a suspected plaintext once and
comparing the hash against the whole list. What it doesn't slow down is the
brute-force cracking of a single targeted hash.

Am I missing something there, or is the article wrong?

~~~
chm
Where/how do you store the salts?

~~~
tomku
Right next to the hashed passwords. The point of salting isn't to add an
additional level of secrecy, it's just to prevent the reuse of hashing work
for attacking other users.

~~~
chm
But doesn't this render the process useless? If an attacker gets access to the
hashes, he also gets access to the salts.

If both hashes and salts were isolated, I suppose it would be much more
secure, although maybe too slow.

~~~
tomku
Absolutely not. Like I said, the point of the salt is not to provide a "second
password" that needs to be independently stolen. It's to make the results from
a cracking attempt on one account useless on the password of another.

If you've read the article, you have a pretty good idea of how it works
without salting. You have a list of N password hashes, and you come up with
candidate passwords. You run each candidate password through the hash
function, and compare the resulting hash to your list. If it matches any of
them, you've got the plaintext for those user accounts. The key thing here is
that you only have to hash each candidate password once, and you can compare
that hash to as many accounts as you'd like.

Now, let's add in a salt. It's stored next to the password, so after stealing
the password db you've got a list of password hashes, each with its own salt.
In order to check a candidate password against a particular user account, you
have to append that user account's salt to the candidate, then hash it and
compare against the stored hash. That part isn't any slower. The problem is
when you try to take that hash and compare it to other accounts. Even if two
users have the same plaintext password, their password hashes will be
different because they have different salts. The end result is that you have
to hash each candidate password fresh for every user.

~~~
chm
That's what I understood from the article, and why I suggested the salts and
hashes should be handled separately.

If you give the attacker the salts, difficulty scales linearly. For two
identical plaintexts, all that differs is the salt, but it is given to him.

If you store the hashes and salts separately (the technical details of which I
know nothing about), then you augment the keyspace exponentially. For N byte
salts, using P characters, you augment it by N^P. Furthermore, who forces you
to append a salt? You could prepend. Or n-pend.

I'm just throwing ideas around.

~~~
nknighthb
The server must have access to both the salt and the hash to verify a
password. Therefore, upon compromise of the server, the attacker automatically
has access to both the salt and the hash. There is no way around this problem
that isn't simply obfuscation.

~~~
chongli
Not necessarily. With something like a smartcard or TPM chip, one could move
the hashes and salts off the server. Both would still be stored together (on
the device) but they'd be separate from the server!

Edit: Or one could move just the salts into the device and store an index into
them in the password file. Hashing would be carried out by the device but an
attacker would only gain access to the indices. Without stealing the device
itself it'd be impossible to properly salt the passwords for hashing.

~~~
nknighthb
If the contents of the device can be dumped, the problem remains. If not, you
have no copies of your database, so when (not if) the device or the server
it's connected to fails, your site is down, and once you've brought it back
up, all of your users must go through a password reset process.

That one server/device is now a single point of failure and a bottleneck in
processing logins.

You're also still relying on the security of a computer, just a special-
purpose one.

~~~
chongli
I didn't say anything about reliability or single points of failure. I merely
pointed out that it was possible to separate the salt from the hashes and gain
security that way. Whether this is practical or not depends on how important
security is to you.

And yes, it would not be possible to dump the contents of the devices.

~~~
nknighthb
The proposal is fundamentally impractical, and thus not a "security gain" in
any meaningful sense. It's the equivalent of preventing cipher algorithm
breaks by using nothing but one-time pads.

It's also theoretically impure in any case, as you've done nothing but add an
additional peripheral to the computer. You're seeking obfuscation, not real
cryptographic integrity.

~~~
chongli
>It's also theoretically impure in any case, as you've done nothing but add an
additional peripheral to the computer. You're seeking obfuscation, not real
cryptographic integrity.

It's not obfuscation. The peripheral has a far smaller attack surface than a
server. This is real security, even if it comes at a cost of reliability
(though one can envision ways of fixing this, too).

~~~
nknighthb
It's operational/system security. That's not the same thing as cryptographic
security.

Stop trying to patch a hole that isn't there. Salt is not secret data. If you
want to protect the hash with secret data, take A1kmm's advice and use the
smart card to encrypt it. But don't call that a salt, because it fundamentally
is not one.

------
jd007
The article explains how all the passwords were revealed except for
"qeadzcwrsfxv1331". Maybe I missed it but could anybody point out to me which
method revealed this password? The letters seem to not form any words that I
know, and although is appended with a number and consists of only lower case
letters, it's still 12-letters long, longer than any brute-forced in the
article. Edit: They tried all the keyboard typing patterns I suppose, I just
realized the pattern of the string when typed on the keyboard.

~~~
kenbot
Try looking at the pattern the letters form on the keyboard...

~~~
simias
Good catch, but it could also simply be that the password had been collected
during a previous hack.

The article mentions it at the end but I think they should have insisted more
on this point: if you have a very strong password that you reuse everywhere
and it gets leaked at some point it has a high probability to end up in
rainbow tables everywhere and might not be more secure than "h4x0r1234".

So using hard to guess passwords is the easy part, the hard part is using
different hard to guess passwords everywhere.

~~~
qu4z-2
Could you explain the rainbow tables comment?

I would've thought it'd end up in a dictionary, not a rainbow table. (although
I have to admit I've forgotten the details of how rainbow tables work)

------
jacques_chester
The punchline:

> _The list contained 16,449 passwords converted into hashes using the MD5
> cryptographic hash function._

Edit: I'm removing all my snarky nitpicking. This is a good article. Yes, the
MD5 case they present is a poor case, but it's really about demonstrating the
tactics of attack selection, rather than teaching someone how to make crack-
resistant password schemes.

~~~
grecy
> _the MD5 case they present is a poor case_

If guys using vanilla hardware get that kind of success in _1 hour_ with MD5,
you only need to increase hardware and the time required to see it's still
completely doable for other hash functions.

~~~
eksith
The issue is practicality. Which is why pbkdf2 (and increasing the rounds each
year) + bcrypt or scrypt is still a better option.

~~~
MichaelAza
Noob question, but why would you use both pbkdf2 and bcrypt/scrypt? Aren't
they all basically doing the same thing?

~~~
JonnieCache
Bcrypt is an algorithm which bundles up something which pbkdf2 achieves by
iterating other algorithms. They're basically the same, but you should use
bcrypt. You shouldn't use both of them, I'm guessing the poster above meant /
rather than +. If you want more security, increase your (b|s)crypt work
factor.

If for some reason you can't bring the bcrypt code into your project, you can
implement pbkdf2 using basically a loop and your stdlib's hash functions.

Scrypt takes the whole concept of placing extra demands on the computer and
applies it to the RAM rather than the CPU (perhaps as well as?,) the idea
being that RAM is harder to accumulate in obscene quantities than CPU power.

------
Corrado
My big takeaway from this article is that passwords, in almost any form, are a
bad way to secure your information. The only acceptable way to use a password
nowadays is to use a password manager to build huge passwords that a human
could never remember or type in reliably. Even then, as machines get faster
and crackers get smarter, these behemoth passwords will fall.

I've been using 2-factor authentication (Google Auth) lately and I'm fairly
impressed with it. I only have to whip out my cell phone a couple of times a
day/month so It's fairly convenient, and it seems very secure. Then again, I
thought my password was fairly secure, but after reading this story and the HN
comments, I can say that I'm just like everyone else when it comes to
passwords. Ignorant. :/

~~~
gradstudent
The "exponential wall" means the longer your password is the less likely it is
to fall. A 10 letter password is in a 26^10 keyspace. Add one more letter and
it takes _10 times_ longer to crack -- assuming of course your password is not
part of some of some combination of short common dictionary words.

What I find really interesting is that the same kind of attack vector
(combinations of common words) is being used as the basis for some really
sophisticated search techniques in Artificial Intelligence. I remember reading
an abstract a few years ago from a student of Rich Korf @ UCLA. In it the
authors use this kind of approach to attack the 24-Sliding-Tile Puzzle.

Here's a link; it's pretty cool!
[http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/349...](http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3496/4146)

~~~
rdl
not 10x longer, 26x longer (and that's assuming lowercase a-z only)

------
tibbon
That's pretty interesting. It makes me even happier that I've switched over to
using passwords like .!a7&Xn2\\#ZRS]!:`\\.H2j;{7 for everything (unique per-
site, password manager). I'm guessing that such a password, when properly
generated, isn't going to be cracked without a quantum computer...

But hey, maybe someone can prove me wrong. Anyone care to take a crack (pun
intended) at 1a323a3185b1dbee3a0ba1f4c3c9f674

I'll send an entire shiny bitcoin to anyone who can get it right.

------
mrgoldenbrown
I don't understand the statement that salts get less effective after you've
broken other passwords:

"But the thing about salting is this: it slows down cracking only by a
multiple of the number of unique salts in a given list. That means the benefit
of salting diminishes with each cracked hash."

A proper salt for user Joe's password does not have any relation to any other
user's salt. Cracking Bob's password should not help you crack Joe's. Am I
missing a technique that exploits one salt to attack another? Or are they
assuming crappy salting methods? As in, if you have 2 bits of salt, then after
the attacker has hashed your entire passwd file with those 4 salts, you might
as well not have salted anything.

~~~
jameshart
I think they misphrased that a little, unless as you say there really are
people commonly using non-unique salts. The krux of the argument is that each
cracked hash removes one hash from the list of hashes you need to check. So,
say you've got 16000 uniquely salted MD5 passwords. 8000 of them are, though,
salted MD5s of 'password'. If you start off by checking weak passwords, your
first pass requires you to calculate 16000 hashes, but if your first candidate
password is 'password', then it yields 8000 passwords. Your next pass only has
to calculate 8000 hashes for each candidate password. So in that case,
cracking Bob's weak password does help you crack Joe's strong one.

------
knightsamar
How about passwords not English words but written using Latin alphabet ? Like
mer@s@nket!k$habd (Hindi for "my password") ?

Bet that would be harder to crack and still easier to remember for a multi-
lingual person.

In fact if non-latin alphabet is widely supported for entering password, it
would make them a bit more secure I feel.

~~~
xioxox
It's dangerous to choose anything someone else might choose. My favourite
technique for good passwords is to think of a memorable phrase (preferably
with numbers) and take the initial letter of each word.

"I can sleep at Nandos, providing I pay $100 for that"

Ics@N,pIp$100ft

------
deathanatos
Anyone know where the article's author got the "MD5.txt" file (the list of
hashes he ended up cracking)? I'd like to take a crack at this (pun intended)
myself.

~~~
PwdRsch
Here's what it says in the "How I became a password cracker" article
([http://arstechnica.com/security/2013/03/how-i-became-a-
passw...](http://arstechnica.com/security/2013/03/how-i-became-a-password-
cracker/)) on Ars from March:

"Dan suggested that, in the interest of helping me get up to speed with
password cracking, I start with one particular easy-to-use forum and that I
begin with "unsalted" MD5-hashed passwords, which are straightforward to
crack. And then he left me to my own devices. I picked a 15,000-password file
called MD5.txt, downloaded it, and moved on to picking a password cracker."

My guess is that the forum he mentions was the insidepro.com web site, since
it is a popular password hash sharing site. A quick search there found several
attached files named MD5.txt, but none seemed to be the right size or have the
right hashes. However, a search for hashes mentioned in the article found this
file
([http://forum.insidepro.com/download.php?id=12783&sid=160...](http://forum.insidepro.com/download.php?id=12783&sid=1607495891f54e9022aff385d920d07c)),
which contains 16,880 (instead of the 16,449 hashes listed in the Ars article)
and at least 3 of the MD5 hashes specifically mentioned in the article.

There's a good chance this is the same file or close to it, but you'd have to
try matching it with more of the hashes from the article to know for sure.

------
oskarth
Cue a test-how-strong-your-password-is service where security conscious
individuals can test how their particular password stands up against these new
attacks.

~~~
anonymouz
And have their password added to a word-list.

~~~
roryokane
There is a site that pretends to do just that, though I can’t find its URL. It
asks you to enter your password for strength checking, then takes you to a
page saying “estimated strength: 0. Because you have typed the password into
an untrusted web page, you must assume it is compromised.”

~~~
JDGM
Has anyone found this? It sounds great, but I couldn't get a URL from googling
the obvious stuff. Please post if you know it!

~~~
croikle
<http://www.inutile.ens.fr/estatis/password-security-checker>

The Terms and Conditions are worthwhile, too.

EDIT: Also, see <http://www.ismytwitterpasswordsecure.com/> (needs Javascript)

~~~
JDGM
Superb. Thanks!

------
vinhboy
I think my favorite part of this is learning that after I finish with my
bitcoin mining rig, I can use it to crack passwords. Awesome...

~~~
davedx
If it's a specialised rig then it'll only be good for MD5 I think. Most
sensible websites don't use MD5 anymore.

~~~
gizzlon
_Most sensible websites don't use MD5 anymore_

What makes you say that? Do you mean new sites or all site? Got anything to
back it up?

I would _guess_ most sites in existence use MD5, SHA1 or similar with 1 round,
because it was (is?) very popular.

~~~
smartwater
Bitcoin rigs are only good for certain mathematical functions. They are not
general computing devices.

------
yashg
How about this - User's password is a single letter "a". I MD5 it and get a 32
char string, now I append a GUID as salt - another 32 chars - and I MD5 that
64 char string again and store it. How difficult would it be to crack it? If
the cracker knew my process he might crack it, but what if the process is not
known to a cracker? Also I can store the hashed string and the salt in a way
that a crakcer merely by looking at the string can't figure out it's actually
salted.

Say my final salted hash is 1234 and the salt I used was ABCD. I can splice
the two and store it as 1A2B3C4D. That would throw a cracker who does not know
how the value was generated off the hook won't it?

Unless the crakcer also gets access to my code, it will be impossible for him
to find head and tail of such hashes.

~~~
spiffytech
Well, you're hashing and salting, which is at least a good start. But you
shouldn't use MD5- if the cracker _does_ find your secret formula, you're
hosed compared to pbdfk2/bcrypt/scrypt. Not only for the speed, but because
MD5 has exploits that make it easier to produce a specific desired hash than
the other options, so your attacker doesn't have to make as many guesses.

You should use a per-user salt, though, instead of a static salt (the 32-char
nonce). The static nonce means the attacker can generate a rainbow table
during their attack of all possible passwords. The hash of the password does
not count as a suitable salt, because if User A uses the password 'a', and so
does User B, both users' hashes come out the same. Finding User A's password
means you can compare its hash against the rest of the compromised hashes and
note that you grabbed User B at the same time. With a per-user salt, the users
will have different hashes despite using the same password, so the attacker
has to attack each hash separately, with no "quick wins". A randomly-generated
salt will do, and I expect simply prepending the username or something to the
password before hashing would do as well, singe it's still unique per user.

~~~
yashg
Yes of course I meant per user salt. And if I store a 64 char string it will
be very difficult for an attacker to figure out MD5 was used.

And since the string that I MD5ed is 64 chars in length (32 of original MD5 +
32 of salt) it should make it near impossible to crack.

~~~
spiffytech
Relying on a terrible hashing mechanism just because you hope nobody will
realize you're using it is a very bad idea. There's no technical reason to use
MD5 over at least looping over a good SHA variant, and if your code has access
to bcrypt/scrypt, it's hard to come up with a reason to choose anything
(except maybe pbkdf2) over them, either.

Further, a string that long might be effectively impossible to crack /by brute
force/, but MD5 has been found to have flaws that allow the attacker to find
alternate inputs that will produce the same hash in a very short amount of
time (minutes to hours, according to Wikipedia[0]). That means if the attacker
has your hash, they don't need to figure out your nonce or salt to come up
with something they can use to trick your system into authenticating them as
an arbitrary user. These attacks don't reveal the user's original password,
but they do still allow someone to impersonate an arbitrary user on your site,
for e.g., purchasing services, accessing financial details from your billing
system, or performing social engineering attacks.

[0] <http://en.wikipedia.org/wiki/Md5#Security>

------
TwoBit
Well the bonehead hashcat program is limited to 15 character passwords, so it
can't possibly crack mine.

The hashcat program software in general is powerful, though pretty crappy from
a design standpoint. The current version crashes if any input is not what it
expects, and it has just about zero help for what it expects. Actually I can't
even get it to work, as executing hashcat-lite.exe with arguments causes it to
just immediately exit without doing anything nor printing any output.

~~~
reeses

        *Well the bonehead hashcat program is limited to 15 character passwords, so it can't possibly crack mine.*
    

One of the main themes of the article is that the sites are the weak point,
not necessarily the users. The first boo-boo is storing passwords in an easily
discoverable format (plaintext, md5, sha1, etc.).

Another massive annoyance is when a site has an insipid constraint on
passwords. No longer than 8 chars because that's what they put in their db
last century or based it on passwd, minimal symbols, and basically just alpha-
numerics.

That's why your password manager has the controls on length, number of
symbols, digits, etc. It's not to "smarten up" your passwords but to dumb them
down for sites that give no priority to security.

------
wyck
The best part of the article is the last part about Intel password strength
"guessing".

The other day some devs on reddit were using google to find password strength
and posted stuff like my password would take "1 billion years to crack",
without realizing these sites have garbage algorithms for measuring strength,
the Microsoft one is a joke.

If developers are falling for these strength indicators, I'm certain average
users are getting a very false sense of security.

------
jpalomaki
In addition to per user salt, it would be good to also add "per application"
salt. A random, long string, that is stored somewhere in the application code
and not in the database.

Obviously this does not help if the whole site is compromised, but it may help
against SQL injection attacks where the attacker gains access to just data
from database.

------
jokoon
Maybe we should tell users to use passphrases instead.

The 8 char limit is very low now. The limite should be raised to 30
characters.

~~~
PwdRsch
Telling people to use passphrases is a great recommendation, but you still
have to spend some time teaching them how to use passphrases effectively. In
the article they list several passphrases that were cracked, such as
"sleepingwithsirens", "gonewiththewind1", and "momof3g8kids". So if a user
chooses "ijustbluemyself" thinking that it's a great choice they are likely to
be disappointed if a skilled cracker gets access to their password hash.

My basic suggestions are to choose 4+ words that are not a common phrase, song
name, quote, etc. And make sure you still use both lowercase and uppercase
letters, plus throw in some symbols in non obvious places (e.g., don't convert
your a's into @ signs). It doesn't need to be as random as a shorter password,
but it still shouldn't look like a normal sentence.

~~~
jokoon
well it's either that or security cannot really be guaranteed.

Maybe one day password will just be 1MB files kept on people's computer
instead of remembering a short string.

------
eksith
'Taekwondo1983' Was the password a fellow admin of an old forum I used to run.
Wander if it was still him.

~~~
yk
Probably not, 'Taekwondo1983' sound like a password some percentage of people
who practice a certain sport and are born in a certain year would choose.

------
chasb
Combinator attacks guess "the square of the number of words in the dict" when
the password is just the combination of two words, correct? So a 3-word pass
would require dict^3? This all seems to come back to length.

~~~
Millennium
Pretty much, yeah. This is really just a generalization of the principle we
see in the length of a binary password. The complexity is s^l, where s is the
number of symbols you can possibly draw from, and l is the number of symbols
in the password.

Consider a binary string. There are two symbols in binary (1 and 0) and the
number of symbols you use is the number of bits in the string. This gives us
the familiar password length calculation: the complexity is 2^l.

When you start using words as the basic unit, you gain many more symbols: if
you use the full Oxford English Dictionary, you have about 600,000 of them.
But when you do this, your symbol is no longer a single byte: it's a word.
That word can be represented as a string of bytes, yes, but it's guaranteed
that this string will be one of the 600,000 byte strings that represent
English words, and you can eliminate all of the others. So you can't measure
your password in bytes anymore; you have to measure it in words instead. A
one-word password is, therefore, of complexity (600,000^1).

Six hundred thousand possibilities sounds like a lot, but computers can crack
that in seconds. Even if you restrict yourself to just printable ASCII, you
can do better than that in only three characters. I don't recommend actually
doing a three-character printable-ASCII password, because you'd have just
under 900,000 possible passwords: still well within crackable range.

But because the complexity increases so fast as you increase the length of the
password, that's how you can gain complexity back. If you use four words
instead of one, you have (600,000^4) possibilities: some 130 sextillion. You'd
need 70 random bits, or 11 printable ASCII characters, to do better.

And that, my friends, is why "k[0" is a better password than
"antidisestablishmentarianism". But it's also why "correct horse battery
staple" is better than either one (or would be, if xkcd hadn't made it so
famous).

------
stitchintime
Related: Stitcher (still) stores passwords in plain text
<https://news.ycombinator.com/item?id=5778367>

------
talloaktrees
I don't know too much about crypto. Would using something like salsa20 stop
this?

~~~
jacques_chester
A password-derivation function has two requirements:

1\. Obscure the password in a one-way fashion.

2\. Be fast enough for humans, but very slow in computer time.

Hash functions can be used to produce an obscured copy of data, but they are
also by design very fast.

If you wish to protect passwords, don't use a naked hash algorithm. Choosing a
_different_ hash algorithm doesn't fix the problem.

Instead, use a password-derivation function. These are designed both to
obscure _and_ to take significant time. Bcrypt, Scrypt and PBKDF2 are the
standards.

~~~
lisper
And of these three, scrypt is by far the best choice because it's the hardest
one to speed up with specialized hardware.

~~~
jacques_chester
Right. Colin Percival did a clever thing and made it _memory_ -hard, not just
computationally hard.

