

Is my developer's home-brew password security right or wrong, and why? - eranation
http://security.stackexchange.com/questions/25585/is-my-developers-home-brew-password-security-right-or-wrong-and-why

======
gizmo
I expected daily-WTF like idiocy, but Dave's code isn't insane. It's a pretty
standard combination of a (sufficiently random) salt and a SHA1 based hash.
SHA1 was until recently regarded "strong enough". Dave's code is unnecessarily
obfuscated, but whatever. I don't see how this can be considered "homebrew
password security" given that all the heavy lifting is done by SHA1. Anyway,
Dave's solution is as good as a SHA1 based solution gets. That's already
better than 95% of the websites out there (citation needed), but of course
worse than the bcrypt approach suggested by the OP.

The OP pretends to innocently ask "Is my developer's home-brew password
security right or wrong, and why?". I'm going to call BS on this. To me it's
pretty clear the OP is trying to passive-aggressively bully/ridicule/shame
Dave in order to get his way. And sure enough, the top ranked StackOverflow
has somebody going through the code line-by-line going WTF at every
opportunity. The OP isn't trying to figure out whether he or Dave is right,
he's clearly already convinced he's right and wants the StackOverflow
community to collectively pat him on the back. Kind of childish.

~~~
Lagged2Death
_To me it's pretty clear the OP is trying to passive-aggressively
bully/ridicule/shame Dave..._

I had a different interpretation; OP _is_ "Dave." It's sort of a time-honored
tradition, on-line and off, to seek advice "for my friend."

~~~
gizmo
Nearly impossible. Every single sentence in the post provides evidence against
that theory. (Dave " _Insists_ on using a home-brew script" / use of
"Homebrew" vs "Standard solution" / Dave's "primary hang-up" became clear
after he "immediately balked" / Ridicule of Dave's code is "sincerely
appreciated", etc). It's so over the top that the entire post may just be an
elaborate troll.

~~~
Lagged2Death
_Nearly impossible. Every single sentence in the post provides evidence
against that theory._

On the internet, nobody knows you're a Dave.

------
marshray

         $time = date('mdYHis'); // Probably in the database somewhere too
         $rand = mt_rand().'\n'; // Said to be a 31 bit number
         $crypt = crypt($user.$time.$rand); // DES crypt is 64-bits output
    

Since "$crypt" is time-varying, he's got to be storing $crypt in the database
as the seed along with the password hash. This system is no better than a
weakly-generated random 64-bit seed.

It's reasonable to assume the attacker knows the username. If the attacker
knows the time the user was created, he can easily brute force the 31 bits of
$rand. PHP mt_rand is not seeded with more than 32 bits of entropy, so with
knowlege of just two (username, time, seed) entries the attacker learns the
values of all other mt_rand numbers (past and future) generated by this PHP
process. This may have significant implications for the security of other
parts of the system.

    
    
        function hash_it($string1, $string2) {
            // Equivalent complexity to:
            return sha1($crypt, md5($password)).
        }
    
        $hash = hash_it($password, $crypt);
    

So the resulting hash is 160 bits long, by interposing MD5 it is guaranteed to
not contain more than 128 bits of entropy. If nothing else, this is a waste of
space.

In my estimation, this scheme is about 0.5 bits stronger than plain SHA-1 like
Linkedin was using. 90% the Linkedin passwords have been cracked.
[http://arstechnica.com/security/2012/12/25-gpu-cluster-
crack...](http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-
standard-windows-password-in-6-hours/)

~~~
jgeralnik
This is as strong as SHA-1 plus a salt, which is much stronger than SHA-1
without a salt(but still not a good way to store passwords). The salt does not
need to be random, merely unique. Since salts are stored in plaintext in the
database next to the hashed password, the assumption is that an attacker has
those as well.

~~~
marshray
> This is as strong as SHA-1 plus a salt, which is much stronger than SHA-1
> without a salt

No, not really. Salts are used for two main reasons:

1\. So if two users have the same password, it's not obvious in the password
file.

2\. To thwart precomputation attacks, e.g., "rainbow tables".

Modern password cracking has all-but obsoleted both of these reasons. For (1),
if two users can "choose" the same password, then it's almost certainly a weak
password that the attacker can guess as well. For (2), GPUs have gotten so
fast now that password crackers don't even bother much with precomputed
rainbow tables.

I'm not arguing against salting, it's still good practice. I'm saying it
doesn't, in reality, add much time at all to the time needed to crack a
database of passwords. It's just that at the end of the day, it _all_ comes
down to the entropy content of the user's password as experinced by the
attacker.

~~~
jgeralnik
I have a database of 1 million users. Regardless of what hashing algorithm I
am using, if there is no salt I just need to hash the entire password space
once, where the password space could be a dictionary attack or a brute force
of passwords up to length 10 of letters, digits, and symbols. If there is a
salt, it means that after hashing the entire key space I only have the
password for 1 user rather than all of them. It means attackers must use
targeted attacks and slows them down by a factor of # of users. Rather than
having all my passwords in a day, an attacker gets access to one user's
password per day (over the course of a million days). (Still not a good thing
to happen)

~~~
marshray
> I have a database of 1 million users.

Similar to Linkedin with 6.5m, we have a real-world case study.

> Regardless of what hashing algorithm I am using, if there is no salt I just
> need to hash the entire password space once, where the password space could
> be a dictionary attack or a brute force of passwords up to length 10 of
> letters, digits, and symbols.

When you say "entire password space" you're implying that there's a hard upper
limit on the complexity of a user's password. Many sites do indeed restrict
passwords to 8 or 10 characters, but that's horrifically broken.

Any not-completely-broken system has an password space so much larger than an
attacker could search that it's effectively infinite. Note that even the silly
system proposed here uses MD5 which accepts password longer than would even
fit into available memory.

> If there is a salt, it means that after hashing the entire key space I only
> have the password for 1 user rather than all of them.

As described, the "key space" is effectively infinite, but luckily for the
attacker some parts of it are far more richer to mine than others. This is
where the easy-to-guess passwords lie.

So the attacker won't do anything as dumb as "hashing the entire key space".
The attacker will search the most rewarding parts of the keyspace first.

> It means attackers must use targeted attacks and slows them down by a factor
> of # of users. Rather than having all my passwords in a day, an attacker
> gets access to one user's password per day (over the course of a million
> days). (Still not a good thing to happen)

For example, recent data breaches suggest that 50% of users will choose one of
the 10,000 most-common passwords. So the attacker tries this set on all 1M of
your users. That's 10,000,000,000 hash operations. A typcial rate for SHA-1 is
3.8 _billion_ per second, per GPU. <http://hashcat.net/oclhashcat-lite/>

So our expectation is that the easiest 500,000 of your 1M accounts will be
cracked in the first 5.2 seconds by a single GPU attacker. And this is
approximately what happened with Linkedin, ISTR 60% of the passwords were
reported cracked that first day.

But the rate of success slows down for the attacker and the remaining
passwords begin to take exponentially longer. As of today, 90% of the Linkedin
database is reported to have been cracked:
[http://arstechnica.com/security/2012/12/25-gpu-cluster-
crack...](http://arstechnica.com/security/2012/12/25-gpu-cluster-cracks-every-
standard-windows-password-in-6-hours/) So there are really only 65,000 hashes
posing any difficulty at this point.

When a fast hash function is mis-used for password hashing, salt or no salt,
there's not a case where "an attacker gets access to one user's password per
day". The attacker cracks the majority of passwords in the first few seconds
yet a small minority will never be cracked.

Salting doesn't save the database as a whole. Salting doesn't help the users
with weak passwords. Salting doesn't even help the few users with very strong
passwords, they wouldn't need salt in the first place. But for the users in
the 60 - 90 percentile, it might buy them a few hours or days.

~~~
jgeralnik
> Similar to Linkedin with 6.5m, we have a real-world case study.

Except that Linkedin's passwords weren't salted.

> When you say "entire password space" you're implying that there's a hard
> upper limit on the complexity of a user's password

That's why I said "where the password space could be a dictionary attack or a
brute force of passwords up to length 10 of letters, digits, and symbols". I
don't actually mean the whole key space, just whatever the attacker chooses to
target.

> So our expectation is that the easiest 500,000 accounts will be cracked in
> the first 5.2 seconds by an attacker with a single GPU

OK, I concede that point that in this case the salt is not helping so much, at
least for users using the 10000 most common passwords. For the few hundred
thousand users users whose password lies in a larger key space (but still in
the range of bruteforce), the salt still protects them.

Getting back to your original point

> Modern password cracking has all-but obsoleted both of these reasons [for
> having a salt]

Let's look at a better hash function - bcrypt. In this case hashing passwords
is still slow enough (by design - you can slow it down to whatever you want)
that rainbow tables would again be useful - to compute the hash of 10000
passwords with a difficulty such that each hash takes 1 cpu-second to compute
will take almost 3 hours. Adding salts makes this a per user operation instead
of a global one.

~~~
marshray
> Except that Linkedin's passwords weren't salted.

My point is that salting a fast hash function is like issuing lifevests to the
passengers of the Titanic. It sounds better than nothing, but in practice it
only just keeps a handful of them from drowning so they can die of hypothermia
a few minutes later.

> That's why I said "where the password space could be a dictionary attack or
> a brute force of passwords up to length 10 of letters, digits, and symbols".
> I don't actually mean the whole key space, just whatever the attacker
> chooses to target.

Sounds like you think you get to decide what the attacker will target first
and when he will give up.

> Let's look at a better hash function - bcrypt.

Agreed, bcrypt is categorically better. But note that bcrypt includes salting
as an intrinsic part of its implementation. So it's not something any web
programmer could say "I will/will not salt my Bcrypt thusly".

As tptacek says, "if you're typing the letters A E S or 'salting your hashes',
you're doing it wrong."

------
DanBC
A meme post, by a StackExchange mod, got upvotes?

(<http://security.stackexchange.com/a/25637/3773>)

~~~
lost-theory
Yes, and by the same mod: "Deleted long discussion on preimage attacks,
collisions, hashing strength etc – Rory Alsop ♦ 8 hours ago"

~~~
roryalsop
Aye - we delete ALL long winded discussion in comments, as Stack Exchange
isn't a discussion forum.

~~~
lost-theory
That may be the rule, but it seems perverse to delete useful comments, which
add to the conversation, while adding meme pics, which add absolutely nothing
useful to the conversation.

P.S.: Meme pictures originated on discussion forums.

------
noonespecial
Bcrypt _is_ published. That's its strength, not its weakness. That's how
cryptography is. The more people try to break it the better it gets.

Dave's primary problem is that his scheme is based on MD5 which is still well
known, but can be brute forced quickly. His additional lightweight obfuscation
will provide nearly zero extra protection.

Use Bcrypt with salt. End of story.

~~~
Nursie
'xactly.

Published crypto is stronger because it's had many eyes on it, not weaker.
This is pre-101 level crypto knowledge, it's like the first line in the
idiot's guide. Or it should be!

------
mrb
I write GPU-based password bruteforcers. The main flaw of this algorithm is
that it is not iterated: it makes a single call to the MD5 then SHA1
compression function. As a result, I estimate it can be bruteforced way too
fast, around 1 to 1.5 billion password/second on a AMD Radeon HD 7970 ($400).

(On this GPU, MD5 alone can be bruteforced at 8 billion pw/sec, and SHA1 alone
at 2 billion pw/sec.)

This one flaw, by itself, makes the algorithm very bad. The developer should
have used bcrypt, or scrypt, or SHA2-based Linux crypt. These are iterated by
making multiple calls to a compression function to significantly slow down
bruteforcing, from billions/sec down to thousands/sec.

------
thaumaturgy
The main problem here IMO is that this thing will be used to store passwords
for a longer period of time than anyone will spend bothering to break it now.
So you store your entire password database this way, then two years from now
you get breached, and both your database and your code gets exposed.

Then, you _might_ benefit from nobody wanting to bother to spend the energy to
reverse-engineer your one-off GoofHash(). But, then again, maybe not. Maybe
the fact that all of the passwords have been effectively hashed using single
iterations of fast trivial algorithms like SHA1 and MD5 will mean that a ton
of your users' passwords could be calculated in a short period of time.

In that scenario, you're gambling instead of relying on battle-tested
approaches. Maybe the gamble will pay off, probably it won't, but when the
battle-tested approaches have been examined by a lot of really smart people
and found to be solid, why not use them instead?

Or, put another way: you're going to have to play a game of dice against
Death. You have a choice between playing with a set of dice that thousands of
people have rolled over and over again and found to be fair so far, or a set
of dice that some dude just rolled for the first time a minute ago and
declared to be fair. Which do you choose?

edit: I am an idiot. If someone gets the code and DB, they won't have to
reverse-engineer anything; this is just a fast MD5/SHA1 combo, a bunch of
passwords could be brute-forced in a short period of time.

------
philbarr
Does anyone know of a decent introductory, authoritative text on cryptography?
I haven't needed to know this stuff before beyond "use bcrypt because it's
good, MD5 is rubbish", but have recently been given the task of generating
license keys for our product. I would like to be able to test that what I've
created isn't trivially crackable.

~~~
MichaelGG
You're going to have to define "trivially crackable". Sure if you use a public
key they can't create a keygenerator. After that, crypto has little use to
you.

A user can just open your binary and flip the jump that checks if the license
is good or not. Do you really want to start wasting time making your software
vastly more complicated, in order to try to lock out crackers? If the software
is remotely desirable or common, it'll get cracked.

~~~
philbarr
For me, trivially crackable here refers to having someone who has read a basic
text picking the license key out of the registry and going "hmm, that looks
like a such-and-such key, I wonder what happens when I run it through this
program here. Oh look it works, I think I'll just extend the date of the
license to forever and tell all my friends."

If someone wants to go and disassemble the app and read all that machine code
to find the bit to flip then there's not much I can do about that I think.

~~~
MichaelGG
"Read all that machine code" - I think you highly overestimate the difficulty
of that.

Take a simple checking function. Check the code, if it is wrong, show a
message box saying "wrong code". Well, in that case, all you gotta do is set a
breakpoint and go back. Programs like OllyDbg or IDA Pro make this kind of
reverse engineering absolutely trivial, and something you can accomplish after
"reading a basic text".

Most likely if someone's going to "run a program", they can run a crack, too.
If you can get away with a license file, then you can do RSA properly.
Otherwise, end up with short RSA and hope no one really looks.

It's almost certainly the case that you're far better off improving some other
aspect of your application than investing any time trying to defend against
pirates.

~~~
philbarr
Thanks for the pointers to OllyDbg and IDA Pro - I'd never heard of them.
Embarassingly, it's been quite a few years since I tried _actually_ hacking a
program; especially since I'm posting on " _Hacker_ News".

~~~
MichaelGG
Here's a tutorial I wrote in 2004 which is still a decent start these days. If
anything, it's easier these days, because IDA Pro has HexRays which can
decompile code into C and makes it very easy to navigate a compiled module.

[http://www.atrevido.net/blog/PermaLink.aspx?guid=ec99e239-89...](http://www.atrevido.net/blog/PermaLink.aspx?guid=ec99e239-8917-48e3-bd4f-af866b730150)

------
webreac
<https://www.coursera.org/course/crypto> is excellent

~~~
Nursie
I did this as well, awesome course. The drop-out rate was astounding from the
feb-may run though, from 70K signups there were only 1-2K that completed IIRC.

~~~
marshray
Wow. I'm even happier to have completed it then!

Agreed, it was a great course. Looking forward to the sequel in early April.
<https://www.coursera.org/course/crypto2>

~~~
Nursie
Thanks fr the reminder, must get involved in that, it's fascinating stuff.

------
jws
When your machine is breached and the list of hashed passwords is stolen along
with the code the attackers will be able to brute force at very high speed all
of your customers' passwords, then use these to get into the customers'
gmail/yahoo/hotmail/paypal accounts (because they used the same or similar
password) and from there do real financial damage to your customers. And it
will all be your fault.

The point of bcrypt is to make it so they can't brute force at high speed. (If
you choose your number well enough.)

------
raheel
The published algorithms are prune to dictionary attacks. If you can have even
a not too good cryto-algorithms, whose logic is kept really secret you are
much secure.

Let's say a hacker, some how got into some non-secure services with good
amount of user. And found out the username/password are not encrypted or
hashed at all. He can use that list a dictionary. He can also have MD5, SHA1
hashed calculated. The same hacker got it into your secured service, who
already have a dictionary to try on. He can get password from MD5 or SHA1 or
any known hash, in O(1) using hash-table lookup. Normally, hackers now-a-days
have the dictionary to try on, it is just the SALT and hash-algorithm is
unknown to them. So, the security measures can be:

1\. Protect password Hashes from theft. 2\. Protect salt from theft. 3\.
Protect hash function from theft.

Try to protect them all. All of them can stolen, but probability of 3 theft is
less than 2 theft. I agree with Dave, only if he has plan to keep hash
function secret. I should not shipped inside java script, so hacker don't need
to do any effort.

~~~
msbarnett
This is more-or-less totally incorrect. The only thing you got right was 'use
salt to help against precomputed tables'.

Cooking up goofy hashes doesn't give you any extra security in the worst case
scenario (the hacker gets access to your web server and has the source code
for your goofy hash). A goofy sha1/md5 based algorithm with salt can still
have millions upon millions of attacks generated per second by anyone with a
half-decent GPU. Your only hope once you're in this scenario is to use a big
slow hash that the attacker can't run very fast -- y'know, something like
bcrypt.

And if you're right that they're more likely to get the db without the source
(very dubious -- web servers are a big attack surface)? Then the attacker is
still completely screwed trying to break your too slow to attack bcrypt+salt
passwords. You've got much better safety in _every_ scenario, not just the
"well they didn't manage to see how my goofy hash worked" scenario.

(And that's assuming you, as a non cryptographer, didn't so thoroughly screw
up your implementation of goofy hash that it ended up with less entropy than
even md5. But how would you know when you've kept it super-secret for l33t
security?)

------
brunnsbe
For a short password, does it help to first calculate a hash using MD5 or SHA1
and then run it through Bcrypt? As far as I have understood it's better that
the password to encrypt is long enough which the first simple hashing takes
care of, or am I wrong?

~~~
MikeKusold
My gut tells me no. If your password process is to take a string, get the MD5,
then call bctypt on the MD5, then you are essentially doing the same as
calling bcrypt on the string.

If the USER takes a short string, and MD5's it then sets the MD5 as their
password then it is more secure, but that is only because it is a longer
password to begin with.

If I'm wrong I'd be interested in knowing why.

------
n0on3
Re-implementing crypto functions is generally a terrible idea, unless you
really know what you are doing (and most likely you don't). There are so many
different relevant factors to consider, that single individuals or
organizations are really unlikely to be able to cover reasonably close to
completely. That's why standards and "well known" algorithms are a good
choice.

Also, "security through obscurity" is well known as a worst practice for
crypto (probability that the algorithm will be reverse engineered or disclosed
is all but negligible), so definitely not a good reason to implement any new
algorithm.

------
arethuza
This is actually quite tame - I've encountered developers who justified
writing their own web server (in VB6!) because it would be more secure than
"standard" web servers - this was for quite sensitive commercial data...

------
nanoscopic
The md5 and shuffling of the md5 hash before doing a sha1 is entirely
pointless.

The salt generation looks acceptable though.

The op stated that he already created a system using bcrypt. The fact that op
doesn't know what is wrong with Dave's code, and needs others to explain why
Dave's code is bad, makes me lean towards op stfu and allow Dave to continue.

Op is unlikely to be any more competent compared to the somewhat incompetent
Dave.

I agree that it is likely that op and Dave are one and the same, and think
this mess of code is some sort of genius.

------
Millennium
The only instance in which homebrew password security isn't wrong is if you
are actively involved in research and experimentation, destined for publishing
and peer review, into better security methods.

We don't publish security-related algorithms to make them stronger. We do it
to weed out the weak, and it is remarkably effective at doing that. If you're
not confident that your algorithm could survive in the open, then chances are
it can't.

------
jakejake
Dave's mentality about having hashing that is unpublished is also known as
"security through obscurity". Relying on the fact that nobody can see your
source code. That's generally known to be a poor strategy.

Ironically, if he would just substitute bcrypt for sha1 in his function, it
would be a great crypt function!

------
oconnore
This is silly, true, but "you are not a cryptographer, use bcrypt" is not
helpful. You could design an equally silly password hashing scheme using
pbkdf2 and bcrypt, the recommended alternatives:

    
    
        sillysalt(user):=user+date+mersenne_twister()
        sillykdf(user,password):=pbkdf2(sillysalt($user)+
               shuffle(bcrypt($password)))
        
        persist( sillysalt($user)+":"+$outer_pbkdf2_salt+":"+
            $inner_bcrypt_salt+":"+sillykdf($user,$password))
    

A scheme like this is designed to thwart offline attacks on the password
verifier (for example, an attacker acquires a dump of your database).

His user+date+mersenne salt accomplishes the goal of (mildly) increasing the
cost of precomputed attacks. He should use a CSRNG instead.

He uses existing cryptographic hash algorithms, SHA1 and MD5.

His shuffling does not add or detract from the security of the scheme.

He does not perform any significant key strengthening. Weak (most) passwords
will be very cheap to brute force.

~~~
marshray
> His user+date+mersenne salt accomplishes the goal of (mildly) increasing the
> cost of precomputed attacks.

Most password crackers these days are using GPUs and don't even bother with
precomputed rainbow tables. So the salt is not increasing their cost at all.

~~~
oconnore
You didn't actually disagree with what I wrote. What you wrote is also true.

------
eranation
tl;dr - don't break the first rule of cryptography
(<http://meta.security.stackexchange.com/a/915/12776>)

> "Anyone can invent an encryption algorithm they themselves can't break; it's
> much harder to invent one that no one else can break" - Bruce Schneier

------
dinkumthinkum
A primary tenet of cryptography is to assume that the mechanism of an
encryption is known. This is a case of someone, honestly, not being educated
about the topic and therefore I would definitely not want someone like that
making security decisions.

