
How Dropbox securely stores your passwords - samber
https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/
======
Someone1234
I cannot see any obvious weaknesses in this scheme.

It seems to address a known pain point in bcrypt (max length), implements a
pepper in a secure way (which cannot inadvertently degrade security), and is
otherwise doing things which are best practices (high work factor, per user
salt, etc).

I know peppers remain controversial (some people claim they're pointless, and
make a good argument). But ultimately nothing Dropbox is doing with peppers in
this article makes your password easier to break, only harder.

I'd call this scheme 10/10.

~~~
dsacco
It's a good system, especially compared with the current best practice of
simply hashing passwords with bcrypt and calling it a day.

I can't recall it off the top of my head, but Facebook has a similarly
impressive system with more secret sauce involved for performance at scale. I
believe what they do is the following:

1\. Hash the password with MD5(password).

2\. Generate a 20-byte (160-bit) random salt (this is well over the 64 bits
you'd need to defend against birthday attack collisions).

3\. Hash with hmac_sha1(hash, salt).

4\. Send this value to a separate server for further operations (mitigates
offline brute-forcing).

5\. Hash in a secret key with hmac_256(hash, secret). Note this operation is
on a separate server. The secret key might be colloquially termed a "pepper".

6\. Hash with scrypt(hash, salt) to make local computation slower.

7\. Shrink the final value with hmac_256(hash, salt) for efficient database
storage.

If any Facebook engineers are around, please correct me if I've missed or
misinterpreted any part of that.

~~~
tptacek
The current best practice of simply hashing passwords with bcrypt is fine, and
anything past that which doesn't doesn't involve (probably per-password) use
of an HSM adds only marginal value.

I wouldn't want anyone to read this and get the impression that "secret
peppers" and multiple hashing rounds and HMAC were important components of a
password storage system. They are not: they're things that message board nerds
come up with.

If you want to be a step better than just storing passwords with bcrypt, your
next step is to create an authentication service that runs on separate
hardware with an "is this password valid" and "enroll this password" API and
nothing else. The stuff people do instead of this is basically cosmetic.

~~~
alecmuffett
Yep, Tom, that's the FB solution - or rather, halfway up the stack of hashing
is a callout to a service where HMACs are involved. These bring their own
challenges:
[https://video.adm.ntnu.no/pres/54b660049af94](https://video.adm.ntnu.no/pres/54b660049af94)

~~~
tptacek
No, according to your talk, Facebook's solution is that password validation is
pushed out to the front-end servers, who use back-end KMS services to do some
(but not all) of the crypto.

I think this is a suboptimal approach. Can you tell me what the benefit of
your layered approach is over simply adding to your KMS servers the APIs for
validate(user, password) and change(user, oldpassword, newpassword)?

If your KMS service did password validation directly, you wouldn't need any of
the layers in this architecture. HMAC would add nothing (it would be
tautological, since anyone who could directly attack the hashes must have also
owned up the KMS service). I still don't totally understand the MD5 step. You
could just use scrypt and nothing else, and you could probably ratchet the
work factor up because you wouldn't be billing the front end servers for those
cycles.

I generally like, have recommended, and have built a few times the "software
HSM" KMS approach you're describing here --- but only for "seal/unseal" and
"sign/verify" APIs.

~~~
alecmuffett
>No, according to your talk, Facebook's solution is that password validation
is pushed out to the front-end servers,

Yes, although I don't work for Facebook any more, that was and probably still
is the case.

>who use back-end KMS services to do some (but not all) of the crypto.

In fact, the backend service does a tiny amount of the crypto.

>I think this is a suboptimal approach.

Of course you do, it's not like I've not argued with you before, Thomas. :-)

>Can you tell me what the benefit of your layered approach

...Facebook's layered approach...

>is over simply adding to your KMS servers the APIs for validate(user,
password) and change(user, oldpassword, newpassword)?

If you want your hash to be ... hashed, rather than lodged in some
questionable silicon, not putting a fat crypto load onto the backend avoids
the "thundering herd" problem when some fraction of 1.7 billion people want to
log in.

>If your KMS service did password validation directly, you wouldn't need any
of the layers in this architecture.

Quite, and if everyone flew instead of drove, we wouldn't need cars with
layers of bumpers, crumplezones, seatbelts and airbags; but it's merely
shifting the problem for 1.7 billion people.

Have I mentioned "scale"? I should mention scale. Scale is a thing.

>HMAC would add nothing (it would be tautological, since anyone who could
directly attack the hashes must have also owned up the KMS service).

Yes. Fucking huge KMS service. HUGE. Trumpiness levels of -HUGE- and eating
lots of power and redundancy.

1.7 billion people. That's a lot. Like 0.01% of it is 170,000 people. All
logging in together. All over the world.

>I still don't totally understand the MD5 step.

Yeah, but I'm not betting that Mark's/whomever's coding at the time was
focused on the future of password authentication.

>You could just use scrypt and nothing else

Yes, but where's the fun in that?

>and you could probably ratchet the work factor up because you wouldn't be
billing the front end servers for those cycles.

The frontend servers are approximately precisely where you want the
cost/chokepoint. There's a metric fucktonne of them and they are closest to
the request, so by definition they are scaled to the load.

>I generally like, have recommended, and have built a few times the "software
HSM" KMS approach you're describing here --- but only for "seal/unseal" and
"sign/verify" APIs.

Cool. What's your biggest deployment?

~~~
tptacek
Last question first: large, but not Facebook large (I spent 10 years
consulting on this kind of thing).†

You comment early in the talk that the MD5 step is somehow helpful for
password dumps. That was the bit I didn't follow. If it's there because that's
how password hashing worked before your team got to it, that makes a lot more
sense. But then: it's not a "layer" of the onion so much as a sheen of dirt
that needs to be washed off the onion. :)

I get that your auth problem is huge. Yuge. So big you wouldn't believe it. I
totally believe you. No, wait, I don't believe you, that's how big I know your
authN problem to be: unbelievably huge.

But here's the thing: you're already scrypting passwords. We're not debating
whether you can use expensive password hashes. You already use expensive
password hashes. I'm saying: the model where the KMS does a small bit of the
password hash step and defers the heavy lifting to front-end servers seems
like a suboptimal way to structure this:

* You have to bill cycles from the front end to do it

* You can't change password hashing without updating all the front-end servers

* It's harder to track usage because it's spread across a zillion machines

* You're more constrained in how you scale it (for instance, if you wanted to double or triple the work factor) because whatever your new scheme is, it has to fit with the existing front-end resources.

I'm not saying "wow, it's dumb that you built it this way". I'm saying, if
other people are reading this thread thinking about how to do it:

* DO split authentication out into its own service

* DON'T have that authentication service be "HMAC as a service" and then do scrypt on your front-end service

YOUR MOVE, ALEC MUFFETT. I keep going until you unfriend me on Facebook so I
can't see you wincing about these posts.

† _I 've_ assessed _Facebook-large variants of this, though._

~~~
alecmuffett
>Last question first: large, but not Facebook large (I spent 10 years
consulting on this kind of thing).

I just spent 3 years living it for 50h/week. Hence why I am taking a vacation.

>If it's there because that's how password hashing worked before your team got
to it, that makes a lot more sense.

That. It wasn't even me, it was done before I arrived, but it was done by a
team of geeks with a tremendous nose for making the best of the database that
they had available to them without pulling the old password-migration "log in
with one password, parallel-encrypt with a new algorithm, and save the new
hashes" \- thing, because some of those billion people might never log in
again for years. You would never stop migrating people.

I remember internal pasword algorithm migrations at Sun, at least there you
could force the matter for 10,000..40,000 people.

But you can't force everyone to migrate at FB scale.

>But then: it's not a "layer" of the onion so much as a sheen of dirt that
needs to be washed off the onion. :)

You can take that approach, but - again - when will you finish the task?
Whereas wrapping one algorithm in the next is a finite task which is
completable in a reasonable amount of time.

>I get that your

...Facebook's...

>auth problem is huge. Yuge. So big you wouldn't believe it. I totally believe
you. No, wait, I don't believe you, that's how big I know your authN problem
to be: unbelievably huge.

Well channeled. :-)

>But here's the thing: you're already scrypting passwords. We're not debating
whether you can use expensive password hashes. You already use expensive
password hashes. I'm saying: the model where the KMS does a small bit of the
password hash step and defers the heavy lifting to front-end servers seems
like a suboptimal way to structure this:

>* You have to bill cycles from the front end to do it

Yes. 0.1% of frontend cycles. <blank expression> And?

>* You can't change password hashing without updating all the front-end
servers

...which happens three times a day, weekdays, and is moving to moreso.

>* It's harder to track usage because it's spread across a zillion machines

"Facebook has tools for that sort of thing":

[http://conferences.oreilly.com/strata/stratany2012/public/sc...](http://conferences.oreilly.com/strata/stratany2012/public/schedule/detail/25540)

...and to be honest it's not hard to do anyway.

>* You're more constrained in how you scale it (for instance, if you wanted to
double or triple the work factor) because whatever your new scheme is, it has
to fit with the existing front-end resources.

Yes. For a site with wildly heterogeneous architectures in front-end
deployments, I can see how that might be a concern; but even AWS leads people
to standardise on having approximately-the-same-kinds-of-hardware-doing-
approximately-the-same-things.

>I'm not saying "wow, it's dumb that you

...Facebook...

>built it this way". I'm saying, if other people are reading this thread
thinking about how to do it:

>* DO split authentication out into its own service

...or some component of it...

>* DON'T have that authentication service be "HMAC as a service" and then do
scrypt on your front-end service

Why not?

>YOUR MOVE, ALEC MUFFETT. I keep going until you unfriend me on Facebook so I
can't see you wincing about these posts.

Wince?

> I've assessed Facebook-large variants of this, though.

Did they buy it?

~~~
tptacek
Sorry for the delay. A Jazzercise class suddenly appeared in the coworking
space I work out of, so I fled, and then I had to give a talk about
Starfighter.

Responses: _(when I say "your" let's just stipulate I mean Facebook)_

* Your password validation overhead is .1% of _current_ front end resources, but could be ratcheted up, and would be easier to ratchet up if they weren't shared by other things.

* I totally understand why you keep the old MD5 cruft around --- but would add that it's cruft that would be even less obvious if it lived behind an authentication server.

* I think it's safer, simpler, cleaner, probably easier to scale, and definitely easier to change authentication if it lives in its own service rather than being implemented (in part) on a generic application server. As usage shifts from HTML front-end to all API, you might even be able to keep app servers from even _seeing_ passwords.

* By "assessed", I mean, worked on other people's systems at this scale.

So I guess I'd wrap up with a question: if you had this to do over again, from
scratch, the way you wanted to, would you have app servers do a password hash
and then entangle it somehow with an HMAC operation from a crypto service, or
would you have the whole password hash done on the crypto service directly?

~~~
alecmuffett
tl;dr for me:

Putting the authentication service into a nice tidy centralised box does not
actually achieve much, and may have architectural downsides.

Not the least of which is: if it's wholly in a service, then you have to
authenticate the service; that's not such a big step from "if the hashed
passwords are stored in a directory, then you have to authenticate the
directory" of course - but if we were to equate the two systems because of the
need to authenticate the {directory, service} then the service-based solution
still has the downside of being a CPU hotspot and a potential single point of
failure.

We're much better at distributing directories of data which is self-protected
/ needs no special treatment, than we are at building humongous scalable
"secure" services with an enormous TCB and a physically enormous attack
surface / footprint.

Yes. If I was doing this, sure, I would do this again. Curiously I am a big
fan of password hashing rather than all-singing, all-dancing authentication
services.

General notes: [http://dropsafe.crypticide.com/muffett-
passwords](http://dropsafe.crypticide.com/muffett-passwords)

~~~
tptacek
Not sure I understand. If you have to authenticate the password verifying
service, you also have to authenticate the HMAC-providing service.

I like password hashing too. That's all my password validating service would
do.

~~~
alecmuffett
> you also have to authenticate the HMAC-providing service

Yes.

But, to look at the Facebook approach, what is the risk surface presented by
the HMAC service?

Done properly in the FB approach the password is irreversibly hashed before it
arrives at the HMAC component, and cheaply HMAC'ed and returned, where the
onion of hashing is completed.

It's good to bidirectionally authenticate access to the HMAC service, but in
terms of protocol it strikes me as less critical than in your scenario.

Either the HMAC is done properly (in which case the eventual hashes will
verify for legitimate users) or - if someone inserts a "fake" hashing service
- the HMAC'ed results will not validate, and a bunch of legitimate users will
experience login failure.

( edit: there's a risk of exfiltrating the input to the service, but it's
meant to be a shitload of work to achieve any evil with that input anyway,
which also can be shorn of user-metadata and other clues thereby making it a
bit less valuable )

Maybe I have missed something but to my mind this threat scenario fails (by
dint of fake services, exfiltration, etc) in a "safe" manner.

=== Now === consider your "authentication service" approach.

Plaintext goes into... what?

The real service?

A fake service that returns "true" in all circumstances?

A MITM that exfiltrates the plaintext?

Where do you put the root of the trust chain to this service? In an SSL
Certificate? Pinned? From which CA?

Simply: I feel that in centralised password authentication services there are
a lot more potential shenanigans to defend against.

( ps/edit: and - I know you would not, but - don't get me started on "three
strikes":
[http://www.crypticide.com/article/42](http://www.crypticide.com/article/42) )

------
kijin
A note about combining SHA512 with bcrypt: Don't feed the raw binary output of
SHA512 into bcrypt. Use the hexadecimal or base64-encoded form instead.
(Dropbox probably does this already, since they mention base64 in passing.)

bcrypt is known to choke on null bytes. Each SHA512 hash has a 25% chance of
containing a null byte if you use the raw binary format.

Using hex or base64, of course, decreases the amount of entropy that you can
fit into bcrypt's 72-byte limit. But you can still fit 288 to 432 bits of
entropy in that space, which is more than enough for the foreseeable future.

~~~
benmanns
Thanks for the reminder. You could encode with a "base255" algorithm that just
excludes the null byte to retain more entropy within 72 bytes.

~~~
KMag
Much faster than all of those multiplications that invertibly maps 64 bytes to
65 bytes without nulls: replace first null with 255. Every later null, replace
with the index of the previous null. Make the final byte the index of the last
null (or 255 if no nulls were replaced). In this way, you've replaced the
nulls with a linked list of the locations where nulls used to be. To invert
the transformation, just start at the final byte and walk the linked list
backward until you hit a 255. (You'd never do the inversion in practice, but
the existence of the inversion algorithm proves that no entropy was
discarded.)

Use the time saved by not doing base 255 conversion to increase your iteration
count.

~~~
kijin
Any time you spend or save on these kinds of transformations is going to pale
in comparison to the bcrypt step. It might not even amount to a single
iteration.

------
0x0
It's nice to store passwords securely, but it's also important to remember to,
you know, actually verify them afterwards ;)

[https://techcrunch.com/2011/06/20/dropbox-security-bug-
made-...](https://techcrunch.com/2011/06/20/dropbox-security-bug-made-
passwords-optional-for-four-hours/)

~~~
eropple
That is literally five years old at this point and is at best, a cheap shot.
Let's be better.

~~~
seanieb
Also, Dropbox has more people working on security today than the total number
employees Dropbox had in 2011.

~~~
zuck9
Also, they have written a unit test to make sure this cannot happen ever
again.

------
borplk
As someone who exclusively uses a password manager with random unique
passwords for each service it always amuses me to see posts like this.

Years ago I relieved myself from the stress by using a password manager. Now
for all I care they could be storing it in plaintext and it wouldn't make a
damn difference to me. Problem solved.

~~~
Kratisto
Just curious which one do you use?

~~~
FabHK
I (not OP) use pwsafe on macOS and iOS, a port of Password Safe [1] designed
by Bruce Schneier.

It syncs the encrypted blob to iCloud, so I can pull up passwords on the
iPhone if necessary. It is fairly simple, not much browser/OS integration -
you just open the app, choose a safe/blob, enter the master password, and can
then browse/edit your list of passwords, and in particular copy a password.

Simple, not too much functionality - not too much that can go wrong, I hope.
(Often it has been the browser integration that lead to exploits in
LastPass/1Pass, if I'm not mistaken)

[1]
[https://www.pwsafe.org/relatedprojects.shtml](https://www.pwsafe.org/relatedprojects.shtml)

------
cperciva
_We considered using scrypt, but we had more experience using bcrypt._

Ok, fair enough...

 _The debate over which algorithm is better is still open, and most security
experts agree that scrypt and bcrypt provide similar protections._

... wait, what?

~~~
tptacek
They're badly expressing the idea that there may be only marginal benefits to
optimizing the genus of password hashes used, so long as you're using a
serious construction designed for storing (or generating keys from) passwords.

The debate over whether scrypt is better than bcrypt is not really still open.
The debate over whether the difference matters that much in practice might be.

For what it's worth: for new systems, I use scrypt. But if someone asked, and
they didn't have a very specialized application, I'd tell them that switching
to scrypt from bcrypt, or even PBKDF2, would be a waste of money.

~~~
cperciva
Right, if they had stopped at "we used bcrypt because we're familiar with it
and we think it's good enough for our purposes", I wouldn't have said
anything. But the second sentence, claiming that there's an open debate about
which provides more protection...

~~~
tptacek
Yeah, sorry, I should have led with "I agree with you".

No part of this post inspired confidence with me. I use Dropbox to store and
share screenshots on Twitter, and try not to use it for anything else.

~~~
cperciva
_I use Dropbox to store and share screenshots on Twitter_

Wildly off-topic, but I just upload screenshots to twitter directly and let
them figure out the hosting. Is your usage of dropbox for this simply a legacy
of when twitter didn't have support for uploading images?

~~~
tptacek
Probably, yes.

------
red_admiral
Here's how facebook does it:
[http://chunk.io/f/72f9c680ac2a4777b6dbf33c532e1d3c.jpg](http://chunk.io/f/72f9c680ac2a4777b6dbf33c532e1d3c.jpg)
(Alec Moffat talking at RealWorldCrypto)

Seems like the combination of strong hash + encryption on a HSM is the way to
go these days. Dropbox's scheme looks good to me.

------
joepie91_
One concern I have here, is that people are going to perceive this post as
"this is what you should do and it's easy!", because the post doesn't really
address the complexities of implementing this kind of thing.

As a result, we're probably going to have a bunch more issues like this one:
[http://blog.ircmaxell.com/2015/03/security-issue-
combining-b...](http://blog.ircmaxell.com/2015/03/security-issue-combining-
bcrypt-with.html)

I'm not looking forward to having to talk people off that particular ledge for
the next several months...

------
aomix
Cool approach, you need to compromise two separate servers just to have a
usable password database you could run tools against. A key compromise can be
fixed quickly and a password compromise is useless without the key.

~~~
dsl
Of the last 10 or so security engagements I have done, I can only recall one
where I wasn't able to compromise _all_ servers. Once you get the first few,
the incremental work to get everything is relatively small.

When breaking in, your end goal isn't the database server... it's the domain
controller or the configuration management server.

------
sandGorgon
Does anyone know what is a good practice to create a "vault" \- the kind that
is used for the Pepper in this case?

I have heard of it being a separate, ip restricted server with daily changing
ip address, etc. A simpler use case would be to store oauth2 tokens or some
kind of PII

~~~
shawabawa3
I've heard before of it just being stored in the codebase. Doesn't add much
security but it does mean both the database server and at least 1 of the app
servers or the codebase have to be breached

~~~
koolba
> I've heard before of it just being stored in the codebase.

Yuck! At the very least put it in an environment variable. Best case is loaded
once at server boot from an HSM, kept only in memory, and rotated on a regular
basis.

Having it in the code, like all other config, is a terrible idea.

~~~
sandGorgon
If we don't have a hsm? Apparently,neither does Dropbox. So how do you
engineer one ?

~~~
morgante
Use something like Vault:
[https://www.vaultproject.io/](https://www.vaultproject.io/)

------
evunveot
> Some implementations of bcrypt truncate the input to 72 bytes, which reduces
> the entropy of the passwords.... By applying [SHA512], we can quickly
> convert really long passwords into a fixed length 512 bit value, solving
> [that problem].

This part confused me. How can truncating to 72 bytes be a more severe
reduction in entropy than generating a 64-byte hash?

~~~
grenoire
Password lengths are variable. With passwords longer than 72 ASCII characters,
you will lose entropy after that.

Let A be a 72 character long string, and B be A + X. Regardless of what X is,
when bcrypted the result for A and B will be the same.

~~~
crpatino
A random X does not reduce the entropy of bcrypt(B), just fails to add any
additional entropy beyond bcrypt(A)'s.

Assuming C, where len(C) < 72, I don't know if it is possible at all to chose
some value of Y such that:

D:= C+Y

Entropy(bcrypt(D))< Entropy(bcrypt(C))

~~~
Dylan16807
"Lose" meaning it is thrown away. Not "lose" meaning it subtracts.

------
OskarS
_If we use the global pepper for hashing, we can’t easily rotate it._

I don't get this point. Why is it harder to rotate pepper for a hash compared
to an encryption key?

~~~
jxcl
Because encryption can be decrypted with the correct key. A hash can't be
reversed once you've hashed something with the pepper.

~~~
OskarS
Hmm. Ok. That's true. Thanks!

------
martinko
A bit of an overkill, no? Doesn't bcrypt suffice?

~~~
OskarS
Assume their salt+hash database leaks. It's true that salting the passwords
and using bcrypt would prevent mass cracking of the database (i.e. you would
have to crack each password individually, not the entire database at once), it
would still be feasible to crack a single user's password if it was weak
enough (which would be worth it, if the user is, say,
president@whitehose.gov).

Using pepper prevents that from happening, and storing it separately from your
database makes it much harder to get both.

~~~
Achshar
> it would still be feasible to crack a single user's password if it was weak
> enough

I'm sorry, but how? You'd need data centers worth of effort for many years.

~~~
OskarS
They say that the bcrypt hashing takes 100 ms on their servers. If we take
that as our limitations, it means we can try 10 passwords per second. So if
you had a dictionary of common passwords, plus the salt, you could try 36000
passwords per hour. If the password is "password12345" (hence "weak enough"),
you could feasibly crack that.

(I should say I'm not a professional security engineer so I'm probably wrong
about everything and would appreciate corrections. )

~~~
Someone1234
Great analysis (that in my experience is largely accurate).

Just want to tack on one footnote: 100 ms is based on a single CPU core. Most
CPUs have 4-8 cores. So instead of 10 passwords a second you could argue 40-80
passwords a second is believable with concurrent operations (which all popular
hash cracking software supports).

It is definitely viable to break a single user's hashed password no matter
what the scheme or work factor. Strong hashing algorithms with high work
factors just stop you breaking multiple user's passwords quickly (and gives
you 1-3 days of delay). It is a stop gap, not impenetrable defence like some
believe.

All technical people need to take a day and learn how to break password
hashes. Not just the theory, but go download the software and actually do it.

~~~
ZoFreX
> Not just the theory, but go download the software and actually do it.

And now that hashcat and oclhashcat have merged[1] and are open-source[2] it's
never been easier to install hashcat and start cracking passwords!

[1]
[https://hashcat.net/forum/thread-5559.html](https://hashcat.net/forum/thread-5559.html)
[2]
[https://hashcat.net/forum/thread-4880.html](https://hashcat.net/forum/thread-4880.html)

------
ppierald
I would be interested in the details of the storage mechanism of the global
pepper. Is this in an HSM? For AWS customers, something like KMS? There are
then huge operational and redundancy issues to think about. Failovers for your
HSM. Handling the possibility that AWS might not be available or corrupt the
key, other cases. These things are easy to whiteboard, but when the rubber
hits the road and you need to think about all the operational edge cases,
things get hard quick.

~~~
dsacco
It's not in an HSM. Dropbox states towards the end of the article that they're
exploring HSM applications for pepper storage, which I think is a great idea.
If I recall correctly, Facebook is also exploring (or has already implemented)
an HSM for password database secret key storage.

You raise good points though. This system is significantly safer than best
practices (bcrypt(password, 10)), but it has significantly more overhead.
There's also diminishing returns here. For a company of Dropbox's size - sure,
invest in this. For a company that came out of YC S16, no, don't bother. Just
properly bcrypt/PBKDF2/scrypt/argon2 the thing and revisit much later.

I love it, but I would not recommend this system to my clients for password
storage unless they had a very mature operations/reliability team.

------
yladiz
Realistically, how much better is this than the standard bcrypt
recommendation? I don't mean for a company the size of Dropbox/Facebook/etc.,
I mean in general, will this really be much more useful than just using
bcrypt? Using an encryption key means that if the database is compromised, as
long as the OS isn't (or wherever the key is being stored), the passwords are
encrypted in a way that's effectively impossible to decrypt, which is nice.
However, are they sure that hashing the password first before hashing it in
bcrypt won't cause issues?

Unless Dropbox employs or contracted someone to verify that this is okay (not
an engineer, a mathematician/cryptographer who can understand the math behind
the algorithms) I'd be hesitant about it. Same goes for other companies that
do some complex sequences of hashing e.g. Facebook. Implementing the idea is
engineering related, but verifying it is not, and I don't trust engineers
(including myself) to verify that a specific algorithm or sequence of
algorithms is valid.

------
faragon
From the diagram, Dropbox stores no passwords: it stores an encrypted hash
(hasing in two steps, SHA512 and then "bcrypt") of the password. I.e. stored =
AES256(bcrypt(SHA512(password), per_user_salt, 10), global_key).

I would like to know if "salted-bcrypt"+SHA512 hashing is really safer than
using just SHA512 (e.g. because of the risk of making locating hash collisions
easier, etc.).

~~~
cstrat
I posted a separate comment, but will post here again...

Dropbox do store one password AFAIK. It appears that they store OSX users
administrator password... I am keen to see if they address this somewhere.

See discussion:
[https://news.ycombinator.com/item?id=12457067](https://news.ycombinator.com/item?id=12457067)

~~~
angry_octet
No, they use the admin access they obtain from you to modify the system so
they can obtain admin access again in the future. They do not ever have your
admin password.

------
CiPHPerCoder
Their solution is very similar to the mode prescribed by [1] and implemented
in [2].

There are actually two problems with bcrypt:

    
    
      - It truncates after 72 characters
      - It truncates after a NUL byte
    

If anyone is dead set on following Dropbox's example, make sure you aren't
passing raw binary to bcrypt. You're playing with fire.

Additionally, if you're going to use AES-256, don't implement it yourself. Use
a well tested library that either uses AEAD or an Encrypt then MAC
construction.

[1]: [https://paragonie.com/blog/2016/02/how-safely-store-
password...](https://paragonie.com/blog/2016/02/how-safely-store-password-
in-2016#why-bcrypt)

[2]:
[https://github.com/paragonie/password_lock](https://github.com/paragonie/password_lock)

~~~
arielb1
nitpick: encrypt-then-MAC _is_ an AEAD construction.

~~~
CiPHPerCoder
Nitpick to counter your nitpick: Not necessarily.

It's an AE construction, but not necessarily an AEAD construction. You can
have IND-CCA3 security without additional data.

The converse is also not necessarily true (i.e. an AEAD scheme could be based
on MAC-then-Encrypt and IIRC there are some that _are_ built that way), but
that's a less useful counterpoint. The recommended AEAD constructions (AES-GCM
and ChaCha20-Poly1305) _are_ EtM.

------
figers
How dropbox "NOW" securely stores your passwords

------
oDot
While this is very impressive, it feels like trying to solve the wrong
problem. The real problem is getting rid of passwords (Persona, anyone?).

Don't get me wrong, what's described there is super-important to secure the
authentication of today, but what about a word for the authentication of
tomorrow?

There already are various solutions. Passwordless[0] is a familiar one for
nodejs, and I recently bumped into the promising Portier[1], which is,
according to its authors, a "spiritual successor to Mozilla Persona".

[0] [https://passwordless.net/](https://passwordless.net/)

[1] [https://portier.github.io/](https://portier.github.io/)

~~~
kraftman
So I have to login to my email every time I want to login to something using
passwordless? Doesn't this just move the problem and create a few others?

~~~
jacobsenscott
For most companies offloading your password management onto an email provider
is the right way to go. Suddenly, for free, you get MFA, a dedicated security
team, and you'll never need to do one of those "Our password database has been
hacked. Here's what we're doing..." press releases.

You've eliminated one point of failure (your company), and haven't added any
because you are already doing email based password resets.

You can delete all your password related stories from trello or whatever.

You eliminate all the bike shedding around how to store passwords.

You've improved your initial user experience by an order of magnitude.
Everyone dreads setting up yet another account password. Don't underestimate
the joy a user feels when the signup form is just "click one of these buttons
or fill in the email field". (The buttons are 'Connect with Facebook' and
'Connect with Twitter').

Users would much rather flip over to email (which is always logged in anyway)
and click a link (especially on a mobile device) than enter a login/password.

------
Jahava
The blog mentions, "We’re considering argon2 for our next upgrade". I suppose
they could do in-line upgrades: as users are signing in, the SHA512 is piped
through the old pipeline for verification and through the new pipeline for
migration. As far as I can tell, there's no way for them to swap bcrypt out
for argon2 using just their cold store.

------
Freaky
> Some implementations of bcrypt truncate the input to 72 bytes, which reduces
> the entropy of the passwords. Other implementations don’t truncate the input
> and are therefore vulnerable to DoS attacks because they allow the input of
> arbitrarily long passwords.

Huh? BCrypt works by stuffing the password into a 72 byte Blowfish key and
using it to recursively encrypt a 24 byte payload. Either it's truncating, or
it's pre-hashing the password to fit much like they are.

The link they use to justify it is funny:
[http://arstechnica.com/security/2013/09/long-passwords-
are-g...](http://arstechnica.com/security/2013/09/long-passwords-are-good-but-
too-much-length-can-be-bad-for-security/)

That's just a naive PBKDF2 implementation that's pointlessly reinitializing
the HMAC context each iteration instead of just doing it once at the start.
The difference between storing a 1 byte and a 1MB password with PBKDF2 should
be on the order of a couple of milliseconds.

~~~
arielb1
Having the SHA-512 hash at the beginning simplifies the implementation because
the "security" code only needs to handle 64-byte random strings (which are
truncated to 54-byte strings for `bcrypt`, but still...). That removes all
sorts of stupid edge cases that come with variable-length strings.

~~~
Freaky
> Having the SHA-512 hash at the beginning simplifies the implementation

The hash is there to ensure very long passwords contribute entropy to the
final hash instead of being truncated. It also ensures the entropy is evenly
distributed - every bit of the password affects every bit of the hash.

> the "security" code only needs to handle 64-byte random strings

You can't feed any typical BCrypt implementation a raw SHA-512 hash because
it's not binary safe - it truncates at the first NULL byte.

Well, you can, and it'll appear to work, but it'll be laughably easy to break.
It's a pretty stupid sharp edge IMO.

> which are truncated to 54-byte strings for `bcrypt`

72 bytes, because that's the size of the key array. _56_ bytes is just where
extra entropy helps less, because the last 16 bytes don't affect every bit of
the output.

> That removes all sorts of stupid edge cases that come with variable-length
> strings.

It's just treated as a circular buffer. And what are we doing here,
implementing our own version of BCrypt? Yeah, that certainly simplifies things
:P

------
nodesocket
So just to reiterate, taking the sha256 of the password before running bcrypt
on it is recommended? Funny, this is the first I've heard of this. You'd think
bcrypt would have just implemented the sha256 step into the algorithm?

------
allstate
Really surprised to see they are not using a HSM yet for the global pepper.
What kind of physical controls put in place for global pepper currently?

------
jgalt212
pepper = sha256(global_randnum + user_start_dt + username + global_randnum)

the above creates a per user pepper, and largely obviates the need for pepper
rotation.

if only password db is stolen, all users are impervious to dictionary style
attacks.

if only codebase is stolen, then they only know how to calculate each pepper
per user.

if both are stolen, you are in no worse shape than just using a salted
bcbrypt, scrypt, or PBKDF2.

------
cstrat
I am wondering how they store OSX users administrator password, since it isn't
being hashed - they actually store the password somewhere... it would be nice
if that were addressed somewhere.

See discussion:
[https://news.ycombinator.com/item?id=12457067](https://news.ycombinator.com/item?id=12457067)

~~~
gruez
I thought they created a setuid executable on first run (when they asked for
your password), and then use that later?

~~~
cstrat
Admittedly I don't know what that is and I don't know a lot about how
applications on OSX get permissions. From reading that thread I posted I was
under the impression that Dropbox stored the password because it was able to
reinstate itself as an accessibility service as many times as it liked without
having to ask for the admin password.

From reading, that wasn't supposed to be allowed. The only way that could work
would be if Dropbox kept your password on file. In effect meaning that the
dialogue you entered your admin password for wasn't a system modal - but
rather a dropbox modal imitating the system one.

~~~
danieldk
_I was under the impression that Dropbox stored the password because it was
able to reinstate itself as an accessibility service as many times as it liked
without having to ask for the admin password_

That is not necessary. A SUID binary owned by root runs with root's
privileges. So, they only need the administrator's password once to install
the SUID binaries. Afterwards, they have their own 'backdoor' to reinstate the
accessibility settings, without needing an administrator password.

So, when they deny storing your password, it's probably true, they don't need
it.

(If you are not convinced, write a small C program that executes a shell,
compile it, make root the owner, set the SUID bit. You can be in a root shell
without ever typing a password. This is why it is a good practice to have as
few root-owned SUID binaries as possible.)

------
Dowwie
I wonder why Dropbox didn't mention its robust support for 2nd factor
authentication?

------
aRationalMoose
Now if only their customer service had been 'quietly' improved over the years.

------
ashitlerferad
In 2016, shouldn't we be using public/private key pairs instead of passwords?

~~~
lighthazard
The problem is, in my opinion, that the average person will have no concept of
how to manage a public/private key setup.

------
tadelle
What is wrong with bcrypt with cost 13. It makes 8194 cycles on CPU...

------
coherentpony
May I ask a potentially dumb question? Why store my password at all?

~~~
adevine
How else are they supposed to verify your password when you attempt to log in?

~~~
username3
Email a link with token to log in.

~~~
Skunkleton
That is a good way to get endless token links spammed at your inbox...

~~~
username3
Gmail tokens tab.

------
eddd
too little, too late. Once you loose trust, you never gain it again.

------
mtgx
So when is Dropbox going to allow users to encrypt files client-side before
getting synced to its servers? It should be relatively trivial from both a
technical point of view and a UX one.

------
awt
This "promise" model of security needs to end. I "promise" I'll encrypt your
password, honest.

~~~
vinylkey
Forgive my ignorance, but are there any companies that provide proof that they
encrypt users' passwords?

~~~
awt
No. This is impossible. The whole model is broken. A less broken model is the
use of asymetric keys for authentication that doesn't require the service
provider to "promise" they'll keep your password secret.

~~~
Lucretiel
This model, of course, is broken in its own way, in that if a user loses their
private key, all their data is lost; there's possible recourse or password
reset. It's also broken in that, if the company believes that a user's private
key was compromised by a third party, they can't completely destroy that key
until the user manually logs in and changes it.

~~~
awt
Well, one could always leave one's private key with a trusted third party. So
the public key model can accomodate those who are uncomfortable with the
responsibility of keeping their private key secure. The password model however
does not allow for allodial title to a secret.

------
ashitlerferad
Why won't people stop using passwords? It is 2016!

------
davedx
Is this before, or after, Condoleezza Rice uploads them to the NSA?

------
cypherpunks01
Isn't this the company that authenticated all production users without
checking passwords for a few hours, a couple years back?

