Hacker News new | past | comments | ask | show | jobs | submit login
Why we are still using PBKDF2-SHA256 despite being aware of its limitations (github.com/bitwarden)
85 points by pldpld 2 days ago | hide | past | favorite | 94 comments

I think this misses the point: we should be doing everything we can to deprecate PBKDF2 because of the big differences between what a specialized attacker can do vs. the defender.

As a rough estimate: a $2k bitcoin miner can do 2^45 SHA-256 hashes/sec whereas your $2k laptop can do 2^16 hashes/sec; the attacker has ~a billion x advantage over you that can be multiplied based on their funding. At that point, doing even 10,000 PBKDF2 hashes may not make much of a difference.

argon2, scrypt and other memory hard password hashing algorithms reduce the orders of magnitude advantages of the attacker by requiring RAM. Attackers might be able to purchase RAM cheaper than the defender, but nothing close to a billion times cheaper.

Addressing concern #3 (want to have a password set on a laptop that decodes in a reasonable amount of time on a low-end smartphone), you could restrict the RAM to some small amount (like 256MB) if you anticipate needing to use a low-end device. This will still be a vast reduction in the attacker's advantage over PBKDF2.

I think the important point is that Bitcoin mining hardware is much more specialized than that. The really crazy fast ASICs are hardwired to do SHA-256 hashes of Bitcoin block headers only and just increment the Bitcoin nonce before repeating. They can't be repurposed to crack password hashes.

They may not be repurposable per se, but they are a great benchmark for what may be realistically achievable for cracking password hashes utilizing similar base algorithms.

One thing to keep in mind: a lot of the die real estate in modern processors is spent on doing tasks other than calculating hashes (a lot of it is caches https://www.servethehome.com/amd-epyc-7000-series-architectu... ).

Someone using an FPGA or ASIC can dedicate almost all of the die to creating lots of SHA-256 units.

AFAIK, the base algorithms here are custom-fabricated ASICs. If your adversary can custom-fab ASICs to break hashes in your PW manager password in PBKDF2-SHA256, no reason they couldn't make one in whatever harder algorithm you could come up with.

ASIC doesn't work well when a lot of RAM is required.

A $2k computer can do billions of hashes a second. 2^30

You're off by about 20 orders of magnitude (the joy of binary exponents).

Thanks for pointing out that I was off in my original estimate (a hasty web search showed the wrong cryptocurrency; almost nobody mines bitcoin on CPUs anymore :).

From what I can see, a more realistic estimate for a single core on a desktop is in the 2^26 range. Keep in mind that the PBKDF2 defender is single-threaded by design, whereas the attacker is not.

This still represents ~a half million x advantage for the attacker/$2k they spend. For $1m, they can guess passwords ~200 million times faster than you.

If we assume memory is the limiting factor for argon2, then even if a specialized attacker can use it at scale for 1/20th the cost, a 20x advantage is much better for the defender than a 500,000x advantage.

I'm confused, 2^14 = 16384 is slightly more than 4 orders of magnitude, where did you get 20 from?

An "order of magnitude" has no precise numerical meaning, it depends on what base you're working in. If the log of the number in base b increases by 1, you've increased 1 order of magnitude with respect to that base. (I don't know why the parent said "20", my instinct is to say 14 orders of magnitude here.)

I've usually heard it being used to describe powers of 10. I'd be willing to accept 14 orders of magnitude since we're talking about powers of 2, but I still can't figure out where they got 20 from. Hence my question.

One billion is 14 orders. We're talking billionS, that's a fair bit more (pun intended).

Maybe they forgot that Windows calculator doesn't respect order of operations...

This is interesting because the OP claims that this is exactly the reason they use PBKDF, because there aren't "highly optimized implementations [of other crypto algorithms] for all platforms":

> What is crucial here is that we don't want to have the defender stuck with a slow implementation of something that the attacker will have a highly optimized implementation for.

After a programmer changed PBKDF2 to use SHA512 instead of SHA-1, I asked whether it made sense to use SHA256 instead for this very reason. With SHA extensions on Intel (pretty rare, incidentally), and the cryptography extension on ARM, I wondered if SHA256 (with more iterations) might put defenders at less of a disadvantage relative to attackers than SHA512.

My threat model isn't a directed attack, it's DB dumps with unhashed or unsalted passwords from random websites. I want to use a unique password on every site, and password managers provide a convenient way of doing that.

Even if every BW vault leaked, if it takes half a day to run through 8 a-zA-Z0-9, it's not practical to do that for every vault. On the other hand, if I'm being targeted, even increasing that to a month wouldn't really matter.

Every "critical" site I use also supports u2f 2fa, which I've turned on. So even if they got my passwords, there's the 2nd factor they don't have.

tl;dr: Just use a damn password manager, even one that has arguable issues such as this improves the average person's security by orders of magnitude.

> Every "critical" site I use also supports u2f 2fa, which I've turned on.

What US bank do you use that supports U2F, or do you not include banking in "critical"?

Banks might not support u2f but they pretty universally support some form of 2FA (I think certain things like PCI require it but not sure on exact regs)

It might not be quite as good but email 2FA behind U2F protected email gets you pretty close

Most support only SMS as security theater, which is worse than useless because it fosters a false sense of security.

My good bank (First Direct) gave me a physical OTP code generator, so I need a PIN and the physical device to get codes which are needed to log in. Once upon a time read-only activities like "Check balance" didn't need a code, but they got rid of that functionality because it's presumably a security risk with little benefit.

The same physical device constructs confirmation codes (proving I know the PIN) for specific inputs like if I want to send money somewhere I've never sent it before or a much great amount of money than usual.

However unlike U2F or its modern successor WebAuthn that's still in principle vulnerable to phishing, if thirstdirect.example pretends to be firstdirect.example and I don't notice, the codes I give to the wrong site work on the real one.

I don't know about the US but Barclays in the UK has had multi-factor authentication for years now. Is that not the case in the US?

If it is TOTP/HOTP based rather than U2F (6-digit codes), it is vulnerable to real-time spoofing.

U2F is a specific type of 2FA/MFA.

They are not congruent.

U2F is (as its full name "Universal Second Factor" would suggest) specifically only a second factor, it doesn't make sense as your first or only factor.

WebAuthn can replace the entire authentication, because it can perform multi-factor authentication locally and then send a claim to have done so, optionally backed by attestation from a vendor saying they promise the multi-factor authentication is done by their product. For example an iPhone can have one press sign-in to web sites or apps using this technology.

What do you use for 2FA tokens?

YubiKeys (4 I think? It's been a while since I bought them)

If I understood correctly, the points are:

- using longer passwords (or salts) is better than increasing the number of rounds

- having the same database on different devices (top-CPU x older cellphone) have impacts on the performance for the user but not for the attacker (as a powerful hardware will be used)

Seems fair, for the average user. And the top user will prefer a longer password anyway.

First sentence is really the problem in the modern era.

The best most people can remember as a password, is some variations on common words and their date/place of birth.

Hence it doesn't matter what algorithms a database is using, computer will crack most passwords very effectively, provided with common words and minimal rules.

The only solution to secure against cracking is to have way more complicated passwords (very long), but people can't remember them.

A lot of the sub-threads seem to have decided to talk about using PBKDF2-SHA256 as a password hash like crypt(), which actually isn't what the linked report is about. PBKDF2, as its name suggests, is a key derivation function. We can use these as password hashes, and people do, but we can also use them to turn any human memorable password (like "stonks", "S&SMBMBbc&wem" or "fzg76@PRU385!") into a nice 128-bit or 256-bit symmetric encryption key and that's what they're doing in tools like Bitwarden or 1Password.

In this role, you very much have a practical option to just pick a decent password. "stonks" is not a good password, the second one is a Rhianna lyric ("... but chains and whips excite me") but the third one is a pretty obscure reference and it's neither likely that your adversary would "guess" it nor that the sort of brute force attacks envisioned would hit this random looking 13 character password.

Anyway, even as a password hash I've made the argument previously that stronger hashes only marginally improve things, far too many of your users will pick "stonks" or if you insist on eight characters maybe "stonks!!" and even if you have Argon2 tuned way up the attacker can reverse that because it's too obvious. If your users picked unique random passwords (as they might with tools like Bitwarden or 1Password, even though I personally use zx2c4's pass) then it doesn't matter if you use a terrible hash like some turn of the century PHP forum using MD5, because that's still safe with such passwords.

What's wrong with PBKDF2-SHA256? I use PBKDF2-HMAC-SHA256 for an almost-stateless password manager.

It's not memory-hard. That makes it much, much faster on GPUs than CPUs, which tends to mean it's much faster for attackers than legitimate users. It's also likely vunlerable to side-channel attacks, since nothing in its design tries to resist those. It's not broken by any means, and still vastly better than just a salted cryptographic hash, but it's not as good as Argon2id.

If I may ask a question: how would I use a memory hardened algorithm on a server if the server ram can't scale infinitely? It seems to me that a few concurrent user logins would effectively DOS the server for any reasonable configuration of argon2

You don't actually need it to use that much RAM. It's more about memory bandwidth. An attacker can increase the memory bandwidth by using block RAM like in an FPGA or on-die RAM in an ASIC, but both of those are far more expensive than discrete DDR DRAM modules. So with appropriate memory usage (enough to saturate the cache lines) it can be cheap enough for the server to run yet still more expensive to build an ASIC platform for. Of course an ASIC can use DDR as well, but then it's stuck with the same memory bandwidth as the DDR module and gains no particular advantage.

Argon2id with t=3, m=92MiB, p=1 should be slower than bcrypt with cost 12 on most modern GPUs. That's not actually all that much memory, you can handle quite a few concurrent logins on a server with those settings. 10 concurrent logins per GiB of RAM dedicated to the task. And it should only DOS the login process, so it might make taking out your auth process (or server) easier but won't necessarily harm any other part of the site.

Thank you. I find a GiB for 10 concurrent logins still a bit too much for some use cases but you have given me a better understanding of the trade-off involved.

The advantage of algorithms like Argon2 is that they are way harder to implement scalably in hardware than SHA256 is. Or in other words, it's easy to build an ASIC that's 1000x as powerful to compute SHA256 hashes as a recent Intel CPU than to build something that's 1000x as powerful to compute Argon2 hashes. You might still be able to, but then power usage is hopefully closer to a 1000-fold of the Intel CPU than the SHA256-ASIC is.

Read the opening comment on the issue

It's reasonable to expect an attacker to have a SHA256 ASIC given Bitcoin making those a commodity

Potentially this can be offset if your server has https://en.wikipedia.org/wiki/Intel_SHA_extensions but if you're running on cell phones that's not there

See also https://bitcoin.stackexchange.com/questions/36253/what-minin...

SHA and MD are fast hashers. You want slow hashers for password security like b-crypt and s-crypt.

PBKDF2 is a slow algorithm specifically created for password hashing. It's in the same family as bcrypt and scrypt.

> PBKDF2 is a slow algorithm specifically created for password hashing.

Yeah, over 20 years ago. It's woefully out of date by modern standards.

PBKDF2 doesn't even attempt memory hardness, so there are whole classes of attacks on later generation slow hashing algorithms that don't even apply to PBKDF2 because of how old it is. Argon2 is extremely resistant to Time-Memory-Trade-Off (TMTO) attacks, which older algorithms like bcrypt and scrypt are vulnerable to.

PBKDF2 is essentially a linear slowdown, which is effectively pointless these days.

Math doesn't age. Cheers.

bcrypt and scrypt, the successors to PKDF2, are both more than a decade old.

RSA is half a century old and it's still up to date by modern standards. In fact nobody has came up with anything better.

Edit: Actually, bcrypt might be as far as 1999, possibly older than PBKDF2.

> RSA is half a century old and it's still up to date by modern standards. In fact nobody has came up with anything better.

RSA is currently a minefield of gotchas and few security companies even get it right. Just generating a good key is actually a very difficult task. It is also very computationally slow and has many practical issues for the level of security it provides.

There are many superior replacements in both pqc and elliptic spaces. I would take EdDSA over rsa-pss any day of the week.

RSA is extremely simple, it's just multiplications and powers. It can be reasonably explained to high school students. The tooling is mature and keys are trivial to generate safely with a openssl command.

EdDSA is another level entirely. I don't know how you can recommend elliptic curve cryptography with a straight face if you think RSA is hard.

P.S. It's a myth that EdDSA is faster. This depends on operation (signing vs verification) and key size.

Elliptic curves are also quite simple. Computing a public key boils down to point addition with a modulus.

The private key is a byte string and the quality of it only depends on the random generator. It’s trivially fast to generate 32 bytes of decent quality random numbers these days. There are many insecure rsa generation methods with weak criteria. Too many are fossilized in libraries and crypto cores. Rsa also has half a dozen padding schemes and most are now considered weak or vulnerable.

EC is generally considered much stronger for a much smaller key size.

Note that we are talking about a single RSA operation. So attacks that require repeated trials are not relevant here.

> essentially a linear slowdown

Why does it have to be linear? Just use 1M irritations today, 2M irritations next year, 4M irritations the year after that, and you'll have an exponential version of it.

You can even take your 1M irritated hashes from this year and execute an additional 1M more irritations on them to update them to 2M irritations when you want to upgrade.

Doesn't PBKDF2 fix that as long as you use a very large number of irritations?

In theory, absolutely.

In practice, bitwarden's server limits the max iteration count on a user account to something that remains insecure. They refuse to fix it.


What do you think about this reply on that thread? https://github.com/bitwarden/server/issues/589#issuecomment-...

For me is not a big deal. If your password is Hunter22 than you have bigger problems than PBKDF. It doesn't matter if they PBKDF it 30 milion times. Longer passwords are harder to crack and I still don't understand why people not using passphrases in their passwords.

Having 2fa enabled on all other accounts it makes me sleep better if somehow one day BW or any other password manager gets compromised.

Use a longer and more complex master password. You're welcome.

You're the sixth person to reply to me with this "advice". My own password is 30 characters and I self-host bitwarden_rs, patched to permit a higher KDF iteration count.

This has nothing to do with my usage.

Is sharing you password length wise? Knowing the # of chars you have reduced the number of iterations needed to complete a brute force attack.

255! vs 255! / (255 - 30)!

My math could be off though, i haven't work with factorials since i was in the university

I don't think you want a factorial involved.

With unknown size, cracking 30 characters takes time proportional to n^30 + n^29 + n^28 etc.

Cracking just 30 is proportional to n^30.

The difference is negligible. A percent or two.

My bad, I was thinking in permutations but those does not allow repeated entries. It make sense now, like you said the difference is negligible.

It's a trick. OP's password is actually 29 chars long, but the attacker will now start at 30 characters, and never brute force the actual password. Nicely played.

Evidently there is no problem then.

The model of hashing user plaintext passwords on server-side is itself a deeply flawed one. User password and master keys derived out of it should NEVER leave user-controlled computer. For authentication password-authenticated key agreement protocols [0] should be used, anything but it means that service does not treat user security as a high enough priority.

[0]: https://en.wikipedia.org/wiki/Password-authenticated_key_agr...

PAKE still require storing verifier on the server, which is basically a password hash. There's no significant difference in security between PAKE and a home-grown implementation that Bitwarden uses with respect to password hash leaks from the server database. See my comment here: https://news.ycombinator.com/item?id=25522361

As for Bitwarden's implementation: it doesn't send the password to the server, it sends, basically, a PBKDF2 hash, which is different from the one used for encryption. The leaked hash can't be used to decrypt the database unless it's bruteforced. However, the protocol is not ideal, there's a weakness that I wrote about here: https://dchest.com/2020/05/25/improving-storage-of-password-...

AFAIK, 1Password uses SRP with PBKDF2 for verifier.

>PAKE still require storing verifier on the server, which is basically a password hash.

No, there is an important difference: leaking this verifier does not let an attacker to impersonate user at will. See my other message [0].

Is my understanding correct that you derive two keys from user password, one used for authentication and one for decrypting encrypted content which does not leave the user's computer? In that case, yes, it's somewhat better than the typical scenario, though I personally would still prefer if a proper PAKE was used for authentication. It may not apply to your service, but leaking encrypted data can still result in exposing certain meta-information, which may be important, so it's better to be extra-safe in such matters.

[0]: https://news.ycombinator.com/item?id=26230200

(Not my service, I'm just a random commenter)

The same happens with PAKE during registration, where the user will need to provide the verifier. Since everything happens over TLS anyway, and hopefully, with pinning (in apps), this is not a huge concern.

I haven't checked Bitwarden, but even though 1Password uses SRP, the initial registration happens in a browser. (PAKE inside a browser with JavaScript is even more useless.) Their protocol though uses a strong key in addition to password, making password bruteforcing from the verifier impossible.

I'm not against PAKE — the biggest benefit is that you don't have to create your own protocol and make mistakes. What I'm saying is that for such use cases its security benefits compared to other protocols are negligible.

>The same happens with PAKE during registration, where the user will need to provide the verifier.

In a typical PAKE, generated challenge depends on random values generated by both server and user, so if at least one of them is not controlled by an attacker, the generated challenge will be different each time. So leaking the verifier or eavesdropping on previous logins does not help an attacker to impersonate user in any way.

Right, the verifier can't be used to impersonate the user, but can be used to verify password guesses offline.

Encryption\Decryption are done locally. the server has no clue what your master-key is, all it knows here's a blob of data you need to store, here's my "ID"(The secondary key generated which is different from the encryption key and also relies on the master-key and other parameters).

When retrieving the vault, the program will generate the secondary key, provides it to the server(You can add 2FA btw), the server sees that indeed it matches one of the ID's and sends you back the encrypted vault. decryption is done locally.

I don't see any issue with this process, maybe you care to point the attack vector\scenario

Leaking the secondary key will allow an attacker to impersonate user at any time and to continuously retrieve the encrypted vault. As I've said, it may not be important for a password storage, but in more sophisticated applications continuous access to an encrypted vault may leak important meta-information (e.g. number of files, their size, time of creation, etc.).

As I've mentioned, when retrieving the vault other parameters need to match like Account name, Email etc.

You can add 2FA to the authentication process.

All metadata is obviously encrypted, the data is just a single blob. you maybe could try to guess the number of entries based on size but that's too dynamic as well.

These products were attacked before, decrypting the vault wasn't how they were attacked.

It was with bugs\vulnerabilities within the browser extensions that lead to data leaks. something else entirely.

I think that's the problem here: Bitwarden hashes the passwords on the client side but it runs on various client devices, some more powerful than others, and others not able to run efficient Argon2 implementations.

Ah, I thought it's a JS library used on a server side. But am I correct that they still pass the derived master-key to server side in a plain form?

Citing the OP:

>Sure, you might tolerate a longer unlock time, but is the security gain really worth the cost to your battery?

I think the battery concern is over-blown. How often do you login into a service? I think that for typical use-cases, amortized battery-cost of a login is negligible. And for other use-cases you can let users choose.

> But am I correct that they still pass the derived master-key to server side in a plain form?

No. Master key doesn't leave client machine, only its hash is transmitted over the network. See dchest's link above.

These cloud based solutions perform hashing+salting+KDFing locally and then add additional compute cost on the cloud level.

I don't see any issues with that.

The issue is that if an attacker can eavesdrop on a compromised authentication server, then it can record user's master-key and thus he will be able to impersonate user without issues until the password gets changed (i.e. the derived master-password is actually plays role of a password now). With PAKE this issue simply does not exist.

Nono. the master-key never leaves the device.

Via the master-key, the program derives(locally) the key to encrypted the data and a different secondary key for authentication against the server. without knowing the master-key you can't decrypt the vault even if you were able to trick the server into sending you the vault.

The vault is decrypted locally

Deriving two separate keys does indeed improve the situation, but still not ideal, see: https://news.ycombinator.com/item?id=26230259

> some more powerful than others, and others not able to run efficient Argon2 implementations.

Like what? It looks like personally my phone is 4x slower than my desktop. So if I calibrate for one or two seconds on my phone, the security should be pretty acceptable. Do I need to worry about devices much slower than that?

Try srspass.com which runs argon2id in the browser. Takes about 3 seconds on my ARM phones for the unlock, which uses quite heavy argon2id parameters, above the recommended memory and iterations.

I don't think a 3s wait for a session, for greater security, and that on an unoptimized device is going to be breaking UX.

Then make it a users choice?

How? Seriously interested. This is cross-plattform software. How do you change that from PBKDF2-SHA256 to Argon in a way that still lets you use your desktop PC as well as your 4 year old budget Android device? And then support user configuration on top of that? Same about raising iterations.

The submitted github comment also makes that point, this is actually hard to do.

Offer the user the choice for other solutions with a performance penalty?

Choice is always better. for people who care\worry, they can change to something more resistant to cracking.

As someone who worked in this field. Offering "options" when a vast majority of the user base don't even understand that their data is encrypted, is often a poor approach to take.

Users will forget their Master Passwords even and because they forgot them they will believe they've been "hacked" and blame you.

Users on Hacker News and similar sites where users actually understand the underlying technology to some degree are the exception, not the norm. Adding options does not help a vast majority of the user base and it complicates your codebase further. Imagine making that change and less than 1% of your users actually use that feature?

It's a poor approach for multiple reasons. Like being the origin of downgrade attacks.

If the default is the downgrade, what have you lost if there's not upgrade path in the first place? nothing!

There's no issue with that. layman users won't modify these settings while advanced users will be warned.

You're thinking wrong.

Agreed, defaults are the norm and then the advanced settings are layered. Even so, explaining and showing the differences visually or with video is possible. I don't agree with treating people as babies when it comes to tech.

Yes. Part of the value prop of these services is they have experts who are better qualified to make these decisions than I am.

I completely agree on server-side derivation being flawed, which is why I made SrsPass, which derives child passwords for you client side, you can use across accounts, and ensures even if you end up making a password on a shitty site that plaintexts your credentials, it won't compromise your master key, as it's 128-bit salted.

Pbkdf2 itself is not the best for the same reasons why sha256 was listed. I do agree that argon2 is better.

PBKDF2 and SHA256 are fine for all use cases and have libraries available in all languages.

argon2 has nothing better to offer. Practically there are 3 argon variants to chose from and they all require careful configuration. It's pretty hard to start with, assuming you can find libraries for it in the first place, last I checked it wasn't commonly supported.

It's a perfect example of theory versus practice. Argon is a researcher's wet dream, ideal by some algorithmic definitions, yet it has no benefits in practice.

Are you serious? Even something that would be considered "bad" argon2 set-up is far better than anything that is based on SHA256.

Modern GPUs and ASICs can perform millions of SHA operations per second, even with a poorly configured Argon2, you reduce that massively.

You can't compare plain SHA256 with PBKDF2. PBKDF2 can take a million SHA operations to hash one password, if you configure it to (default is somewhere 10k to 1M).

If you were to leak your company database with 1 million customers and hashed passwords, there's some theoretical considerations to be made on resistance to GPU and ASIC cracking, practically you're in a pretty bad place whichever algorithm was used. ^^

P.S. Cryptography would have more weight if half the passwords weren't a variation of password2021 and hunter22.

> You can't compare plain SHA256 with PBKDF2.

But you can. It’s literally just N times the hash. Typically the number of iterations is chosen to be somewhat slow on the server that derives it. But a specially designed rig can execute this with extreme parallelism and speed.

User-defined passwords are a bad choice for an encryption KDF. I'm not disagreeing with sibling xxpor's tl;dr that worse is better here; getting everyone on password managers would be a huge net win for personal security. But debating the choice of KDF is missing the elephant in the room: that user passwords simply lack enough entropy for use as secure key material. It's like debating which kind of paper you should use to repair a huge hole in your bunker with a paper mache.

That said, asking the user to manually type full-size generated keys between devices is simply a nonstarter.

But what if the user stored their passwords in a private Matrix room? Matrix' solution to sharing encryption keys between devices is by being the communication channel by which users approve new devices from existing devices; upon approval the room's encryption key is sent to the new device encrypted using the new device's public key. That is, the room key can only ever be seen by the devices themselves. (I think this is a reasonable summary of encryption in Matrix, please correct me if I'm mistaken.) This is basically using a Matrix room as a general distributed, encrypted data store. Thoughts?

If it's not broke, don't fix it. The underlying hasher is most important anyway. Crappy passwords will always be susceptible to rainbow attacks.

No, not with salt and a high enough iteration count, they won't be.

I think gp is a bit off - weak passwords will be vulnerable to a dictionary attack - even if reasonably salted. But won't be vulnerable to rainbow attacks in any meaningful sense (assuming sensible salt/hash/iterations).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact