tl;dr: Hardcode a second salt in your application code or in an environment variable. Then a database dump is not enough anymore to do any kind of bruteforce.
It's simple, free and you can retroactively apply it.
EDIT: I addressed some of the points raised in this thread here https://blog.filippo.io/salt-and-pepper/#editedtoaddanoteonr...
Your pepper will be a long, random key that is known to your app server but not your database server. If you store passwords as:
bcrypt(bcrypt(password, salt), pepper)
If you store passwords instead as:
encrypt(bcrypt(password, salt), pepper)
The rest of it is more of a qualitative question -- what's the risk that someone gains access to your DB but not your app server, vs. the risk that in implementing the pepper, you somehow screw up and store something easily crackable?
And, no, I am not being facetious by saying nobody knows how to do that. I am being quite literal. Have you ever done that? Do you know how? Do you even know what you would google to figure out how?
I'm yet to see my favorite library of course's documentation on a HSM. How do you do that in e.g. PHP with MySQL? MVC with MS SQL? Java with Oracle?
Edit: There's even a project to literally do this directly from PHP
You don't get your own HSM but it's MUCH cheaper ($1/key/month) and more scalable and available than an HSM.
I have not used either myself, but I would imagine the documentation is quite good.
The question is irrelevant anyway, because if someone gets access to your app server, they get the secret in both cases (pepper and encryption). So the first risk there is present in both cases, which just makes this a static "what's the risk that you screw up in implementing the pepper?"
So yes, S+P protection won't save you if the app server is completely compromised, but it does protect you in case DB only is.
And the ability to S+P seems pretty simple to implement and document. Why is everyone panicking that it's hard to do?
$2a means it's using bcrypt version 2a
$12 means 12 rounds
GhvMmNVjRW29ulnudl.Lbu is the salt
AnUtN/LRfe1JsBm1Xu6LE3059z5Tr8m is the checksum
But upon reflection I believe you were instead using the term "pepper" to include the encryption approach as well and merely trying to question whether the added security of requiring an app server compromise is worth the risk that you still screw it up somehow. And to that I'd say that it's not difficult to apply an existing block cipher algorithm when storing/retrieving password hashes so I think the risk there is low.
> Why is everyone panicking that it's hard to do?
Everybody's panicking because the originally proposed pepper implementation is a really bad idea. That approach has not been researched for security implications, and there are many reasons to believe that composing hash operations without using a specially-defined operation like HMAC is bad, and adding static bits to the salt or password (e.g. if you use 'pepper.salt' for the scrypt salt or 'pepper.password' for the password) is also bad.
However, I believe the approach of using a block cipher to encrypt your hashes with an app-wide password is reasonable. It's not composing operations badly (encrypting a 256-bit string, or whatever scrypt emits, is perfectly reasonable given a secure key) or otherwise providing an attack vector on the hash algorithm.
The biggest risk I can see with this approach is you have to make sure the pepper is stored securely on your app server, never visible to the database server (and never accidentally committed to an open-source repo, if you're using an open-source server). Not just that, but you have to make sure that you don't accidentally lose it either (or you'll have instantly lost all your accounts). But this is a solvable problem.
1) You should not invent your own algorithm. That's a given. That's why you use bcrypt/scrypt.
2) It's not abusing the algorithm, it's using a longer salt (in the concatenation case).
3) There's nothing wrong with nesting algorithms (just remember to use hex/base64 encodings, not binary). For example Facebook passes passwords through half a dozen algorithms. They call it the "onion". And it includes a pepper.
4) As for being effective, I think the SQL injection case speaks for itself.
5) As for rotation - just don't do it. You pepper gets compromised? Who cares, add a new one on top of the old one.
Also, I'm confused at how the proposed alternative would be harder to get wrong:
> Encrypt The Output Hash Prior To Storage
salt = urandom(16)
pepper = "oFMLjbFr2Bb3XR)aKKst@kBF}tHD9q"
# or, getenv('PEPPER')
hashed_password = scrypt(password, salt + pepper)
The idea of not using key-rotation alone is insane, but lets just focus on your last point
Also, I'm confused at how the proposed alternative would be harder to get wrong
Outside of Peer-Review, what reason would anyone have to use the pepper scheme? As others have posted, there are several community members who's opinions do matter due to extensive research and body of work
For password hashing purposes, salt doesn't need to be uniformly random, the only requirement for salt is to be unique and unpredictable to the attacker (see http://crypto.stackexchange.com/questions/6119/hashing-passw...). Most password hashing functions use a cryptographic hash on salt.
This particular function, scrypt, uses one-round PBKDF2-HMAC-SHA256 to mix password and salt:
PBKDF2 feeds salt, basically, into SHA256:
But PBKDFs like bcrypt and scrypt are not designed to keep the salt parameter secret; in fact they assume the attacker knows the salt. And so if they happen to reveal the salt to the attacker, this is not considered a bug in the algorithm and won't have been flagged or fixed by cryptographers.
XOR could also result in NULL bytes anywhere in the hash input, which could drastically weaken passwords,. For example, bcrypt ignores any password characters after the first NULL byte. This is especially bad if the attacker can supply their own passwords, doubly so if they can observe the output, since they can then easily brute-force individual bytes of the secret and use that knowledge to intentionally create NULL bytes in the hash input.
> XOR doesn't decrease the keyspace (or change it in any interesting way), so any attack on XOR is an attack on bcrypt
I wouldn't make any statement like this unless you've actually gone through the steps to prove it.
Can you be kind enough to explain with a very simple example?
I would appreciate understanding your point - Thank you
(It's also possible to brute-force more than one byte at a time; it would just take longer. For example if the shortest password the attacker can observe is 6 bytes then they would need to try 2^48 possibilities.)
The real downside is that there's a better, proven way to do the same effective thing, which is make a database-only compromise require additional work, without rolling your own crypto. It also supports doing things retroactively for real (not some of the hacks being discussed in this thread) and key-rotation. All the upsides, with none of the downsides.
Please do not ever consider "rolling your own crypto" a walk in the park. Unless you have a serious security background and some actual cryptography education and research never, never, NEVER do this. It is not negligible, and it is not safe.
scrypt(scrypt(scrypt(scrypt(password, salt), pepper2013), pepper2014), pepper2015)
The best article I've seen against this technique is by ircmaxell . Nicely summed up in this sentence "It is far better to use standard, proven algorithms then to create your own to incorporate a pepper."
Anyone have source material (academic paper, Bruce "The Crypto God" Schneier blog post) that shreds some light on peppering passwords?
I'd be much more interested in how many iterations of bcryprt Slack were using. That has a much bigger bearing on events for me. Anyone at Slack know/want to answer that question?
Yes, many times a dedicated attacker who has read access to your database will also have read access to your source code or config files, but many times they won't. And if they don't, then they won't be able to crack a single one of your passwords, while even with a modern and proper hashing algorithm they still may be able to crack passwords.
Take the scenario of a relatively intelligent hacking or hacktivist group, of which there've been several in the past 5 years. Let's say they're targeting someone they dislike for whatever reason, and find out that person is registered on some forum and decide to compromise the forum. (This tactic of lifting a whole haystack to find a single needle is very common for motivated attackers.) They don't care about any of the other users, they just want to try and crack the hash of one single member and have a full GPU cluster with which to do it. They're also willing to spend weeks trying to crack that one hash.
If the user's password isn't particularly strong, it's going to fall no matter what algorithm they used.
But if the forum is peppering all of their hashes, and those same attackers can only manage to gain access to the forum's database and not its local filesystem, then their chance of cracking that password goes to 0.
This scenario is a bit contrived because odds are motivated and intelligent attackers like these will end up gaining access to the filesystem and reading the pepper with enough time and effort, but the pepper is still an additional defense and means SQL injection alone won't be enough to crack passwords.
From all the advice I've read security and crypto they don't work like that. The assumption is the other way around. A properly implemented, simple pepper can only hurt password security until proven otherwise by rigours testing and analysis.
Time and time again we read stories of a tiny implementation detail that created a sly and subtle vulnerability that simehow leaks information about original plain text by interrogating the cipher text.
bcyrpt with a large work factor and a per user salt is a PROVEN method to prevent attackers learning the plain text. Until I see evidence from a trusted cryptanalyst I'm not going to roll my own by adding in pepper they didn't plan on being there.
EDIT: sorry let me make my point a little clearer. In the event that the hacker can access the filesystem or memory -- whereever you store your pepper -- could the hacker use the pepper and an implementation detail in the peppering technique to learn information about the plaintext or the salt? This question is what needs to be answered by qualified cryptanalysts before developers start using peppers wide-spread in my opinion.
Despite hashes being cryptographic primitives, user password hashing is less about cryptographic principles (preventing first and second preimage attacks) and more about increasing the amount of work an attacker must muster to find an input which hashes to the hash value.
Attributes like collision resistance generally mean almost nothing when on the scale of strings under 100 characters in length, which most user passwords are. Practically, you are never going to run into collision issues if you're using MD5 or later. Your goal is merely to increase the amount of CPU time it takes to find a hash's original input.
Because of this, even if AES encrypting a hash with a random pepper somehow reduced the collision resistance of a hash (I'm 99.8% sure it doesn't), it wouldn't at all affect the speed at which the hash is cracked.
>a sly and subtle vulnerability that simehow leaks information about original plain text by interrogating the cipher text.
Hashes are not ciphertext. For all intents and purposes they can be viewed as CSPRNG output. There is nothing you could do to them to leak info about the plaintext as long as your hash algorithm isn't pathological. There are things you could do to reduce collision resistance, but I addressed that above.
Password protection in web apps is not about encryption or decryption.
Well, yes. But what is the definition of "properly"? There are definitely constructions of "pepper" that look simple, but drastically hurt overall security:
bcrypt(hmac(password, key), salt)
It's sort of like the difference between birth control and counting based contraceptive methods (Standard Days Method). Executed perfectly, they are equally as effective. But with a slight error, one stays roughly as effective (losing maybe 5 to 10% effectiveness overall) while the other drops drastically (down to 10 to 20% effectiveness).
Considering using encryption is as effective as using a pepper, and it's less prone to weakening the core password hash, I suggest using encryption instead of peppers.
I would consider using the raw byte-output version of a function a very blatant example of "improper implementation".
Also, I agree regarding encryption. In my example I was actually referring to the random AES key as a pepper, even though it'd probably be better called an "application secret".
1. Hacker steals db but does not compromise web servers (because the hmac pepper key lives on the web servers and not in the db)
2. Hacker can run SQL Injection via web server, but cannot otherwise access web server memory/process
3. HMAC key is stored in a hardware security module and hacker cannot gain physical access
Otherwise such advices should be ignored.
Anyone can invent a security system that he himself
cannot break. I've said this so often that Cory Doctorow
has named it "Schneier's Law": When someone hands you a
security system and says, "I believe this is secure," the
first thing you have to ask is, "Who the hell are you?"
Show me what you've broken to demonstrate that your
assertion of the system's security means something.
I'm not advocating for shutting-down anyone's discussion. What I, and probably others, are advocating for, is only putting in trust in proven crypto.
The parent comment to all of this basically has the form: "Wow, you could totally solve your password hashing (that isn't broken) by using this scheme I came up with. No one else has looked at it, but boy, it looks difficult to crack to me."
There's a complete difference between that comment, and many of the others: "how about if we do X", and I believe the second is completely valid for discussion; As LONG as you include relevant experts in that discussion.
Joe and Bob talking about encryption isn't very useful unless Joe &| Bob are trained in cryptography and have experience in applying that crypto in the real world.
Down-vote all you want, but that's my view-point at least.
It's "cite your sources."
To insist on something novel, yes you want a source.
The upshot is that if you get a db dump and are able to brute force some bcrypt hashes, you won't know what usernames they go with. If you get a db dump and our source code, you're still out of luck. If you get ahold of an old server hard drive, you're out of luck. If you root a running server and inspect the process memory, you can obtain this key.
This scheme also allows the mapping key to be rolled, which would immediately invalidate all passwords in the system.
*we also version our password hash records so we can migrate from bcrypt to a new scheme fairly painlessly if it's warranted in the future.
As an attacker, I get SQL access to your DB (meaning no access to the encryption key). I then download the user names, and the hashes. I then attack the hashes offline. I recover only the weakest few percent (since you're using bcrypt). But since the weakest few are those most likely to be re-used (both by different users and by a single user across sites), they are going to be both more valuable to me and easier for the next steps:
Then, I take the highest frequency passwords and the user table, and I start validating them online in your system. Now if I do that too quickly, you'll notice and I'll be shut down. And if I do that all from the same IP, I'll be shut down.
But what if I had a botnet that I could distribute the load across. What if I kept my request rate small enough to stay under the radar of even a moderate scale system.
I would expect to start seeing meaningful results within days.
If you had 1000 users, then I could surmise that you don't have much traffic, and hence keep the request rate down to perhaps 100 per day. In 10 days I'd have at least a few u/p combinations that I know for a fact worked.
If you had 1000000 users, I could ramp it up quite a bit higher, to perhaps 1000 or 10000 per day.
And since they all came from separate IP addresses, it could be rather difficult for you to tell an attack was going on unless you were looking specifically for it.
Does that mean you should stop immediately? No. It's not that bad of a scheme. But be aware that it doesn't give you (or your users) the level of protection that it may look like on the surface.
The #1 password out of 3.3 million was 123456, which was used 20,000 times.
So extrapolating that for your 2 million hashes, we'd expect the top password to appear roughly 12,000 times.
Running those numbers, we'd expect each guess to have a 1/12000 chance of matching. Or more specifically, a 1988000/2000000 of not matching.
With some quick running of those numbers, we'd expect a 50% chance of finding a match after trying just 115 random usernames.
I'm not saying it isn't an interesting approach, I just don't think it's nearly as effective as if you encrypt the hash directly (which has no attack vector unless you can get the key).
But if you've got a list of all usernames (probably a relatively small number) and access to a running system, isn't it easy to just try each password against each user until you find a match?
The common practice of limiting logins from a single username wouldn't help with that either.
Later edit: I'm referring to your example in your link:
salt = urandom(16)
pepper = "oFMLjbFr2Bb3XR)aKKst@kBF}tHD9q" # or,
hashed_password = scrypt(password, salt + pepper)
There's nothing wrong with nesting algorithms (see the Facebook hash onion), so you can use the following scheme:
bcrypt(bcrypt(password, salt), pepper)
bcrypt(bcrypt(password, salt), pepper)
I hope it's obvious that no one should never do this, since the output would contain the "salt+pepper" bits in cleartext alongside the hash, defeating the entire point of the "pepper":
In fact, this is a perfect illustration of why it's bad to put secret bits into a crypto function in a place that's not designed to take secret bits. Bcrypt does not treat the salt parameter as a cryptographic secret, and other algorithms might not either. And they might leak it in more subtle ways.
This is really vague. What kind of algorithm? With itself, or just anything inside anything else? Passing the raw output of any common cryptographic hash (SHA-x) to bcrypt, for example, completely destroys its security, as bcrypt input is null-terminated.
(What happens when you nest DES in A*, anyway?)
I would avoid using any particular symmetric algorithm twice. Otherwise if you have an example of algorithm chaining that can weaken security beyond the weakest link, I would love to see it. (Not that I think nesting is a great idea.)
It (poor implementation) is definitely related to implementing pepper on top of a secure password hash, though, which everybody is already doing differently.
> I would avoid using any particular symmetric algorithm twice. Otherwise if you have an example of algorithm chaining that can weaken security beyond the weakest link, I would love to see it. (Not that I think nesting is a great idea.)
“algorithm” is, again, really vague. (So is “nesting”.) But for something contrived and not snarky, here:
h = sha512_hex(password)
sha512_hex(bcrypt(h, gen_salt()) +
bcrypt(h, gen_salt()) +
bcrypt(h, gen_salt()) +
>“algorithm” is, again, really vague
Something that you can use to hash passwords. What you gave works if you assume gen_salt is seeded per user.
>The weakest link here is 374 bits (4 bcrypts), but the output is 288.
I'm afraid I don't follow. Your bit numbers confuse me, and I don't see how this results in an algorithm that is weaker than either sha512 or bcrypt.
They’re bits of entropy (not counting the password itself) (I think). SHA-512(M): 512 bits; SHA-512(SHA-256(M)): 256 bits, for example.
> I don't see how this results in an algorithm that is weaker than either sha512 or bcrypt.
That’s my point. It’s not easy to get this kind of thing right, so just don’t bother with pepper.
>That’s my point. It’s not easy to get this kind of thing right, so just don’t bother with pepper.
What? Your point is that you haven't demonstrated that it's weaker than the weakest link, therefore you win?
Edit: Okay I figured out where you got 288. Still confused by the 374. Anyway you need to make truncations explicit. You didn't pass all of the sha output to bcrypt. You're taking advantage of an implementation API bug.
I'm not asking for evidence that shoving together functions from google without understanding them can go wrong. That's trivially true.
I want an example where combining hash algorithms is inherently wrong. Like using a block cypher twice can pop out your plaintext, but probably not as extreme.
Edit 2: Oh, 384!
passwordHash = bcrypt(salt + password)
encryptedHash = encrypt(passwordHash, pepper)
decryptedHash = decrypt(encryptedHash , oldpepper)
encryptedHash = encrypt(decryptedHash , newpepper)
The point of salting password hashes is to prevent identical cleartext passwords from being stored as identical hashes in the database. Salts are often stored in the database, as well.
The point of peppering keeps a database dump from being at all useful for recovering passwords. It make sure that a component of the process of cleartext -> DB entry is not even in the database, requiring something from the app as well.
Why does encryption work here? Because you've already done a one way function on the cleartext -> salted hash. At that point, there is still no way to reverse the process all the way to get the cleartext. By using a two-way encryption function for the pepper portion, you keep the ability to rotate 'peppers' periodically, in case it is leaked, for example.
But, the original example is hashing the password. No encryption involved. So what makes anyone think that they can reverse a hash?
This would work if you were building a new system today, but if you had a DB full of one way hashes you're not going to be able to retroactively modify the pepper.
Straight from the original artical:
hashed_password = scrypt(password, salt + pepper)
hashed_password = scrypt(scrypt(password, salt),pepper)
After hashing the password he is storing the hash along with the users random salt, not retroactively applying the salt a second time.
I think the article's context is if you don't currently use a pepper, you can easily add one and update all of your password hashes in the database.
hashed_password = scrypt(password, salt + pepper)
Adding an additional secret key can be added generically, in (at least)
four ways, to any password hashing function:
1. Store: salt + HMAC_K(PHS(pass, salt))
2. Store: salt + PHS(HMAC_K(pass), salt)
3. Store: salt + AES_K(PHS(pass, salt))
4. Store: salt + PHS(AES_K(pass), salt)
I have used here "HMAC" to mean "some appropriate MAC function" and
"AES" to mean "some symmetric encryption scheme".
These methods are not completely equivalent:
-- With method 1, you forfeit any offline work factor extension that
the PHS may offer (i.e. you can no longer raise the work factor of a
hash without knowing the password). With methods 2 and 4 such work
factor extension can be done easily (if the PHS supports it, of
course). With method 3, you can do it but you need the key.
-- With methods 2 and 4, you must either encode the output of HMAC
or AES with Base64 or equivalent; or the PHS must support arbitrary
binary input (all candidates should support arbitrary binary input
anyway, it was part of the CfP).
-- Method 4 requires some form of symmetric encryption that is either
deterministic, or can be made deterministic (e.g. an extra IV is
stored). ECB mode, for all its shortcomings, would work.
-- Method 3 can be rather simple if you configure PHS to output exactly
128 bits, in which case you can do "raw" single-block encryption.
-- Methods 1 and 3 require obtaining the "raw" PHS output, not a
composite string that encodes the output and the salt. In that sense,
they can be a bit cumbersome to retrofit on, say, an existing bcrypt
The important points (in my opinion) to take into account are:
1. This key strengthening (some people have coined the expression
"peppering" as a bad pun on "salting") can be done generically; the
underlying PHS needs not be modified or even made aware of it.
2. Keys imply key management, always a tricky thing. Key should be
generated appropriately (that's not hard but it can be botched in
horrible ways), and stored with care. Sometimes the OS or programming
framework can help (e.g. DPAPI on Windows). Sometimes it makes things
more difficult. You need backups (a lost key implies losing all the
stored passwords), but stolen backups are a classical source of
password hashes leakage, so if you do not take enough care of the
security of your backups then the advantage offered by the key can go
3. For some historical reasons, many people feel the need to change
keys regularly. This is rather misguided: key rotation makes sense in
an army or spy network where there are many keys, and partial
compromissions are the normal and expected situation, so a spy network
must, by necessity, be in permanent self-cleansing recovery mode; when
there is a single key and the normal situation is that the key is NOT
compromised, changing it brings no tangible advantage. Nevertheless,
people insist on it, and this is difficult. The "method 3" above
(encryption of the PHS result) is the one that makes key rotation
easiest since you can process all stored hashes in one go, as a
night-time administrative procedure.
4. Key strengthening makes sense only insofar as you can keep the key
secret even when the attacker can see the hashes. In a classical
Web-server-verifies-user-passwords context, the hashes are in the
database; one can argue that database contents can be dumped through a
SQL injection attack, but a key stored outside the database might evade
this partial breach. But if the key is in the database, or the breach
is a stolen whole-server backup, then the key does not bring any
5. If you _can_ store a key that attackers won't steal, even if they
get all the hashes, then you can forget all this PHS nonsense and just
use HMAC_K(pass) (or HMAC_K(user+pass)). The key must thus be
envisioned as an additional protection, a third layer (first layer is:
don't let outsiders read your hashes; second layer is: make it so that
your hashes are expensive to compute, in case the first layer was
This is a really strange advice from Thomas Pornin. People rotate keys because not doing so weakens most symmetric encryption schemes. For example while using AES-GCM with 96-bit nonces one needs to rotate keys after encrypting roughly 2^32 ~ 4 billion messages; otherwise the IV collision probability will be higher than 2^(-32), which is already high enough in most large scale systems (and really bad things happen when the IV is repeated).
Also, if you have 4 billion hashes stored, and you rotate the key, and you still have 4 billion hashes stored... What's changed? You would need a key ring or derivative keys I guess but I think this is actually a case where ECB does the job.
But I guess we've now proven the point that even a pepper is non-trivial.
This is rolling your own crypto, which is universally bad. To paraphrase Bruce Schneier, anyone can write a crypto algorithm they themselves can't break. Peppering a password hash destroys any future maintainability.
There are issues with using it as implemented in some posts here, with nested bcrypts, to be sure, but I think the concept is still fairly sound, though there are certainly implementation pros and cons.
As for maintainability:
I'm also familiar with crypt(3)-style password hashes, where a prefix uniquely specifies the algorithm (and subvariant) used.
Why wouldn't this be fitting here? You can then easily detect, and deal with, passwords that have been tagged with "previous" peppers, such as forcing returning users to change password if a previous pepper was compromised, etc.
1 https://www.freebsd.org/cgi/man.cgi?query=crypt%283%29 or http://man7.org/linux/man-pages/man3/crypt.3.html, for some reason I can't find a link to a more comprehensive list at the moment
For example if my salt was CrytoRandom(10), and you increase it to CrytoRandom(15) you've just "rolled your own crypto" according to you.
If that is not the case then explain the difference between CryptoRandom(15) and CrytoRandom(10) + CryptoRandom(5) (longer salt Vs. salt+pepper).
There's a lot of people spreading FUD ("it is unknown!!!") and nonsense (concat two strings is literally rolling your own crypto!) in this thread.
I don't know if peppers are worth the dev' time, deployment issues, and additional maintenance (e.g. rotation). However I do know that the people arguing against it here aren't making rational counter-arguments that hold up under basic scrutiny.
But that's not all! Hash algorithms are written assuming the salt is random, and now I have millions of hash outputs in which the last X bytes of the salt are shared. Have you proven that this doesn't increase the attack surface? It certainly sounds like it might. This is exactly the type of side-channel attack that tends to break crypto, and you're giving it away for free.
If the salt remains the same length and you add the pepper on top, it won't make the final hash less secure/strong, due to the way hashing algorithms are folded.
At worst case scenario you've literally added no security at all with the pepper. There's no rational scenario where it reduces the security when all other things remain equal (i.e. you aren't replacing the salt with a pepper, or reducing the salt's length/complexity for the pepper, etc).
Yes, I think this is the most likely failure mode (though not necessarily the only one - crypto can fail in very suprising ways!).
But even this is harmful, since you are potentially making changes to security-critical code for no benefit. At best you get more complexity and more chances to introduce bugs, plus a false sense of security.
That's a nice trick, I've read about it elsewhere but never used, will do for sure in the future!
Are you sure that the intruder did not had server access? I mean the info: "We were recently able to confirm that there was unauthorized access to a Slack database storing user profile information." is not enough to deduce that this was an SQL injection (although might very well be).
Stay strong :-)
This wouldn't be my first concern. It would be all of the confidential communication that happens within slack.
Under their FAQ on the post. It could be inferred that there was some unauthorized access to certain users' communication logs?
> As part of our investigation we detected suspicious activity affecting a very small number of Slack accounts. We have notified the individual users and team owners who we believe were impacted and are sharing details with their security teams. Unless you have been contacted by us directly about a password reset or been advised of suspicious activity in your team’s account, all the information you need is in this blog post.
edit you can log in if and when you crack some of the hashes.
Of course, I barely know anything about computer security, but at least it should prevent attacks using rainbow tables I think?
Of course, even if they can't steal everyone's passwords, maybe the hackers will try to crack the passwords of higher profile targets.
I guess that I missed a step in the explanation where you attack the hashes.
However I see that they say that they are using some best practices (bcrypt, "salt per-password") so this attack will be largely mitigated.
If you index shingles (phrase chunks) instead, you lose out on sloppy phrases...you can only match exact phrases. I imagine you can perform a similar statistical attack too.
Hell, just getting the term dictionary would probably allow you to reverse engineer the tokens, since written language follows a very predictable power law.
Hashing also removes the ability to highlight search results, which significantly degrades search functionality for an end user.
Basically, yes, you can do search with encrypted tokens...but it will be a very poor search experience.
It's configurable for paid accounts, and can be set as low as one day. However, one of the best features of slack (and products like slack) is message history and search. Otherwise, IRC isn't all that different (WRT messaging).
First: Slackbot. This is a Slack-run bot that's in every channel; team owners can customize it to do various things, like scan messages for keywords and give out canned responses. Even if Slack adopted some variant of encrypted chat, each message would still need to be readable by Slackbot, so Slack would still have the means to collect every message.
Second: channel history. When I join a channel, I can see the messages in that channel from before I joined. This means that Slack (the server) must be able to give me those historical messages. In an encrypted group chat, the messages are encrypted only with the keys of the participants at that time, which means newcomers can't read them.
I'm sure there are other features in conflict with end-to-end encryption, too; these are just off the top of my head.
As for the second, the server could ask one of the clients to re-encrypt the channel history with the newcomer's key. It would only fail if nobody was online the moment you joined the channel (and you still could get it later).
Encrypting user data should be a common practice like hashing passwords.
I get the feeling that you've never done this before and you don't understand the technical challenge and implications of the added complexity you propose here for an essentially free to low-price all-in-one communication online service.
Slack is not the NSA, encryption is not the answer to every security problem out there.
Of course, that requires a decent protocol, and Mozilla is doing the world a disservice in not marketing Persona better seeing as it's the right solution....
Why's a password so different, seeing as most people reuse those passwords? Why do we essentially allow (and yes, I am excluding those that use password managers in this statement, I'm one of those) access to our webmail and other critical services to random websites on the internet? What makes this right?
> Payment is also mostly less sensitive to availability and latency issues than authentication.
That's patently untrue. Latency issues are nonexistant in both areas, and availability issues are critical in both areas.
Credit card payments online are so ludicrously insecure that it baffles me it's even legal. I only use them when dealing with the US (although some of the major retailers like Apple have finally started accepting 21st century payment methods), and I simply assume my credit card info has been leaking all over the place for ages.
The whole basic premise of credit cards is "we know it's totally broken, we'll just refund you the money because it's cheaper than fixing the problem".
Yes. It might be a hassle should someone misuse it, but the status-quo effectively means if I didn't make the purchase I'm not responsible for it.
More importantly, this was proven before PCI-DSS was a thing.
But if you just want the money shot: http://sakurity.com/img/smsauthy.png
Yes. Typing '../sms' in the field bypassed the 2nd factor. Just, wow.
Amazing what you can do with improperly-implemented input sanitation :)
This probably could've been prevented by disallowing non-number inputs, no?
They were doing the input sanitation, but it wasn't the very first thing in the processing pipeline, since "best practice" was to pipe everything through 'rack-protection' first.
Homokov was first to state, this was really a black-swan type bug which 99.9% of the time makes it into production. Apparently, they were doing the "right thing" and still got burned.
Basically, the form itself could have (and maybe even should have) required numeric-only values, seeing as Authy's codes are either 6 or 7 digits long and contain no alphabetical or special characters.
What could you do with a one-way encrypted phone number? I'm not able to enter a phone hash to make a call.
The previous comment did make the encryption / hash distinction - though I can totally understand how his post might have been misread that he was recommending the same mechanisms for both sets of data.
We can assume they aren't total idiots and there's a Internet facing application server that connects to a internal-only database server that has this data. Also, assume SQL injection is not the attack vector.
How would you apply encryption to protect the username, name and email from an attacker that has gained access to the application server? I've gained some shell on the server and have 24 hours minutes to extract data. I can see all the files on the server but maybe as non-root but just the user that runs the application. How can you, as a security sensitive application developer, stop me if I've gotten so far?
It's also worth noting that it wouldn't just be the web servers that require your private key, it would also be any mail servers you use for sending your newsletters and such like (assuming these aren't run on your web servers - which often isn't the case). Then there's your telephone support staff, who would also may need to know your e-mail address so they could do their job effectively. And any other operators that might compile data extracts, eg for 3rd parties where users have given permission for your details to used / sold.
Quickly you're in a situation where your private key is more available across your infrastructure than the e-mail would have been if it wasn't encrypted to begin with.
Now lets look at the cost of such a system. There's an obvious electricity / hardware cost with the CPU time required to encrypt / decrypt this data (after all, CPU time is the general measure for the strength of encryption) and the staffing cost with the time wasted jumping through those extra hoops. The development time, code complexity, etc - it all has a cost to the company.
Don't get me wrong, I'm all for hashing / encrypting sensitive data. But pragmatically we need to consider:
1) are e-mail addresses really that sensitive? Or instead should we be encouraging better security for our web-mail et al accounts (eg 2 factor authentication) to prevent our addresses being abused. Given that we give out e-mail addresses to anyone who needs to contact us, I think the latter option (securing our email accounts) is the smarter one
2) instead of encrypting phone numbers and postal addresses, should we instead be challenging the requirement for online services to store them to begin with? If they have my email address, why do they also need my phone number? Postal address I can forgive a little more if there's a product that needs shipping or payments that need to be made.
* Use http://en.wikipedia.org/wiki/Database_activity_monitoring. If you don't list users on your site and you get a query that would return more than one user record, it's a hacker
* Add some http://en.wikipedia.org/wiki/Honeytoken s to your user table, and sound the alarm if they leave your db
* Use Row-Level Security
* Database server runs on own box in own network zone
* Send logs via write-only account to machine in different network zone. Monitor logs automatically, and have alerts.
* Pepper your passwords (HMAC them with a key in an HSM on the web server (then bcrypt). Don't store key in db). https://blog.mozilla.org/webdev/2012/06/08/lets-talk-about-p...
* Use a WAF that looks for SQL injections
* [Use real database authentication, per user. Not one username for everyone connecting to db. Yes, this is bad for connection pooling]
Edit: Slack's in PHP, I thought it was in RoR for some reason. Oops.
1. Not all SQL statements are parameterizable (dynamic identifiers vs literals)
2. Stopping SQL injection doesn't stop Insecure Direct Object References
3. Developers make mistakes
4. Plugins are a risk (example: http://www.zdnet.com/article/over-1-million-wordpress-websit...)
For parameterization to work you need to be perfect, always. My suggestions are for when someone else fucks up.
Many of the comments have great suggestions. However, very few talk about the most important part of creating mitigations and designing key management/crypto. What is the security target?
Before throwing new designs at a problem, the attackers and attack vectors must be defined. If you don't know who you are guarding against and what they will do (and what data they will steal), then how can you possibly say what is a good mitigation??
One might argue that the threat is obvious, but I'll guarantee you that there are dozens of threats here. List them. Prioritize them. Then mitigate them. It is helpful to fully understand the problem/solution space before jumping in with pepper's, salt's, extra databases, and solutions.
Regardless, this is an example of why cloud communication (and ticketing and database off-loading [see MongoHQ] and...) systems probably won't ever become commonplace in most of the government space and the finance and health sectors.
I don't know how common this line of thought is in security. But if it is common, then if you're a small company, aren't you better off not hosting my stuff at these large companies, because it's putting information you collect with someone who is more likely to be "interesting" to the right people?
If you think you haven't been hacked, you probably have (or you are so small that you may have only been probed by bots).
If you haven't actually been hacked yet, it is only a matter of time. Ideally you start designing security layers now, before you are compromised.
You can't prove your cloud provider is using security best practices, while you theoretically can prove (or disprove) the same internally. Few companies do proper auditing, reviews, and pentests, but they have the capability to do so.
But you don't have to. They do. Look at Amazon. They publish their security audit each year, and now every company that uses them doesn't have to do their own audit. They can point their auditor at Amazon's report and say "see our datacenter passes".
Also, do you do a security review on your power company, or do you assume they've done it?
1. VPNs are almost always terrible for security because people tend to get them for one need (i.e. email, one line of business app, etc.) but when their computer gets compromised the attacker now has access to the superset of everything which every VPN user has ever needed to access and in all likelihood a bunch of other things which were never intended to be exposed but were never adequately firewalled internally.
I agree. We might not like rolling out our own instances, but it prevents hackers from being able to grab ALL THE DATA in one fell swoop. It really amazes me that some EHR systems have gone the cloud route.
Also, "cloud" for services like this means "your own private instance of the software running in a private VM in our datacenter" not "your own customer_id in a shared database."
It's why your gmail account is more likely to get hacked than my piddly self hosted imap server. Google's network security is unarguably better than mine, but you are never going to social engineer your way into changing my password, which is actually doable with gmail (happened to my sister in law).
I think you have to assume that you're going to be hacked if you're a big enough target. You don't know what you don't know about your vulnerabilities. The better question is how you're going to design your data and platform to minimize the damage a major hack can do.
I guess what I'm saying is that regardless of who you are, there is no easily discernible best practice playbook, just a sea of tradeoffs generally made by people with a woefully inadequate grasp of the risks involved. Heck, even the best security people are at a disadvantage in the asymmetrical battle of infosec.
Small companies typically can't afford competent and professional security analysts, engineers, penetration testers, and auditors.
Targeted cracking attempts against specific hashes are definitely still an issue though.
To you have a credible source for the "10..12 is too low for 2015" claim?
HHVM 3.6 on a small Ubuntu server
>>> timeit.timeit("bcrypt.hashpw('this is a password', bcrypt.gensalt(11))", setup="import bcrypt", number=5) / 5
>>> timeit.timeit("bcrypt.hashpw('this is a password', bcrypt.gensalt(12))", setup="import bcrypt", number=5) / 5
>>> timeit.timeit("bcrypt.hashpw('this is a password', bcrypt.gensalt(13))", setup="import bcrypt", number=5) / 5
>>> timeit.timeit("bcrypt.hashpw('this is a password', bcrypt.gensalt(14))", setup="import bcrypt", number=5) / 5
>>> timeit.timeit("bcrypt.hashpw('this is a password', bcrypt.gensalt(15))", setup="import bcrypt", number=5) / 5
That's five repetitions of a bcrypt hash with the work factor passed in bcrypt.gensalt(). The resulting units are seconds.
Good thing you made me re-measure :) That makes 13 my new bcrypt default.
Nevertheless, in all implementations I am aware of, the default for that parameter is 10. And earlier, you wrote:
> If they used ten rounds, it's dire, and just saying "bcrypt" doesn't say much unless you also specify the number of rounds.
tedunangst and I both assumed you were referring to the default 10 work factor of BCrypt and were calling it "rounds" as many of us are doing.
The obvious question that tedunangst is asking (and others in this thread) is whether a work factor of 10 is considered too low.
And in another instance the hacker emailed us asking for ransom.
Log into server. Why is server slow? Run `top`. Hmm, `./exploit` is consuming 99% CPU...