It is absolutely not the case that people tend to get encrypted blobs right nowadays. All-zeros IVs, ECB-mode encryption, failure to authenticate date, repeated CTR nonces, wrappable/seekable CTR counters --- these are all things that we've found just within the last several months.
Major components of popular high level web stacks have had bugs of this severity lurking in them for years.
Don't get me wrong, people get it wrong all the time, but I think it's a minority these days. That could be strongly skewed by the stuff I'm auditing these days, though. In fact, it almost definitely is; I should rephrase that line.
CTR generates a keystream using a nonce and a counter as inputs; neither the nonce nor any single value of the counter can be used for more than 1 item of data.
as i said, i think i understand CTR mode. or thought i did. i was more curious about how you could seek to a particular counter value. and then was even more confused by the idea that neither nonce nor counter can be repeated, as i thought it was the combination that had to be unique. so maybe i do need to take that course. hmmm.
There are systems that use CTR mode as a way to do "random access" bulk encryption, because Schneier suggested that in both his major crypto books.
The specific exploitable condition is indeed the recurrence of a specific nonce/counter tuple; the point is, there are systems in which attackers can induce that condition, as opposed to simply having the system blunder into it (for instance, by using the same nonce every time).
No. That was a terrible design. They screwed up the implementation --- developers, no matter how competent, can be relied upon to screw something up in every system --- and as a result Thai Duong and Juliano Rizzo managed to write a CBC padding oracle exploit that worked against huge numbers of real-world ASP.NET systems.
ASP.NET Forms are a pretty good textbook case study of why you should never build unnecessary crypto into your system. Microsoft spends millions every year on external security validation and also keeps a highly regarded team of crypto designers on staff. Still missed the most blatant web crypto flaw of 2010.
Can anyone offer any insight as to why you wouldn't generate a one-time random string with a short lifetime for password reset emails? Sending a blob with data in it (encrypted or otherwise) seems like a potential information leak with no upside.
That is exactly what you should do. The only upside to sending semantically meaningful data in email tokens is that parsing the responses doesn't require a central database lookup --- but that upside itself comes with the downside of losing the control that a simple database table gives you over outstanding tokens. And that's not really an "upside"; it's just a convenience.
You have to be careful though. If someone managed an SQL injection and can read tables, but you are using bcrypt hash properly, you might think you are [somewhat] safe. (i.e. at least they can't write anything.)
But if they can read the password reset tokens from your database they can login as any of the users. (This is especially bad if changing a password doesn't invalidate "keep me logged in" cookies.)
The solution is to treat those tokens as passwords, and hash them as well.
(I know this because I had to clean up after this exact senario happened to someone using Joomla. They got an admin login from a password reset token, from there they were able to write files to disk and fully take control of the server.)
If you're writing injectable SQL queries, there is nothing I can tell you about crypto and password resets that can help you.
Not all vulnerabilities are equally easy to blunder into. We can presume a competent development team will avoid SQL injection, or at least, will discover accidentally introduced SQLI quickly through testing. The same is not the case with crypto flaws.
Another way to think about it is: if you're using an SQL database at all, you've signed up for SQLI mitigation whether you use the database for reset tokens or not. However, you still have the option of not having to deal with crypto bugs.
So, I'm already sending random strings, but there's a few things that I know are just wrong and broken about my current thing, and I'm wondering what the appropriate way to fix it is.
1. It doesn't limit how many tokens you try to send out; I'm not entirely convinced this is an issue yet, since these tokens are 256 bit, so you'd have to create quite a few before random attempts would start working...
2. The tokens do not expire.
Again, I know both of these are wrong, but I'm not sure what the appropriate way to deal with them is.
Limiting [1], i.e. the number of outstanding tokens, sounds like a bad idea, because then an attacker could lock the legitimate user out from being able to reset their password (barring a support call). Doing statistics on how many password reset requests are being made, per host and per credential, seems more efficient.
Adding a simple 24h lockout to [2] seems like a good idea, and it happens to be ridiculously simple to implement, too :)
Oh, by the way: tptacek, I sent you an e-mail weeks ago about a Pycon talk -- did you happen to get it perchance?
> You can always achieve a token timeout by embedding a timestamp and signing the blob.
What's the benefit of this over just sending a random string? One database read for the risk of an attacker being able to create arbitrary valid password reset tokens that never expire?
That particular trade-off strikes me as fantastically bad, especially for an operation like password reset that doesn't happen very often.
> However, because you're typically doing a password reset in a centralized database anyway, there's not really much point.
Personally I find the incurred security risk more compelling than "we're hitting the database anyway, so who cares if we make another trip for the timeout."
For forgot password, there really is no reason to take this approach, though if you do it sanely it's not a problem. The report generation and other things like that in their code, however, are reasonable to do this way if you do it right.
Most likely they wanted to use the same scheme everywhere in the webapp, and for some reason decided that encrypted data proxied through the user was the best choice.
Assuming that the concept in general is sane, then it makes sense to not special case specific uses (less code where things can go wrong).
> but if you can put in any data you want, you could poison the compression enough to put it into a bad state -- one where effectively nothing compresses properly, and you end up with your own data completely in the clear.
This sounds interesting, and completely counter to my understanding of how compression works.. I thought CRIME's innovation was to exploit compression ratios as a proxy for cleartext. Poking a compressor into emitting probabilistically assigned bit-aligned symbols that happen to correspond to its input seems unlikely.
A lot of compression algorithms use fixed size dictionaries over a window.
Let's take the canonical example of zlib/gzip, which does this.
With careful control over the data, you could make it so no dictionary lookup would ever succeed. This would mean no strings would ever be eliminated through backreferences.
Most also have a minimum match length, making your life a bit easier here.
Most also are outputting encoded streams that basically a little decompression VM (IE literal, 0, backreference, 255 bytes ago, size 30). Because of this, they will not eliminate duplicates where the match is too small (only 2 bytes).
This will get you past the duplicate elimination phase, but not the huffman phase.
Getting past the huffman phase is harder.
To get it to output a stored block, you have to convince it the raw literal length of the block will end up less than the length of the block as encoded.
For zlib, we have
opt_len = (sizeof (compressed data in bits) + sizeof (huffman tree in bits ) + 3 + 7) / 8
if (stored_len+4 <= opt_lenb) use stored block
So you do have some chance to beat it by messing with the probability distributions, and do get a little leeway.
On the plus side, you only need to mess with the probability for a single zlib block, not the whole shebang.
If you can put arbitrary content into data to be compressed and don't have any real length limits (<1MB or so, probably) then skewing the probabilities to have the compressor spit out the data you want is pretty straightforward. I'll write a blog post on that at some point, as it's a technique I abuse in a couple places.
Other than forgot-password, why use blobs at all? Only other rare use of blobs that I can think of is proxying users from one domain to another (single sign-on, session sharing etc.)
My theory is that the reports were really a compressed and encrypted query. Obviously that shouldn't be exposed to the user in any way (because of tampering and information leakage issues) so an encrypted blob is actually a reasonable approach when doing the handoff from one part of the site to another.
I dont get why you would bother using a blob for forgot password.. Isnt something like reset.php?account=joeblow&auth=md5(accountname/secretkey/first10charsofcurrentpasshash/yymmdd) enough?
You cant replay attack the reset link once its used, it expires in 24 hours and so long as the 'secretkey' was sufficiently unique, you wouldnt be able to bruteforce or crack it.
All the php script needs to do is attempt to build a 'good' md5 hash and see if they match- if they do, let you input a new password to store.
An attacker can do an offline brute-force attack on the "secret key" in this scheme, since attackers can be presumed to know at least one valid password in the system.
Just use a random string as a primary key for a token table in a database and be done with it.
Why do they even need to store arbitrary data in the blob at all? You can simply use it as a token-id for the server. While it does dramatically increase the amount of data that needs to be stored, and accessed, it nearly prevents attacks like this.
It can be good money, but it can also be a big time sink; really just depends on the site. CCBill was super easy for me on the whole, but I've struggled with others: I only found 3 or 4 bugs in PayPal, never found anything in Facebook, etc.
I submitted a couple bugs a while back, and they asked for some clarification on one of them which I didn't ever take care of (just haven't had the time or motivation). Given that, I really don't have any opinion on their bounty program either way thus far; should I ever put some real effort into it, I'll probably write about my experience.
Edit: And you're as exceptional as Daeken. I'm a builder, not a breaker. I'm awful at security beyond knowing better than to do anything but a random token + short lifetime.
Major components of popular high level web stacks have had bugs of this severity lurking in them for years.