I don't know about the passwords, but my card was successfully stolen and a malicious transaction was initiated from another country. I know this was because of Adobe for sure because I (co-incidentally) used a brand new, fresh, unique e-mail address just for Adobe, and that email was released recently in the dump that the hackers provided.
Luckily the malicious transaction was declined by my bank and they blocked the card for me and they told me that someone had compromised my card details and issued me with a replacement card free of charge.
I only keep posting this in every thread about Adobe because I genuinely want other Adobe customers to understand the gravity of the situation and disable their compromised credit card and get it replaced by a new one as soon as possible.
My creditcard and the one of my coworker got blocked too; I only use this one for company expenses, Adobe being one of them. The only overlapping service my coworker and I paid for with a creditcard was: Adobe. We both got a new one in a week from our banks; but this, to me, makes it look like it was definitely compromised because of the Adobe breach.
It was only the night before the hack had I bought an Upgrade to the CS6 suite. In all probability, mine would have been listed out in the recent few transactions on Adobe's database, maybe?
There is only one other service I am using this card with - A stock photo site and they haven't announced any security issues, yet. Plus, I always am required to enter my card details on their site each time I buy something, in all likelihood, they aren't probably storing the card on their servers, which leads us to the culprit, Adobe.
Also, I am very security conscious in general. I am no security expert, but I always use Linux/Mac to make transactions, with Firewalls, stuff like that.
Is there an online service to check your email address is in there? I didn't get an email from them (perhaps it got stuck in my spam folder that gets auto-deleted weekly), but my credit card got hacked soon after.
What is that all about? The site makes a claim of making a search through 150mio records a big deal. Ok, it is not trivial, but any decent DB with proper indexes would have no trouble doing that. Am I missing something?
Also, they want me to enter my e/mail to check... <lost me here already>. Seriously?
A form to enter the sha1 of your email address so they can check against a list of addresses they've already sha1ed? And a form that takes the email for people that don't give a shit or don't know wtf a sha is.
Maybe just a .txt of the hashes too, but then no ones coming to your web service I guess.
To play devil's advocate here, isn't this actually more secure than something like MD5 or SHA1 without stretch factors or multiple invocations, assuming that the key was not also stolen?
My reasoning is, that in order for an attacker to get the passwords out of this dump, they have to break the 3DES encryption. Brute forcing the key is, as I understand it, still very difficult, and without it they can't get any of the passwords. If someone did find the key however, they'd have instant access to all of the passwords no matter how complex.
On the other hand, if the passwords had been protected using an unsuitable hash algorithm, the highly efficient GPU-based crackers would be able to find millions of people's passwords very quickly, using the sophisticated dictionaries and mangling techniques that are around now. Even quite complex passwords can often be found in this way, since the GPU crackers have got so fast they can try billions of combinations - e.g. even things like "!)@(#*$&%^Test123" can be cracked.  . Although, extremely long and complex passwords should be safe.
Obviously, I'm not advocating we all switch to 3DES for our password storage, and the huge risk here is that the key was also stolen - but I'm wondering if my reasoning is actually right here, and that people without extremely strong passwords are better off with this leak than if it'd been MD5.
You're right --- had they used something other than ECB mode. For example, if they used CBC mode with a proper IV, assuming the key is not stolen or compromised, the passwords would be quite secure.
The issue becomes verifying the passwords, then: supposing you have a CBC mode oracle, like a HSM, how can you verify two passwords are the same? (This is probably the reason they chose ECB mode in the first place.) In fact, if you allow the user to test if two ciphertexts represent the same plaintext --- and nothing else --- you still break the very definition of secure encryption, namely that a secure encryption scheme should have, as one popular definition, indistinguishable ciphertexts (usually under a chosen-plaintext attack).
So you have to develop some new way to measure security for the scheme, or perhaps somehow measure the damage that an equality oracle can inflict upon an IND-CPA secure scheme. The notion of indistinguishable ciphertexts roughly reflects the inability of an attacker to reliably learn any function of the plaintext from the ciphertext. Throwing that idea out the window seems unwise, since it's such an elegant idea. So, anyway, you're off in uncharted waters, not a good place to be if you're securing users' passwords.
All of that is relatively complex, though, so just let me know if I need to elaborate on some idea more. (I am never sure how in-depth to go in these posts.)
For anyone wondering about the acronyms above, ECB is Electronic Codebook, CBC is Cipher-block Chaining, and IV is an Initialization Vector used in CBC. IND-CPA is Indistinguishability under chosen-plaintext, a method of attack.
> The issue becomes verifying the passwords, then: supposing you have a CBC mode oracle, like a HSM, how can you verify two passwords are the same? (This is probably the reason they chose ECB mode in the first place.)
Why is that? What's wrong with verifying a password in CBC mode with different IV?
I am assuming that the DB server cannot decrypt. If the DB server has decryption permissions, and it is compromised, then there's little need in encrypting the passwords anyway as an attacker will simply decrypt them by hand. (They may not get all of them before being detected, but they'd still get some or most --- and they might look for high-value targets!)
Unlike ECB mode, you can't just encrypt the same password again and check to see if the ciphertexts are the same --- the IVs will be different (hopefully!). So, can you somehow check two ciphertexts with different IVs to see if they represent the same plaintext? Nope, probably not: if you could, then the scheme would no longer have indistinguishable encryptions under a chosen-plaintext attack . Briefly, this ability to test if two ciphertexts represent the same plaintext would allow the adversary in the IND-CPA game to simply encrypt both of the plaintexts and test them against the challenge ciphertext. (See the  reference for more info on this.)
In fact, this is exactly the property you don't want. You don't want to be able to determine if two users have the same password. If the DB server were able to test ciphertext equality, then it could pairwise-test each pair of users (or again only hunt for high-value targets).
If you were really adamant on encrypting passwords, I'd also suggest that we'd need to pad the password out to the max password length to prevent the revelation of password length. But of course, I very strongly suggest against password encryption. Still: that's another thing to consider in this hypothetical scenario.
> I am assuming that the DB server cannot decrypt. If the DB server has decryption permissions, and it is compromised, then there's little need in encrypting the passwords anyway as an attacker will simply decrypt them by hand.
I am not an expert in security/cryptography, but to decrypt you need a secret key, don't you? In this case Adobe believes the cracker doesn't have the secret key which means if Adobe engineers did use CBC and random IV for each password, the cracker can't learn much even provided with hints in the database.
So if you store the IV and the ciphertext in DB, and when you want to authenticate, you encrypt the plaintext with that IV with your secret key. I don't see why that can't be done. Encryption is really only as strong as your secret key security and the IV in this case looks like a salt in hashing scheme. The difference encryption is bi-directional and hashing is one-way.
So I don't understand why they have to use EBC mode. I am not getting your point.
Why go through all that when you could use the username or email together with the password to encrypt it with 3DES ECB mode? Those would be unique and users who have same passwords would still have different ciphertexts.
3DES has a block size of 64 bits, or 8 bytes. Unlike a hash function where the whole input affects the whole output (the strong avalanche criterion and whatnot), in ECB mode encryption, data is only changed on 8-byte block boundaries.
So, for example, suppose the user's username was 8 characters, their email was 16 characters, and their password was some more characters. Then if you use
3DES-ECB(uname || email || passwd)
where || denotes concatenation, then uname and email will take up 3 blocks and the password will be the rest... this is essentially the same scenario as the Adobe leak, except that the attacker now has to worry about how the password falls across the block boundaries. In this contrived scenario, the password starts a new block of its own, so the addition of the uname/email adds no security here.
Since the uname/email lengths are public, you still might be able to cross-reference sections of identical passwords with other users, depending on how the blocks line up. In any case, this scheme still doesn't offer the unconditional security level you'd like.
I'd recommend the Matasano crypto challenge . It covers some material similar to this pretty early on, so you get your feet wet here.
AFAIK you need the same IV for both encryption and decryption.
Some calculate an IV using existing components (such as has of email or name or such), some always use 0x0 as IV. But safest method is to have a random IV (preferably also stored in a HSM along with the key) per encrypted account.
"In an update on the data breach disclosed earlier this month, Adobe has said that source code for Photoshop was stolen."
I might be being overly paranoid, but I've shut down Adobe's Air/Acrobat/Flash updaters at the firewall until I hear plausible sounding assurances that Adobe didn't lose _everything_ in this breach, including software signing keys, update servers, DNS SOAs – the whole lot. _Maybe_ some of that stuff was better secured than the Photoshop source code… But would you bet every machine on your network that they "only" lost ~130million account credentials and the Photoshop source code, but nothing else?
Roughly, yes. You can't just run through the database and crack every account that used one of the top 1000 passwords (password, secret, sex, ...). But since you can see all the accounts that have the same password, that lets you:
1. identify all the people who used a popular password
2. identify anyone who happened to use the same password as one you already know, such as your own.
1 + 2 = 3. if you mount an online attack against the people from 1), once one account is cracked you instantly know all the others as well, and can probably dodge any security of the three fails and you're in timeout variety.
The security afforded by the 3DES key (assuming it's a secret) should be greater than the security of a password like "adob3" even with a good hashing technique.
There's an essential note which is not quite clear from the article too: while encryption algorithms are reversible, most hashes are not -- sometimes we forget, but they're soft of by definition many-to-one (injective) functions -- while encryption is fully reversible (albeit mathematically "hard"). Of course, this won't matter for websites that use the same hashing algorithm, but it's an essential part of why it doesn't make sense to encrypt rather than hash passwords apart from key compromising.
Yeah, secure hashes are (hopefully) not mathematically reversible - but when you can try 350 billion combinations per second on a GPU cracking setup, that stops being such a great defense. This is why you have to use a slow hashing function.
So do multiple things to make it harder for the naughty people, i.e. hash and then encrypt the result.
Hashing or Encrypting?
1) If you have to choose one or the other then obviously hashing (with a random salt and a relatively expensive algorithm, e.g. bcrypt et al with suitable work factor) is generally the way to go.
But it is still possible to brute force many of the easy passwords from a DB leak of bcrypt() hashes, it just takes a bit of time. From there you get to know that email address X uses password Y, which may open the door for hacking into other accounts where they've used the same passwords.
The people that get fucked over first from a DB leak of hashed passwords are those with weak passwords. A poor hashing algorithm (unsalted MD5, etc) may even expose seemingly "unguessable/random" passwords thanks to rainbow tables.
Even with bcrypt() hashed passwords you should be able to work through a huge portion of the top 100 passwords for all 130M accounts and come out with a huge number of email/password pairs.
Increasing the work factor of the hashing algorithm is a trade off, too high and you'll soon need extra servers just to cope with the extra CPU load of people logging in and having to check their passwords, too low and the hashes are easier to crack.
2) But, why not use bcrypt() to hash the passwords, and then some encryption algorithm to encrypt the resulting hashes?
Before anyone jumps in with it, doing this is not "security by obscurity" because you're not relying on the encryption key remaining secret alone to protect the password.
What it does protect you from is a basic leak/dump of the DB being open access for those who want to try cracking the bcrypt() hashes.
They're left with an initial problem of finding the encryption key before they can even start on the bcrypt() dictionary attack.
Sure it just takes someone to grab a copy of the login code (or wherever the encryption key is being stored) but you've protected yourself from a basic SQL injection attack that could be used to just dump the DB without access to the server to compromise the login code.
Why weren't the hints encrypted? (Including having a random n character 'salt' that is prefixed to them before encryption to prevent the same hints encrypting to the same string).
If the hints were encrypted you couldn't use them to help guess passwords.
Even if the passwords were hashed rather than encrypted, the unencrypted treasure trove of password hints would make the job of cracking the passwords much much easier.
Why were the hints even stored in the same table (or even on the same server) as the passwords? (Maybe they weren't and the hackers got both and combined the two datastores.)
Again, if the server was compromised enough that the source for the login code was obtained then the hints would effectively be in plaintext, but you've still protected the hint data from a simple DB dump hack.
I'm sorry, how can you check easy passwords with bcrypt? Hell, checking just one password per account, assuming bcrypt takes around half a second (what most reasonable implementations take) would take two years. If you want to check the 100 most common passwords for everyone, that's 200 years right there.
Half a second sounds pretty slow for authentication servers handling over 100 million accounts. Here's how I figure it: Assume about 10ms per attempt, since this is a high-throughput login system. Use a whopping 10 computers to do the cracking, and you're already under a month to try most of 100 passwords on every single account.
When using an awful password, bcrypt can only do so much. It can protect you from the ideal case of a single person with a single core that doesn't filter accounts in any way. Now consider how many people have access to this database...
Sure, someone on a single machine isn't going to get far, but that's not a great defence to rely upon.
The answer is one of two things:-
Botnets, which most hacking groups will have some access to. Being in control of a modest 10,000 machine botnet reduces that 200 years to about a week, call it 5 weeks if you limit utilisation to 20% of a single core. Expect more of this when bitcoin mining on botnets becomes less profitable.
Also, despite bcrypt() being designed not to be easy/fast to implement on GPUs because of the memory footprint required, GPUs are growing in size and reasonable implementations exist for FPGAs. ASICs would be even faster.
Upping the work factor to compensate for this makes more work for the CPUs at the site that is using bcrypt(). I know of one company that has more cores utilised in performing bcrypt() checks than they do running the HTTP and DB portions of the site.
I wouldn't call Photoshop an achievement, it's a hodgepodge of old libraries and bugs at the best of times.
Adobe refused to patch a vulnerability in CS5 at one point, telling people to purchase and upgrade to CS6 (US$199) if they wanted to not be vulnerable to malicious code execution. In response to the uproar they eventually backported the patch.
If Photoshop, basically industry standard in image-editing software, isn't an achievement... then nothing qualifies as an achievement. And this is coming from someone who prefers GIMP on linux. I would love to produce something even a quarter as popular & usable as Photoshop.
It might be more of a reflection on my colleagues and I, but the phrase "industry standard" is basically used to mean something is crap by us. This is in radiology. UI inconsistency, buggy, crashy (as in 40 minutes to reboot the damn system), incompatibility with other systems etc. When you complain, you're told its industry standard, or similar. And this is just one system
I use. There are other standards too. The company formally known as Kodak supplied PACS system which relies on IE6. It is also an industry standard. And it's as bad as things get. In healthcare IT industry standard = rock bottom standards.
"If PS (or other system) makes your machine need a reboot, the problem isn't Photoshop" The system is provided, hardware, and all, by the manufacturer. The problem is both hardware and software. Flagship model too. GE.
Mammoth corporations have fuck-all to do with common sense. They're run by apathetic shills that couldn't care less about technology, progress, or people (not to knock the talent that works in the trenches).
I'm curious to hear when exactly people think bcrypt became accepted best practice? And how much of a grace period did people have to switch? Were you incompetently negligent if you didn't use bcrypt by 2003? 2007? 2011?
(I ask this as a fairly big fan of bcrypt myself. Somehow I just have the impression that half the peanut gallery comments come from people who literally switched over from md5 hashes yesterday and suddenly feel the need to crow about their great accomplishment.)
(the discussions usually then fragment into the "No, use scrypt instead! GPUs! HashCat! ASICs! Memory-hard vs iterations-hard!" thread, and the "but what if I use an application salt with my MD5 hashes? Or invent my own complification techniques (and keep using MD5)?")
While Coda's 2010 blog post is clearly the most commonly linked-to bcrypt reference – the post itself includes a many links including one to an article by Derek Slager (quoting tptacek) from 2007, links t both Java and Perl implementations from 2006, and a link to a Usenix paper from 1999.
If Adobe didn't switch to intentionally-slow hashes with proper salting until "last year", that puts them over 20 years behind "best practice" (as well as 2 or 3 years behind fully deserving of online mockery, laughable uninformed-newbie levels of security engineering).
Why is SHA-2 in the list? [Looks like cperciva beat me to it, 0 minutes ago]
Something I've always wondered about SHA and MD5, though: if you feed the output of a hash function into its input enough times, will you eventually reach the original value? Will you have traversed the entire output space of the hash, or will there be multiple closed loops, or perhaps even multiple starting points converging on a single terminal loop?
There has to be at least one loop (or fixed point that maps to itself). There's only a finite set of outputs, and the input space of strings as long as a hash is at least that large (in theory, some output hashes may not be possible to generate). Ideally you'd have very large, long loops, which implies few collisions.
As a small aside about Adobe security practices, I needed customer support from them the other year (its own horror story). As part of this, customer support insisted on setting up an Adobe account, although I'd purchased the product through a third party vendor and not directly from Adobe (as it turns out, thank goodness!).
When looking up that account information, I saw the note I made as to the original password they gave that account that they set up: "123456". I had changed it away from that; I suspect a significant number of their users might not have.
Glad that account contained only a name and ZIP code / town.
AND the serial number. If someone consumed a spare slot on the serial number, I shudder to think of how many hours on the phone with Adobe support it might take to get that slot freed up.
So, if Adobe engineers eventually realized that they needed to upgrade their password security, and they had access to the passwords in their DB (they used 3DES, and they had the key) - why did they not immediately decrypt and hash all passwords?
I suspect as "the worst thing ever on the internet" it'll do wonders for sales/downloads of 1Password/Keypass/Lastpass/whatever, and bring awareness of the need, as well as the ability to easily have separate strong passwords for _every_ site/account/login to "the general public" - which'll be a big win for internet security generally (at the expense of everybody who loses out from exposed and re-used credentials).
I have a possibly-not-too-paranoid suspicion that "the worst thing resulting from the 2013 Adobe compromise" may yet be to be revealed. People have joked for years about the "Adobe updater virus" – but what's protecting everybody who's now so familiar with weekly or monthly Acrobat/Flash/Air update boxes popping up asking for admin credentials? If they lost the Photoshop source code, is it even vaguely plausible to suggest they couldn't possibly have also lost root on the update servers, or their SSL private keys, or admin access to the dns zonefiles, or the adobe.com registrar credentials, or any of the other steps in the chain that'd allow attackers to push a malicious Adobe update?
I will probably download the tarball to check, but it would be swell if that "check your email address" form allowed pattern matching, so I could just grep for a substring instead of typing in my whole address.
Edit: just for laughs:
email@example.com was found. You need to change your passwords now
I wonder what his password is. I guess we'll find out eventually.
The xkcd references comic 792 and I totally agree that at any given time there is a website people going to sign up and they will never realize the author of the website stores everything in plaintext so he could get your username and password. This is why we need to push identity services like Persona.
"Adobe says that they've followed best practices for password storage and protection for more than a year now, as their authentication systems were upgraded to use SHA-256, with salt, to protect customer passwords."
Absolute bullshit! SHA-256 with salt is totally inadequate for password storage, they should use a PBKDF like scrypt.
Depends on the number of rounds used. It can be brought up to a reasonable time complexity this way and is definitely not inadequate. Worse than some alternatives? Sure. But it still gives reasonable protection.
That is the standard in the password libraries really. passlib uses thousands of rounds of sha-256, so does glibc, etc.
Unless they implemented the whole thing from scratch, they wouldn't be using a single run of sha-256. It's not impossible, but I'd say at this point it's unlikely they're doing something silly - it would be a job terminating mistake for whoever implemented the new system after the last fiasco.
How long have you been doing that? There's an email address of mine in that list that is much more likely to have been given to Macromedia back in the late '90s or very early 2000s - I was working at a different company in '02 and am quite unlikely to have given the older email address out after then…