Hacker News new | comments | ask | show | jobs | submit login
Cracking Passwords In The Cloud: Amazon’s New EC2 GPU Instances (stacksmashing.net)
199 points by ssclafani on Nov 15, 2010 | hide | past | web | favorite | 88 comments

This is insane for password cracking. I've been trying to get Pyrit working on the Centos image but have been having trouble compiling it. An 8 node cluster with Teslas is going to bring WPA-PSK cracking down to where WEP was a few years back for those who can afford it.

This is an incredibly disruptive thing for Amazon to do. They've just brought near-government grade crypto-breaking capabilities to the mass market.

They've just brought near-government grade crypto-breaking capabilities to the mass market.

No, they really haven't. Near-government grade KDF-cracking capabilities will be when Amazon announces FPGA Compute instances.

Last time I checked, GPU processing has a better-bang-for-the-buck than FPGA processing, and the gap continues to widen.

I suspect the NSA doesn't care too much about bang-for-buck. 22nm FPGA's [1] seem like they would work pretty well.

In an interesting twist, The Register claims that Achronix's decision to use Intel was driven in part by national security considerations. We've reported extensively on the idea that chips fabbed overseas in insecure facilities could contain hidden kill switches or backdoors that would let an opponent cripple the US military, and Achronix allegedly wants to be able to sweeten its pitch to military customers by offering a home-grown solution.

[1] http://arstechnica.com/business/news/2010/11/intel-shifts-st...

Those are particularly interesting because they're asynchronous FPGAs -- they use local handshaking rather than a global clock to keep everything synchronized. That should make them easier to port to new, smaller process nodes, and they say it's responsible for their unusually high throughput.

Cool stuff, and all the more intriguing considering that Intel's getting involved.

I suspect the NSA doesn't care too much about bang-for-buck.

Bang-for-buck is pretty much the name of the game in brute-force cracking. You're right that NSA probably doesn't have any budget constraints, but they'd still be interested in getting the most hashes/second possible out of $10 million.

World's fastest supercomputer is now nvidia's gpus: http://spectrum.ieee.org/tech-talk/computing/hardware/china-...

Maybe, maybe not. see http://www.renderlab.net/projects/WPA-tables/

Basically, if your WPA network is vulnerable now, it already was vulnerable. A decent password and non-default ssid will still slow people down.

One would imagine that this isn't even remotely close to what government agencies have in terms of crypto breaking power. Don't underestimate what a basically unlimited budget can buy you.

For every instance you an go out and get they probably have a football field sized room full of stuff twice as fast.

Please don't assume that I'm referring to the US Government. I'm not referring to any specific government, just the concept in general.

The major powers have particularly good access to kit and people. There's a significant number of countries for whom this is actually more than they have (in fact I know of two or possibly three countries where the full capabilities definitely supercede existing ministry of interior type capabilities).

> I've been trying to get Pyrit working on the Centos image but have been having trouble compiling it. An 8 node cluster with Teslas is going to bring WPA-PSK cracking down to where WEP was a few years back for those who can afford it.

Are there legitimate uses for password cracking or is this about getting access to other people's accounts?

At my day job we mostly do penetration testing and incident response. Sometimes we need to crack passwords so we have these huge files called rainbow tables that can be used to look up a very high percentage of possible passwords for given algorithms, but aren't infallible and take time to search.

We sometimes crack passwords to do a password strength audit. Sometimes dictionaries aren't really enough (as someone might have chosen an obvious word in another language) so it's easier to just crack the passwords, automate analysis of the obvious and then scan through anything left behind.

WPA-PSK cracking is particularly useful in the UK Local Government sector, where local government in most cases needs to have an annual penetration test, often including their wireless networks. A lot choose WPA-PSK because it doesn't mean spending money on a full-blown wifi network.

The other thing we use password cracking for is when we do incident response work. Sometimes people encrypt things like documents or use PGP containers. This isn't quite making PGP cracking feasible, but certainly the majority of encrypted word documents should be doable with this.

Thanks for the information, that's very interesting.

The sha1 hashes he provides are super weak. I can crack half of them in less than 30 seconds on my CPU with my software (16crack). Hardly material for a GPU:

EF8420D70DD7676E04BEA55F405FA39B022A90C8 "Password!"

5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8 "password"

A9993E364706816ABA3E25717850C26C9CD0D89D "abc"

1902E3D6FC4E78A0BCC50BA12B882769AFBF4A8C "bad"

8F2005004F8BAA7A1090A9BF3B03C48D38E78157 "P4s$"

CD3724AC40034097A3D27865D710E4F791B6AEDB "Bwah"

7110EDA4D09E062AA5E4A390B0A572AC0D2C0220 "1234"


How long does it take your software to crack the other half?

That depends on the number and type of CPUs available. I know better than to race my CPU against a GPU. I won't win. I'm just pointing out the fact that if you have a fast GPU (or a cluster of them as the article suggests), then the hashes should be much better than what is provided. With those hashes, any CPU based cracker can do them easily.

The article says 1 to 6 character passwords. With 12 cores, I can enumerate the full printable ASCII character set in about 30 hours, but I would not try to do that. I would use patterns and word lists first. After 6 to 7 characters, brains begin beating the hell out of brute speed.

In the article it's stated that the hashes are brute forced using the provided charset. I think you used a wordlist, which is a whole other thing.

I use brute force on 0 to 5 character attempts. I get smart at 6 or more characters. No need to use brains or GPU speed to crack 5 char passwords. Here are more weak hashes from his list:

DA39A3EE5E6B4B0D3255BFEF95601890AFD80709 "" no_pass

0D824508182A1AA0EEF9A0B6EE52F8A32AF06F0A "GoOd!" brute_5

A94B95A7A4D432DE056B0030DA879AF841376069 "GPGPU" brute_5

BFE06C47BE2390ACA934AB6A128C141DCEB4072F "G0o|)" brute_5

My point still stands. These should have been better/stronger hashes. I did all of that in less than an hour on a CPU using full enumeration (aka brute force).

>This just shows one more time that SHA1 is deprecated

This just shows ignorance about hashing functions, especially fast ones. If they had used SHA-512, or say SHA-65536, it wouldn't be any more secure against brute-forcing / dictionary attacks. Barring SHA1 being cracked - ie, finding an efficient way to find SHA1 collisions - ie, "reversing" SHA1, it's no more deprecated than any other non-cracked hashing function.

About your only option is s/bcrypt, or something similar, which are intentionally slow / hard, to defeat brute-force attacks like this.

Allow me to be the first one to argue that we should require that users use longer passwords. This is not much of a burden. Your password will last for a very long time, and it takes only a few minutes to memorize a long password.

If computing power doubles every 2 years, then every two years we should bump up the required password length to match. If each character is randomly chosen from amongst 2^6=64 possibilities, then every 6 years we add 1 to the number of characters required.

For instance, if today we require 12 character passwords, which are clearly uncrackable even with Amazon's fancy new GPU instances, then in 2016 we require everyone update their password to a minimum of 13 characters. Every six years we bump it up again.

I think people vastly underestimate their ability to memorize passwords (just think how many phone numbers you probably had memorized back in the day). You could probably memorize passwords thousands of characters long if you bothered, and the only problem is that it would take too long to type them in.

You only need one long password, and all others can be stored in an encrypted file. This is not much to ask.

Bcrypt and scrypt look great, but not fundamentally any better than longer passwords.

> Bcrypt and scrypt look great, but not fundamentally any better than longer passwords.

A technical-only solution is infinitely better than one where you try to change user behavior. You think memorizing a long password is easy because it's easy for you. Try telling that to someone who's 72 and just started using the Internet.

The reality is that anything that reduces friction to adoption is almost always a positive choice for any given company. There are exceptions, like banking, but for the most part, this is true.

This all ignores the fact that longer-length passwords are almost completely pointless, anyway, for a ton of reasons:

(1) The data you store on behalf of the user is probably not important enough to warrant the (very strong) inconvenience.

(2) Proper password storage ((b|s)crypt) can already mitigate a lot of these risks. If tuned properly, even a 4-character bcrypt password can become more computationally difficult to "reverse".

(3) Short passwords can be bruteforced over the wire? Well, you can prevent people from doing this. You control the environment and can make brute-forcing attacks against your login mechanisms unfeasible.

(4) For this to matter at all, some attacker has to steal the entire authentication table with all of the hashes. If that happens, the number of ways you're fucked is much larger than just your users having to change their passwords where re-used elsewhere.

Fundamentally, for most use cases, it should be a user's choice to opt to use a longer password that would be more difficult to crack, or use a shorter password for convenience.

Google already does this by showing a password strength bar when choosing a password. Unless you store very sensitive data, who are you to make that decision on their behalf?

A good long password - the first few words of your favorite catchphrase in lowercase with no punctuation. For example: "well thats the funniest thing". I'm not sure how strong it is, but it doesn't seem too bad.

Some UNIX geek once wrote a tutorial saying that good passwords have a random collection of uppercase, lowercase, and punctuation marks. It's too hard, so people just use "Pa$$worD".

We can't even get people to dump IE6. How the hell are we going to get them to use longer passwords, and change them every year?

Scrypt is better than longer passwords because it's actually possible to implement in practice.

People use IE6 because of external forces. You can add longer passwords to your application and still make it work with IE6.

Some do, some don't. I remember a digg survey a while back - a significant number responded that they had not upgraded because they felt no need to.

So just drop IE6 support - those people will soon wake up. But that has its own problems. If 10% of my users use IE6, and I drop support for it, that could have a significant effect on conversion rates and the like. Those people are going to do something else, maybe even go to a competitor.

The same applies to passwords. If, say, tumblr suddenly required 12 character passwords, then it would be quite a hit to their signup rates. This is simply not going to fly when there exists far less drastic measures (i.e. scrypt).

> How the hell are we going to get them to use longer passwords, and change them every year?

You merely require them to use longer passwords, and require them to change them every year. My university does this. They require a minimum of 8 characters, and they require that we change our password every 3 months or we can't log in.

If I had to come up with a new long password every three months I'd do what undoubtedly countless other people would do in the same situation: I'd write down my password somewhere nearby the computer so I could look it up when I needed it.

Overly onerous password requirements reach a point where they no longer increase security, they just shift vulnerability to a new area. They also piss off users.

If I had to change my password every three months, I'd do something even simpler: I'd stop using the service.

How many people do things like changing their password from "soccer5" to "soccer6" every three months?

That's avoided by not allowing more than three consecutive characters from the old password to be in the new password. It gets really annoying, trust me.

That would mean that they're storing the passwords themselves, hopefully encrypted, rather than just a salted slow hash of them. That makes me nervous. Should it?

you could have a form asking for the previous password and the new password... It then checks the previous password against the salted hash and then has the information to compare changes between the old password and new password without having to store anything

Then you[1] can just alternate between two sequences of passwords: password1, cleverme1, password2, cleverme2, ...

[1] Meaning: anyone who wishes to use the service but isn't willing to come up with an unending stream of genuinely different passwords for it.

Regardless of the length, passwords are obsolete, at least when used in single factor systems. The biggest problem with authentication is identity assurance, an attribute that passwords simply don't possess. Better and more factors are necessary to provide true security in the future, and we're simply making due with passwords in the meanwhile. Phishing and Firesheep are ready examples of the weakness of passwords, and they're not even the easiest methods of compromise available. Passwords have a use, but so do other forms of security-by-obscurity, a category that passwords will inevitably fall into.

Payment Card Industry (PCI DSS 1.2) only requires 7 characters and some must be alphabetic while some must be numeric. So "soccer1" is a perfectly valid, PCI compliant password that will be cracked in less than a minute (offline or online).

Section 8.5.10 Passwords must be at least 7 characters long.

Section 8.5.11 Passwords must contain numeric and alphabetic characters.

OK, I'll bite (and show my ignorance regarding security). What should we use in place of SHA1?

In short, for passwords, never use a hash function. Hash functions are made to be fast. If you absolutely must use a hash function, iterate it a few thousand times.

Ideally, use http://www.tarsnap.com/scrypt.html (made to be computationally annoying to brute force).

made to be computationally annoying to brute force

I think the word "annoying" here wins you the understatement-of-the-day award.

I've used bcrypt [http://bcrypt-ruby.rubyforge.org/] in the past. Automatically handles a salt, and you can "tune" the number of iterations, so you can pick the cost. You can make it so the hash function takes 300ms on fast hardware, which limits the rate that an attacker can brute-force your database.

Edit: Wrote up this comment before the one about scrypt. It also looks nice, but it looks like there's no or only primitive language bindings available.

You're just fine with bcrypt. scrypt is almost certainly better, but even iterating SHA1 repeatedly is still acceptable. What isn't acceptable is using a naked hash function (or a naked hash with a "salt").

Really? All I've been doing is just picking 8 random characters as a "salt", sticking it to the password, and SHA-1 it, and you're saying its not secure? Uh oh.

The good news is, there are libraries for bcrypt for most every major language out there, and they are extremely simple to use. Some languages also have scrypt libraries, which is even better. Either will be a huge improvement over plain salted SHA-1.

"Why Not {MD5, SHA1, SHA256, SHA512, SHA-3, etc}?

These are all general purpose hash functions, designed to calculate a digest of huge amounts of data in as short a time as possible."

See the rest here: http://codahale.com/how-to-safely-store-a-password/

designed to calculate a digest of huge amounts of data in as short a time as possible.

That's a weakness, not a strength. If you can only calculate 100 hashes per second, it will take a lot longer to crack a password than if you can calculate 100 000 hashes per second.

That's the point of that article, use bcrypt because it's slow.

Misinterpreted the parent post. Thanks for the correction.

At the very least, add a random salt added to each plaintext before you run it through the SHA1. This will at least defeat rainbow attacks fairly well.

That said, you need to do more and move beyond SHA1 since you can now reverse a SHA1 into plaintext with the computing power EC2 gives you.

It should be pointed out that it's impossible to "reverse a SHA1 into plaintext". The reason being that, since SHA1 produces a fixed size output and takes a variable (unbounded) input, there are an infinite number of input values for every unique output value. You may find a string that happens to come to the same SHA1, but there's no way to know whether or not it's actually what went into the SHA1 algorithm in the first place.

Actually, you can reverse a SHA1 into plaintext. That's exactly what we're talking about.

Can you know it was the same plaintext that went in? No, but as far as the SHA1 output is concerned, it's a suitable input and a suitable reverse.

For the application we're discussing, the fact that collisions can be engineered is sufficient to generate a reverse of the hash. At no point do you need to generate the original input in order to reverse it for this application.

But that's ok, the server doesn't know what originally went into the algorithm either.


While completely true - it's not an obstacle in a simple implementation of password hashes. Any string that produces the same hash will suffice as a password - as the password is by definition anything that hashes to that value.

Yes, but you generally don't just care about recovering a password that happens to hash the same, so you can use it against a given service; if you've already managed to get hold of their hashed passwords, you usually have considerable control already. The non-recoverability of hashes comes into play when you consider that people reuse passwords on other sites, so if you could reverse all the password hashes for one site, it's conceivable that you could get into the users' accounts on another site. But if you're not recovering the original string, the chance of the password hashing the same way on another site is slim. Of course, so is the chance of a collision in the first place, making all of this purely theoretical in the end.

  > you can now reverse a SHA1 into plaintext
I thought that all hash algorithms were lossy. Would I be able to reverse a SHA-1 of a git commit back into the contents of the git commit with enough computing power?

The term "reverse" is not the right word to use. What is meant is that you can generate another input for which the SHA1 output matches your other SHA1 output. This is because SHA1 has collisions.

So "foo" could hash to 1 and "bar" could hash to 1 also. It doesn't mean 1 means foo, but it means if your password is foo, bar works too when hashed and compared against a stored hash.

If the original input is not very short, it's extremely unlikely that an input with the same SHA-1 hash could be found. These attacks work because the passwords are weak and SHA-1 is fast to compute, not due to any weakness of SHA-1 as a cryptographic hash function.

> That said, you need to do more and move beyond SHA1 since you can now reverse a SHA1 into plaintext with the computing power EC2 gives you.

But this only works for small numbers of characters. You cannot inverse-hash a book length document. You also still can't inverse-hash a 16 character or probably even 12 character password.

This fast hashing stuff is only a problem because people use ridiculously short passwords. The examples in the article are a lazy, pathetic 6 characters. I think people vastly underestimate their ability to memorize passwords. For instance, I am able to memorize a 16 character password by just writing it down a few times and then testing myself. And since I can remember many such passwords, I could easily memorize 32 character passwords, which are probably impossible to crack for the next 50 years or so. And you really only need to remember one such password, because all your other passwords can be put in an encrypted file.

technically you aren't reversing it, you are calculating it through brute-force.

best analogy for a hash is smashing a plate. you may be able to glue it back together but it is not the same plate

btw I wonder how long before somebody calcs and loads a rainbow table onto AWS and charges for lookups. so tempted...


Runs a WPA cracker against your provided pcap files for $17- $40, depending on how fast and how extensive a word list you want to use. (Yes, I realize they aren't using rainbow tables, but WPA uses the SSID as salt, so that wouldn't be as usefu.)

Adding a few cracks and glue doesn't make it a different plate...

SHA-2 would be a good choice. The algorithm is based on SHA-1, but avoids the same vulnerabilites found in the SHA-1 algorithm.


SHA-1's vulnerabilities are entirely irrelevant to its unsuitability as a KDF, making SHA-2 no more suitable. As others have recommended, use scrypt or PBKDF2.

Thanks for this. :) Learn something new every day.

First of all, only the initial release day, and the EC2 GPU killer app has been discovered.

Second, store passwords as a salted hmac, not just a shaXsum. It is still usually a singel command, and WAY more secure than simple shaX or mdX, as it eliminates the risk of prefix/postfix attacks. Adding the salt makes dictionaries pretty irrelevant (as long as each install has a unique salt).

I mean GEEZ; even php does it now: http://us2.php.net/manual/en/function.hash-hmac.php

An HMAC is a message authentication code, not a salted hash. It allows you to send a message over an insecure link and verify, by providing the secret key, that the message (and hash) was not tampered with.

Also, a salted hash is not a particularly good storage mechanism, ranking only above cleartext and naively-hashed storage. Once someone discovers the salt, then it's fairly easy to attack the password database in the same way that you'd attack an unsalted hash -- make a list of all possible passwords, hash them, and see what matches. Machines like the one mentioned in the article can make trillions of these in a second. The problem with hashes for protecting passwords is that they can be calculated very quickly; good for ensuring that every packet on your VPN arrived without errors, bad for ensuring that a bad guy can't make a list of passwords, hash them, and check the hashes against your database.

The best password storage mechanism is via the use of a "slow" hash function like bcrypt or scrypt. You set these up so that it takes a full second of CPU time to generate the hash from the message. Then instead of trying trillions of passwords a second, the attacker can only try one!

If you're using Perl, use Authen::Passphrase. It's a module that lets you easily use bcrypt for new passwords but your old method for older passwords. With an API like that, there's no excuse for endangering your users by using salted hashes!

HMAC does nothing to prevent iterated brute force cracking, the kind of cracking that most benefits from GPU acceleration. There is virtually no security benefit to using an HMAC construction over the bare hash function itself.

Listen to 'jrockway.

This is something I've been wondering about - how can you use pre/postfix attacks on passwords? I can't see any obvious ways it would help, and googling hasn't produced any information.

Shouldn't the nonce be different for each password rather than for each install (I take it that by "install" you mean a server or an application installation)?

Nonce-per-password is used in BCrypt and while I don't know much about security, that's the vibe I'm getting from the folks who do.

With the right hash type (unsalted md4... yes, I'm looking at you Microsoft Windows Active Directory) and a top notch Nvidia or AMD graphics card, one can attempt roughly 600 million hashes per second at home in the living room.

Also, one of the teams used Amazon servers during the 2010 Defcon Crack Me If You Can contest. Here is their write-up: http://contest.korelogic.com/team_CrackHeads.html

600 million? Try 10x that. A top notch HD 5970 does 6 billion MD5 hashes/sec. The MD4 rate is roughly 25% faster.


While the author does not seem to know much about computer security, he identified a legitimate threat. Brute force isn't so brute anymore.

We should stop calling them passwords to users, seeing as the best passwords are the concatenation of several non-words.

I believe we made a misstep with the password. I learned a long time ago that how you frame something for the user changes how they will interact with it. A common example I refer to is a small textarea on a website, and some clients wanting to see that textarea larger to encourage their users to write more in the box, or witnessing a user stopping and editing their post to frame it inside of the available area without going over.

The same with passwords, they imply a word. A much better solution is a pass phrase. As far as the system is concerned, functionally identical. But to the human mind, a completely different animal. A word is a word, but a phrase is limitless. With proper punctuation and capitalization, it has everything that makes a good password good: A-z, symbols, length. Except, being a phrase, it has an edge to a long complex password: you can remember it. A phrase has a beginning, middle, and an end.

Passphrase still implies using actual words, passcode on the other hand does not.

That said, for the most part people who chose weak passwords do so for ease of memory, not because they're so stupid that they think they are only allowed words.

Using actual words isn't such a bad thing if what you want to optimize is password entropy at a given level of memorability for naive users.

What about passwords 8 or more characters long?

You can easily calculate that using the charset and the time it took to crack a 6 character password. 6 character passwords with 95 different characters per digit take 49 minutes (Utilizing one machine and using CUDA-Multiforcer). 7 digits would mean 49*95=4655 (77 hours). But remember: The common password does not use special chars, so the actual number can be much lower.

If anyone is looking for a super simple bcrypt solution for PHP, check out Phpass:

http://dev.myunv.com/articles/secure-passwords-with-phpass/ http://www.openwall.com/phpass/

Anybody know how tarsnap's scrypt lines up against HMAC-based key derivation[1]? In particular, let's say I used the HMAC of the password as the bases for the HKDF?

[1] http://tools.ietf.org/html/rfc5869

HKDF isn't a password hash; it's a key derivation function, used to transform a passphrase into key material suitable for something like AES. It doesn't address the key security problem that scrypt addresses (iterative brute force cracking). It isn't an acceptable substitute.

You can use PBKDF2 as a password hash (even though it too is designed mostly to turn passphrases into keys), because PBKDF2 is iterated to slow down brute force attacks.

isn't it illegal for a German to post on how to crack passwords ?

Why ?

In Germany, linking to illegal material can be considered a crime itself, for example linking to pirated software/media.


IANAL but I suspect the OP feels the same laws could be applied here.

No. Germany (and the UK) applied the American drafted European legislation that says it is illegal to distribute software that could be used to break into computers.

http://www.securityfocus.com/brief/567 http://www.schneier.com/blog/archives/2007/08/new_german_hac...

This has had a chilling effect already

for instance


On September 2007, to comply with new German laws regarding distribution of hacking tools to the public, THC stopped making the program available.

I see. I was not aware of this, thanks for the details.

If this becomes a serious issue, my bet is that the government will be forced to start monitoring or background-checking GPU cluster use. This is one case where the cloud will hurt privacy, because Amazon will not risk their empire to protect criminals. Distributed computers cannot be subpoenaed so easily.

Not sure about this, but I could definitely see "export restrictions" on the service... though what's to stop a foreign company from building the same type of service.

export restrictions easily circumvented by getting a US credit card, just like you need one for Amazon Mechanical Turk

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact