
Password Cracking with 8x Nvidia  GTX 1080 Ti GPUs - EvgeniyZh
https://www.servethehome.com/password-cracking-with-8x-nvidia-gtx-1080-ti-gpus/
======
mrb
In 2010 I built an 8-GPU machine[1] (4 dual-GPU AMD HD5970) and wrote an MD5
bruteforcer (then faster than hashcat), doing 28.6 then 33.1 billion passwd
hashes/sec with a software optimization:

[http://blog.zorinaq.com/whitepixel-breaks-286-billion-
passwo...](http://blog.zorinaq.com/whitepixel-breaks-286-billion-passwordsec/)

It's interesting to note that 6.5 years later a single GPU like the Nvidia
1080 Ti can match the whole 2010 machine (32 billion hashes/sec). This is a
doubling of speed every ~2 years. Moore's Law is still alive and kicking
(contrary to what many claim)!

[1] Incidentally posting this machine on HN is how I got pointed to Bitcoin
thanks to the reply of a HN user :)
[https://news.ycombinator.com/item?id=2003888](https://news.ycombinator.com/item?id=2003888)

~~~
mtgx
> Moore's Law is still alive and kicking (contrary to what many claim)!

That statement is so often misunderstood, in multiple ways.

First off, Moore's Law isn't technically about performance increases. It's
about doubling of transistors every 2 years on the same die space. We still
got that on _CPUs_ until very recently, even though CPU _performance_ has
stopped doubling every 2 years like 15 years ago. But now even the transistor
count doubling doesn't happen every 2 years - _for CPUs_. If that were to
happen we'd all have 32- or 64-core laptops by now. And even then the
performance wouldn't be anywhere close to 64x (compared to single core).

The second misunderstanding, which exists in your post, too, is that most
people refer to Moore's Law dying when they talk about CPUs (especially if we
discuss performance).

However, GPUs have indeed kept doubling or so their performance every 2 years,
also until very recently. Now it's more like a 30% improvement per year, or
70% improvement every 2 years, which is still _way_ better than CPUs.

And the reason for this is because GPUs pretty much scale with number of
cores. If you have 4000 CUDA cores in 2017 for 150W TDP and a $1000 price,
then you're going to have let's say 7000 CUDA cores for 150W TDP and $1000
price in 2019, and the performance will be 70% greater. It doesn't work like
that for CPUs, and it hasn't for like 15 years.

~~~
mrb
On a GPU more transistors = more compute units = more performance. Hence my
over-simplification of Moore's Law.

I strongly disagree that the rate of CPU perf improvement has slowed down "15
years ago". What a laughable statement. You have to look beyond core count to
gauge performance. Microarchitectural improvements, new instruction sets (SSE,
AVX), bigger caches, etc certainly still help keep the pace. Have a look:
[https://www.hpcwire.com/2015/11/20/top500/](https://www.hpcwire.com/2015/11/20/top500/)
In particular: [https://6lli539m39y3hpkelqsm3c2fg-wpengine.netdna-
ssl.com/wp...](https://6lli539m39y3hpkelqsm3c2fg-wpengine.netdna-ssl.com/wp-
content/uploads/2015/11/TOP500-SC15-Tech-Trends-Scalar-Processors-Moores-Law-
is-fine.png)

Also perhaps you get confused by the fact the average wattage of CPUs sold to
consumers is dropping. If, from the performance of a 60-70 watt Netburst
Pentium 4 from 2001, you project the expected performance of a 2017 CPU
according to Moore's Law, then you should look at today's 60-70 watt CPUs, not
at modest 10-20 watt CPUs that seem to be quite popular these days.

------
beefsack
We all know MD5 is broken, but looking at it from purely a brute-force
perspective:

If you look at a US English keyboard, you've generally got 47 unique character
keys. Let's double it and say there are 100 different characters you can type
just using the character keys and shift.

This machine could brute-force crack any 6 character password in under 4
seconds, any 7 character password in just over 6 minutes, and any 8 character
password in just over 10 hours.

And that's using the least efficient method known.

~~~
pfalke
As a non-infosec guy, could someone shed more light on the implications for
end users?

I get that the combination of password reuse, short passwords and the fact
that some services store passwords in plain text or as MD5 hashes makes it
easy to break into accounts once a single service is compromised.

So my takeaway is not to use longer passwords, but to use a password manager
and have unique passwords for every service. My current setup is 8 character
passwords for online services (easier to occasionally type in manually).

Am I running a risk by not using 12 character passwords?

~~~
dahart
I'm personally convinced 8 chars is now too short to be safe, and I suspect
real attacks are generally much faster than 8 hours for a password of that
length.

Using a password manager to generate random passwords you get a way to be
impervious to dictionary attacks, in addition to being able to generate and
manage longer passwords. I'm generally using 20 char passwords, and I'd turn
it up further if there weren't so many stupid websites that limit the max
length of passwords to 20 characters.

For passwords I need to type, especially if I need them occasionally on a
touch screen tablet, I'll use a long all-lowercase letters password. Some of
them have a 'make pronounceable' option as well that gives random syllables
and makes typing a 20 char password easier than typing an 8 char password of
completely random characters. 20 chars of lowercase alpha is _a lot_ more
secure than 8 chars of mixed-case alphanumeric and punctuation.

~~~
tracker1
Yeah... I feel if you're limiting input that limit should at LEAST be 64-100
characters or more. Then, since you're hashing anyways, I wouldn't worry too
much about limits (other than practical check times, for creation complexity
requirements, etc).

The other side is to use a fairly expensive hash, and methods to
mitigate/reduce use of a login system as a DDOS vector... having the system,
and database used for authentication separate from your actual application is
a good start, as is exponential backoff on bad passwords by IP and username.

Moving to a separate "auth" domain that returns a signed or encrypted token,
and having that in isolation won't stop your processes from running if you get
too many requests for auth at once. Having an exponential and random wait
before returning from a failed login is another. Keeping track of IP/user
requests in an N minute block is also helpful.

token re-auth may be on the auth domain, or the actual service domain, so that
can be different.

------
walrus01
as with most really high electrical loads, if you can operate it remotely, do
so. There's places in North America near major hydroelectric dams with
electricity that costs $0.03 to $0.045 per kWh. Running something like this
and the cooling needed for a 3kW thermal load on California electrical prices
is a good way to burn money.

~~~
jjordan
Someone password cracking with 8 GTX 1080's isn't likely worried about the
electricity costs associated with said cracking.

~~~
PeterisP
If you anticipate a full load and include cooling, already within a single
year the electricity costs more than the GPU hardware - so yes, even (and
especially!) if you're buying top end gear then electricity costs matter, it's
more expensive than the shiny stuff.

------
dragontamer
> The GTX 1080 Ti is the go-to value card for deep learning at the moment.

Ah, that explains a lot. I recall seeing that AMD cards are being used for
most of the Bitcoin / Etherium stuff right now, so I thought it was odd to see
team-green used in this case. IIRC, AMD cards have faster integer performance,
but are slower in floating-point than the NVidia cards. Password-cracking is
primarily integer-based however, soooooooo...

But since they're actually building a Deep Learning Platform (and while
waiting for other stuff... using it to quickly-test a password cracking
solution), these benchmarks make sense. Plus, its an interesting datapoint in
any case.

I do wonder what the AMD-cards would do however. Whether it'd be more
efficient, or faster... (or hell: maybe conventional knowledge right now is
wrong and NVidia is faster)

~~~
jrimbault
I _think_ (not sure) that AMD is faster per $ not faster for mining
specifically ? So it makes more sense when mining for bitcoin to buy the
fastest per $ cards ?

~~~
mciancia
You need to count in the performance per Watt and nvidia is sometimes better
when it comes to that. Anyway, with current bitcoin and ethereum price, good
luck finding any high end nvidia or amd GPU - miners are buying everything

------
DanCarvajal
Can this brute force a smart fridge?

~~~
ryan-allen
Did you forget the passcode to your beer?

~~~
tqkxzugoaupvwqr
It it a reference to the TV series Silicon Valley (season 4).

------
cm2187
bcrypt: 21kH/s scrypt: 750kH/s

The author didn't mention the work factor so I don't know how comparable the
results are. But I thought the merit of scrypt over bcrypt was that it was
memory hard, i.e. hard to run on a GPU. It doesn't seem to be the case.

~~~
mrb
The thing about bcrypt is that despite using small data structures in memory
(kilobytes) it reads and writes and rewrites a _lot_ of data into them
(megabytes, when choosing common work factors). And the small size of its data
structures is still just a bit too big to fit many parallel instances of
bcrypt in GPU L2 caches. So it is effectively "memory hard" on GPUs.

Contrast this with scrypt which writes memory (typically megabytes) only once,
and a byte is read, on average, only once.

bcrypt and scrypt are both, at the moment, memory hard on GPU, but for
different reasons. bcrypt reads/writes many times a small buffer. scrypt
reads/writes once a large buffer.

Don't pay too much importance on comparing the hash/sec of these 2 algorithms
in this specific benchmark. The configurable work factors influence speed a
lot.

One day L2 caches will be big enough and we will see a sudden massive cracking
speed improvement for bcrypt. This isn't going to happen anytime soon for
scrypt.

~~~
nly
So one is memory capacity bound, and one is memory bandwidth bound.

------
Tepix
For reference, here are some more numbers:

8x AMD R9 290X
[https://gist.github.com/epixoip/8171031](https://gist.github.com/epixoip/8171031)

8x GTX 980
[https://gist.github.com/epixoip/c0b92196a33b902ec5f3](https://gist.github.com/epixoip/c0b92196a33b902ec5f3)

8x GTX 1080
[https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...](https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a270c40)

------
Const-me
I’ve once used my PC for the same thing, the result was OK. Based on raw
performance, my GTX 960 GPU is 4.5 times slower than a single 1080 Ti, and
thus 36 times slower than 8x 1080 Ti’s.

My GPU was able to try 20GB passwords in a couple of minutes. That was the
largest password list I’ve found on the internets.

However, neither system can brute force even a short 8 characters password. If
the password is lower+upper+digits, that’s (26*2+10)^8=2.14E+14 passwords.
This number is way out of range even for a PC with 8x 1080 Ti.

Those GPUs are only useful in a very small number of corner cases.

~~~
mrb
2.14e14 is nothing. 8x 1080 Ti can crack that in less than 15 minutes (NTLM,
MD4, MD5).

~~~
Const-me
This page says 8x 1080 Ti can do 1.3e10 hashes/sec for NTLM:

[https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a27...](https://gist.github.com/epixoip/a83d38f412b4737e99bbef804a270c40)

To test 2.14e14 passwords, you need 4.5 hours.

Add a single extra character to that password and the time will become too
long regardless on your hardware.

~~~
nitin_flanker
Also, the OP didn't mention the use of symbols. Cracking even a 8 digit
password that have symbols in it will become impractical to even consider.

~~~
lightedman
Process of elimination restricts character and symbol sets, generally,
narrowing your set of possible combinations greatly.

The best way to crack a password isn't to brute-force it first, it's to first
analyze who made the password, and the password system, to narrow down all
possibilities before you try brute-forcing.

Example; if a person is American, you can pretty much assume they're
restricted to the typical US keyboard and its symbols, for 90+% of the
population. Very few people know of ALT codes or unicode or even the character
map, even in IT. That narrows your symbol subset down dramatically. System for
passwords truncates after 12 characters, has a minimum of 8? You already know
you don't need to try doing anything with more than 12 characters, and you can
limit your password cracking to starting with 8 characters and ignore anything
with fewer than that. That eliminates a whole slew of brute-forcing that is
required, as you've now narrowed down the password range.

All it takes is a little thinking. Man can make it, man can break it, there is
simply no exception.

~~~
algorias
I believe the poster upthread already considered only restricted characters
(upper + lower + digits), so the difficulty they stated is what remains
_after_ your analysis.

> Man can make it, man can break it, there is simply no exception.

Nice platitude, but this is simply not true.

~~~
lightedman
"Nice platitude, but this is simply not true."

You got an example of anything man has made that man has not broken?

"I believe the poster upthread already considered only restricted characters
(upper + lower + digits), so the difficulty they stated is what remains after
your analysis."

No it's not, because they didn't think of things like password truncation
(which my bank annoyingly does) and various other things.

I tested it. It took me almost an hour to crack my chosen mixed-character +
symbol 15 character password with a GTX970 implementing the few rules I stated
above. Howsecureismypassword.net says it would take a computer 16 BILLION
years to crack.

My point very firmly stands.

------
castratikron
I wonder what happens to all of the old cards that are replaced every
generation? Would be nice to snag a couple of those off of eBay.

~~~
AlphaSite
They're there and unbelievably cheap. Just wait for the inevitable death of
GPU mining again.

~~~
castratikron
Would you happen to know the model name of yesterday's deep learning GPU?

------
voycey
Interesting to see real world figures for bcrypt, also interesting that 7Zip's
hashing algorithm seems extremely resilient (I assume it works on the same
kind of difficulty / work factor that bcrypt uses but with a higher factor)

~~~
jrimbault
Slightly tangential but:

7zip amazes me. It hasn't been _regularly_ updated for years, and still comes
out on top of most benchmarks. And yet I find that many, if not most, people
on Windows use WinRAR.

It's good software, easy to install, easy to use, and it doesn't nag the user
with warnings about licensing. I don't quite understand how WinRAR got popular
in the first place with that kind of competition.

~~~
a_t48
I've never had file associations working with 7z on any machine. :/

~~~
0xcoffee
I think you have to run 7zip as admin when setting the associations.

------
hacked_news
There exists a computerphile video about this.

[https://www.youtube.com/watch?v=7U-RbOKanYs](https://www.youtube.com/watch?v=7U-RbOKanYs)

------
cleeus
As a dev you should take a look at the difference of hashes/s of the different
algorithms. Some are measured in GH/s, some in MH/s, some only kH/s. E.g. MD5
is way faster then SHA256. So if you only use a strong hash for ensuring data
integrity in the absence of adversaries and performance matters, you may
prefer MD5 even in these days.

~~~
ComputerGuru
There's no good reason to use md5, period. If performance matters and security
doesn't, use something like mmhash3. If security matters and speed doesn't,
use something like SHA-512 (optionally truncated to 256 bits). If both matter,
use something like siphash if a prf will do or blake2 if you need a true,
high-speed cryptographic hash.

(Actually, besides portability and interoperability, there's probably no good
reason to use SHA-anything over blake2. Although if you are working in
environments that provide hardware crypto support (Intel SHA Extensions on
Atom, now also supported on Ryzen so maybe we'll see it on the desktop, too),
SHA becomes faster than blake2 and you should use that instead.)

If at any point you find yourself in a situation where your hash being
computed too fast poses a security risk, that should be a HUGE warning. Hashes
should be fast, cryptographic or otherwise. If you need a "slow hash" you are
probably looking for a derivation algorithm and not a hash and should be
looking at scrypt, bcrypt, or even pbkdf2.

~~~
cleeus
I have not looked deeply at mmhash3 (I guess it means murmurhash3), but
wikipedia says:

[...] When using 128-bits, the x86 and x64 versions do not produce the same
values [...]

which will make it unsuitable for cross-platform applications. I was talking
about a usecase where you could also choose CRC32 from a security standpoint
but want more collision resistance. How does blake2 performance compare to
MD5?

~~~
aappleby
The wikipedia page might be a bit misleading - there's a 128-bit murmur3 that
uses 32-bit math (works well on most every processor), and a 128-bit murmur3
that uses 64-bit math (much faster on 64-bit processors, much slower on 32-bit
ones)

-Austin, Murmur author.

~~~
cleeus
ah, so there is mm3-128-32 and mm3-128-64

that makes it actually a viable alternative to MD5

