
Using Ordered Markov Chains and User Information to Speed Up Password Cracking - Faizann20
http://fsecurify.com/using-ordered-markov-chains-and-user-information-to-speed-up-password-cracking/
======
geezk7
Another way is to use probabilistic graphical models. The paper "Personalized
password guessing: A new security threat" addressed this threat several years
ago.

[https://experts.illinois.edu/en/publications/personalized-
pa...](https://experts.illinois.edu/en/publications/personalized-password-
guessing-a-new-security-threat)

------
mattcoles
Seems like randomly generated passwords kept in a password manager are the way
to go then, too much risk in letting our personal details bias password
choices.

~~~
throwaway729
But single point of failure and also clipboard attacks! :(

~~~
tmalsburg2
Correct me if I'm wrong but if you install a malicious application, aren't you
screwed anyway, password in clipboard or not?

~~~
Buge
On desktop yes. But on mobile, where every app is sandboxed to some degree,
not necessarily. Keepass2Android prevents clipboard attacks by installing a
keyboard that autotypes your password, never letting it get to the clipboard.

~~~
amelius
Interesting approach. But keyboards are quite personal (some people choose
their keyboard), and there's quite a lot of technology in keyboards (gestures
to text, learning dictionaries, et cetera), so I'm wondering about the quality
and user-friendliness of the approach.

~~~
x1798DE
You only use the keyboard to enter the password, not in general. When you open
the database and select an entry, it offers to switch to the password keyboard
for you, then when you are done you switch back.

------
Faizann20
Here's the updated and live link. [https://medium.com/@faizann20/using-
ordered-markov-chains-an...](https://medium.com/@faizann20/using-ordered-
markov-chains-and-user-information-to-speed-up-password-cracking-d8b2718a502e)

------
matrix2596
Nice idea! Can anyone point the data. May be we can try RNN to generate the
passwords.

~~~
gwern
Markov chains can do amazing things in password cracking:
[https://arstechnica.com/security/2013/05/how-crackers-
make-m...](https://arstechnica.com/security/2013/05/how-crackers-make-minced-
meat-out-of-your-passwords/)

But an RNN isn't necessarily going to help as much as you think. An RNN has
two problems compared to a Markov chain:

1\. Markov chains memorize strings very very easily, accurately, and scalably;
it's easy to memorize phrases, words, suffixes, and prefixes from the existing
corpuses of billions of passwords. That's all a Markov chain does, memorize &
count. On the other hand, an RNN will struggle to do so because there's no
'place' for it to put all of that, everything has to be encoded into the fixed
set of neural net weights, otherwise, it just doesn't know about it; and the
more you ask it to learn, the more the competing demands fight each other.
RNNs augmented with external memories might help fix this but are still
cutting edge research.

2\. Markov chains are also very fast, far faster than an RNN. Multiple orders
of magnitude difference are possible, unless you use a RNN so small as to be
irrelevant (since then it can't memorize anything). For cracking hashes, a
small gain in plausibility of guesses is not worth being able to make hundreds
or thousands times fewer guesses (unless perhaps the hash are something proper
like bcrypt/scrypt where it takes seconds to check, in which case the guessing
phase takes up a much smaller fraction of runtime and better guesses may be
worthwhile).

~~~
ma2rten
Actually RNNs/LSTMs are surprisingly good at memorizing in addition to
generalization. Have a look at the famous blog post "The Unreasonable
Effectiveness of Recurrent Neural Networks" [1] for instance and notice how
many words it's able to generate from characters. However, your second point
is valid.

[1] [http://karpathy.github.io/2015/05/21/rnn-
effectiveness/](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)

~~~
gwern
That's not 'surprisingly' good, that's effective only on a small corpus. You
can see for yourself that if you try to feed it multiple corpuses with many
vocabulary words or proper names, which a Markov chain wouldn't break a sweat
on memorizing them all, the RNN has limited memorization ability and what
tends to happen is that the less common ones get overwritten in favor of the
general grammar of English and the vocabulary of the largest corpus:
[https://www.gwern.net/RNN%20metadata](https://www.gwern.net/RNN%20metadata)

~~~
ma2rten
If I understand you correctly you are comparing Markov Chains on words to a
RNN over characters. That's not fair.

This paper shows a large LSTM outperform n-gram models:

"In this paper we have shown that RNN LMs can be trained on large amounts of
data, and outperform competing models including carefully tuned N-grams. [...]
Unlike previous work, we do not require to interpolate both the RNN LM and the
N-gram, and the gains of doing so are rather marginal."

[https://arxiv.org/abs/1602.02410](https://arxiv.org/abs/1602.02410)

~~~
gwern
> If I understand you correctly you are comparing Markov Chains on words to a
> RNN over characters. That's not fair.

No. Words have nothing to do with it. (An RNN over words would be useless for
password guessing.)

Anyway, your link doesn't demonstrate what you think it demonstrates. It's not
on a password corpus but a much smaller natural language one, there is no
attempt to equate runtime or model size, and the log-likelihood is an
irrelevant measure of performance to passwords/s.

~~~
ma2rten
If words have nothing to do with it then I am not sure how your link shows
that Markov Chains are better at memorizing, because a Markov Chain over
characters would do a much worse job at generating coherent text.

But I actually acknowledged your second point, so I never said that RNNs are
useful for password guessing, unless maybe you have a very expensive hash
function. However, they are good at memorizing sequences and log-likelihood is
not an irrelevant measure of performance on passwords. It measures ability of
the model to generalize to unseen data. In this case that means generating
realistic passwords that are not in the training data. That is important
because otherwise you might as well just use a dictionary attack.

~~~
gwern
> If words have nothing to do with it then I am not sure how your link shows
> that Markov Chains are better at memorizing, because a Markov Chain over
> characters would do a much worse job at generating coherent text.

Over n-grams, it would not, as some of the responses to Karpathy's post noted,
by posting Markov chain text which is of high quality. The char-RNN shows its
greatest ability in matching syntax and recusive structures and in modeling
the subtler aspects of English grammar, syntax, and semantics... which are all
useless in password guessing. (For example, I could only tell the difference
between the Markov chain and char-RNN C source, because the char-RNN
understood the nesting of syntax, but not between the Shakespeare.)

> However, they are good at memorizing sequences and log-likelihood is not an
> irrelevant measure of performance on passwords. It measures ability of the
> model to generalize to unseen data.

No, it measures a particular loss function proportional to the mean log
probability. In password guessing, the loss function is zero-one: you care
_only_ about guessing a single _exactly right_ password. You get zero points
for generating a realistic password which is one character off. Being able to
model the distribution of 'e's slightly better is irrelevant compared to being
able to memorize common birthday suffixes and guess a few more passwords per
second. Having a better log likelihood on a natural English language corpus is
measuring the wrong thing on the wrong data.

> you might as well just use a dictionary attack.

Exactly. This is how the best password crackers work: mix-and-match memorized
literals, prefixes, and suffixes extracted from dumps of billions of
passwords. A Markov chain is a souped-up dictionary attack demonstrating 'The
unreasonable effectiveness of big data'.

------
gravypod
Can we make a characteristic scoring metric to help order password cracking
attempts? Is there a standard distribution of characters in passwords that can
be analyzed?

~~~
Etheryte
Every password guessing optimization is about finding logically linked
character distributions in passwords.

~~~
gravypod
I still think wordlists are the easiest generic way of brute forcing passwords
en-masse since most people still don't use password generators or aren't as
uniuque as they think they are.

Sorting wordlists by some kind of metric _should_ improve performance.

------
Faizann20
Apologies guys. The link will be up within a few minutes.

~~~
vog
Site is down (again?).

~~~
vog
Oh no, it isn't. It just takes quite a while to load.

~~~
lucb1e
It is now: blank response with status code 500.

~~~
lqdc13
Always baffling to me. HN front page is not that much traffic. Maybe 1 req/s
on avg for a few hrs with bursts up to 3 req/s.

~~~
GrayShade
Why are you getting downvoted?

