
Analysis of bank PIN numbers - dangoldin
http://www.datagenetics.com/blog/september32012/index.html
======
kijin
If you're wondering why 1004, a seemingly random number, is so close to the
top of the list -- something that the author does not investigate in any
detail -- my guess is that the database he used contains some major leaks from
Korea. 1004 is a fairly popular password there, because it is one of the few
4-digit numbers that sound like actual words in Korean. 1004 sounds like
"angel" ( _cheonsa_ ).

So if you're actually trying to break into people's accounts, it would be
advantageous to know your victims' ethnicity. It's quite likely that cards
stolen in Koreatown will have a different distribution of PIN numbers than
those stolen in Chinatown.

~~~
heyitsnick
Interesting. When you say 1004 'sounds like angel', do you mean saying 'one
thousand and four' in Korean sounds like cheonsa? or 'one zero zero four'? Or
the numbers look like letters than spell out angel? / tangent

~~~
1331
The number read in Korean ("one thousand and four") has the same pronunciation
as the word for angel: both are 천사 ("cheonsa").

[http://translate.google.co.kr/?tab=wT#en/ko/one%20thousand%2...](http://translate.google.co.kr/?tab=wT#en/ko/one%20thousand%20and%20four%0Aangel)

~~~
JoeAltmaier
Its read "one thousand four". The 'and' is unnecessary and ambiguous. One of
my hot buttons.

~~~
talkingquickly
Admit I don't have strong feelings either way but out of genuine curiosity,
why is one thousand four less ambiguous than one thousand and four?

You could argue one thousand four is actually easier to misinterpret because
it could easily (although incorrectly) be read to mean "one thousand fours."

edit: typo.

~~~
JoeAltmaier
It matters when fractions are involved. Try reading both of these:

    
    
      100 2/3
    
      102/3

~~~
talkingquickly
Thanks, hadn't thought about that scenario. I have a feeling it's going to
start bugging me as well now...

~~~
JoeAltmaier
Sorry!

Just to make it worse for you: why don't people also pronounce 1492 as "one
thousand and four hundred and ninety and two"?

~~~
talkingquickly
Surely that comes from the same language structure which means you wouldn't
say beans and eggs and tomatoes and potatoes, you'd say beans, eggs, tomatoes
and potatoes e.g. in a list of "and" items you only say and between the
penultimate and final items?

------
kleiba
When I opened my Wells Fargo checking account, they let me choose my own PIN
number for my debit card. I was quite surprised, because previously other
banks always sent me a 4-digit number that I had no influence on.

I became even more surprised when I was told that I don't have to restrict
myself to 4 digits! So, I entered a sequence of 12 or so digits that I happen
to be able to remember easily but that otherwise follows no patterns as e.g.
the ones mentioned in this article.

I thought I was so smart!

Except when I later found out that a lot of card readers in convenience stores
accept a maximum of 8 digits for card PINs :-( I can't use my card there. So
much for being a paranoid computer scientist...

~~~
xtreme
You can't change your card's PIN? That's news to me.

------
jasonwatkinspdx
I have to say, I've really liked some of the posts on this blog. The
comparison of monte carlo and markov models of Chutes and Ladders[1] is very
intuitive. I think it's a great way of understanding forward vs backward
statistical reasoning without falling into the usual borderline religious
divides.

[1]: <http://www.datagenetics.com/blog/november12011/index.html>

~~~
andrewcooke
that's an interesting article that shows the difference between markov and
monte-carlo approaches, but it seems to be incomplete. in particular, although
it claims it will calculate an _exact_ solution (just after the "Markov
Chains" subtitle) it never does; the example uses repeated calculations and
stops at some point to give an _approximate_ answer. am i misunderstanding
something?

------
zokier
And why do banks (or credit card companies) even let users pick their own PIN
codes? Wait, do not answer that, I know. Some beancounter has calculated that
the cost of somebody guessing PIN is significantly less than cost of support
calls when people forget their random PINs. Which kinda sucks that security is
compromised to make more profit.

For reference I have (afaik fully random) bank-assigned PIN on my card.

~~~
wazoox
Many banks apparently don't let you choose PINs; however from the limited
sample of PINs I know of (mine, my wife, my company and a few others) they're
not random; they all have doubled digits (like 2554).

~~~
finnw
It could be a coincidence. About half of all 4-digit numbers contain at least
one repeated digit. (Of 10000 possible numbers, 10 * 9 * 8 * 7 == 5040 contain
four distinct digits.)

~~~
ralph
Right, and about a quarter contain the same digit at least twice in a row,
which may be the "double digit" the grandparent referred to.

    
    
        $ pins() { seq -f '%04.0f' 0 9999; }
        $ pins | egrep -c '(.)\1'
        2710
        $ pins | egrep -c '(.).*\1'
        4960
        $

------
peterjmag
Interesting analysis, but I question the use of a proxy for the source data. I
doubt that PINs like 1234 and 1111 are nearly as common when an actual bank is
involved. At least, I would hope that people take their ATM PIN more seriously
than a password for a relatively unimportant website account, but maybe I'm
giving the general public too much credit.

~~~
sitharus
Here in New Zealand banks reject sequential and repeating numbers for your
PIN, and they also check for your birthdate.

So the trick is to find their anniversary or first child's birthday.

~~~
drstewart
That's awfully nice of them to significantly reduce the search space.

~~~
sitharus
It is rather. Though given that the majority of transactions are electronic
here people share PINs all the time, that irks me.

~~~
sjwright
You're sharing pins for electronic transactions? What the what? (I'm over the
pond in Australia, and I've never shared any of my various pins with anyone.)

~~~
sitharus
I wouldn't! A lot of people do though, it's the downside of a mostly cashless
economy, but with the rise of internet banking and near-instant transfers
online it should go away.

~~~
sjwright
But why would you? What is the pin for??

~~~
JamesBlair
As someone who knows someone else's PIN: if they're too inconvenienced by
making the payment themselves and don't realize what security it's meant to
offer, they'll be quite content to hand the card and the PIN to someone they
trust.

------
zero79
Thought it would be worth mentioning that the heat map appears to show people
love using MMDD dates (and possibly MMYY, but harder to conclude from the
data), as shown by the lower left heat intensity and the heavy usage of 0 and
1 as the first digit. In addition, the bottom 20 has no candidates that would
fit into MMDD.

------
steve8918
My PIN number is 8+ numbers long, and to my chagrin, I discovered in 2006 that
most ATMs in Europe didn't accept anything more than 6, and most accepted only
4. I'm not sure if things have changed since then, but in the 5 countries I
was in, the only ATM that accepted 8 was in Amsterdam, near the end of my
trip.

~~~
rrreese
Did machines just reject your pin, or did they accept the first 4-6 digits?

~~~
danielweber
This happened to me in the United States.

I had a 8 digit PIN, but only the first 4 mattered to the ATM. The screen
would blink after the first 4 numbers, too, to help you notice you already had
it.

Then two weird things:

1\. The bank got bought by someone else, and the new bank demanded 6 digits. I
had stopped using anything past the first 4, but they were still the first six
from the original form I filled out years earlier.

2\. They sent me my full PIN in the mail. At first I was surprised by this,
since one-way-hash and all that jazz, but on reflection one-way hashing a six-
digit PIN is pretty silly.

------
zissou
1-2-3-4-5? That's amazing. I've got the same combination on my luggage.

------
poopicus
High five for using 'PIN numbers' and not caring about repeating the word
'numbers'. That'll probably drive some people nuts, but not me!

~~~
bobbles
In this case it actually works because he's looking at the 'numbers behind the
PINs' :)

------
aw3c2
The title is wrong and very misleading. There is no single mention of "bank"
in the linked post. Hell, it even says that it was collected from random
password leaks. It says

 __ _Given that users have a free choice for their password, ifusers select a
four digit password to their online account, it’s not a stretch to use this as
a proxy for four digit PIN codes._ __

but I highly disagree with that extrapolation.

~~~
apl

      > but I highly disagree with that extrapolation.
    

That's great, but your counter-claim isn't obvious at all. So please
elaborate. Are you alluding to banks generally setting people's PINs for them,
using more appropriate distributions? Or, God forbid, do you believe that
users are _more careful_ when picking sensitive PINs?

I have serious doubts about the latter.

~~~
grey-area
_Are you alluding to banks generally setting people's PINs for them, using
more appropriate distributions? Or, God forbid, do you believe that users are
more careful when picking sensitive PINs?_

Well yes, of course.

Firstly, Banks do set the pin first, and the vast majority of people probably
never change it.

On your second point, for a bank I would pick a complicated pin and/or
password, for some throwaway website/app/game account I'd choose something
simpler and easy to remember, many people probably use the same pin for loads
of services, but wouldn't use it for their bank for obvious reasons.

Most people have some perception of levels of security, even if they only have
a binary concept of 'involves my money' or not, and trying to extrapolate from
website logins for some unimportant data to banking pins which involve real
losses _for the user involved_ is not at all convincing.

~~~
apl

      > Most people have some perception of levels of security
    

We're just trading intuitions here, of course, but I'm virtually certain that
people's sensitive passwords are as abysmal as their non-sensitive ones. Sure,
you'd pick a good password; however, the mere fact that you're debating this
point and that you know what password goodness entails tells me that you're
not a helpful sample of the population in question (namely _everyone_ ).

I'm going to look into finding some data pertaining to this issue. What I'll
definitely grant you is that the extrapolation isn't perfect.

------
robocaptain
But... the N... stands for Number!

Whyyyyyyyyyyyyyy!!!!!

Potentially relevant XKCD: <http://xkcd.com/1108/>

------
gpcz
It's not a perfect fit, but it's unsettling to me how close the leading PIN
digit graph corresponds to Benford's Law (
<https://en.wikipedia.org/wiki/Benford%27s_law> ). You would think intentional
human randomness would not fit this distribution.

~~~
T-hawk
The humans aren't being random. They're using numbers from sources in real
life. A large fraction of the data points are years from the twentieth century
as 19xx and a smaller fraction as 200x. MMDD or DDMM format dates also follow
the Law in tending to lead with digits 0-3. Or consider the 2580 case or other
patterns on the keypad, where the leading digit will skew towards the
beginning of the digit-alphabet.

So Benford's Law will apply even on a data set that has no numerical meaning
in itself, transitively from the _real_ sources of the numbers.

------
buro9
> Now that we’ve learned that, historically, 8068 is (was?) the least commonly
> used password 4-digit PIN, please don’t go out and change yours to this!
> Hackers can read too! They will also be promoting 8068 up their attempt
> trees in order to catch people who read this (or similar) articles.

That's cool, mine is now 8093

~~~
yen223
Thanks for sharing.

~~~
StavrosK
It was actually 8095, he fooled us.

------
adastra
I would take this with a grain of salt. He doesn't say what exposed passwords
he used as his sources, but I'd bet the linkedin passwords were part of it.

But a lot of people will knowingly use an unsecure password for a site like
linkedin, because they don't really care if its hacked. Using a secure
password for bank accounts, email, etc. is critical, but I don't think it's
realistic to ask people to have unique, secure passwords for their account on
forums.49ersfans.com. That just isn't going to happen. It's interesting, but
I'm skeptical his sources are even remotely a good proxy for bank account PIN
numbers.

------
rafski
Interestingly, I've never received a default bank pin in Ireland that wouldn't
have 2 repeated digits in it. I have a theory they do this to make pins easier
to memorise.

~~~
sitharus
Last time I signed up to a new bank there was a box on the paper form for your
pin number, which was then typed in a clear text field by the staff member
setting up my accounts.

I stuck with my old bank who used a secure entry keypad so I could set my pin.

~~~
JamesBlair
Which bank are you with? One of the few things that annoyed me about NBNZ was
the utter lack of security with setting the pin and online banking password.

~~~
sitharus
NBNZ, but given that I haven't set a PIN for at least 10 years they may have
changed the process. Same for online banking.

When I last did it the PIN was a secure keypad, and online banking required
them to send a letter and then authenticate in branch.

Photo ID was required for both steps.

~~~
JamesBlair
Ah, that's unfortunate.

------
mrmaddog
2580 seemed like a fairly obvious PIN choice to me, but I'm completely baffled
on the significance of 1004 in position #6. It is neither easy to type, has no
good geometric representation, and doesn't seem like a year/date/zipcode or
any common number. Any ideas?

(I'd also venture that one of the reasons 2468 is so much more popular than
1357 is because of the nice symmetry that 2468 has)

~~~
waxjar
The only thing I could think of was Ubuntu version 10.04. I doubt that
explains it. I thought any variation on that (1.004, 10.04, 100.4) might have
some significance in popular culture, like the 42 has, but didn't find
anything.

I'm curious.

~~~
pbhjpbhj
Seems like 1004 would be a pretty easily typed and spatially easy PIN to
remember - [http://static.guim.co.uk/sys-
images/Guardian/Pix/pictures/20...](http://static.guim.co.uk/sys-
images/Guardian/Pix/pictures/2007/12/04/atm460.jpg) \- poke your elbow out and
place your hand over the keypad and it's middle finger, thumb, thumb, index
finger. Quick and easy.

------
hahainternet
Perhaps this is a chance for a quick poll. I believe that every PIN number
I've ever received from a bank in the UK has had a duplicated number. Some
examples are: 1142, 4840.

Does this happen to anyone else?

~~~
pmjordan
As 'finnw has mentioned, almost 50% of 4-digit numbers have at least one
repeated digit, so there is high chance of ending up with such a number. FWIW
I have 2 UK cards from different banks where the PIN has no repeated digits. I
suspect you're seeing a pattern where there is none.

Incidentally, codes with repeated digits make it harder to guess your code
based on fingerprints, as there are more possible combinations.

~~~
hahainternet
Excellent thank you. I feared that I was but the alternative is also fairly
plausible. Cheers!

------
BOYPT
All credit card passowrd leaked and compressed into 3 lines python program

<https://gist.github.com/3748634>

------
crusso
Yes, my bank PIN is the least frequently used out there. I feel safe.

... uh... shoot. ^H^H^H^H^H^H^H CARRIER LOST

------
jconnop
I love stats analyses like this.

Pleased to say my pin occupies one of the darker regions of the heatmap :)

~~~
sukuriant
And now all the hackers that care have inverted the map and are attacking you
with the inverted version now :P

~~~
alanh
Oh no, he reduced the attack space from 20% to 80%! Oh wait…

------
Kilimanjaro
Unbelievable how ONE in SIX pin numbers are either 1234 or 1111.

I guess they're right, mine is 1234

------
chayesfss
I learned to program on an 8088 and then 8086, guess it wasn't too popular

------
DomKM
"Analysis of bank _Personal Identification Number numbers_ "

------
hammock
What's the horizontal line around xx12? He doesn't explain this.

~~~
bkerins
Backwards birthdays, like 2311 for November 23. In Europe this would be the
usual way of writing a birthday already.

------
Kilimanjaro
Besides the repeating numbers, I dare to say a big percentage of pins start
with 19 __because people use their year of birth as pin.

------
bluedot
God damn it, my pin is (ranked) number 9999.

~~~
squeakynick
Now the world knows :)

