Hacker News new | comments | show | ask | jobs | submit login
Analysis of bank PIN numbers (datagenetics.com)
255 points by dangoldin 1242 days ago | past | web | 111 comments



If you're wondering why 1004, a seemingly random number, is so close to the top of the list -- something that the author does not investigate in any detail -- my guess is that the database he used contains some major leaks from Korea. 1004 is a fairly popular password there, because it is one of the few 4-digit numbers that sound like actual words in Korean. 1004 sounds like "angel" (cheonsa).

So if you're actually trying to break into people's accounts, it would be advantageous to know your victims' ethnicity. It's quite likely that cards stolen in Koreatown will have a different distribution of PIN numbers than those stolen in Chinatown.

-----


Interesting. When you say 1004 'sounds like angel', do you mean saying 'one thousand and four' in Korean sounds like cheonsa? or 'one zero zero four'? Or the numbers look like letters than spell out angel? / tangent

-----


The number read in Korean ("one thousand and four") has the same pronunciation as the word for angel: both are 천사 ("cheonsa").

http://translate.google.co.kr/?tab=wT#en/ko/one%20thousand%2...

-----


Its read "one thousand four". The 'and' is unnecessary and ambiguous. One of my hot buttons.

-----


It's read "one thousand four" in US english. In British english (and in many Commonwealth countries like Australia and New Zealand; not sure about Canada) it's read "one thousand and four", regardless of whether the and is unnecessary or ambiguous.

-----


Admit I don't have strong feelings either way but out of genuine curiosity, why is one thousand four less ambiguous than one thousand and four?

You could argue one thousand four is actually easier to misinterpret because it could easily (although incorrectly) be read to mean "one thousand fours."

edit: typo.

-----


It matters when fractions are involved. Try reading both of these:

  100 2/3

  102/3

-----


Odd; I read improper fractions differently.

100 2/3 - One hundred and two-thirds. 102/3 - One hundred two over three.

-----


Thanks, hadn't thought about that scenario. I have a feeling it's going to start bugging me as well now...

-----


For what it's worth, the "and" is mandatory in my language and we manage to disambiguate 100 2/3 from 102/3 by putting a pause between either "hundred" and "and" or "two" and "thirds".

100 2/3 - one hundred ... and two thirds

102/3 - one hundred and two ... thirds

Alternatively, use "plus" instead of "and" in the first case.

-----


Sorry!

Just to make it worse for you: why don't people also pronounce 1492 as "one thousand and four hundred and ninety and two"?

-----


Surely that comes from the same language structure which means you wouldn't say beans and eggs and tomatoes and potatoes, you'd say beans, eggs, tomatoes and potatoes e.g. in a list of "and" items you only say and between the penultimate and final items?

-----


http://xkcd.com/725/ and http://xkcd.com/386/ both come to mind...

-----


Yeah, 5555 sounds like hahahaha in Thai :)

-----


Thank you very much for this information. I was wondering.

-----


When I opened my Wells Fargo checking account, they let me choose my own PIN number for my debit card. I was quite surprised, because previously other banks always sent me a 4-digit number that I had no influence on.

I became even more surprised when I was told that I don't have to restrict myself to 4 digits! So, I entered a sequence of 12 or so digits that I happen to be able to remember easily but that otherwise follows no patterns as e.g. the ones mentioned in this article.

I thought I was so smart!

Except when I later found out that a lot of card readers in convenience stores accept a maximum of 8 digits for card PINs :-( I can't use my card there. So much for being a paranoid computer scientist...

-----


You can't change your card's PIN? That's news to me.

-----


I have to say, I've really liked some of the posts on this blog. The comparison of monte carlo and markov models of Chutes and Ladders[1] is very intuitive. I think it's a great way of understanding forward vs backward statistical reasoning without falling into the usual borderline religious divides.

[1]: http://www.datagenetics.com/blog/november12011/index.html

-----


that's an interesting article that shows the difference between markov and monte-carlo approaches, but it seems to be incomplete. in particular, although it claims it will calculate an exact solution (just after the "Markov Chains" subtitle) it never does; the example uses repeated calculations and stops at some point to give an approximate answer. am i misunderstanding something?

-----


And why do banks (or credit card companies) even let users pick their own PIN codes? Wait, do not answer that, I know. Some beancounter has calculated that the cost of somebody guessing PIN is significantly less than cost of support calls when people forget their random PINs. Which kinda sucks that security is compromised to make more profit.

For reference I have (afaik fully random) bank-assigned PIN on my card.

-----


I opened an account at Citizens Bank, who did not let me choose my pin code. It went something like this.

Received card, no pin, called, sent new, recieved card, no pin, pin came 2 weeks later.. for the first card, no pin for second card, third attempt nothing was sent, went to the bank, they told me I had to call, gave up after 2 months, over the next year they sent me several new cards (security breaches?), but I never had a pin to activate them.

One year later they change their terms and contions, impose new fees, deplete my checking account, deplete my savings overdraft account, and then send me to collections... all for opening a couple of accounts and depositing a few hundred dollars.

-----


I seriously doubt they just up and decided to take all your money. What policy or agreement did you violate?

-----


Account inactivity. Cards were never used.

-----


>And why do banks (or credit card companies) even let users pick their own PIN codes?

Sure the PIN sucks, but ATM security is actually two factor, something you have and something you know. The cards can be duplicated, they've made good inroads on that but haven't rolled those cards out in the states.

Its an example of "good enough" security. Not only is the barrier to theft much higher than the cash next to it in their wallet, you only have a 30% chance of getting into their account before the ATM eats the card. If you do get in, you can typically only withdraw $300 and your face has been recorded.

Which kinda sucks that security is compromised to make more profit.

I find it kind of strange that you're picking up on this, do you not read everything on HN as a stream of soul-crushing money-first attitudes too?

-----


In the UK, cards are issued with random PINs. I believe some have the facility to change them, but i would guess the majority do not. I was wondering this when reading the article; so I can take it to assume that in the US you define your own pin? How does that work, do you do it at the bank or a the ATM machine the first time you use it?

-----


I'm not aware of any UK cards that don't let you change the PIN, just pop them into a cash machine and you'll find an option to do it on the menu.

Of course, my not being aware doesn't mean you can't, but I know from experience that most of the big banks let you do it, and I can't think of a single example of either me not being able to do it or me hearing of anyone else not being able to do it.

-----


That could be true. But how many people do you think actually change their PIN? I think it's will be low single figure %; I think many aren't aware you can make PIN changes.

-----


I only have anecdotes rather than data (suspect the same is for you), but personally I would have guessed that nearly everyone does. Obviously I don't know the situation of all my family/friends, but for example in my close family both parents and both siblings have always changed to PIN numbers they chose themselves, I've known colleagues go change their PINs after getting a new card, etc...

I suspect there's no way to know which of our guesses is closer to the truth without a bank providing stats, which seems unlikely to happen. Or a poll somewhere..

-----


In every letter I've had with a new bank card it's told me how to change my PIN if I want to. I'm not sure I know anyone who hasn't changed theirs.

-----


In Finland you actually don't get to pick your own PIN. The bank sends one to you when you get a new card and that's that.

-----


Most US banks are that way. Almost all, in fact. I thought it was a law until I joined a credit union which lets me change my PIN at their ATMs.

-----


I don't have sufficient knowledge to say that you're wrong about this, but I DO know that the largest banks ALL allow you to change your PIN. Examples:

  * Bank of America
  * Chase
  * Citibank
  * Wells Fargo

-----


Many banks apparently don't let you choose PINs; however from the limited sample of PINs I know of (mine, my wife, my company and a few others) they're not random; they all have doubled digits (like 2554).

-----


It could be a coincidence. About half of all 4-digit numbers contain at least one repeated digit. (Of 10000 possible numbers, 10 * 9 * 8 * 7 == 5040 contain four distinct digits.)

-----


Right, and about a quarter contain the same digit at least twice in a row, which may be the "double digit" the grandparent referred to.

    $ pins() { seq -f '%04.0f' 0 9999; }
    $ pins | egrep -c '(.)\1'
    2710
    $ pins | egrep -c '(.).*\1'
    4960
    $

-----


Further to my reply to sibling-reply finnw, here's the pattern of digits, further showing his 10 * 9, etc., breakdown.

    $ pins |
    > sed -r '
    >     :a; s/^([0-9])(.*)\1/\1\2a/; ta; s/[0-9]/a/
    >     :b; s/a([0-9])(.*)\1/a\1\2b/; tb; s/[0-9]/b/
    >     s/([0-9])\1/cc/; s/[0-9]/c/
    >     s/[0-9]/d/
    > ' |
    > sort | uniq -c | sort -n
         10 aaaa
         90 aaab
         90 aaba
         90 aabb
         90 abaa
         90 abab
         90 abba
         90 abbb
        720 aabc
        720 abac
        720 abbc
        720 abca
        720 abcb
        720 abcc
       5040 abcd
    $

-----


I know it is best to use a full uniform distribution for assiging PINs, but I do not know whether any bank does it. Would they really dare send out '0000' to 1 in 10000 of their customers?

-----


You can still have a uniform distribution across the subset of four digit numbers that don't look suspicious to a human.

-----


Interesting analysis, but I question the use of a proxy for the source data. I doubt that PINs like 1234 and 1111 are nearly as common when an actual bank is involved. At least, I would hope that people take their ATM PIN more seriously than a password for a relatively unimportant website account, but maybe I'm giving the general public too much credit.

-----


Here in New Zealand banks reject sequential and repeating numbers for your PIN, and they also check for your birthdate.

So the trick is to find their anniversary or first child's birthday.

-----


That's awfully nice of them to significantly reduce the search space.

-----


That's one thing the article should have taught you: a large search space is worthless when, given the chance, customers restrict themselves to tiny parts of it.

-----


It is rather. Though given that the majority of transactions are electronic here people share PINs all the time, that irks me.

-----


You're sharing pins for electronic transactions? What the what? (I'm over the pond in Australia, and I've never shared any of my various pins with anyone.)

-----


I wouldn't! A lot of people do though, it's the downside of a mostly cashless economy, but with the rise of internet banking and near-instant transfers online it should go away.

-----


But why would you? What is the pin for??

-----


As someone who knows someone else's PIN: if they're too inconvenienced by making the payment themselves and don't realize what security it's meant to offer, they'll be quite content to hand the card and the PIN to someone they trust.

-----


Additionally, if I'm reading this properly, it only includes people who chose numeric passwords when they could have chosen anything (presumably beyond alphanumeric even). This means that all of the folks who chose a password outside of the PIN domain (presumably because they know better) don't have their PINs represented here, and I'd wager that a better proportion of those numbers are beyond 1234, 0000, etc..

-----


My buddy gave me his card yesterday and asked me to go to the ATM for him.

His PIN? 4321

-----


Thought it would be worth mentioning that the heat map appears to show people love using MMDD dates (and possibly MMYY, but harder to conclude from the data), as shown by the lower left heat intensity and the heavy usage of 0 and 1 as the first digit. In addition, the bottom 20 has no candidates that would fit into MMDD.

-----


My PIN number is 8+ numbers long, and to my chagrin, I discovered in 2006 that most ATMs in Europe didn't accept anything more than 6, and most accepted only 4. I'm not sure if things have changed since then, but in the 5 countries I was in, the only ATM that accepted 8 was in Amsterdam, near the end of my trip.

-----


This happened to me, too. For once, carrying travelers checks came in handy.

In Turkey, most ATMs accept my extra-long PIN, but very few have UIs designed to fit more than 4 digits. On many of them, the digits will continue outside of the form field and sometimes all the way off the screen.

-----


I ran into the same issue while traveling in South America. Some ATMs would not work at all with my 8+ PIN.

Strangely, there was one ATM that accepted 4 digits and then automatically continued to verification of the PIN. I never got a chance to enter the rest of the digits, but it still passed verification and allowed me to withdraw.

-----


Did machines just reject your pin, or did they accept the first 4-6 digits?

-----


This happened to me in the United States.

I had a 8 digit PIN, but only the first 4 mattered to the ATM. The screen would blink after the first 4 numbers, too, to help you notice you already had it.

Then two weird things:

1. The bank got bought by someone else, and the new bank demanded 6 digits. I had stopped using anything past the first 4, but they were still the first six from the original form I filled out years earlier.

2. They sent me my full PIN in the mail. At first I was surprised by this, since one-way-hash and all that jazz, but on reflection one-way hashing a six-digit PIN is pretty silly.

-----


I don't get why banks restrict the number of digits of a pin. The bank I'm with limits your pin to 4 digits.

Is it a usability issue, or don't they realise that more digits means more pin options?

-----


On the other hand, my bank only allows 4, and when I go overseas, I sometimes can't use my card at all because they require at least 5/6/8.

-----


I wonder if you could prepend your PIN with 0's. If the input is interpreted as a number, it might work. If they cast it to a string, it probably wouldn't.

-----


I've been using 10 in Canada for several years now. I initially carried cash on me as I expected it not to work at a lot of places... So far I've only had one issue which was at a parkade. The payment machine only accepted six digits on debit cards.

-----


1-2-3-4-5? That's amazing. I've got the same combination on my luggage.

-----


High five for using 'PIN numbers' and not caring about repeating the word 'numbers'. That'll probably drive some people nuts, but not me!

-----


In this case it actually works because he's looking at the 'numbers behind the PINs' :)

-----


Blame the people who coined the initialism, not the public who uses it the way that sounds best.

-----


I just call them PI /pɪ/ numbers!

-----


The title is wrong and very misleading. There is no single mention of "bank" in the linked post. Hell, it even says that it was collected from random password leaks. It says

Given that users have a free choice for their password, ifusers select a four digit password to their online account, it’s not a stretch to use this as a proxy for four digit PIN codes.

but I highly disagree with that extrapolation.

-----


  > but I highly disagree with that extrapolation.
That's great, but your counter-claim isn't obvious at all. So please elaborate. Are you alluding to banks generally setting people's PINs for them, using more appropriate distributions? Or, God forbid, do you believe that users are more careful when picking sensitive PINs?

I have serious doubts about the latter.

-----


Are you alluding to banks generally setting people's PINs for them, using more appropriate distributions? Or, God forbid, do you believe that users are more careful when picking sensitive PINs?

Well yes, of course.

Firstly, Banks do set the pin first, and the vast majority of people probably never change it.

On your second point, for a bank I would pick a complicated pin and/or password, for some throwaway website/app/game account I'd choose something simpler and easy to remember, many people probably use the same pin for loads of services, but wouldn't use it for their bank for obvious reasons.

Most people have some perception of levels of security, even if they only have a binary concept of 'involves my money' or not, and trying to extrapolate from website logins for some unimportant data to banking pins which involve real losses for the user involved is not at all convincing.

-----


  > Most people have some perception of levels of security
We're just trading intuitions here, of course, but I'm virtually certain that people's sensitive passwords are as abysmal as their non-sensitive ones. Sure, you'd pick a good password; however, the mere fact that you're debating this point and that you know what password goodness entails tells me that you're not a helpful sample of the population in question (namely everyone).

I'm going to look into finding some data pertaining to this issue. What I'll definitely grant you is that the extrapolation isn't perfect.

-----


Agreed. It's also worth noting that the dataset does not represent the general population, but only the people who use 4-digit passwords on some websites.

-----


>Firstly, Banks do set the pin first, and the vast majority of people probably never change it.

That's certainly not the case at my bank. I live in Canada, so perhaps things are done differently here, but when I was issued my card they made me choose my PIN.

-----


As supporting evidence, the author points to the high frequency of "2580", which is vertically down on a numeric keypad but not on a computer keyboard. That dissuades me from accepting your high level of disagreement.

-----


But... the N... stands for Number!

Whyyyyyyyyyyyyyy!!!!!

Potentially relevant XKCD: http://xkcd.com/1108/

-----


It's not a perfect fit, but it's unsettling to me how close the leading PIN digit graph corresponds to Benford's Law ( https://en.wikipedia.org/wiki/Benford%27s_law ). You would think intentional human randomness would not fit this distribution.

-----


The humans aren't being random. They're using numbers from sources in real life. A large fraction of the data points are years from the twentieth century as 19xx and a smaller fraction as 200x. MMDD or DDMM format dates also follow the Law in tending to lead with digits 0-3. Or consider the 2580 case or other patterns on the keypad, where the leading digit will skew towards the beginning of the digit-alphabet.

So Benford's Law will apply even on a data set that has no numerical meaning in itself, transitively from the real sources of the numbers.

-----


The naming of Benford's Law has long bothered me. It feels to me like it should be Benford's Observation. Awesome catch, though! I love it.

-----


> Now that we’ve learned that, historically, 8068 is (was?) the least commonly used password 4-digit PIN, please don’t go out and change yours to this! Hackers can read too! They will also be promoting 8068 up their attempt trees in order to catch people who read this (or similar) articles.

That's cool, mine is now 8093

-----


Thanks for sharing.

-----


It was actually 8095, he fooled us.

-----


I would take this with a grain of salt. He doesn't say what exposed passwords he used as his sources, but I'd bet the linkedin passwords were part of it.

But a lot of people will knowingly use an unsecure password for a site like linkedin, because they don't really care if its hacked. Using a secure password for bank accounts, email, etc. is critical, but I don't think it's realistic to ask people to have unique, secure passwords for their account on forums.49ersfans.com. That just isn't going to happen. It's interesting, but I'm skeptical his sources are even remotely a good proxy for bank account PIN numbers.

-----


Interestingly, I've never received a default bank pin in Ireland that wouldn't have 2 repeated digits in it. I have a theory they do this to make pins easier to memorise.

-----


Last time I signed up to a new bank there was a box on the paper form for your pin number, which was then typed in a clear text field by the staff member setting up my accounts.

I stuck with my old bank who used a secure entry keypad so I could set my pin.

-----


Which bank are you with? One of the few things that annoyed me about NBNZ was the utter lack of security with setting the pin and online banking password.

-----


NBNZ, but given that I haven't set a PIN for at least 10 years they may have changed the process. Same for online banking.

When I last did it the PIN was a secure keypad, and online banking required them to send a letter and then authenticate in branch.

Photo ID was required for both steps.

-----


Ah, that's unfortunate.

-----


2580 seemed like a fairly obvious PIN choice to me, but I'm completely baffled on the significance of 1004 in position #6. It is neither easy to type, has no good geometric representation, and doesn't seem like a year/date/zipcode or any common number. Any ideas?

(I'd also venture that one of the reasons 2468 is so much more popular than 1357 is because of the nice symmetry that 2468 has)

-----


In Spain, 1004 is the number to call the main phone company. It is usually associated to something bad, as you can imagine :-P but it is an easy 4 digit number to remember. Not sure if that is the case in other countries...

I still prefer the korean explanation, though

-----


The only thing I could think of was Ubuntu version 10.04. I doubt that explains it. I thought any variation on that (1.004, 10.04, 100.4) might have some significance in popular culture, like the 42 has, but didn't find anything.

I'm curious.

-----


Seems like 1004 would be a pretty easily typed and spatially easy PIN to remember - http://static.guim.co.uk/sys-images/Guardian/Pix/pictures/20... - poke your elbow out and place your hand over the keypad and it's middle finger, thumb, thumb, index finger. Quick and easy.

-----


http://news.ycombinator.com/item?id=4536506

-----


Perhaps this is a chance for a quick poll. I believe that every PIN number I've ever received from a bank in the UK has had a duplicated number. Some examples are: 1142, 4840.

Does this happen to anyone else?

-----


As 'finnw has mentioned, almost 50% of 4-digit numbers have at least one repeated digit, so there is high chance of ending up with such a number. FWIW I have 2 UK cards from different banks where the PIN has no repeated digits. I suspect you're seeing a pattern where there is none.

Incidentally, codes with repeated digits make it harder to guess your code based on fingerprints, as there are more possible combinations.

-----


Excellent thank you. I feared that I was but the alternative is also fairly plausible. Cheers!

-----


4960 of 10000 4-digit numbers contain at least one duplicate. Also, the probability of getting a number that has an unusual property is, counterintuitively, as an old math joke states, 1.

-----


http://en.wikipedia.org/wiki/Uninteresting_numbers

-----


In Sweden I've had pins without duplicated digits.

-----


It's called the birthday paradox :)

-----


1/3 of mine have duplicated digits.

-----


All credit card passowrd leaked and compressed into 3 lines python program

https://gist.github.com/3748634

-----


Yes, my bank PIN is the least frequently used out there. I feel safe.

... uh... shoot. ^H^H^H^H^H^H^H CARRIER LOST

-----


I love stats analyses like this.

Pleased to say my pin occupies one of the darker regions of the heatmap :)

-----


And now all the hackers that care have inverted the map and are attacking you with the inverted version now :P

-----


Oh no, he reduced the attack space from 20% to 80%! Oh wait…

-----


Amount of hackers that care: ~0. Haha.

-----


Unbelievable how ONE in SIX pin numbers are either 1234 or 1111.

I guess they're right, mine is 1234

-----


I learned to program on an 8088 and then 8086, guess it wasn't too popular

-----


"Analysis of bank Personal Identification Number numbers"

-----


What's the horizontal line around xx12? He doesn't explain this.

-----


Backwards birthdays, like 2311 for November 23. In Europe this would be the usual way of writing a birthday already.

-----


Besides the repeating numbers, I dare to say a big percentage of pins start with 19 because people use their year of birth as pin.

-----


God damn it, my pin is (ranked) number 9999.

-----


Now the world knows :)

-----




Applications are open for YC Summer 2016

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: