
Falsehoods programmers believe about addresses - ColinWright
https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/
======
FooBarWidget
I read this article a long time ago and I took it seriously. So instead of
asking for address, postal/zip code, city, state/province, I just put up a big
text area labeled "Full address" so that people have complete freedom about
what to fill in.

80% of the users ended up only filling in their street address, not their
postal/zip code, city and state/province, even though they're from countries
where (most?) addresses satisfy that format.

We ended up reverting to the previous form where we explicitly asked for
postal/zip code etc.

~~~
ubernostrum
This is where most of the "falsehoods programmers believe" articles fall flat.

The thing to do is _not_ try to design a single form that accommodates every
possible address or name or whatever. The thing to do is examine your use
case, design a form that works for, say, 99.9% or more of the people you want
it to reach, and then if you really want to get the last 0.1% have someone who
can do customer service and has access to a freeform text box into which
_they_ \-- not the end user -- will enter the information.

Because when you get right down to it, the freeform text box is the only thing
that accepts 100% of valid addresses (or valid names, or whatever). But it
also necessarily accepts a ton of _invalid_ ones, too, and so it causes more
trouble for the common cases than it saves in the uncommon ones.

~~~
ryandrake
"Punt and let customer support handle it" is a pretty lazy solution, in my
view. Additionally, what a developer might think of as a "99.9%" solution is
more often than not a 99% solution, or a 90% solution. With enough customers,
dealing with everyone who can't use your slapped-together form will be more
expensive than doing it right.

Software doesn't become great by assuming someone else will handle the edge
cases and details, although you may get away with selling it for a while until
your user base grows.

~~~
ghshephard
Except, in this scenario, it's relatively easy to come up with a good 99.9%
solution:

Seven text fields will cover greater than 99.9% of users:

    
    
      Name
      Address Line 1
      Address Line 2
      City
      State
      Country
      Postal Code
    
    

So much of that article discussed stuff that was irrelevant - Live in
Singapore? Great, just fill in Singapore, Singapore, Singapore. I've done that
on many, many sites and it's always worked fine. Don't have a postal code in
Ireland? Most people there learn to try EIRE or 00000.

I think a better article would have been: "Here are the scenarios in which the
7-text address system doesn't work, and how you can make it better." \- But I
couldn't find anything that suggested it wouldn't work just fine.

~~~
lucaspiller
My country unfortunately doesn't fit that scenario. Here in the UAE there is
no residential postal delivery, so if you want to receive letters you have
them sent to a PO Box. Most people usually use their office's, but I have my
own. That means you can reach me with simply:

    
    
        PO Box XXXX
        Dubai
        UAE
    

For sending parcels to be delivered by a courier you can include a physical
address, however most streets here don't have names (and even if they do
people have no idea what they are) so people go by directions. Which means
something like:

    
    
        Flat XXXX
        Building Name next to / opposite / near Landmark
        Area Name
    

In most cases it's best to include a phone number so the courier can phone for
directions if they get lost (without street names it's easy), and of course
everyone requires a name even though I'm the only person living here and using
this PO Box. That means my 'full address' ends up being:

    
    
        Name Surname
        Tel: 05XXXXXXXX
        Flat XXXX, Building Name
        Next To Landmark
        Area Name
        PO Box XXXX
        Dubai
        UAE
    

In most cases that never fully fits in the length of the fields, or they do
something silly like requiring a postcode but limiting it to 5 digits (luckily
my PO Box number is only 4 digits in length, but in reality they are [0-9]*).
Anything I order from eBay has "NOTPROVIDED" on it as I left out an optional
field they think should always be included :D

Edit: Area names have their own fun: There is a road going from one end of the
country to the other which in Dubai is called Sheikh Zayed Road (it has a
number and different names in other emirates :D), but that is also the name of
an area along part of the road.

~~~
rtb
So what would fix this for you? Same 7 fields, but longer?

------
edmccard
"An address will exist in the country's postal service's database"

This is the one that I've run afoul of -- not because I live in a brand-new
building, or a houseboat, or 30 miles from anywhere on an unnamed road; it's
just that there is no door-to-door delivery for houses within a 2-block radius
of the local post office, so we have a PO box.

The trouble happens when a business that needs my street address uses the USPS
database for address verification. One example is online stores that don't
ship to PO boxes. Some of these sites have a form with a sort of "Are you
sure" prompt when my street address isn't recognized; others just refuse to
accept it.

Even worse was when the local company that picked up my trash was bought by
one of the larger regional "waste management" operations, and all the drivers'
routes were re-planned for "efficiency" (evidently using software that hit
some USPS database); the upshot was that everyone on my street had their
address removed from the pickup routes.

~~~
IshKebab
I have this problem too, but just because the UK postcode database that a lot
of sites use is wrong. There's not even a way to report it.

It means I can basically never pass a credit check at this address.

~~~
lucaspiller
A lot of companies verify postcodes with QAS, so probably worth reporting it
to them:

[http://www.qas.co.uk/knowledge-centre/product-
information/ad...](http://www.qas.co.uk/knowledge-centre/product-
information/address-postcode-finder.htm)

They are run by Experian who one of the main credit check companies in the UK,
so you might be able to hit two birds with one stone :D

------
hawkice
A falsehood only distantly alluded to:

That people have addresses at all, or can describe their residence in an
unambiguous or clear way (even using GPS coordinates).

I used to live in a place I couldn't even remotely give directions to. It was
deep within a neighborhood of a poorer country, none of the streets had names,
none of the buildings are numbered. I lived in an building where none of the
apartments had numbers or names.

If I wanted something delivered, I would go to a local shop for the company
delivering it, and show my ID, and they would have it routed there if it
wasn't in the building already.

If you wanted a billing address for, I don't know, tracking me down,
initiating lawsuits, something like that? I honestly just assume that's
impossible.

~~~
dfranke
Furthermore, just because you don't have an address doesn't mean you can't
receive mail! At least as recently as a few decades ago, there were rural
communities in the US where nobody had a street address; the postal service
knew where everybody lived and would deliver mail given just a name and town.
I'm not sure whether this arrangement still exists in the US, but I'm pretty
sure it still does in Ireland and probably elsewhere.

~~~
protomyth
The push for 911 changed a lot of places by assigning street names, but the
911 folks are sometimes the only ones who know the names. UPS still delivers
to some vague addresses on the reservation. Shipping to "House 311 behind the
school" does tend to confuse a few Internet merchants.

------
UrMomReadsHN
Is it common to believe that post codes don't start with zero? All of New
England (ME, MA, NH, CT, RI, VT) have zero starting post codes. Plus
apparently a part of New Jersey. Map here:
[http://en.wikipedia.org/wiki/ZIP_code](http://en.wikipedia.org/wiki/ZIP_code)

Also, since it brought up naval vessels, here's the addresses of all US Navy
ships: [http://www.navy.mil/navydata/ships/lists/ship-
fpo.asp](http://www.navy.mil/navydata/ships/lists/ship-fpo.asp)

Anyways, the post office does a wonderful job delivering mail considering how
complex addresses can be and how people can have messy handwriting.

Also - You can have an address that doesn't map to a physical location that
totally looks like it should (not just PO boxes). Consider RPIs address.

[http://rpinfo.rpi.edu/locateRPI.html](http://rpinfo.rpi.edu/locateRPI.html)

"Correspondence to and from Rensselaer Polytechnic Institute uses the official
address of 110 Eighth Street, Troy, NY 12180. This address serves as a mailing
address only; you will not find a building with that number on 8th Street."

~~~
mgkimsal
I got in to an argument in school with my computer programming teacher. A
BASIC course, we were having to design a system to accept an address, and I
was treating my ZIP code as a string.

IIRC, something like

50 INPUT "Your ZIP code?", ZIP$

Everything was reviewed via handwritten code and flowchart before we were
allowed to type it in in the lab, and I was told "ZIP code is a number, but
you're putting it in to a string, that's wrong, fix it".

"But a ZIP code starting with 0 would then not have the 0 at the front when we
show it back to the user" I said.

"ZIP codes don't start with 0"

Me: "Some do"

Her: "No, they don't"

I lived in Michigan, but was a huge infocom fan, and they were in MA, and had
a leading 0 in their ZIP code. We argued about this for a good few minutes,
and it wasn't until I brought in something the next day that demonstrated
legit ZIP codes starting with 0 that I 'won'. Crap like this reinforced my
distrust of authority and cynicism in life (for better or worse).

EDIT: Anyone else remember the "New Zork Times" newsletter? :)

EDIT 2: It wasn't until much later that I learned the ZIP code system we had
wasn't actually completely formed until after she was at least a teen, if not
a full adult - she simply wasn't exposed to stuff that I was earlier, it
didn't really impact her, and she just assumed they were all numbers with no
leading 0s. In some ways a minor point, but... it also taught me about stuff I
took for granted not always having been there, even something as basic as
addresses. Didn't really learn that until much later after that class.

~~~
zaroth
The worst is when you store it as a string, having learned from others'
mistakes, but then upon exporting to CSV... _Excel_ kindly strips the leading
0 and you get a bug report that your software is screwing up the ZIP codes!

Took a bit of searching before finding this:
[https://www.webdigi.co.uk/blog/2010/handling-leading-zero-
in...](https://www.webdigi.co.uk/blog/2010/handling-leading-zero-in-csv/)

~~~
WatchDog
Excel is the bane of my existence. I give a lot of speeches about how to
preserve data precision. However It seems to be largley in vain, as there is
often a lot of back and forth when handling data between different groups,
even programmers don't seem immune from screwing up a csv file with excel.

~~~
UrMomReadsHN
Oh god office workers love putting everything in Excel. Even when it makes no
sense...

~~~
TeMPOraL
Can't blame them. The only other choice they usually have is some crappy,
poorly thought "database system" written by IT people who are poster children
of this article. I'm a programmer and I defend use of Excel in the offices -
I've actually worked in such an office before and I know that all the
alternatives suck more.

And I guess that's one of the biggest mistake programmers make - assuming that
real-life data will conform to some imaginary, bureaucratic, fixed format.
Even schemas ain't fixed in real life. That's why people use Excel.

------
pella
some usefull links:

\- The free and open global address collection :
[http://openaddresses.io/](http://openaddresses.io/)

\------ "OpenAddresses Hits 100 Million"
[https://www.mapbox.com/blog/openaddresses-100m/](https://www.mapbox.com/blog/openaddresses-100m/)

\- OpenStreetMap "addr:housenumber" FREQ
[http://taginfo.openstreetmap.org/keys/addr%3Ahousenumber#val...](http://taginfo.openstreetmap.org/keys/addr%3Ahousenumber#values)
[ more "14" than "13" ]

\- OpenStreetMap "addr:street" FREQ
[http://taginfo.openstreetmap.org/keys/addr%3Astreet#values](http://taginfo.openstreetmap.org/keys/addr%3Astreet#values)

\- OpenStreetMap "addr:postcode" FREQ
[http://taginfo.openstreetmap.org/keys/addr%3Apostcode#values](http://taginfo.openstreetmap.org/keys/addr%3Apostcode#values)

\- Derek Sivers: "Japanese addresses: No street names. Block numbers."
[http://sivers.org/jadr](http://sivers.org/jadr)

\- Wikipedia : "Address (geography)"
[http://en.wikipedia.org/wiki/Address_(geography)](http://en.wikipedia.org/wiki/Address_\(geography\))

------
nvivo
There are so many misconceptions from developers, even more when building apps
used worldwide. If you plan to accept data from different countries, free text
with no validation is the only acceptable answer.

I remember once we had to remove validation from names because some countries
don't even have last names, and others have real names with two or even one
characters.

~~~
matthewmacleod
_free text with no validation is the only acceptable answer._

Unfortunately, it's not an acceptable answer in practice.

In my experience, users aren't clear what to do when presented with a free-
form, multi-line text box in which they can enter their address. This results
in frequent missing data – users aren't aware they need to include a postal
code, or county, or country…

This is probably because users are generally conditioned to expect separate
text fields for separate tokens in their addresses.

There's a middle ground though – extract the tokens you need (postal code,
country) and allow the user to freeform the rest. And don't even think about
trying to validate addresses in any real way – you'll fail!

~~~
nvivo
Of course I'm not saying single multiline textbox for every app out there.
That would be crazy. :-)

But I'm always surprised how many sites ask for information that will never be
used for anything, and assume things like the lenght or characters valid in
zip codes or phone numbers.

I heard these days on "There is no such thing as a fish" podcast there is a
country somewhere where the post office locates places by directions given by
the sender! And that is officially accepted! Crazy world, try validating that.
:-)

~~~
jacquesm
So, what's the proper way to deal with this then?

Ask a series of questions and adapt the follow up to the questions depending
on the answer given.

Starting out with 'select your country' and then expand from there, the more
you know the more you can narrow down the remainder of the input.

That would be a nice little widget to be able to throw onto a form 'world
accurate address input fields'.

And for some localities it will indeed display a freeform text field, but for
localities where there is more structure it could supply that structure and
make certain bits mandatory.

~~~
tombrossman
Picking an arbitrary starting point like 'country' still has problems. I'm in
Jersey Channel Islands, which isn't a country at all (much like Vatican City &
other territories).

We are British but not part of the UK nor members of the EU. We're served by
the British Royal Mail system and use UK-style postcodes, but sometimes when I
input my address I _must_ chose a country, so pick UK, which triggers UK VAT
on my order. We're exempt from UK VAT so dealing with this is frustrating.
Some retailers do waive it when I raise the point but I have to remember every
time.

Total edge case I know, but I hope it demonstrates that even picking a really
broad starting point like 'country' can still fail sometimes.

~~~
hayksaakian
Sorry I'd this comes off as tangential:

What kind of internet service do you have where you live (speed wise) ?

I've always considered island life, but poor internet service has put me off.

~~~
Symbiote
Jersey is a tax haven (40% of the economy is evading tax), lying 20km from
France and 160km from Britain, so there's no reason the internet access has to
be bad.

[http://www.jtglobal.com/Jersey/Personal/Broadband/Products/H...](http://www.jtglobal.com/Jersey/Personal/Broadband/Products/Home-
broadband-tariffs/) — it's a bit expensive, and has a usage cap, but that
seems more to do with the lack of competition than it being an island.

~~~
tombrossman
> (40% of the economy is evading tax)

Citation needed. You are welcome to pay maximum taxes where you live if you
like, but jurisdictions with lower tax rates keep the pressure on governments
to deliver a good return on investment to their citizens. When I moved from
San Diego to Dallas after the dotcom bust I slashed my taxes which was great
(no income tax in TX). This wasn't tax evasion, it was common sense. San Diego
was no longer a good investment.

Also, you totally missed the Gigabit fiber-optic broadband plans[0]. I have a
1GB __unlimited __plan from their competitor for £55 /month[1]. Standard fair
use clause applies but I've never been throttled and I use massive amounts of
data every month.

[0] [http://www.jtglobal.com/Jersey/Personal/JT-Fibre/Fibre-
Tarif...](http://www.jtglobal.com/Jersey/Personal/JT-Fibre/Fibre-
Tariffs/Tariff-details/) [1] [https://web.sure.com/jersey/internet/home-
internet/unlimited...](https://web.sure.com/jersey/internet/home-
internet/unlimited-broadband/plan/15172)

------
smcl
"A road will only have one name"

This one's particularly common in Edinburgh. In fact in the example they use -
"Regent Road" connects to Princes Street, which becomes Shandwick Place, then
Atholl Place. At this point the main fork becomes Dalry Road which then
becomes Gorgie Road which becomes Stenhouse Road and then Calder Road - all of
which are roughly a straight line:

[https://www.google.com/maps/dir/Calder+Road,+Edinburgh,+UK/S...](https://www.google.com/maps/dir/Calder+Road,+Edinburgh,+UK/Stenhouse+Rd/Gorgie+Rd,+Edinburgh,+UK/Dalry+Rd,+Edinburgh,+UK/Atholl+Pl,+Edinburgh,+UK/Shandwick+Pl,+Edinburgh,+UK/Regent+Road,+Edinburgh,+UK/@55.9539937,-3.179938,15z/data=!4m44!4m43!1m5!1m1!1s0x4887c427942a3bbb:0xb5f6c3bd2a7a564c!2m2!1d-3.2872785!2d55.9230398!1m5!1m1!1s0x4887c6f3045dd509:0x37d6bdd187c2d7d9!2m2!1d-3.2590407!2d55.9296972!1m5!1m1!1s0x4887c6f92cf9f363:0x1e2575905cb7c023!2m2!1d-3.2400333!2d55.9365554!1m5!1m1!1s0x4887c7a8bc280a39:0xdd1e5bdda92e84e7!2m2!1d-3.2218567!2d55.9425158!1m5!1m1!1s0x4887c7a48581d1f5:0x1f38a0b91d4eebab!2m2!1d-3.214204!2d55.9474842!1m5!1m1!1s0x4887c7a325f46fc1:0xd4ba3dd16343dd18!2m2!1d-3.2105438!2d55.9491457!1m5!1m1!1s0x4887c789c2b1a9ff:0xb23c9c457a777421!2m2!1d-3.1777818!2d55.9538905!3e2)

~~~
mtmail
There's also the chance a single segment of the road has several names. For
example when crossing a bridge or a round-about. Never mind all local names vs
official names, old names and translations.

~~~
smcl
Correct, and indeed having the same name on different sides of the _same
road_. I just recalled another example from Edinburgh[0] where one side of the
street is called "Lochrin Buildings" while the other side of the street is
called "Gilmore Place" (then both sides of the street become Gilmore Place,
which then invisibly changes its name to Granville Place before changing into
Polwarth Gardens.

[0] =
[https://www.google.cz/maps/@55.941979,-3.203582,3a,75y,341.7...](https://www.google.cz/maps/@55.941979,-3.203582,3a,75y,341.75h,100.38t/data=!3m4!1e1!3m2!1sT0PUyqDdTmZ8msxqf8Yt4w!2e0?hl=en)
(I think the smaller lettered "Gilmore Place" underneath hints at a "yeh we
know this is sorta Gilmore Place but not really)

------
2ion
Seldom are the only restrictions that apply to an address only the ones in a
single software system. In fact, your address data could be the least of the
problems you have to worry about.

When actually using all the addresses you stored for shipping stuff, it is
almost guaranteed that the shipping company will cut off or drop lines from
labels, and of course every shipping company is going to have its own quirks.
Maybe just because not every address is going to fit onto a fixed label area
in a fixed font size.

I have in fact lost several shipments to my address(es) due to every single
kind of the above caveats.

------
oneeyedpigeon
Of course, they're falsehoods that _everyone_ believes about addresses (except
maybe for postal workers), but programmers are the only ones who have to
actually think about them.

It would be nice to come up with some sort of conclusion or recommendation.
Should addresses just be used as one big blob of text, and never parsed at
all? should there be individual per-country libraries for parsing them? should
we just address everything by coordinates (which doesn't solve the houseboat
problem)? How about a unique identifier for every person on the planet, plus a
gps tracking system that guarantees big brother can deliver to you whenever,
wherever?

Now that the average piece of post is a prig package that needs signing for,
rather than a small letter that can just go through the letterbox, I quite
like the idea of centralised pigeon-hole buildings, that have existed in many
towns as the only method of delivery, but are now being born everywhere thanks
to amazon, etc. That's quite a different problem, though :-)

~~~
curryhoward
I think the consensus is to go with the "big blob of text" approach. You
mentioned that postal workers are the only people qualified to parse
addresses—let them do it. The validation done on your end shouldn't be much
more than asserting that addresses contains non-whitespace characters.

I don't know how well this works in practice, but it's the most "correct"
thing to do.

~~~
walshemj
Bad mistake you get a metric fuck ton of shit in your data that way.

You should always police your input data properly one big text field is lazy
programming that leads to a big technical debt later on.

And yes that does mean handling apostrophes and all those other edge cases
properly.

~~~
to3m
Name of recipient, state/province/etc., postal code and country, maybe. But
for the rest, that bit (at least in the UK) that goes in the middle... what
policing can you do? That part is just a text blob. You can't do much
meaningful with it except for showing it to somebody (e.g., on a label affixed
to the package you're sending) and have them figure it out.

~~~
Dylan16807
Well sure, once you've separated name and country and postal code you've
already pulled out most of the data.

The blob has been tamed to 1-2 lines, and you're probably best off giving
'line 1' 'line 2' 'line 3' fields. At this point the chance of confusion is
minimal, even if you can't validate very well.

------
jakub_g
"Wikipedia has a photo of a parcel where a Russian/Cyrillic address was
displayed on a computer with the wrong character encoding, and transcribed
from that. Reportedly a russian postal worker was able to reverse the mapping
and deliver the parcel."

The linked URL doesn't work anymore but thanks to Archive.org and reverse
image search in Google I managed to find it:

[https://web.archive.org/web/20130909052627/http://en.wikiped...](https://web.archive.org/web/20130909052627/http://en.wikipedia.org/wiki/File:Letter_to_Russia_with_krokozyabry.jpg)

[http://upload.wikimedia.org/wikipedia/ru/thumb/f/fa/Letter_t...](http://upload.wikimedia.org/wikipedia/ru/thumb/f/fa/Letter_to_Russia_with_krokozyabry.jpg/755px-
Letter_to_Russia_with_krokozyabry.jpg)

~~~
breakingcups
That is one dedicated postal worker.

------
shdon
It's not even true that a single building has only one address, must exist in
one town or even in one country. There's a house that has one address in
Baarle Hertog (Belgium) and another address in Baarle Nassau (Netherlands),
with different house numbers too.

------
kimdouglasmason
An observation:

My street address is

XXXX Martin Luther King Jr Way, Apt XXX

The number of organizations that accept the address, but chop off the
apartment number when they send mail because it is too long is ridiculous.
Even better, it tends to be government departments. For example, the IRS does
this.

Martin Luther King Jr Way has to be one of the most common street names in the
US. Many orgs can't even get the simple cases right; I don't hold out much
hope for the obscure cases.

See the Google 'real-names' debacle for just how wrong people can get this
kind of thing, even when they're being loudly told that they're doing it
wrong.

~~~
samatman
Pro-tip: no one will ever, ever misdeliver a package addressed to {house#} MLK
#{Apt#}. I've lived on an MLK. If you live on Lakeshore Drive in Chicago, LSD
works just as well. Where MLK is concerned, he has an official holiday,
Americans know what those three letters mean.

~~~
TazeTSchnitzel
MLK Jr. for extra clarity?

------
drivingmenuts
That article can also be read as a list of things that need to be fixed by the
various postal systems.

We can issue addresses to computers, many of which cannot be considered to be
in a fixed place, yet somehow we can't issue a permanent, unique address to
something that's not likely to move around much.

~~~
rmc
Reminds me of a quote from a geocoding session at an OpenStreeMap conference.
"Addresses are not a theoretically hard problem. The problem is that people
don't follow standards, or have the same standard."

Trying to get everyone on the planet to massively change how they view the
world is not easy.

------
zimbu668
I didn't see this one mentioned: in addition to crossing city and county
lines, ZIP codes also cross state lines.

[http://www.answers.com/Q/How_many_zipcodes_cross_state_lines](http://www.answers.com/Q/How_many_zipcodes_cross_state_lines)

------
koenigdavidmj
One more: in SW Portland there is a section east of the 0 line on the street
grid where all buildings have a leading zero. 0634 and 634 are different
addresses on the same street.

------
bojanz
And the solutions for C++, Java, PHP:
[https://github.com/googlei18n/libaddressinput](https://github.com/googlei18n/libaddressinput)
[https://github.com/commerceguys/addressing](https://github.com/commerceguys/addressing)

------
pcthrowaway
This article was submitted here nearly two years ago (as you will find out if
you click the link to the HN discussion at the bottom of the article). But I
thought of one not included in the article.

From Portland's Wikipedia page:

> On the west side, the RiverPlace, John's Landing and South Waterfront
> Districts lie in a "sixth quadrant" where addresses go higher from west to
> east toward the river ... East-West addresses in this area are denoted with
> a leading zero (instead of a minus sign). This means 0246 SW California St.
> is not the same as 246 SW California St. Many mapping programs are unable to
> distinguish between the two.

~~~
apaprocki
The city deserves a "bug" for that. Computers might not be able to distinguish
between those easily, but I'd bet most humans (especially ones not from
Portland) would have no idea either. Addresses are not set in stone -- it's
easy enough for them to fix and avoid the entire issue.

edit: I've had my zip code changed within the last few years. That causes the
same amount of pain as changing any other part of the address and is done
without much fanfare.

------
jmharvey
I'm not sure how many of these falsehoods programmers actually believe. But
one falsehood I've actually seen in the wild isn't included:

"(Direction) Street" is necessarily the same (or different) than "Street." Or
even if they're different, users will understand the difference, at least on a
local basis.

I had a GPS that would always omit any directions that prefixed a street name.
I was occasionally thrown for a loop when it told me to turn on Beacon St in
Boston, when it really meant North Beacon St, which is a nearby, but
unrelated, street.

------
jacquesm
Patrick has an excellent variation on this theme:

[http://www.kalzumeus.com/2010/06/17/falsehoods-
programmers-b...](http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-
believe-about-names/)

I live on a road that has two sets of numbers, both identical (but several
hundred meters removed from each other) in two different towns but with the
same name. Getting mail and packages delivered here is for want of a better
word a challenge.

~~~
masklinn
Is it because the post sucks at its job? I would expect this to be handled
correctly for the sole reason that you could have two unconnected roads with
the same name in two different cities and end up in essentially the same
situation, no?

~~~
jacquesm
I think if they would be further away from each other it would be less of a
problem.

------
GigabyteCoin
Now I understand why PayPal lets scammers register with an address that reads
"asd, asdf,asdff, Turkey" and immediately allow somebody with that address to
send me funds. (Ultimately stealing/using the credits they purchased instantly
from my site with the fake paypal account setup on a stolen credit card)

Without a current streetmap of the entire world, how would you really know
it's a bogus address?

------
dmckeon
Similar to "Street names don't recur" \- A named road will be continuous: a
friend lives in a condominium, in one of a group of 8 buildings bounded by 3
fairly normal roads. There are 4 separate driveways between buildings leading
to parking.

At some point the condo units were renumbered with street numbers, and the
disjoint driveways were all given the same name: <Condo> Lane.

Regular delivery drivers - USPS, UPS, FedEx, and pizza seem to cope, but taxi
drivers or other irregular visitors who expect numbers to be continuous along
streets are almost always baffled.

Similar to "A road will only have one name", for emergency services - fire,
ambulance, and police - a similar case arises when a route passes through
several small towns, each with its own set of street numbers, possibly with
variations of proper street name, and perhaps with different
direction/cardinality mappings.

For a motorist calling 911, reporting that one is at street number 123 on El
Camino Real (on the SF peninsula) will probably map to several possible
locations, depending which of the 12 or so towns one is in.

------
dasil003
When I moved to London and opened up a Lloyds Bank account (then Lloyds TSB),
I was confused to find they did not consider my office postcode W1F 7RB valid.
I poked at it a bit and found some programmer assumed that the first half of a
UK postcode was \w+\d+. The hilarious part was the branch I was opening an
account at was in W1S, so their form wouldn't even take their own postcode.

------
montecarl
Isn't it a false dichotomy that you can either have a complicated multipart
form or a freeform text box for addresses? Why not by default show the
multiform box that provides some nice (optional) validation to catch that vast
majority of cases and also give the user an option to fill out a freeform text
box if the former doesn't work for them?

------
6t6t6
Maybe a good approach would be to use a format that somehow fits 99% of the
addresses and a link on the bottom of the form with the text "Problems fitting
your address in that form?". When the user clicks the link, all the fields of
the form would be substituted by just a multi-line input field. Then you have
a solution for the 1%.

------
markbnj
This hit home for me as a couple of months ago we faced the choice between
continuing to develop a parser-based approach to location extraction from free
text, or moving to an entity extraction and search approach, i.e. geotagging
or geocoding. Notably we were just trying to get city and state, and even
limiting ourselves to the U.S. the combinations we were seeing made parsing
seem like a game of increasing complexity delivering diminishing returns. We
ultimately went with a search-base approach and it's been working much better
and is more tolerant of format variations.

------
SeanLuke
> No buildings are numbered zero > Counterexample: 0 Egmont Road,
> Middlesbrough, TS4 2HT

A more fun counterexample: the Black History Museum in Richmond Virginia is
located at "00 Clay Street". That's 00, not 0.

------
firegrind
My mainland European address saves me a fortune in online shopping - payment
and delivery both usually fail.

My two neighbours and I share a driveway but we have our own gates and house
numbers. The street is unnamed and unnumbered like the other roads in the
immediate area.

There are at least two valid postcodes for the property, which is a few
minutes walk from a major administrative boundary. Postal deliveries might
turn up once in four to six weeks.

When I put this info into card validators, then tell them that my bank is in a
different country to the one I'm ordering from, they generally barf.

------
MatthewWilkes
One I always enjoyed that's not on the list (because it's UK specific, I
guess) is:

Buildings don't have multiple postcodes in different outcodes (
[https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdo...](https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Outward_code)
).

Counter example:

* Merchant Venturers building, Woodland Road, Bristol BS8 1UB

* Merchant Venturers building, Park Row, Bristol BS1 5UB

------
0942v8653
So I guess this is nothing more than a fantasy:
[http://www.xkcd.com/208/](http://www.xkcd.com/208/)

~~~
UrMomReadsHN
"Some people, when confronted with a problem, think, “I know, I’ll use regular
expressions.” Now they have two problems."

------
je42
repeat:
[https://news.ycombinator.com/item?id=5791489](https://news.ycombinator.com/item?id=5791489)
?

~~~
babo
It still as worth to read as it was 598 days ago. :-)

------
chris_wot
There are so many formats - is there anniversary format for addresses? In all
seriousness, this is where XML and XSDs would really shine!

~~~
chris_wot
However, turns out this is a problem that is solved by Google :-)

[https://github.com/googlei18n/libaddressinput/wiki/AddressVa...](https://github.com/googlei18n/libaddressinput/wiki/AddressValidationMetadata)

------
rmc
As someone from Ireland, a country with unusual addresses, this is spot on.

------
Hussell
I've personally seen a building with a fractional street number, in Kingston,
Ontario, Canada. I've also had to deal with irregular addresses in Canada.
Working on a Canada-only program, I was expecting addresses to have the
components:

    
    
      [unit-number, ]building-number street-name
      city/town, province/territory, country
      postal-code
    

I was fortunately already expecting characters from the two official languages
of Canada, English and French, so I was prepared to deal with accented
characters.

Later, I had the opportunity to work in Iqaluit, Nunavut, Canada, which
violated most of my assumptions, both explicit and implicit. First, the
territory (not province) of Nunavut is a relatively recent creation, having
been created by splitting off a part of the Northwest Territories on April
1st, 1999. Before that, the addresses were all in a different territory.

Second, Iqaluit uses a system where every building in the city has a unique
number. Currently (2015) the highest number is rapidly approaching 7000, but
at the time it was in the 5000s. In addition to their unique number, some
buildings also have a name, which is sometimes written only in the Latin
alphabet, sometimes written only in the Inuktitut syllabary, sometimes either,
and in at least one case both. When a building has both a name and a number,
people may use just one or the other. (I haven't found a building without a
number yet, but I'm no longer going to assume there aren't any.)

Street names were not introduced until 2003, and when they were, all street
signs were labeled in both the Latin alphabet and the Inuktitut syllabary.
Since the system of uniquely numbering every building is continuing, most
people ignore the street names unless they're actually talking about streets,
not buildings. Nonetheless, some attempts have been made to get everyone to
change their mailing addresses to include the street. In every case, everyone
has agreed that use of the Inuktitut syllabary should be encouraged.

All these peculiarities are in the territorial capital, where almost all the
territorial government and law-enforcement addresses are, so anyone dealing
with addresses for the Canadian government should be aware of this (but
probably isn't).

On a related topic, the US has long had a system of two-letter abbreviations
for its states, commonly used in its addresses. Canada eventually introduced a
standard set of two-letter abbreviations for all its provinces and
territories, being careful not to duplicate any of the US state abbreviations.
However, many people still use the traditional abbreviations, which are of
variable length, sometimes have completely different French and English
versions, and sometimes include hyphens to prevent confusion with US state
abbreviations. (So 'T-N' might appear, meaning 'Terre-Neuve', the French name
for Newfoundland, with the hyphen mandatory to prevent it from being mistaken
for the US abbreviation for Tennessee. Periods and capital letters with
accents also appear, e.g. 'Î.P.É.')

Since its introduction, the "standard" two-letter system has seen at least
three name changes. Quebec was PQ before 1991 and is now QC, although
sometimes QU or QB show up, Nunavut was added in 1999 (previously part of NT,
now NU), and Newfoundland changed its name to Newfoundland and Labrador in
2001, and its abbreviation from NF to NL in 2002. Also, the territory formerly
known as "Yukon Territory" officially changed its name to just "Yukon" on
April 1st, 2003. (What is it with the Canadian territories and changing
important stuff on April 1st?) Their postal abbreviation did not change
however. It's still YT, not YK, despite the latter being used fairly often and
making more sense now.

This matters because not all two-letter abbreviations appearing in the
database (this includes your database) are on the standard list, either
because they were entered incorrectly, or because they were correct when they
were entered, but have since changed, and the database wasn't updated for fear
of breaking working code. As a result, a naive lookup-table to get the full
province name from the two-letter abbreviation will fail.

~~~
Cyranix
> I've personally seen a building with a fractional street number, in
> Kingston, Ontario, Canada.

There's at least one fractional street number here in Victoria, BC along Fan
Tan Alley:
[https://www.flickr.com/photos/goddess_spiral/3377009251/](https://www.flickr.com/photos/goddess_spiral/3377009251/)

------
r3pl4y
Once you look at Korea, all those rules are only the basics...

------
pbhjpbhj
I know a UK semi-detached house that has 2 postcodes.

------
anon4
> Addresses will have a street

> An address will include a state

These two are quite annoying whenever I have to write my address somewhere
that presumes them. Here an address is city, city area (by name), block inside
the area (by number, sometimes a letter). Blocks are numbered in the order
they're built, so their numbering doesn't follow any pattern. And while there
is a street passing by and it does have a name, the building itself doesn't
have an address on the street.

And no, we don't have states. From the description above, you might think that
the city:city area might be used like state:city, but no. City is city-sized,
city area is neighbourhood-sized.

Additionally, there are addresses on streets, but those are not the same
places as area-block numbers. Sites around here that need to get your address
either include every possible field and ask you to only fill in the applicable
ones, or give you a free-form text area after asking for which city and post
code you're at.

I think amazon gets it right - country, administrative area (state in the US,
something else in other countries, maybe nothing in some), city, postal code,
two freeform lines.

------
mahouse
I thought this was going to be about memory addresses. :)

~~~
dysoco
Yeah me too, although I must say it's a quite interesting article.

------
je42
awesome! nice list.

------
Rygu
Developers can't know everything there is to know in the world. Developers
aren't suppose to know specific stuff like this. Some parts of modern society
are easy to digitalize, other (often historical) parts aren't. I think it's up
to entrepeneurs to find ways to solve these problems, and create a better
world by doing that. Don't blame/shame developers for stuff like this. It's
not even remotely fair.

When you're training developers remember that you're not training demigods.

~~~
jacquesm
Well, after reading this article and other variations on the theme that's one
less set of mistakes to make.

Even if developers are not demigods they shouldn't be above learning.

Entrepreneurs have very little chance of fixing this, it's mostly a local
government thing and since it isn't actually broken I highly doubt that
something will change. So 'deal with it' is the appropriate response.

~~~
Rygu
Learning is important. Part of being a developer is that you must keep
learning new techniques and facts that help you accomplish your goals or your
clients goals. So I'm not saying we don't need to deal with
internationalization, localization, and globalization. When the time comes,
deal with it. I'm just saying there's no shame in being a developer with
little or no expertise in those subjects. Many commenters here don't seem to
acknowledge the many amazing technical things that developers generally do
know about.

Thanks for taking the time to respond in words.

