Hacker Newsnew | comments | show | ask | jobs | submit login
Dear American Website Owner (rychter.com)
142 points by jwr 1947 days ago | 148 comments



Dear German Website Owner

(yeah, this works both ways)

I do have a state in my postal address, and if I have to enter my address as 47374 Richmond instead of Richmond IN 47374, it will probably go to Virginia first, and maybe they'll get it to me eventually.

Just saying. Internationalism is hard. If you want to go for the global market, it would behoove you to try to do it right.

-----


Doesn't the author address this pretty directly?

you should remove the State field if the country isn't set to U.S.A. or at least make it optional

He doesn't say "Remove state completely".

-----


I guess I should be more specific. This is an actual problem I've had on actual German sites.

-----


What exactly were you buying from Germany that you couldn't find in the USA? Oh... I see...

-----


Books in German. Also contact information as a freelancer for translation agencies, who really should know better (but their Web designers often don't). But your joke was very funny.

-----


Absolutely — which is I would recommend providing multi-line edit boxes for addresses.

That way you can enter your address any way you see fit. You really know best how to address mail so that it gets to you, why would I want to pretend I know better?

-----


So you have a strong AI doing natural-language parsing to figure out from that freeform address whether you need to charge sales tax/VAT?

If you're doing business across state/province or national borders, getting that structured address data (and getting it right) can be awfully important.

-----


Seems like the easiest solution would be for web developers to show the address in their country's format, if their customer is from their country, and to show a textarea box for anyone else.

-----


Yeah giving an end-user a free-form open box to enter their address as they see fit is a GREAT idea. Let me know how that works out for you.

-----


It works very well for me. Provided the goal of the address field is to copy/paste it verbatim to an envelope - if you want to actually understand the different parts of the address this might be counterproductive.

-----


A lot of people use fields such as state and zip code for demographic data collection. A free form textbox makes that more difficult.

-----


Dear everybody else:

Some of us don't even have postal codes. Those profane-looking ones we enter aren't real.

Love, Ireland.

-----


At least you don't even have them. We (Albanians) have them, but neither the govt neither the professionals know about them!

-----


But you have Guinness, or as you would say, beer. I'll trade any day.

-----


I think this blog post REALLY overlooks the fact that interface localization isn't easy; even with a simple web forms.

The interface for a search engine is a single input field with dynamic suggestions and people were in an uproar when Bing didn't give suggestions that were localized to the user's country or language. That's to say, if one of the biggest software companies in the world can't get it right, then what hope does the average web developer have?

-----


Not every aspect of this is easy, true, but that's no reason not to tackle the ones that are. It's trivial to simply not require a state or a postcode or try to verify a phone no if the user selects a country other than the US.

There should be a library for this. We have libraries that can handle the insanely capricious details of DST for all countries and subnational entities on the planet; we should have no problem providing one list of states and two regular expressions per nation.

(I know it's more or less impossible to reliably catch (e.g.) invalid UK postcodes, but we wouldn't have to. We'd mainly want to catch obvious typos without making things needlessly hard for customers overseas.)

-----


Wufoo is pretty good for pre-made forms. And they advertise that they can do US and International phone, mail and currency fields.

-----


If I wanted to write a blog post which elaborated on all possible internationalization issues, it would become a PhD dissertation and take a year to write.

And then it wouldn't even touch on all the issues.

You have to start somewhere — and let's face it, dealing with the two silly problems I mentioned would get us 80% of the way there.

-----


I gave it a shot of my own. I didn't do all the permutations, and surely left out big swaths that I've never even thought of, but it's a start...

http://hicks-wright.net/blog/stupid-american-website-owners/

-----


This post made me laugh. And I think it is exactly at what I trying to get at when I said that internationalization isn't easy. Nice!

-----


I wrote about Credit Card forms a while back as well.

http://www.jasonlotito.com/programming/what-i-know-about-des...

Anyways, you mention in your post:

"Making something a simple form that is fully internationalized and has useful validation is no where near as easy as they would like us to believe."

It's not easy, but it's not as time consuming and as difficult as you think it is. Getting the phone numbers right doesn't take years. A week is all you need from design to testing, and you are done.

Postal Codes are easy as well, considering their are numerous services that allow you to verify the Postal Code to the address. However, even taking something as simple as the common case, a user from the US entering a 6 digit code has probably made an error, and that solves a common problem.

House numbers and other addresses never presented a significant problem. I've seen more problems with not accepting different encodings then with the actual input.

And why is this important, even if you just want to serve an American audience? Because not every American lives in the US. Consider just the Armed Services, stationed all over the world.

No.

If you don't want to do it, that's fine. But don't complain that it's too much work or not worth it.

-----


Of course I exaggerate in the post, you're right that it'll probably take just a week to get coded up. (Which means three weeks, the way development often goes.)

The question is whether or not that time is well spent, or if I would be better off using that time to improve whatever product/service the form is attached to. Especially when you're on a lean (i.e. startup) budget, the decision becomes pretty clear.

-----


You are throwing a lot of ifs.

If you are providing a service that needs to be programmed. If you are short on funds. If you are only focusing on the US.

The problem isn't that. The problem is this:

When you are providing a service that relies on processing international orders... When you are no longer a startup and earning money... When you want to accept orders from the international community...

If you don't want to support the international market, don't. But if you do, do it right. I've seen it happen: "Launching in Canada!" and they still have a restriction of 5 characters for their Zip Code, or require all numbers.

It's silly. That's what the article is referring to. Wanting to accept international customers but doing it poorly. You'd be better off not offering the service and doing it right then offering it up poorly.

-----


I had to deal with International phone numbers before. Obviously, I had to deal with other issues, too, but the phone number was the important piece. Early on, before PayPal or other services caught on, we created a system to verify credit card users with their telephone. They would enter their telephone number and we'd call them. It was simple.

The problem was with international users. The system was simple: it would call the exact phone number the user entered. This meant we really wanted to get it right the first time. The other problem was that we had a single call center setup out in New York. This meant international callers needed to be called with special country codes and what not.

Now, I feel it's safe to say if you ask someone in Europe to give their phone number, they are going to give their phone number like they would to any of their friends. They aren't going to enter in the country code, and they aren't going to know to prefix it with a special code so that an automated system from the US can call them.

The thing is, it's not fair for me to simply provide them with the requirement "Give us your phone number so we can call you from New York, USA." It really isn't professional.

So, I spent some time (lots of time) reading up and learning about international phone numbers, and coding together a system that went a long way toward fixing this problem. A user could enter in their phone number, and if they didn't enter in a country code, we'd be intelligent about it and add it for them. How did we know where they lived? Two sources: CC Info and the IP address. We could be intelligent and assume the two should mostly match up. Obviously, if the CC address was the US, and the IP was somewhere off in Asia, red flags beyond just the errors for phone numbers would popup (but, even then, you had to be careful!).

I spent a lot of time fine tuning the system, working hard to make sure that a phone number would get through however the user entered it, and we could call. We had a really good success rate with numbers outside the North American norm. Enough that the cases that did fail I couldn't even figure out manually.

I was a bit saddened that it was all for nothing when we eventually removed the 'feature.' and the requirement for a phone number. I understand the reasons, but from a problem solving point of view, it was a lot of fun.

-----


I don't think your assumption is right — if you tell a European to give you his phone number 'just as for SMS messaging', you'll always get the right thing, with the country code, beginning with a '+'.

To send a text message you have to use a full international phone number, so everyone uses them. You can't deliver an SMS message without a full number.

-----


Couple things to consider. First, this was written up in 2003, and expanded on in 2004. I don't know if text messaging is as big then as it is now. Secondly, we weren't sending them a text message. While asking them might not hurt, we didn't want a cell phone. Asking for a phone number "just as for SMS messaging" might be more confusing. I'd rather just ask for the phone number, and fix it myself.

At least, that were my thoughts at the time.

-----


That sounds like a really awesome bit of code to look at. Can you share it?

-----


Oh, it gets worse fast.

I hate street address validators that insist you live in a house or apartment with a number on a given street.

Because?

I live in flat m at street number n on Foo Street.

Here in Scotland -- where I live -- there are three ways to put this on an envelope, all recognized by the post office:

Flat m, n Foo Street

m/n Foo Street (this is the one in the Post Office postcode lookup database)

Flat xFy n Foo Street (where y is floor number and x is apartment number on that floor, e.g. 3F2, 12 High Street)

... So why do so many Javascript address validators throw out m/n or xFy format addresses for having an illegal character in the middle? Including British ones, that back onto the official Post Office lookup database?

Address formats aren't standardized internationally. They aren't even standardized within countries with a unified postal service.

-----


I've argued many times for making address one big text field. Let people who live in the country, who own the address, format it how they know it should be formatted.

-----


I recently made some parts of my web application worse in this regard in order to be compatible with more US payment processors and some regulations. I don't think I'll have a huge problem with international users yet, but if it needs more code to satisfy them then I'll add it when necessary.

Unfortunately, one big text field doesn't cut it when every other system you interact with requires separate fields.

-----


Me too! I've been arguing this for years, and I have yet for someone to reply with a good reason why it shouldn't be handled this way.

Bad reasons given:

- Safeguard against users who can't be relied on to format their own address properly. Fix 1: Don't worry about that. Fix 2: If you really want this, leave the rest of us an option to use a freeform address instead.

- "Out webshop database has these fields." Fix: Change the database, then.

-----


"Out webshop database has these fields." Fix: Change the database, then.

Much harder when the "database" belongs to a third party such as UPS or Paypal.

-----


There are several legitimate reasons for requiring users to enter in their address in separate fields. The two main ones are:

1. If you need to integrate with third party systems who require you to break up the address. In case you've never tried, parsing addresses is very hard -- if not impossible -- depending on how many formats you need to support. All it takes is one third party library that requires you to break them up to make your life miserable.

2. If you need to categorize your users by country, state, zip, etc. For example, if you need to handle different tax laws for different states or if you need to generate reports on how many users you have from country XYZ.

-----


Are there any promising open source libraries to fill this niche? Or startups offering it as a service API?

Honestly this seems like a problem that could be solved once and we all would benefit... And like any good problem, there is probably a profit to be made from it.

-----


It seems like it would make it that much easier if it were a text box and a country selection (rather than just a text area, so you can reliably know the country). Since most countries that I'm familiar with have a limited number of ways of representing an address. The larger variation seems to be between address formats of different countries.

-----


Is parsing the addresses on the back end harder than providing the same number of support formats on the front end?

-----


Parsing on the backend, even when only considering United States addresses is incredibly difficult and error prone. There are apartment numbers, rural routes, PO boxes, circles, streets named after compass directions (i.e. North, South, East, West), streets whose direction names appear before or after the street itself (e.g. N. Main St. vs Main St. N.), military addresses, and the list goes on and on.

It's a terrible system really, and the only solution that I've been able to rely upon is having the user parse their own address and supply it to me. At least that way I don't have to spend hours looking through regex statements and edge case detection logic to figure out why a street address was parsed in a particular way.

-----


For the US, the USPS can help you with some of that.

-----


Yes, but they make good money licensing those databases for managed use to nice companies like:

http://www.semaphorecorp.com/

-----


Because chapter one in the databases for dummies book used address as an example, with fields for address, town, state etc.

Then you stick them altogether and print them on the envelope, thus covering chapter2 = report generating.

-----


'"Out webshop database has these fields." Fix: Change the database, then.'

Or let the app first sort out the raw input, make sure it makes sense, then pass it on to the database.

I can see wanting to have fine-grained table fields. But I don't like seeing the schema drive the model or the UI.

A single textarea really is a strong idea (and like others here have been advocating it fr a few years with mixed results).

Seems like these are the sort of mundane tasks computers should be good at. Let humans be quirky and let computers sort it out. Prompt for confirmation as needed.

-----


The only reason I can think of is for internal statistics. For example the website owner can run a report on their customers broken down by a state or country. Nowadays with GeoIP, this is obsolete.

-----


what if the destination/shipping is different to the billing, or where you buy it from? Using geoip here is mostly irrelevant and incredibly dangerous if you want to get a strong opinion of who your customers actually are, as opposed to where they are buying it from.

-----


Or you can just make the state field optional, add an address2 field, let zip codes be a free form string and phone-numbers a free form string. What cases would not be covered then?

-----


Which would make so much more sense in the first place. I don't even know why did they make them in separate fields in the first place. To give you a huge listbox of all states to choose from?! Oh, wow!

-----


While accepting generic address is nice, if you actually use the address for anything actually knowing the Country and Zip+4 for the US gives you a lot of power for things like advertizing campaigns. Also, the chances of a user entering bad data when given an unformatted field increased dramatically.

My suggestion is to give the option to enter generic information but don't let that become the default.

-----


Or:

* allow the user to enter a free-form address, normalize it and then prompt the user for confirmation that the normalization was correct.

* Allow the user to edit fields on the confirmation page.

* Log any differences between the normalization and what the user changes for further refinement of the normalization process.

* Don't require fields that may not be relevant to all addresses that you plan to capture.

* Allow some sort of contact avenue on the address confirmation page so that they can complain when the interface doesn't allow them to properly enter their address. (If people hit a brickwall while trying to submit their address, they are more likely to just give up if they have to hunt down contact information. At the very least, if there is an easy way for them to complain, you get some sort of feedback even if you lose them as a customer.)

-----


Why are you normalizing it?

Is there some huge performance advantage in having the state field be an Id into a states table? Perhaps it would make updates easier if they ever rename Washington?

Why not normalize the name? There must be lots of Johns in your DB

-----


Others have stated the reason for normalizing it already: interacting with third party software and/or services.

It makes more sense to normalize it at the beginning in order to give the user a chance to approve that it was processed correctly. You could just do the normalization at the point when you need to interact with the 3rd-party services, but then it's hidden in the back-end so you have to be confident in your normalization processing.

You could roll out a solution where the confirmation page is just there to allow the user to approve that normalization works on their free-form address, but still store it as a chuck of text in the database. Then when you are confident enough in your normalization process, you could just remove the user confirmation part.

-----


That huge listbox of states is designed to normalize the incoming data so that you don't end up with address in California that use "Ca", "CA", "Calif." and "California" -- or any number of misspellings that users are prone to input.

Sure, there are other ways to normalize an address from freetext input, but they often require using a third party system from FedEx or UPS.

-----


Freeform address boxes can make some things more difficult:

http://news.ycombinator.com/item?id=1234105

-----


Not sure how many people have tried to actually make international address forms, but it's really difficult. Despite a lot of searching, I have yet to find any web site or book or library that can handle all international addresses properly while still letting you validate that the data is reasonable (city/state/postal code) etc.

It's not that American-run web sites don't want to support international formats, but when 95% of your business is from America, it just doesn't make business sense to deal with the complexity of international addresses for 5% of your revenues.

I'd love it if someone can point me to a library/book/tutorial of the appropriate way to handle this situation. NOTE: I haven't looked that recently, but did about a year ago.

-----


Once you start to look at the problem, it becomes really hard. It is not only address and phone number formats. The name alone a very hard. American style is given name and family name I guess, and as a German that sounds very convincing. But then you suddenly stumble across a middle initial, whereas in Germany parents give their kids any number of additional forenames, from zero to five or so, not just one for the middle. Then countries like Spain and Iceland traditionally don't have a family name, but surnames derived from the father or the mother, or just the surname from both in a certain order. You have potentially two family-level names then (and potentially the same name twice!). So the person complaining in the blog post probably doesn't know what it means to get these things right, even if you only consider highly developed countries. Even if you only want the name.

Maybe someone should start a library for this problem, and then we all lay back and watch it grow to 50,000 LOC.

-----


It can happen within countries too. Most UK websites will require you to enter a "county". That's fine in England, because England is divided in to lots of counties (e.g. Essex, Hertfordshire, Oxfordshire etc.). However, Scotland does not have counties any more. They were abolished before I was even born and replaced with regions. It might sound nit-picking, but it's one of these things that adds to the chip on our shoulder that suggests folk in Engerland don't really care what is going on in the rest of the UK (I don't know how Wales and N.I. are organised).

-----


FWIW, not all parts of England are in counties either, so don't feel like you're being singled out. York isn't in Yorkshire, for example. It is its own unitary authority. Even Amazon demands a county, so I put North Yorkshire. Close enough.

The Royal Mail are happy for old curmudgeons and the like to use traditional counties in addresses, if they wish: http://www.abcounties.co.uk/bpa/bpacontents.htm

All that Royal Mail cares about is your house number and postcode, so you could probably even get away with telling websites that you live in Wibbleton in Foobarshire.

-----


York isn't in Yorkshire, for example. It is its own unitary authority. Even Amazon demands a county, so I put North Yorkshire. Close enough.

It's not just "close enough" - it's entirely correct. York is in the "ceremonial" or "geographical" county of North Yorkshire. And addresses are geographical references, after all: http://en.wikipedia.org/wiki/Ceremonial_counties_of_England (though, technically, you should probably just go with whatever the Royal Mail determines is valid)

-----


Correct. They messed around with the local authorities where I live last year - I was in Cheshire and I'm now in something called "Cheshire East". My neighbours are in "Cheshire West and Chester".

The address "52 High Street, Northwich, Cheshire West and Chester" would reflect the correct local authority name but is incredibly confusing - are we in Northwich or Chester? "Cheshire" is still correct.

-----


They were abolished before I was even born and replaced with regions.

And then, in 1996, "regions" were replaced with "council areas:" http://en.wikipedia.org/wiki/Subdivisions_of_Scotland

that adds to the chip on our shoulder that suggests folk in Engerland don't really care what is going on in the rest of the UK

I suspect most in "Engerland" care as much about Scotland as the average Scot cares about English affairs (well, except for holding positions of power within our parliament).

-----


I just wanted to point out (with humor!) that you are grumpy about English folk not understanding the rest of the UK, and yet you yourself (as a Scot) have no idea how Wales and N.I are organised!

-----


What's the point of having a state field at all? Isn't there a unique mapping between zip code and state?

About the phone numbers, my favorite annoyance is when companies, who always are fond of making stupid mnemonics out of the letters corresponding to their phone number, don't let me enter my mnemonic for my phone number, which I picked from Google Voice specifically because it makes my name, but insists on numbers.

-----


What's the point of having a state field at all? Isn't there a unique mapping between zip code and state?

Redundancy. The postal service can usually deliver if you screw up one of city, state, and zip code. If you got rid of city/state then any error in zip code would result in failure to deliver.

-----


I only proposed getting rid of the state field, not the city. I have no idea whether cities share zip codes, but that seems a lot more probable than states sharing zip codes. And even if they did, unless that zip code also had two cities named the same, it would still be uniquely identify an address.

-----


What's the point of having a state field at all? Isn't there a unique mapping between zip code and state?

No. Zip codes may cross state lines.

-----


I am not sure that is completely true. I thought ZIP codes could not cross state lines (at least in the US). And after looking at them, I couldn't find any that were not unique.[http://www.aggdata.com/free/zip-code]

Edit: Sorry, There are a few that cross state lines, when using the first two digits as the state code. However, each ZIP code is unique to a town; therefore, one could use it in the above scenario.

-----


"Each ZIP code is unique to a town"? No, not really. Small towns without their own post office typically share a ZIP code with the closest city because that's where the post office that delivers their mail is.

-----


I guess my point is, the ZIP code should be a pretty good indication of the state that the person resides in. No, it is not perfect, but I think it lends itself to be somethng that can help with the OPs concerns.

-----


Wikipedia used to have a list of them, I remember there being at least 5, possibly more.

-----


Google brought me to this article; I'm showing the jump to near where the issue is mentioned.

http://en.wikipedia.org/wiki/ZIP_code#By_geography

-----


Zip codes may cross state lines.

Example, please?

-----


"Fort Campbell (ZIP code 42223), primarily in Kentucky, also has some roads in Tennessee."

source: http://en.wikipedia.org/wiki/ZIP_code#By_geography

-----


They certainly can include multiple municipalities.

-----


You could probably do without city as well if using zip codes. Obviously this doesn't hold water for international setups, but it could be nice for the US.

-----


Unfortunately, no, you can't.

There are plenty of zip codes in Missouri that have a dozen small towns that all have 1001 Broadway as valid addresses.

There are at least 5 zip codes (possibly more, there used to be a list on wiki) that cross state, county, and city boundaries. So it's entirely possible that without the zip+4, you'd have to have both the city and the state to correctly locate an address.

-----


Until the wrong zip code is written or it's illegible. They go to great lengths to deliver mail that is damaged, incorrect, etc. and not having this extra information would end up being too expensive to continue this practice. They would have to say "get it right or we throw it out".

-----


I wish author was a little more constructive here. What phone number pattern would he recommend we use?

-----


Why use any pattern? Let me enter the phone number in a way I got used to it and deal with it.

-----


There's a cost/benefit analysis to be done. People do make mistakes, and checking their submission against common patterns will save you lost sales.

As suggested in a comment on a rant I wrote a few years ago, one strategy is to 'validate' against something reasonable, but instead of rejecting anything that fails, ask the user to confirm that they really do want something unexpected.

I'm not sure that 'reasonable' would be for an international phone number, but whatever you pick based on the country they choose, let them override your validation.

-----


Well, I suppose you could just validate numbers from the most common countries you see and hope the rest don't make mistakes. Call it the Amdahl's law of internationalization.

-----


Confirmation is OK, but taking a hard line is a no-no. I've had endless problems trying to jam my Irish mobile phone number (for a long time, my primary phone) into UK phone number fields.

-----


Because that's scary for users who have anxiety worrying they didn't input things correctly. If you tell them the phone number format and have a state dropdown, people rest easy knowing you will get their contact info correct.

-----


Ummm...no. Do you not see the problem with that reasoning?

-----


I'm not snitko, but I don't see the problem. Could you please explain your reasoning?

-----


To be blunt, inputs you don't validate are inputs that will be entered wrong a significant portion of the time. Other concerns may override this for many fields, but unusable phone numbers are generally considered a problem.

-----


If you force people to enter a phone number in a format they don't have, you will just end up with a bunch of 555-555-5555 numbers. Either way, you are going to get incorrect information.

It's much better to realise that your information will not always be 100% accurate no matter what you do.

-----


The idea is to minimize incorrect information. A 555 number at least tells you something that a number in an unfamiliar format doesn't - whether this was intentional or mistyped.

I'm very much for allowing people to enter any format, but seeing how badly people mis-enter NANPA phone numbers, it is simply wrong to not acknowledge that you will see far more mistakes from an open input box.

-----


Yes, but if you over-validate you also risk losing sales. What to do doesn't seem obvious at all, at least to me. You'd want to do some kind of cost-benefit analysis

Also, wouldn't a phone number in effect be validated for most online credit card purchases? I thought credit card processor often match the phone number you provide to the phone number on record for your account to prevent fraud. Skip the javascript validation, and let the cc processor validate the phone number.

...but I have no idea what I'm talking about, please correct me if I'm wrong.

-----


It's not universally true, no.

If you under-validate, you risk sales you can't complete. Mind, that's less dire when you have email addresses - but we know the problems with trying to validate those on forms.

-----


What, the whole "being able to place an order even if you're not American" thing?

-----


Probably the best approach is not using any pattern at all. The less you force your users into something that will not work most of the times, the best. I always struggle to input information even in forms designed specifically for my country.

Just let people enter numbers with all the spaces and the dashes the wish. After all, if a human will ever need a phone number, he will be able to read it in any form. If these characters are a problem for you to store, simply strip them. The routine to do this takes the same effort to write as the one to check the input.

-----


It's curious how many forms do not allow spaces and dashes in numerical data like phone numbers and credit card numbers. Given how easy it is to strip characters out of a string, one wonders why people bother mandating a particular pattern.

-----


Exactly.

And note that if you overzealously validate, you'll get garbage data anyway. What do you think I enter as a phone number if I can't enter my number at all?

-----


The best approach is to fix it for the person behind the scenes. You can keep the original input (you should), but you should also reformat the phone number in an acceptable format for your uses. This does take a lot of work, however. It's not reasonable to ask someone to add their country code to their phone number. However, you do have the ability to add it yourself. And you should. It won't always be a human that needs the phone number.

-----


For phone numbers, this library might help: http://code.google.com/p/libphonenumber/

-----


Very nice. Anyone have links for other languages? Python and ruby in particular.

My take is that we need to validate (people screw up and often), ask them to verify when we can't validate and proceed after verification.

-----


E.164 format. Every valid number in the world can be represented. The downside is that most people don't even know that this exists, so when doing an international based system, you have a filter based on the country they user is based in and automatically apply any codes needed. Just strip everything out of the field that isn't a number and prepend the country code if the region doesn't normally include it. That means that 210-555-1212 would become 12105551212 for a U.S. number and 21-555-1212 would become 55215551212 for a Brazil number.

-----


The transform isn't that simple. For example, UK numbers start with a 0 to indicate STD (long distance) - but that 0 must be stripped for calls from another country. Ie, 01234 567890 is dialed from abroad as +44 1234 567890.

-----


Additionally these rules can and do change. For example I know that the national numbering plan in the Czech Republic changed around 2002 for instance so a leading 0 for national calls was removed. Getting all websites in the world to understand changing numbering plans would be quite a feat, and for most sites not worth the effort.

http://www.itu.int/itudoc/itu-t/number/c/cze/76045.html

-----


This gets into dial semantics vs. number format, which then becomes an element of user behavior. I don't know how user behavior is in a large number of other countries, but the U.S. I know typical reaction to "enter your full phone number" is area code + number, not 1 + area code + number.

-----


Indeed. In the UK, the zero is always regarded as part of the area code.

-----


http://wtng.info/ would be a good place to start. It's a lot of work to do right. You have to support full international numbers as short as 7 digits (Tokelau has +690 XXXX) and as long as 15 digits (in Germany the telco basically gives subscribers a subnet-like functionality, as extra dialled digits are passed along during call origination).

Ideally you'd be able to parse it into a punctuation format based on local practice, but just about everyone punts on that.

-----


Good question. Unless there's some table valid phone number patterns corresponding to various regions, the best you can do might be to check for dashes, digits, dots, and parenthesis. Maybe also check that it doesn't exceed some arbitrary but generous length, too. (Ugly, I know.)

-----


We've been told for years that database normalization is good. Removing data redundancy is good. This is one reason forms are broken into so many separate fields as discretely as possible.

That said, once a company decides to do business outside their country of origin, some serious rethinking of customer-facing forms (and fields, if applicable) is in order.

-----


I should probably clarify something, given how heated the discussion is becoming.

I could have written the same thing as a nerdy techie article, pointing out flaws in input validation. Do you think this would have gotten my point across? How many people would have read that article? Would it have made it to Hacker News?

Look: this is not a rant about Americans. It just so happens that US has developed most of the internet as we know it and it just so happens that most of e-commerce happens there. I'm sure any number of websites in other countries are guilty of similar sins, but you have to start somewhere.

The point is, we should start thinking more carefully about how we validate forms and store user data in that "global economy" everyone is bragging about.

-----


"Would it have made it to Hacker News?"

Heaven knows, we never see nerdy, technical material on this site.

Are you trying to "clarify" or "justify"?

EDIT: Serious question. At the same time you're saying "this is not a rant about Americans", you're also saying but none of you would have paid attention if I hadn't made it one.

-----


I think on some level this is a rant about Americans, even if you didn't intend it to be one. My Indian and German friends are aware of the fact that not every country has states, even though their countries do. I would bet money on the same being true for Brazilians and Australians. I would bet money on Japanese people being aware of the fact that not every country is partitioned into prefectures.

Maybe this is because Indians, Germans etc. are not socialized to see their way of running a nation as the only way, at least not in the way Americans are. Maybe not. In any case it's a nice demonstration of how collective cultural ignorance leads to annoyed visitors and, one might presume, lost sales.

-----


Brazilian websites are just as bad, if not worse than American ones about this, probably because websites that do online sales in Brazil tend to cater to a primarily domestic clientele. If you happen to live in Argentina, as I do, entering your address when you buy things in Brazil can be comically frustrating.

Americans definitely don't have a monopoly on ignorance when it comes to other countries' address formats.

-----


See my post elsewhere, and maybe your German friends aren't Web designers, but I've had problems with addresses on German forms - just because I speak German doesn't mean I live there. So this is a universal problem, not an American one: internationalization is hard, because there is a tremendous amount of domain-specific knowledge that is not convenient to find.

-----


"My Indian and German friends are aware of the fact that not every country has states, even though their countries do"

Apparently not: http://news.ycombinator.com/item?id=1232278

-----


YES YES YES YES YES! I've always wondered about this. I once submitted a form saying that I wanted something delivered to New York, Singapore. Was very lucky that it even got delivered, much less sent to the right address.

-----


I thought we already knew this whole web2.0 thingie is more than rounded corners. It's the "web services and integration" that's interesting.

There is more than one service offering a geocoding API. Use any of them to do the hard work for you.

Your form should have one single field, named "Address". Let the user type whatever s/he wants. Get this input and pass it along to Google Maps, or Yahoo Maps, or WhateverMaps. Your geocoding service will return things in a way that is obviously more parseable.

-----


I've had that issue with a Department of Education website, I moved to Canada for a second degree recently and I was unable to set my address with them to here because of the state field not resetting when I select Canada.

I agree with this article, if you set a country other than USA, the state field should go away or better, don't preselect the US and add the field later.

-----


all of which required me to provide a "State" name (I don't have one)

As the author is in Poland, I'd argue that voivodeships are equivalent, in terms of being the highest level subdivision of the country. As a Brit, I'd tend to go with the county instead.

-----


How about Puerto Rico (in those cases where Puerto Rico is listed as a country instead of a state-like entity, which is how the US Postal Service treats it)? Hungary? Japan? Germany? Anywhere else that doesn't use a state in the postal address?

Edit on second thought: how about the vast majority of forms of this nature, which have a helpful dropdown list of the fifty valid states, and sometimes DC for the District of Columbia, but almost never PR for Puerto Rico, let alone Guam, the Virgin Islands, and the other areas of the world that the US Postal Service (and law) considers part of the United States?

I lived in Puerto Rico for years, and this is an utterly rampant problem for millions of actual American citizens living in the United States, let alone the rest of the world, all due to the fact that most Americans don't know jack about the world they inhabit.

-----


How about Puerto Rico (in those cases where Puerto Rico is listed as a country instead of a state-like entity, which is how the US Postal Service treats it)? Hungary? Japan? Germany? Anywhere else that doesn't use a state in the postal address?

"State" is really just equivalent to "second-level subdivision" (a bit like how "ZIP code" can mean "postal code" in general, as long as the box allows freeform input). Most countries don't call their second-level subdivisions "states" but counties, municipalities, council areas, and the like. Puerto Rico has municipalities and Japan has prefectures, for instance.

the other areas of the world that the US Postal Service (and law) considers part of the United States?

The places you mention are all unincorporated organized territories, meaning they are not considered to be "part of the United States proper." The word "proper" is somewhat important in this distinction, of course, and the USPS's definitions do nothing to help clear the waters on this one.. :-)

-----


They do if you want to receive mail. Puerto Rican municipalities have nothing whatsoever to do with mailing addresses (and in fact the situation is worse, since many PR addresses include an "urbanization").

Moreover, Indiana has counties that correspond to Puerto Rican municipalities. Drawing this sort of specious parallel is entertaining, to be sure, but doesn't help people place orders.

-----


Drawing this sort of specious parallel is entertaining, to be sure, but doesn't help people place orders.

It does if the order form requires this information. A little extraneous information on the postage label isn't going to cause significant delays for your mail as long as the underlying details are right (street address and ZIP code for most of the US - everything else is filler). Mail delivery systems are designed to deal with this (hence the introduction of the ZIP code in the first place) because people don't always address their mail to technical specifications.

-----


Seems like in Britain the closest equivalent to US states is countries (England, Wales, Scotland). But countries within countries is just too danged confusing.

-----


Nope. Counties are.

-----


In what way are "American Website Owners" responsible for this problem? I've seen it plenty of times going in the other direction. Such a typical European way of thinking...

-----


And fix those drop-down menus for country where I have to scroll allllll....the.....waaaaay....dowwwwwwn to reach United Kingdom!

-----


If it's an HTML <select> element then you can click it to give it focus then start typing the name of option you wish to select.

-----


But it would be easier for the customer if the website guessed from the customer's IP address which country they were in, and put it at the top of the list (as well as in its normal position).

-----


My favourite are the dropdowns that place "popular" countries at the top. This means that if I scroll through the list then my country may or may not be where I expect it based on how the developer was feeling that day.

-----


That sounds like a good little usability trick. I might do that next time I have a countries dropdown in a form.

-----


while we are on the topic can we lose the dropdown list of 196 countries you got from the ISO table?

Seriously, how many products have you shipped to Afghanistan or Albania but you make everyone scroll past these to get to the only country you actually deliver to (+ Canada if you're lucky)

-----


Also annoying are email address parsers that don't accept .name Domains.

-----


Dear Non-American Website User:

You do not live in the United States of America. You think all American website owners are the same. And then you decide it might just be a good idea to lump the 95.4% who know what they're doing with those who don't. But you know, most American website owners are no more alike that most American <anything>.

The moment you whine about the few to the many in a blog, two things do happen:

- you irritate the people who know what they're doing

- you don't reach the people you should

If you don't like something about an American website, tell them, not the rest of us. That way, you're more likely to actually accomplish something. Which is, after all, the American way.

-----


Do you respond to other general advice to Website operators in the same way, or just to foreigners telling Americans what to do?

-----


An interesting point about tone.

If this had been pitched as "sites do international addresses wrong" as opposed to "Americans do it wrong!", there'd probably be less defensiveness, less snideness, less "typical American can't read", etc.

-----


Look, I'm an American, and having lived elsewhere, I've been bitten many a time by this kind of thoughtless antipattern. And when I am, I don't think, "Oh my, this Web designer of uncertain origin didn't study his field," I curse, moan, and wail, "Stupid bleeping Americans have failed again to understand the complexity of their world!"

I don't know where you're from, obviously, but having grown up in Indiana, I'm all too aware of the oblivious American nature, and I find the original post entirely justified in its tone. As far as I'm concerned, the defensiveness here is good ol' American exceptionalism at work, and anybody who wants to have global business had better get over that.

I second or third the notion that a nice library would be a Good Thing.

-----


"I'm all too aware of the oblivious American nature"

Correct, but it's only somewhat less parochial to assume that one's own country is uniquely oblivious to the rest of the world than to be oblivious to the rest of the world. After all, you were the first person in the thread to start pointing out how sites from other countries around the world fail in their own ways.

The simple truth is that without being whapped in the face by alternate standards for things like address formats, people assume everyone else does things their way. That a Pole runs into this much more quickly than an American is a natural consequence of living in a smaller and less populous country with many nearby neighbors.

"As far as I'm concerned, the defensiveness here is good ol' American exceptionalism at work"

Please! I've dealt with too many people from other countries to buy that. When "outsiders" start criticizing people who identify with any nationality based on their nationality, those people usually react with defensiveness and hostility. Part of dealing with people from other countries is learning how not to provoke such reactions.

-----


If you have a free-form field, how often to people screw up their own address or phone number?

And if you alienate them, does your support cost go up or down?

-----


Is this the place to complain about other car drivers too?

-----


Actually, as a web site owner, for a number of my sites, they are INTENTIONALLY USA only. Sorry that you happened to stumble across them, but I either do not care about your business, don't want to deal with the hassle (international shipped, phone, regulations, etc), or legally cannot serve your area.

-----


I'm pretty sure this criticism is not leveled at your sites, which doubtlessly have nowhere to enter a country, either. But if they have a space for your country but don't make it possible to enter foreign addresses, which is very often the case, then the operator has a problem and should probably think a little harder about their data structures and form layout.

-----


Acually, as typical American, you are unable to read and understand such short text. It is directed to those website owners who do want to deal with the hassle you don't care about.

"The moment you put a "country" field in your form, two things should happen:" - that sentence should give you a clue.

-----


"Acually, as typical American"

It helps, when mocking a people, to convey such mockery through proper use of their language. ;)

More seriously, there's no need to douche up the conversation by throwing around slurs against nationalities.

EDIT: Apparently people disagree on that need. Ah, well.

-----


It used to annoy me too. When I saw those forms, I usually assumed a website owner didn't want to talk to me anyway. But then I realized that's okay. US generally don't care for the rest of the world and I believe they have the right and the privilege to do so. So it's either I move to US or stop complaining. And US companies that, say, want my money - they usually have the right kind of forms already.

-----


I disagree there's there's nothing wrong with wanting to raise web standards. More people should complain, we'd have a better web. Some sites still don't have encryption when entering credit card numbers.

-----


Sure, but the design of a web form is hardly a standard.

-----


I think being international is a standard of sorts. Eg:

- handling international times (say, Feb 1 2007 rather than 01/02/07 or 02/01/07)

- allowing people to enter their normal phone numbers and addresses

- knowing that United Kingdom is preferred for addressing and Great Britain is not.

-----


I want to see the library and code that handles all of those possibilities.

-----


Ask a website nicely - 95% of them handle this fine. Or write it yourself.

-----


Dear Foreigner,

Okay it’s probably time to come clean.  In reality the problem you highlight is all a secret plot to frustrate you and hence make you less effective and unable to compete with us in the global economy.  We were all told to do so through coded messages that are embedded into episodes of American Idol (I bet you thought all that stuff Paula Abdul said was just drunken babbling)

It’s all part of a far reaching plot by our government to subvert the rest of the world.  Hence Google pulling out of China and the creation of Microsoft Windows (we secretly use a highly stable open solution and just use a keyboard shortcut to make it look like Windows when you walk into the room much like that fake looking spreadsheet that was embedded into the DOS version of Tetris)

Sorry for the inconvenience (though not really as pointed out above),

America

-----


How does this reply add value? The blog post presents a problem that I (being an American) rarely even think about. I wanted to see how others deal with this issue. WTF is this reply doing?

-----


This is what your downvote button is for.

-----


I did downvote, but at the time, the parent post had over 15 points and I wanted to know why.

-----


The idea that people other than Americans are real consumers with real money to spend is inherently funny; the idea that it might make good business sense to not put gratuitous hurdles between them and your checkout is practically hysterical. It's not like the US has an external trade deficit larger than the GDP of Taiwan or anything.

-----




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: