Hacker News new | past | comments | ask | show | jobs | submit login
Îñţérñåţîöñåļîžåţîöñ (googlewebmastercentral.blogspot.com)
141 points by pwim on Sept 10, 2011 | hide | past | web | favorite | 30 comments

An extremely important point that most native english speakers just doesn't seem to get: Don't force a localized version to your users.

In many countries your users are probably going to prefer english than a translated version for many reasons, and that's even before taking into account that you probably can't translate it well anyway (even with a lot of resources and effort) which makes it even worse.

Always make it ridiculously easy to switch to english, always. In many cases you probably should make english the default even though you have a localized version (this is especially true for web sites, few things are as annoying as having to change language again and again and again and on every device, browser etc. (example: youtube has this annoying message telling me that a localized version is available that forces a page reload to dismiss (and thus you have to reload and restart the video, true story))).

This of course depends heavily on your target audience and their culture but just don't assume a translated page/app is a favor, you just might end up with some confusion and a group of really outraged users (boy do I loathe applications that assume I want a translated version - always give me the option to change it to english during the installation).

Direct machine translations might be better than nothing for some cultures but for some it's a slap to the face and will just make your company look pathetic. Example: Adobe.

I'm non-native English speaker and all my software is in English. Still I understand why developers/publishers want to default to my local language.

It all depends on your target demographic. If you are targeting young, geeky people then it's OK to have English around. But if you are targeting every joe sixpack and grandma, then it's probably smarter to present localized versions.

Optimum of course would be to query the underlying platform for locale information. That way everybody gets what they want almost always.

And while we are talking about localization, I need to point out that localization is much more than just translating the UI texts. It also includes stuff like date and number formats, correct units, and lots of other stuff. I'd even argue that those are more important than translations.

"Optimum of course would be to query the underlying platform for locale information. " - This isn't true.

Most people are not in control of that. For many many years I had to work on Swedish version of windows, because that is what the company paid for. But that doesn't mean I want my CD-burner application or whatever to talk to me in "bad" Swedish. Never mind trying to Google localized error messages, that's a another pain I'd rather be without.

That situation has somewhat improved in Windows 7. In Enterprise and Ultimate editions users may choose their language freely, and MS docs are bit unclear, but apparently there is Language Interface Packs for other editions too. And I think system locale is settable even without the corresponding language pack, but I'm not sure what it actually does.

But yes, I forgot that in Windows language settings are bit strange.

> Always make it ridiculously easy to switch to english, always. In many cases you probably should make english the default even though you have a localized version (this is especially true for web sites, few things are as annoying as having to change language again and again and again

Yeah, like google.com. Whatever I do it just does not seem to be possible to permanently set US version of google.com as default, not local version. Especially frustrating when traveling to other countries and getting it in language you do not understand.

I think what fixed this to me, was to set the browser settings to English too. Google stopped trying to force the language change after changing the browser settings in this way or in some similar way (maybe locale).

I have that as my homepage to make sure that the cookie is set every time, and I still occasionally end up with search results or google products in non-English.

Agreed. And it annoys the hell out of me if I fire up my Android browser and end up on a Hebrew home page, again.

Somehow I thought that we already had a mechanism to negotiate the language between server and browser. Ah well, let's just ignore Accept-Language, shall we?

Would this order of detection be good:

1) Detect in Cookie (Pre-set by user)

2) Detect Browser's locale (You can choose it via options)

3) Guess via IP address (This is if there's no locale for some reason but Geo IPs tend to be bad)

What examples have you seen that are good at presenting English (or similar languages) as an option?

Some examples I know of:

Facebook.com presents most popular languages on frontpage with an option to click more.

Youtube.com where at the bottom says Language: English, no indication of i18n. Even though the website had been used by other non-English speakers.

I hate websites who guess my locale via my IP address. What a PITA when travelling abroad, or browsing through foreign proxies...

If my browser's locale is English, display me your website in English.

1. Cookie is of course great but have in mind that cookies aren't permanent and doesn't propagate through browsers and devices, that's why I'm so bothered by youtubes approach. Actually, youtubes approach isn't _that_ bad, although they dismiss my browser locale they only inform me that there is a localized version, the problem is that they do it in a really obtrusive way. If there was a small localized flag in the top-left corner or something that'd be fine and I wouldn't have to dismiss the thing all the time (or if they respected my browser locale).

2. Browsers locale is a neat idea but it has flaws, you might not be at your own computer (school/work/café etc. (introduces the problem of what permissions you need to change this setting - there should also be a way to set it temporarily for this session (any browser that lets you do this?))) so don't assume it is correct (most surely don't know it even exists). Debian.org was perhaps the first site to follow the browsers locale that I stumbled upon (before I knew sites could even check for it) with no mention of it and no other way to change the language, I actually thought it was a phishing site at first and after assuring myself that I was at the right domain I couldn't for my life find the english version. I started googling and found others with the same problem in forums and that way I found out that there was a setting in my browser and that debian locked on it. That was a really horrible and time consuming expericence. They've fixed it quite well now, default on browser locale but list all languages at the bottom so you can change it quickly and they also explains why they defaulted to what they did and how to change it, great!!

Facebook not only presents the most popular languages on the frontpage, they check your Geo IP (I think) and gives the first language on the list to the language that they guess you'd prefer (nice). Problem is that they don't respect my priorities. Even though I have english first in my browser locale they default at swedish (second choice), if I remove swedish altogether it finally defaults to english. So they think they know better than my browser what language I really prefer, sigh... (later I assume (I don't use facebook) that the setting is stored in the account, which is great of course but on sites that doesn't require an account (such as youtube) please choose your defaults carefully).

3. There are obvious problems (and benefits of course) with this. You have to really balance the options of what is more cumbersome for your users and to do this you should have an understanding of the culture and your target group for all languages that you implement. Some might be turned off by an english site and others will scream in agony if they can't find a quick way back to the english site.

Always assume you might get your default wrong (if the computer is at a internet café the cookie might be incorrect as well), always make it easy to switch to english.

One thing I really dislike (popular for large corporations) is a whole page that lets you choose region before entering the site and they don't list the reason for why either. Are they basing the language on this decision (most likely, just hope they haven't implemented mine yet) and/or is the information on the site only relevant for that specific region? If I go to the site in order to look up information about the warranty for a product I must choose between having a badly translated page that is seldom up to date but has localized information, or I can choose the official/US/UK-site that will be comprehensible and up to date but the information I get might not be valid in my region anyway. Ok, that was more of a rant but that is true for soo many sites it's not funny and even if they do a good job of it they communicate it badly so the experience is bad anyway.

Localization is a difficult problem that has the potential to annoy many people no matter what you do.

We're currently going through our internationalisation test phase for a major release and it's been a huge headache.

My one piece of generalised advice: DO NOT leave internationalisation testing to the end of the release. Make sure you're pumping Unicode strings through your app in unit tests. Make sure you're regularly viewing your app in other languages using mock strings if you haven't got the translations. The earlier you can find these defects, the easier it is to fix. There's a perception that internationalisation defects are easy fixes, just updating a messages file. The truth is that major parts of your app may not work unless you regularly test it. And don't rely on the Unicode abilities of your chosen framework, language or operating system. If you haven't tested it, it doesn't work

Here's a real life scenario:

1) Develop a project in a rush

2) Becomes successful

3) Boss/Client decides to launch it in 2/5/10/22/34/82 countries

4) Project becomes one big hack

One simple Unicode test and 10 minutes of initial thinking would've saved all this trouble.

> We can’t recommend this enough: reuse the same template for all language versions

This can get quite tricky, because languages don't easily map onto 1-to-1 items that can be just translated independently and stuck into a template. See the classic "localization horror story": http://interglacial.com/tpj/13/ (and HN discussion: http://news.ycombinator.com/item?id=2095334)

It's not a problem of HTML, but the method you use to construct translation string key names.

If you have `<button>file</button>` and use it in both "[file] a complaint" and "save to a [file]", then you're in trouble indeed.

But you can fix that by adding context to translation keys in the template, e.g. in TAL you can do `<button i18n:translate="'file (verb)'">file</button>`.

It's not just an issue of polysemy (that "file" means two different things in English), but that the structure itself may need to be different in different languages, because languages have different grammars, and different ways of dealing with objects/numbers/etc.

In the example I linked above, there's a whole mess of rules to capture if you want to handle singular/plural correctly in different languages, including those that distinguish more cases than singular and plural. It boils down to the fact that word-for-word translation doesn't work in all but the simplest cases: you can't translate "[a] [b] [c] [d]" by having 4 lookup tables for [a], [b], [c], and [d] and then plugging them into a template independently.

Gettext now offers quite the same solution


Oh yes, this is tremendously fun! I worked on i18n/g11n/l10n for Y! World Cup and other games

Compare: http://us.wc.fantasysports.yahoo.com/ http://hk.wc.fantasysports.yahoo.com/ http://tw.wc.fantasysports.yahoo.com/

It's a tremendous amount of work, and you have to identify translatable phrases AND contexts, not just strings and words. Websites that plan on catering to multiple languages and regions should incorporate i18n from the beginning, not an afterthought, which would require going through lots of code with a fine-toothed comb =)

If i18n is a priority on your project, make sure you show your designer(s) this article. With the current trending emphasis on grids and horizontal rhythm, left-aligned labels are quite fashionable.

An effective bandaid solution is to set left-align labels to text-align: right; thereby visually linking the variable width label to the field.

Good to see that Hacker News deals correctly with those beautiful glyphs :-)

It's pretty hard to screw up copying a string from one place to another without modification, at least when it comes to encodings without null bytes.

Some programs manage it somehow, though.

You will be surprised actually how many web apps get this wrong.

I assume Arc Lisp is no longer ASCII-only? I seem to remember that being a complaint when it came out.

Actually, Arc Lisp now allows for any kind of unicode madness. Øþ@ßðđŋħjłĸŝ¢łµæ¶ŧ←↓→ΩŁ§ÐªŊĦJ

See : http://news.ycombinator.com/item?id=111100

Good article. Now can anyone recommend a service for replacing the copy in a site with copy in different languages, so I can use my same templates?

Like Google Website Optimizer, but replaces strings with the appropriate language.. ?

I'd be interested in knowing what made them write that IntelliJ IDEA provides "decent solutions for bidirectional editing", considering that JetBrains' Dmitry Jemerov has specifically said that they don't see the business case in supporting RTL editing:


I just want to point out that the title uses a letter which doesn't exist in any language: t with a cedilla. It's a Unicode fuck up which plagues the romanian language to this day. Not just on computer systems, in real life too!

Nothing like early morning Unicode abuse. ;)

I'm a web developer with experience of technical editing, a native speaker of Arabic, have translated many popular Firefox extensions to Arabic, and worked in "Google in your language" project, since I speak English and German fluently. I'm willing to help any HN member who wants to localize his startup/website to a RTL language, for free. If you contact me, I will gladly review the Arabic localization and RTL design and help you with the globalization in general in my spare time. As a thank you for the community I'm a big fan of.

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact