
Accept my accept-language - hc5
http://blog.choibean.com/2014/03/accept-my-accept-language.html
======
Udo
Pretty much every major site will completely ignore not only your browser's
language settings but also onsite user account settings and other desperate
attempts at selecting the language. The only thing that matters is your IP
address.

For example, accessing Google: my browser is set to accept English only. I'm
entering the English URL. In my account settings I periodically reset
everything I can find to English (settings apparently decay, too). Google
_knows_ I want the English version. Yet, they still give me the interface in
whatever language my IP address comes from. And not only the UI, search
results as well.

Recently it's gotten even worse than that: Google figured out I'm actually
German, so they start defaulting to German more often now - ignoring
everything else. At least with the IP address-based routing it was impersonal.

I happened to be in Sweden when I linked my Facebook calendar to my Google
calendar. Ever since that day, my friends' birthdays are given to me in
Swedish. Facebook _knows_ I want English, yet for some reason this is how it's
got to be.

The same abuse is apparently considered best practice at new startups as well:
recently I was testing a browser game for an acquaintance who's on their
development team. Because I was in Portugal at the time, I of course got the
site in Portuguese. Manually switching that to English, the game still started
up in Portuguese. It's been doing that ever since. Every email I get from that
company is in Portuguese, too, even though I tried everything I could to set
my language to English.

It's a source of endless frustration, maybe even a hostile act. They're
effectively saying " _Your choices don 't matter, we know what's best for you.
You're from country X, so you _must_ speak Xish. People are on the internet to
enjoy regional separation. Really, it's best._"

~~~
nrser
can anyone provide insight into the business reasoning behind this? i really
can't conceive of why you would want to supersede a user's exact, known
language with a guess. sites are pretty difficult to use when you can't read
anything. maybe there are technical issues for some sites, but Google search
is the worse i've ever dealt with, and they def have some resources behind
that.

~~~
preinheimer
Mostly it's because people don't know how to change that setting. Imagine
walking up to a computer in a shared space (hotel lobby, library, etc.) and
it's been configured to send out accept-language: <something you don't
understand>.

Many of the people reading Hacker News will be able to find and change that
setting. My mom never will. She'll just know she went to google.com, and saw
Chinese.

If you're using a computer in a country, and websites seem to be showing you
things in the language of that country, that's something you can probably
understand. If you're using a computer and some websites insist on showing you
some other language, you'll be confused.

~~~
ds9
It's true that most people leave the defaults. However, there may an easy
solution for a subset of cases.

Do the browser defaults in any country include multiple language settings? It
seems likely that in most countries, the default would be only one language.
And if this is the case, then if multiple alternatives are present in the
request headers, it's very likely the user or computer admin has deliberately
changed it, and that in turn would mean that sites should respect the choices.

This might still be wrong in when the settings were made by someone other than
the current user, or there are multiple languages default-configured, but it
might be a step in the right direction.

------
rwg
You probably want something like "en-US,en;q=0.9,ko;q=0.8". (Note the addition
of "en" between "en-US" and "ko".) Some quick testing with Firefox, which lets
you directly alter the Accept-Language: header in requests in about:config,
shows that fedoraproject.org has "en" versions of resources but not "en-US"
versions. Since your Accept-Language: header only lists "en-US", "ko" ends up
being selected.

 _EDIT:_ I just noticed your guesses at the bottom of the post. Your second
guess is correct. See §14.4 of RFC 2616:

 _As an example, users might assume that on selecting "en-gb", they will be
served any kind of English document if British English is not available. A
user agent might suggest in such a case to add "en" to get the best matching
behavior._

------
delroth
I just fixed dolphin-emu.org, this was a bug in our code that would not detect
en-US as being "compatible" with en. See [https://github.com/dolphin-
emu/www/commit/ddef974c6f601bc2db...](https://github.com/dolphin-
emu/www/commit/ddef974c6f601bc2dbbcc499c8990f4dd074615f)

As a French guy leaving in the German part of Switzerland with Accept-Language
configured to get English content, I'm kind of ashamed to have that kind of
bug in my language detection code. I'm always complaining about other websites
language detection, looks like I should have looked at my own code first!

~~~
spacehunt
Please don't do this... I _detest_ sites that try to be clever and serve me
Simplified Chinese even though I only have zh-hk in Accept-Language:.

~~~
delroth
I already have exceptions for things like that. I think our code handles
zh_{CN,TW,HK} separately, as well as things like pt_BR vs. pt.

    
    
        > curl -I -H 'Accept-Language: zh-hk,en;q=0.8' https://dolphin-emu.org/
        HTTP/1.1 200 OK  # No zh_HK translation (yet!)
    
        > curl -I -H 'Accept-Language: zh-cn,en;q=0.8' https://dolphin-emu.org/
        HTTP/1.1 302 FOUND
        Location: http://cn.dolphin-emu.org/?cr=cn
    
        > curl -I -H 'Accept-Language: pt,en;q=0.8' https://dolphin-emu.org/
        HTTP/1.1 200 OK  # No pt translation (yet!)
    
        > curl -I -H 'Accept-Language: pt-br,en;q=0.8' https://dolphin-emu.org/
        HTTP/1.1 302 FOUND
        Location: http://br.dolphin-emu.org/?cr=br
    

i18n is hard but I think I've been doing a fairly good job on it. Proud to
have more than 50% of our visitors from outside of the US!

~~~
spacehunt
Having now read the full code and not just the diff, I have to say it looks
pretty good. I note that plain "zh" is not redirected to the cn site. ;)
Whether it should or not is debatable though -- I actually think ignoring "zh"
altogether is a rather prudent move if it is intentional.

------
crazygringo
Language choices are a mess. There can easily (and often) be conflicting data
based on:

\- accept-language header

\- URL that includes language/region codes as a subdomain or part of the path

\- language preferences set in a cookie or account

\- IP region detection

In the end, any website is trying to provide the right language most often for
their users, and there are no easy answers. When I access webmail from an
Internet cafe in China, I don't want the interface popping up in Chinese just
because the browser's accept-language is configured for Chinese. Fortunately,
it doesn't.

Most web users have never even heard of accept-language, it's just
automatically configured by whatever language their browser was installed in,
which isn't always the language you want to be browsing in. (E.g. you bought
your laptop overseas because it was cheaper, so it runs in English instead of
your own language.) It's not a surprise that IP address detection provides the
best default experience most of the time, which can then be overridden by URL
or user choice, and that accept-language is fairly irrelevant.

~~~
delroth
What we've done for dolphin-emu.org:

* In all cases, a fairly visible language picker is displayed at the top of the page, with internationalized language names.

* If someone goes to a language-specific subdomain (fr.dolphin-emu.org, cy.dolphin-emu.org, ast.dolphin-emu.org, ...), they get this version.

* If someone goes to the generic/english dolphin-emu.org, the system checks whether the user has a "nocr" cookie. If so, they get the english website. Otherwise, they get redirected based on their Accept-Language.

* If a user uses the language picker, we assume they know what they want and set the "nocr" cookie to disable redirections in the future.

* When the user gets redirected from the standard/english version to an internationalized version, a message is shown in english saying that they have been redirected based on their browser preferences, with a link to go back to the english version (and set the "nocr" cookie).

I thought for a pretty long time about this and think it is a good compromise
between providing the best version for our users and not being
annoying/guessing too much. In the end, more than 50% of our users now are
shown internationalized versions of our website, which is a very good number
in my opinion.

~~~
yepguy
This seems like a pretty good solution, except that your language picker
includes country flags, which don't make sense for many users.

~~~
delroth
They do make sense for many users, and they are the closest you can find to a
proper graphical representation of languages. When I add a language that I
know to be official in several countries, I look at my analytics to see where
most users come from and use the flag from their country. I can't remember a
time where it did not also match the country with the most speakers.

~~~
yepguy
It's a common enough practice that most people usually know what it means, but
there's a reason you don't see flags on Wikipedia, Facebook, or Youtube.
Languages are spoken in many countries, and countries are multilingual. There
are quite a few articles around the web on this topic, but that's basically
what they boil down to: languages are not countries. Some users may be
confused or offended that their flag is not represented.

~~~
nfoz
And as a Canadian I find it generally a little weird that the Canadian flag
often means Canadian French, and I have to click the US flag to get English
(which is of course a slightly different English than Canadian English which
is probably unavailable).

I guess it's something like "language most unique to that country", no but
that's not right either... I don't know.

------
raving-richard
Google is^H^Hwas* really bloody annoying when it comes to this. English (en-us
and en) is the only language in my accept header. When I lived in Geneva
though, Google always used to serve me pages in German (presumably Swiss-
German). Gee, that's logical. (Geneva is a mainly French speaking city, though
over 40% of the population is non-Swiss.)

Where I live now is another French speaking area. I just checked and it seems
they are no longer serving French pages to me. But they were even just one
year ago. (I don't use Google by default, so I don't know when they changed.)

Admittedly, that was an issue with geo-detecting rather than the website
having bad language detection.

* They seemed to have stopped.

Air France is (though they have many faults) actually alright at detecting my
language. And mostly gives me English pages...

~~~
tazjin
My accept-language header only has en_GB and en in it. Google still randomly
serves me pages in Swedish and German (which are both languages I speak, but
which I both explicitly disabled in my Google account settings).

The best case of this was when they launched the preview for the new Google
Maps version - there was a landing page with some information and a button in
the middle. This page was served to me in three languages at the same time
(the header, the button and the info text) - presumably served by different
internal components that all handle languages differently.

------
darklajid
My accept header only contains en-US and en. I tend to get served German
content (and Google's especially bad about this).

Please, I hope someone hears your complains and starts fixing things. That
issue is highly annoying..

~~~
masklinn
> and Google's especially bad about this

Google is a Royal Pain in the Ass on this point. They completely disregard any
request configuration and decide on output language based on IP geolocation
(which is pretty much always Not What I Want, even more so in multilingual
countries such as Belgium or Switzerland[0]), then Chrome "helpfully" suggests
translating documents.

[0] where it won't even send you something matching your actual geographical
location's language, usually sending the country's most common language —
dutch in Belgium and german in Switzerland

~~~
tonfa
I live in Switzerland and google does follow my accept-languages (en-
US,en;q=0.8,fr;q=0.6,de;q=0.4). When going to google.com in incognito I get
google.ch in english which is what I asked.

~~~
_delirium
I don't get that behavior in Firefox. I have 'en-US' and 'en' as my preferred
languages (in that order), set via Preferences->Content->Languages. But when I
go to google.com in incognito, I get google.dk in Danish.

I guess English is preferred here commonly enough that it's at least listed as
one of the two alternate google.dk languages in an easy-to-find place under
the search box, along with Faroese. "Google.dk på: Føroyskt English". And if
you click "English" it stays with it for the session. No luck if you wanted
something else like German, though.

~~~
tonfa
Ok, I think the trick is to have at least one more in addition to en-US+en (I
have a couple more).

------
jtokoph
> 1\. the default quality value is being parsed wrong, and English is being
> assigned q=0 instead of q=1 or

> 2\. en-US doesn't match en and is being bypassed

Or: 3. They are simply checking your IP address and not looking at your header
at all.

~~~
bhrgunatha
This is what so many websites do now. It causes me constant aggravation. It's
nothing to do with your browser settings. They infer your language from your
IP address, and for most cases that IS the right thing to do. However for me
it isn't. I really really wish there was a way to configure your browser to
force websites to accept your language settings.

The only other option is to enable cookies so that the website language choice
is saved - which also invites countless tracking cookies which I do NOT want.

Your web site does NOT know better than me which language I want to read.

~~~
epsylon
> and for most cases that IS the right thing to do.

The problem is that for a large minority of people this is absolutely
catastrophic. Think of the Western business traveller going to Japan or
China...

~~~
bhrgunatha
You're preaching to the choir, I'm in the suffering minority; that's is my
problem exactly. I was just pre-empting the usual replies. Every time the
subject comes up, people always respond with "Most people can't configure
their browsers correctly" and "these websites do extensive testing and for
most visitors they are right".

------
junto
I have blogged about this several times. Google are one of the worst
offenders. I'm not sure if it is insular non-travelled US developers with a
deep love of IP-to-geolication databases, or an anally retentive legal
department, but it really sucks as a user experience.

From an advertising perspective this is a major market that is being
overlooked, because guess what, I don't look at ads that much, but you can bet
your bottom advertising dollar that I'm definitely not going to read it in a
language that isn't my mother tongue.

IP address != language preference

It is about time that developers got that through their thick skulls.

Finally, over here in Europe we can live in whichever EU country we want to.
This means that we can move countries easily. I've already been in four of
them. I don't think I'm an edge case by any means. People migrate.

------
midas007
I'm working on locale stuff for a Rails app right now (just updated the
i18n_data in fact).

The assumption will be that country is mostly orthognal to language b/c people
are übermobile. Further, that the dialect of the language should not force
assumptions of other preferences... only autodiscover initial settings as
close to desired as possible. (Fuck, why isn't there a standard for this
common, hard-to-manage shit the OS already knows.)

i18n is taking up tons of time to get (mostly) right, but I believe it's one
of those things not to botch because it's such a huge signal to everything
else about your app.

If I want to be the most obscure hipster paying in Lesotho Loti, read
Catalonian, have a "," for thousands separator and use UTC tz, by Flying
Spaghetti Monster that's what it's gonna allow.

~~~
steveklabnik
You might enjoy
[https://github.com/jcasimir/locale_setter](https://github.com/jcasimir/locale_setter)

~~~
midas007
Interesting, thanks.

Current Gemfile:

    
    
      # ...
    
      # i18n
      gem 'rails_locale_detection' # consider locale_setter
      gem 'rails-i18n', github: 'steakknife/rails-i18n'
      gem 'i18n_data', github: 'steakknife/i18n_data'
      gem 'countries_and_languages', require: 'countries_and_languages/rails'
      gem 'country_select' # for simple_form
    
      # tz
      gem 'tzinfo-data', '>= 1.2014.1'
      gem 'tzinfo'
    
      # symbols and images
      gem 'svg-flags-rails'
    
      # idn
      gem 'resolv-idn'   # resolv unicode patch
      gem 'idn-ruby'     # unicode IDNA domain resolution
      # ...
    
      # ...

------
pytrin
Those sites are not relying on accept-language, but rather on IP geolocation
to select the default language. I sometimes use a non-US proxy when I'm
feeling vigilant, and Google always uses the IP of the proxy to determine what
language to serve me (even though my browser accept-language hasn't changed).

------
nemetroid
I'm running an English version of Windows 7 in Sweden. The accept-language
headers in each of my installed browsers are:

* IE9: sv-SE

* Firefox: en,sv;q=0.5

* Chrome: en-US,en;q=0.8,sv;q=0.6

I'm going to go ahead and suggest that the reason English comes before Swedish
is due to my system language, and that Swedish otherwise would come first. The
"users will have the wrong settings" argument seems moot to me.

~~~
pornel
I know people who have their OS in English even though they're not fluent in
English - they can't be bothered to reinstall OS that came with the laptop (or
cracked torrent) and understanding a few words like "ok/cancel" is enough.

------
gioele
To avoid such problems in Rack/Rails Ruby project I suggest the
rack-i18n_best_langs gem (regardless of the name, it does not depend on the
i18n gem) I wrote:

    
    
        https://github.com/gioele/rack-i18n_best_langs
    

> Differently from other similar Rack middleware components,
> rack-i18n_best_langs returns a list of languages in order of guessed
> importance, not a single language.

> Language discovery is done using three clues:

> * the presences of language tags in paths (e.g. /service/warranty/ita),

> * the content of the HTTP Accept-Language header,

> * the content of the rack.i18n_best_langs cookie when set.

------
Aldo_MX
Unrelated with the accept-language issue, but somehow related:

I needed to create a yahoo account, and I registered it selecting the kimo.com
domain (kimo.com is a Chinese domain owned by yahoo). Since the first moment I
set my language preferences to English.

No matter which yahoo service I'm visiting, I always get welcomed by _at least
the login prompt_ in Chinese, I can't really complain, because I was the one
who looked for a rare domain, but it's an annoyance for me, because yahoo
assumes that I understand Chinese because of the domain.

------
nraynaud
There is also the problem of getting the original content, I speak 3 langages,
I intend to read the original source if it's one of those language. I don't
want unpaid-intern translation.

MS C# documentation deserve a special kind of hell, because they detected I
reverted the language to english ; and now they present me a special
translation mode of their freaking doc where there are huge tooltip texts
everywhere.

~~~
Too
I don't know when, might be due to the language of your Windows installation,
but sometimes in .NET they even translate exceptions and other error codes to
your local language making it impossible to use google for troubleshooting.

~~~
nraynaud
yeah, translated error messages are a pain to google. sometimes we can get
away we error codes.

(I'm using mono on a mac, and I'm not really doing important stuff).

------
pornel
Sorry, when I implement language negotiation I interpret "en-US" as lack of
preference.

The problem is that en-US is the default and I can't tell difference between
user not setting language and user choosing en-US.

Add "en" or even "en-GB" to your Accept-Language header.

~~~
Eiwatah4
It isn't the default for most people. Download a browser and OS localized to
German, French, or British English and Accept-Language defaults to that
instead of "en-US".

------
petercooper
accept-language is like any of 1000 other idealistic parts of Internet specs
that has good intentions but is so poorly used (or misused) that almost no-one
implements it correctly, instead simply doing the simplest thing that works
best for 99% of the audience.

------
raverbashing
If I access Google using and European IP it will show the Google page for the
Country I am in, regardless of my accept-language

(I don't have an accept-language for Dutch, Italian, German, French but in all
these cases I was shown the local page)

