Hacker News new | past | comments | ask | show | jobs | submit login
What they don’t tell you when you translate your app (ericwbailey.design)
173 points by flowerbeater on Sept 9, 2021 | hide | past | favorite | 230 comments

> People may prefer the English experience because they expect the translated version to be inferior

As an Italian, I can relate so much to this, because translated apps will happily treat verbs as adjectives and vice versa. A couple example:

* Flixbus' app translates "Open ticket" to "Biglietto aperto" (treating "open" as an adjective, not as an imperative verb). Correct translation should be "Apri biglietto". Nothing bad, just unnecessarily confusing (what is an "open ticket"? As opposed to a "closed" one?)

* EasyJet's app does the reverse and makes it much worse. The English version likely says something like "Gate close: xx:xx (am|pm)". They mis-translated this as "Gate chiuso: xx:xx", which actually means "Gate has closed", even though the gate is still open. So you get a small heart attack, notice the actual closing time, curse the translators, and go on with your life.

I think the issue is many translation databases just hold the English text and then all the translations. So the entry is “Open ticket” and then you just drop in the translation anywhere that phrase shows up. But sometimes “open” is a verb, sometimes a noun.

The actual identifier should be something like “Open a ticket (imperative, button)” and then that phrase has translations, including the English “Open ticket”.

That's actually a nice idea in forcing even the default language to be handled by the same workflows, processes, and tools as the other languages. I've found that in a lot of those cases there simply is lots of context missing from the strings that should be translated. If all you get is the English text without any indication of how and where it's used in the UI, you're bound to make such mistakes in the translation.

For example, it took me a while to figure out why Word 2007 in its German version used the word »Gliederung« for the stroke of a shape. But translating »outline« in a word processor to mean »document outline« instead of »shape outline« is actually quite understandable.

Back then I tried thinking about automatic or semi-automatic solutions to get a bit more context for the translator. The trouble is that most UI toolkits make it very hard to impossible to solve this, unless the developer actually knows enough about the problem to always include context and a description. Qt has (had? That was pre-QML, I think) a nice mode in its translator UI where the XML UI description could be used to show the string in its UI context. Windows Forms had a way of changing the form's language and simply replacing all strings directly in the designer (which has the problem that the translator might accidentally destroy all layout). Most things that are used just from source code have no visual way of relating strings to UI at all.

In most places I worked that used translation systems, all languages where translations including the default one. Within code using message keys like "thing.title", "thing.add_action", "thing.on_save_error", etc or something like that.

I really like this approach because it makes the code and especially templates much more readable. You usually don't care about the verbose form of the text that should be displayed and those type of keys give you just enough information to understand what it is.

Problem is, it makes it harder to outsource the translations, and well, as it is known, naming things is hard.

Oh, cool. You just reminded me of a feature I had built into my web app many years ago when we implemented translations. We accepted internal commands in our search box, and one of the commands told the app to display the language text identifiers alongside the language text. It was a great mode for developers, QA, and translators. Developers and QA could easily locate text that needed to be put into the language system, and translators could work page by page to find the identifiers they needed to translate.

Yep, there's a ton of software that wants to use the english text as the translation key, which leads to all sorts of bad results.

Things like Open Ticket used as a verb to create a bug report, or open a bug report, or as an adjective to indicate a bug report is still active. Or similar when ticket means a transportation or entertainment event. If your key is the English text, you can't translate those three usages differently which is not good.

But also, minor edits to the English text are hard to manage for the translations, some systems have a way to suggest an existing translation, but it requires a translator to affirmatively select it. If the key doesn't change, you can still use the existing translations until the translators review the English change and decide if they want to also make a similar change or not.

Of course, the worst thing that people try to do is numbers; there are tools for that, but trying to do Open Ticket vs Open Tickets as singular vs plural falls over with languages that have a form for one, two, three, or more, or even more forms.

And then you get people trying to do string math. Delete this ticket vs Delete this image need to be translated as whole units, you can't add 'delete this' to the type name, gendered verbs and objects and sometimes even more complex stuff makes it not work.

I was using Qt recently and saw that example code did that English keying. It's good that they're promoting creating translatable UIs, but I don't know if it's the right thing to do if they're encouraging people to do it by using English text as the key.

I think that approach is quick and easy for an English only developer to understand and do, but it's hard to get quality results. A synthetic key tied to the context so the same English text can be translated differently as appropriate.

Tools that show translators the application context are really helpful, too. Bulk translation in a spreadsheet is an OK place to start when there are a lot of new translations to do, but everything needs to be checked where it's used as well. Especially for languages that tend to result in layout issues when added to formerly English only apps, like German (lots of very long compound words) and LTR languages like Arabic.

>The actual identifier should be something like “Open a ticket (imperative, button)”

Or even better: "ui.ticket.actions.open" — trying to shoehorn linguistic categories into translation files is a painful experience, but dumb specific IDs work great and make untranslated captions apparent.

Or even Create Ticket. From the description, it seems a ticket is being created not opened, despite the slang that people incorrectly use for buttons.

If a ticket is closed upon completion, it stands to reason that creating a ticket is "opening" it.

> If a ticket is closed upon completion, it stands to reason that creating a ticket is "opening" it.

That's faulty reasoning.

At the inception of a ticket, it is first created and then opened. It is common to have these programmed to work as a single button push, but they are two actions, and creation always happens first, even when the ticket is not opened. Later, when the ticket is completed, it gets closed.

When you go into your house, an opening must first be created, either a doorway or some other hole in the wall like a window. Then, after the hole is created, you can enter the house.

Or you have a list of tickets that you have purchased in the app, and may need to select one from the list to show to the controller.

Or maybe the translator should get annotated screenshots of the app. Laid out in a story board.

If you just hand someone a list of strings to translate, there's no way you'll get sensible results.

Also, you should test you translations. Some of the translations that I've seen (even from reputable companies) are so bad that it's pretty obvious no native speaker has ever looked them over.

The tension is that people want to reuse translations. So maybe you have the story board for the first version. Then, someone makes a new button. In Django, they'd put _("Open ticket") as the text, see there's already a translation, and think they're good to go. Sure, having every page looked at by a translator for every language every time you make a change would be ideal, but also a bit costly and slow. I think there are better options in the middle.

I don't think there are any "middle options" that result in a good product. You want a localized app, but you don't want to put the work in.

If you add some new text somewhere in the UI, you need to start the app and make sure it looks right. If you only do that for one language, and don't check other languages, then there's going to be one language that's broken.

So your app is going to look broken in one language. And you probably will never find out, because the people who run into the bug don't speak your language.

> You want X, but you don't want to put the work in.

Yes! That is exactly what people want.

> Also, you should test you translations. Some of the translations that I've seen (even from reputable companies) are so bad that it's pretty obvious no native speaker has ever looked them over.

Testing is a must. Before I fixed + linted it, we often had community-provided translations which would cause the app to crash due to missing/additional format strings.

Yup. The “key” for the translation is the English phrase itself. It also makes English text changes weird because you either have to change the key across all languages to match the new English or you leave the key alone and change/add an English “translation” that is the new text.

Personally I think it is better to use a “surrogate key” that isn’t the English text itself.

I think a bit more of context would even be better.

The usage of "discover" and "find out" in english and portuguese comes to mind.

words from the "discover" family, in the english language are generally used when talking about discovering something that nobody or few people knew (somebody discovers a cure for some disease), while "find out" is generally used at a more personal level (somebody finds out that someone else bumped his car)

in portuguese you can only "find" (encontrar) physical things. you can't "find out" information

This makes me thing that in some instances it might be necessary some sort of descriptive context on the meaning

Obj-C and Swift allow comments to go along with translation strings, so you would add

    let buttonStr = NSLocalizedString("Open ticket", comment: "For user to open a ticket")
And the comment would make its way into the eventual xliff file sent to translators

This looks like the right approach: giving context to the translation strings.

Many translation tools give the locations in the code where a string is used. It's a first step, though translators are not always able to read code.

In my job I sometimes have to do with rollouts of a centrally maintained software to subsidiaries in other countries. Translation is often done by simply sending a Excel file with all string identifiers and their English value to a key user in that country, and maybe they translate it themselves, maybe they give it to some agency. So we can be 100% sure that now, there will be several additional rounds for them to figure out what the string is really supposed to mean versus what they initially thought it would mean.

Yes, we improved on this somewhat. In the last rounds they got access to the software in English beforehand, and there are now also access keys to press to see the string id for any label in the app. It is still a very time consuming process, and I love it when a rollout is done to a country where we can just say, consumer facing texts get translated, our employees all speak English well enough to use this as is.

> The actual identifier should be something like “Open a ticket (imperative, button)”

The identifier should be a GUID with an option for tagging and comments for developers and translators. Other languages depend on other contexts which are not represented here, like multiple forms of "present" tense, or the current time of day. Keying translations on English-language concepts is a bad idea, as a lot of languages are unlike English. Treating English as one of the translations (and not the reference) is a good idea that will prevent problems with future translations and avoid re-architecting your whole localization pipeline (and code base).

That sounds like a nice setup.

This is what the gettext contexts and the pgettext macros are for: https://www.gnu.org/software/gettext/manual/html_node/Contex...

Another example I saw was "(N) seconds ago" (e.g., "30 seconds ago") vs. just "seconds ago" (i.e., someone just posted this). They were translated into a single phrase. Hilarity ensued.

I think Mozilla's translation system called Fluent can handle that.


> So you get a small heart attack

This is something overlooked by devs and PMs sitting comfortably in their chairs.

People using apps in the modern world, especially mobile apps, are tired, stressed, busy, unfocused, and on the move. Small things like that added to the mix can induce a lot of stress.

For many devs/PMs it's just a piece of text. For the user it's much more.

The translations are somehow unappreciated part of the app dev by many people. I know several languages and I checked all new translations in our app each time but few people cared as much.

Tom Scott used an example of a phone system for results of an STI test to demonstrate this: https://www.youtube.com/watch?v=LZM9YdO_QKk

I have a similar experience as a german on the internet.

Aliexpress (straight up wrong translations), Discord (anglicism, adjective ordering and weird sentence/tone structures) and plenty of others I don't remember, the list is pretty long. Size doesn't really seem play an effect aswell.

Another big issue are potential bugs you encounter. If you just get a translated error message without any error number or something similar it's a very frustrating experience to troubleshoot it. I've spend quite some time retranslating error messages to solve issues. Add to it that often knowledge bases are outdated in the translated languages.

Aliexpress in Dutch is lovely!

Plenty of items on Aliexpress can be shipped from multiple locations. "China" is almost always one of the options. Well, in Dutch they've translated that to "Porselein". That is a valid translation, if you are talking about plates and dishes made from porcelain)

I wonder how actively harmful this bad translation is to their business.


screenshot: https://fransdejonge.com/wp-content/uploads/2019/11/Screensh...

Weird, I sometimes get the French translation of AliExpress (when my cookie expires I suppose) and I haven't noticed "China" being translated as Porcelaine in French. I wonder why it's different. Also now I wonder why I get the French version, since I live in Flanders (although I'm a French speaker).

AliExpress translations are totally incomprehensible though anyway. I really don't think they should show the translated versions by default.

(By the way, your second link doesn't respond for me)

"Ships from" though is perfect Dutch.

> Size doesn't really seem play an effect aswell.

This holds true in all areas of software development—nay, in business in general. To the point where I’m not really sure why people do expect large players to do a good job, because they just about never do.

Large organisations are very close to incapable of producing good results—their software will be clunky and slow, their translations present but bad, their customer support painful. Small organisations are more likely to be able to produce good results. Notwithstanding this, small enterprises are often unable to match large for certain resource availability (including time!), which acts as a balancing factor so that small is not often uniformly superior to large, though it’s much more likely to be superior in a certain subset of fields; and this is the case with i18n/l10n.

I think this actually stands to very simple reason when examined numerically: have enough mass and you’ll produce average results (regression to the mean); be small and you’re more likely to deviate from the mean, whether for good or for bad, and if for bad you’re more likely to fail, so you’ll tend to end up with more above-average small players.

I set my Android phone to English mostly to avoid badly translated apps. Not so much for the major ones, but you never know if you might get some really bad machine-translated stuff in some apps.

As there isn't an easy way to set this per app, it makes more sense to me to just switch the phone entirely to English.

That's also one of the reasons why I don't use a localised version of a desktop OS. Especially on Linux, where everything follows the OS locale and the translations of many open source applications are either only half-done or leave a bit to be desired in terms of quality, the original English strings are just a better experience if you're proficient enough.

I'm not saying this to bash open source translations (let alone translators) or anything. A good translation takes a lot of work. That's just how things seemed to be last time I tried, and I don't really have the energy to contribute myself nowadays.

Of course there are other reasons for not using a localised desktop especially if you're a technical person, such as better web searchability in case of problems. But the inconsistent quality of translations is probably one of the top reasons for me.

I wonder if there is just any point in doing translations for software unless you are fully committed to having full time I18n people testing the actual app in the language. At my past job it was basically a throw over the wall thing where we send them yml files and they send us back yml files without the translators ever touching the app.

The translators also likely did not understand which words were key app terms which should stay constant. For example "benefit" is a key term which can not ever be substituted out with something like "improvement" even if it seems like it makes more sense in the particular string. But without knowing the app well, you wouldn't know this.

Seems like it would be better just convincing everyone to use the English versions of everything since those are perfect and the majority of the world knows it now.

> Seems like it would be better just convincing everyone to use the English versions of everything since those are perfect and the majority of the world knows it now.

There are lots of e.g. elderly and less educated people in non-English speaking countries who don't speak or understand much English, at least not enough to be comfortable. You don't even have to go further than Europe for that. The aging might not be a key target audience for many apps and it might make business sense to not bother with translation, but there's a significant number of people who might be excluded by that, and it would be difficult to accept that as a solution at least for the most commonly needed apps.

It also seems a little presumptuous that English should be pushed as a general solution to everyone in the first place, but you don't even have to go there before you get into practical problems.

Sometimes this doesn't work like what happens on the desktop with Google, for example. English is my default language, and in country where English is also the official language but Google in its wisdom, decided that the default for google.com is Swahili. Yet, I don't know of anyone that uses Swahili as his default langauge on digital devices. Microsoft too. Every once in a while they would send me text messages in cryptic Swahili that I have never bothered to find out what it means.

I always use the English version of everything simply because error messages in Dutch or any amount of troubleshooting isn't going to helpful at all.

In the Netherlands, almost anyone who has about any device he owns configured in Dutch is almost certainly technically illiterate. People keep everything English not even because the Dutch translation is inferior but because any online documentation one will find is based on the English version. This is so entrenched that any notes I even keep to myself on my computer are in English rather than my native language without even giving it any second thought.

It is honestly somewhat strange and awkward to read technical documentation in Dutch. Many of the translated words they use take a moment for me to figure out since I never heard them in Dutch.

Same here. Some IT departments even have a policy of setting up every server in English to make troubleshooting easier.

> translates "Open ticket" to "Biglietto aperto"

I have a good example. I won't name names. I saw an Italian localization on a "like" count in social posts that localized "N people like this" as "N persone come questa" [N people like this one]

This is surprising. You really think someone as big as EasyJet would get that right.

We're doing a translation now into Japanese and the translator is actually taking the time to look at screenshots and use the app to see the text in context. It makes a huge difference.

As you've pointed out, it's one thing to see the string "open" in a XLF file, it's quite another to realize it's intent. This requires setting up demo environments for each translator though.

Heck, Microsoft got it wrong — Swedish versions of Windows used to have a folder called "Vanliga filer" (Ordinary files). This was, of course, the "Common files" folder, but the translator, probably having no context, picked the wrong meaning of "common", and chose a Swedish word which does not have the "shared" meaning.

In El Salvador (our TLD is "sv") we sometimes get apps or websites in Swedish because "sv" is used as a language abbreviation for Svenska/Swedish. Spotify was like that for months when it launched here.

Another data point:

When Windows 10 just came out, many of apps in start menu had a label next to them, which in Russian version said "Создать" (~Create). It took me some time to figure out that it's a poor translation of "New" (~Recently added). Proper Russian translation would be "Новое".

The problem isn't the translation, per-se, it's that translation isn't tested (or tested thoroughly enough).

Amazing Microsoft can sell a whole other version of Windows and mess something like that up.

Everything that impacts the user experience needs to be tested.

> You really think someone as big as EasyJet would get that right.

Well even Apple has a wrong translation on the iOS keyboard in French. The "Return" key is translated as "Retour" (generally meaning "Back" rather than "Carriage return") instead of "Entrée".

It might have changed in the last years though, I don't know.

That’s not a mistake.

The English term ‘Return’ refers to a carriage return (on a typewriter, it means to move the carriage back when you’ve reached the edge of the paper).

The French translation of “carriage return” is “retour chariot”, hence the ‘Retour’ key on French keyboards.

In Apple terminology, the Enter key is a completely separate key on the number pad with a different function and has no equivalent on iOS’ on-screen keyboard.

Therefore, the ‘Retour’ translation is appropriate for Apple’s software ecosystem.

> hence the ‘Retour’ key on French keyboards.

But as far as I know there has never been a key called "Retour" on French keyboards. Both Return and Enter are called "Entrée" in French by everyone I know, maybe because Apple's keyboards don't have any text on them, while PC keyboards do have "Entrée" written at this location[0].

Retour is only ever used to mean "Back", and Back is also translated as "Retour" on iOS. Which means in some cases (like in the Mail app) you have two different "Retour" buttons, one that inserts a newline while the other one cancels what you're doing.

I'm certainly not the only person to have a problem with it because I'm the one having to explain why the keyboard says "retour" to iPhone users, even though I mostly use Android phones.


> Both Return and Enter are called "Entrée" in French by everyone I know

Most likely because everyone you know has learnt learnt that PCs. On PCs, they're the same button. In Apple's ecosystem, they're completely different buttons that do entirely different things.

> But as far as I know there has never been a key called "Retour" on French keyboards

On Apple keyboards, it has always been referred to in the operating system and documentation as the "touche Retour". Though you're right, there isn't a symbol on the keyboard in Europe in order to sell the same model for the whole region.

> Back is also translated as "Retour" on iOS. Which means in some cases (like in the Mail app) you have two different "Retour" buttons, one that inserts a newline while the other one cancels what you're doing.

The latter is actually a UI error in most cases. Apple's HIG demands that "< Back" buttons actually have a label to say what you're going back to.

But yeah, they have a hard time following their own advice.

> I'm the one having to explain why the keyboard says "retour" to iPhone users

Now you can explain: it's just Apple terminology dating back to the 1980s. :-)

The secret is that you cant verbatim translate into Japanese as well, which forces more attention (also its a bigger economy)

My Italian teacher in college had the following saying: "Traduttore, traditore": the translator is a traitor.

Thank you for providing a translation of that. Traitor.

French here, I do the same, French translation are unbearable.

Not only will there be tons of small mistakes with nouns/verbs as you mentioned, but maybe worst, there is often "overtranslation".

Some words are well established as technical jargon in English, and should _not_ be translated.

- "cloud" -> "infonuagique"

- "email" -> "courriel"

- "freeware" -> "gatuiciel"


While I agree that we (fellow Frenchman here) tend to overtranslate, some of our portmanteaus are kind of clever.

My personal favourite is illectronisme, a combination of illettrisme (illiteracy) and électronique (electronic), to refer to people who are not good with computers.

My favourite is autof = selfie. It’s auto-photo, but if you pronounce it, it’s photo said backwards — because you use the hold the camera backwards or use the front-facing camera.

“Open ticket” doesn’t even really make sense in English. Maybe “Buy ticket” or “View ticket” or similar would make sense there.

Flux bus is German, I suspect the translation was from German and not English in this case?

In German adjectives and verbs are easily distinguishable without context. The verb "open" would be "öffne" and the adjective "open" would be "geöffnet". So it's unlikely this mistake would happen if you translate from German to Italian.

It's common to talk about "opening" files to view them, so I assume that's why the developer chose that term, even tough "view" would have been better.

Don't German UI conventions follow the convention of using infinitive verbs for commands normally? ('öffnen' instead of 'öffne'?) [the only exception I know is the adjective "rückgängig" for "undo" ]

Though even there your point still stands - they can be easily distinguished from the perfect form ('opened' (geöffnet) vs 'to open' (öffnen) ). So I guess I'm nit-picking a bit.

Native german speaker here.

Have no clue about UI conventions nor grammar, but I'd expect either "Öffne Ticket" or "Ticket öffnen". "Öffnen Ticket" is wrong, just as "Ticket öffne".

I suppose the Flixbus app was made by native German speakers coding in English language. That would explain why they chose "open ticket" rather than "view ticket" or similar. As some of the parent posts said, you also "open files" here, so they probably just did what they assumed is right.

Denglish (Deutsch-Englisch) is full of this.

Happened to me as well; for most of my life I used "eventually" incorrect, thinking it means "maybe".

  * German "eventuell" = English "maybe"
  * German "schlussendlich" = Englisch "eventually", "at last"

Oh yes, word order is also a major problem in software translated to German. You can tell when an English speaker programmed something...

It annoyed me more than it should that Word for Mac for years had a menu command "Beenden Word" (Quit Word) where the order of the words was obviously hard coded...

Or how Siri says "In 50 Meter Sie haben Ihr Ziel erreicht." (it should be "In 50 Meter haben Sie Ihr Ziel erreicht")

In English you can just take the sentence "You have arrived at your destination" and prefix it with something like "In 50 yards", and it's a perfectly valid sentence: "In 50 yards you have arrived at our destination". It might sound a bit mechanical, but it's not wrong.

If you do the same in German, it just sounds very confusing and wrong.

I’ve noticed a few other German-English anomalies in my time around German people. First is “some-” instead of “any-”. Second is the use of “since”, e.g: “I’ve been working here since 8 years”. Present tense confusion - e.g: “I’m having” instead of “I have” - my understanding is that in German there is only one present tense, whereas English has a few subtly different ones, so it’s not surprising that some confusion ensues. “Driving a bike” is always funny. Saying “with X years” instead of “aged X years”.

In all of those cases it’s still obvious what the person means. And to be clear I don’t mean to pick on anyone here, I just find the language differences interesting. Far be it from me to judge - I can barely speak one language, let alone two.

English has borrowed so many words from so many other languages that there are false friends everywhere. As a native English speaker and German learner, it took me a minute to get over the same thing with "aktuell" - yes, of course I want the actual news!

Many many years ago, trying to learn German as an english speaker, i was taught Maybe (EN) -> Wehrscheinlich (DE) -

Is that correct? If it's even remotely close to being correct, it definitely makes sense pedagogically to avoid the ambiguity.

I learned:

- maybe = vielleicht

- probably = wahrscheinlich

- definitely = bestimmt

But a big part of the gap between being able to speak in a language and being able to comprehend a language is that there are often plenty of ways to communicate/translate the same concept. It's much easier for a learner to say "vielleicht" every time they mean to indicate "maybe" than it is for a learner to learn that "vielleicht", "eventuell", "möglicherweise", etc. all basically map to the concept of "maybe" (which of course conceptually maps to its own set of English words - "maybe", but also "perhaps", "possibly", etc.).

It gets even hairier because word choice is highly culture-bound and the semantics are not guaranteed to be the same as the top dictionary definition. "Could you maybe take a look at this?" is not really asking someone to "maybe" take a look, the asker definitely wants them to take a look, it's just a construction that carries a deferential tone.

Oops, you are right, now that I think about it it would be weird to see a button labelled "öffne" instead of "öffnen".

I just thought that it was unlikely that the verb/adjective confusion comes from German->Italian translation, it's more likely from English->Italian.

It's a common mistake that I've already seen in software translated from English to German as well, it's just what happens when you translate English strings without context.

In many apps, to "open" a document means to "show" it. And a ticket is conventionally a paper document.

So it's pretty easy to get from Open ticket to Papers Please.

An open ticket, in English, means a flexible transport ticket rather than one with fixed departure times.

Confusingly, "Open Ticket" is a valid term for a ticket with no fixed date it has to be used on.

While most of modern European languages are heavily analytical, English language is pushing it straight into the isolating language territory. Morpological differences between nouns, adjectives, imperatives, infinitives? Who needs those? Just line the words in the correct order!

It is true that English is extremely analytical, but most european languages are fusional rather than analytical: https://en.wikipedia.org/wiki/Fusional_language

It's crazy to believe that EasyJet wouldn't pay an Italian guy for a few months to translate the buttons and commands and notifications in their application, considering Italy is a fairly large market for them.

Hopefully it's better now, but back when I lived in Sweden, I had to routinely translate strange Swedish wording to English to figure out what arcane computer term the translator had misunderstood.

Open ticket is ambiguous in English anyway. It could mean the ticket to which this refers is of type open or click here to view the ticket.

I have a different complaint than most others here and that is that most UIs require me to mentally translate from software developers English to real English. It's no surprise then that translating to a completely different language is difficult and error prone.

Edit: mismatched asterisks.

It’s really embarrassing that two large Europeans companies are unable to pay 20 € per hour for a mediocre translator, which you could find on upwork within minutes.

I wish the users could edit the text and then the app would take over a text the users agreed upon in majority. It cant be that hard.

Sounds like a good way to end up with all the strings in your app translating to "penis".

Facebook did that early on, no idea if they still do. As a professional software localizer, I was not amused by the resulting translations.

I don't see why as a local user i cant override a crappy translation at least for my account.

> People may prefer the English experience because they expect the translated version to be inferior

There are multiple comments here about how it usually is inferior.

But even when it's not, there can still be reasons to stick to English.

I've done a lot of work with Brazilian graphic designers, and they all use Photoshop/Illustrator in English -- the Portuguese version is essentially "unusable". Not because the translations are necessarily bad, but because Photoshop has its own bespoke vocabulary.

E.g. what's the difference between "image" size and "canvas" size, between auto "tone" and auto "color", between "crop" and "trim", or between "vibrance" and "saturation"?

In layperson English these are essentially synonyms, but mean different things in Photoshop. And if you want to follow any Photoshop tutorial, or communicate with any designer, you need to know these "English" terms, just like every programmer needs to know "if" and "then".

Translating adds yet another layer of confusion that hinders more than it helps.

Of course, this is specific to professional tools that require training -- it doesn't really apply to consumer software intended for a general audience.

It does apply to troubleshooting in other languages though. Tryin to fix my dad's French software issues over the phone when my own computer uses the English version of the software is its own version of hell.

English is a good choice usually because tutorials are easier to follow, there are way more tutorials in english than in other languages.

> People may prefer the English experience because they expect the translated version to be inferior

No, it is not about expectations. In +95% of cases[1] the localized[2] version is objectively worse, to the point where it often only is possible for me to understand by first translating it back to English.

If you give me a localized version first, and don’t give me an obvious way to permanently choose English, I’m likely gone.

[1] Mostly excluding the big ones (MS, Apple, etc), but quite often they fail too

[2] My first language is one of the smaller European languages (<20M speakers), perhaps bigger languages have higher quality.

I think that's exactly what the author meant.

>[1] Mostly excluding the big ones (MS, Apple, etc), but quite often they fail too

MS docs

I mean they're good when displayed in English, but defaulting to "native_language" is annoying.

One annoying issue with translated apps the author doesn't cover is "googlability", the fact that we often need to google an error message or menu label in order to solve a problem. This leads to a lot of bilingual people running their software in English just in case they need to paste an error message into a search engine.

Totally agrees. FireFox devtools UI language is your browser language, and the error messages are also translated. A really annoying thing is there doesn't seem to have an easy way to only switch devtool language to English.

On the other way, Chromium devtools doesn't even have translations so it doesn't have this problem. And the new Edge (Chromium) translates the devtool by default, but there is an option allowing you to switch back to English in devtool only.

Yes. Having unique error codes should be a best practice with translated apps.

I've always felt that a popup with an error code should have a "search the web for this code" button.

That’s where “error code” essentially comes from, an actual code you could look up in a manual. Errors have evolved to be more of “error messages” which then have grammar, word choice, and a million other concerns.

I wonder why message code culture disappeared now.

From IGN's article about the difficulties of game development (https://www.ign.com/articles/turns-out-hardest-part-making-g...):

“Planning ahead helps, but nothing will prepare you for German,” [Joe Mirabello] said. “German destroys your best laid plans. German will defeat you. That text field you thought would only ever need a single 10-20 character word? Nope. German has a unique word for that and it’s a hundred and twelve characters long. We even have a native German developer on our team and he refuses to translate our games into German. This is all said tongue-in-cheek, of course, just to illustrate a point, and that is; whatever scaling flexibility you think you’ve planned for in your UI to account for localization? It’s not enough. It’s never enough.”

It's not that German has a "unique word" for things, it's that they use compound words: multiple words written together without space to form a new word. This is a feature found in most Germanic languages except English, because reasons. It's just that German is the most commonly translated of them.

So, for example, in Dutch you would write sciencefictiontelevisieserie instead of "science fiction television series"; it's not an "unique word", just four words strung together. There are some examples that can be quite long; the longest in the dictionary is meervoudigepersoonlijkheidsstoornissen, or "multiple personality disorders", although you can easily make it longer by adding more words: meervoudigepersoonlijkheidsstoornisbehandeling ("multiple personality disorder treatment") or meervoudigepersoonlijkheidsstoornisbehandelaaropleiding ("multiple personality disorder treatment education"). I miss this in English by the way; you can get creative with it and form new compound words quite easily.

Sometimes the addition or lack of a space can change something quite a lot, so you can't just insert them because it's convenient.

It sure can be annoying fitting these things in boxes at times though.

[1]: https://twitter.com/spatiegebruik/status/1434538804883427330

You add a reference but never refer to it! Let me explain for the English here :)

The Twitter link shows a picture taken at a race event, where it says on the door: wedstrijd secretariaat, meaning secretariat competition in English. It's two words, so the first modifies the second (adjective) rather than forming a compound noun, thus some wedstrijd (competition) of the secretariat seems to be held there. Writing wedstrijdsecretariaat as one word makes it a compound noun and translates as competition secretariat which is (presumably? :D) what was meant. Ha-ha! Germanic humor, I guess. (I really enjoy them at least, since it really is what people wrote and they don't even realize it. Probably ties into pentesting, where I also exploit what people incorrectly wrote?)

> Sometimes the addition or lack of a space can change something quite a lot, so you can't just insert them because it's convenient.

Correct, but note that hyphens between the parts are always legal if you think it's more readable.

For example meervoudigepersoonlijkheidsstoornisbehandeling was not hard to get for me but then the ...behandelaarsopleiding variant is really stretching the possibilities and I'd definitely start to hyphenate there, also because it's a bit of a false start (it's an education, but you're starting off with a disorder and then segueing into treatment and then again veering off into it being an education that you're describing -- it's a bit like "The old man the boat." in English: a garden-path sentence or an intuinzin which starts off making you think it's one thing and then continues in a way that forces you to reevaluate it).

Also, if you have a reason why you didn't put an "s" between behandelaar and opleiding I'd be interested! It feels to me like there should be one but I don't know the rule.

Ah yeah, I started writing an example of how making a space can make a difference in meaning with that as a somewhat humorous example, but I couldn't really come up with a good way to explain it in English last night so I removed that part, but seems like I forgot to remove the reference.

meervoudigepersoonlijkheidsstoornisbehandelaaropleiding is a bit of a contrived example, it's just an example of how you can "invent" compound words on the spot and how you can have quite a lot of words in a compound word.

> Also, if you have a reason why you didn't put an "s" between behandelaar and opleiding I'd be interested! It feels to me like there should be one but I don't know the rule.

No reason; there probably should be an "s" (I think?) The previous comment went through quite a few revision before the final version that I posted; I just missed it.

Reminds me of the Pseudo-Locales Windows Vista added that "translate" English strings to things that look like English, but use unusual characters and end up with longer strings in an attempt to catch UI issues before having full localization versions ready.


> People may prefer the English experience because they expect the translated version to be inferior

Even more, bilingual people exist!

A translated version is always worse. With a good human-made translation, it may just be a matter of making things un-Google-able or misrepresenting certain concepts. With an automatic translation, it's usually completely unusable.

I'm a native speaker of Dutch. I'm a near-native speaker of English. Having a page with both languages interspersed is completely acceptable! Don't "helpfully" translate everything which isn't in the configured language - you're only making things worse.

Exactly! Ironically, the most Anglo-centric assumption of them all is that people are only fluent in exactly one language. Configuring anything to be truly multilingual as opposed to "in another language" has terrible UX.

What do you mean? Having parts of the interface in one language and other parts in another?

I'm fluent in multiple languages (as most European devs) and I've never really heard or thought about this concept so I'm intrigued. What kind of software are you referring to that could have this feature?

I'm talking more about the behavior of websites or applications.

For example, I'm on macOS. I want the OS GUI to be in English and to use the US keyboard layout, because I'm used to that and buttons/labels aren't a big deal anyway.

However, most of my communication with my coworkers (for example, MS Teams) happens in Italian, so I'd prefer those programs to display their UI in Italian (so that everyone using the program would be on the same page) and to have Italian spellcheck.

When I open Safari to look at docs, Wikipedia or search results, I want those in English. But e-commerce sites like Amazon need to be in Italian. Except if I'm shopping for technical books or manuals: I need those in English.

For some of those needs there's a workaround, some I found completely impossible to solve (I can't seem to get the spellchecker to switch reliably, my solution is simply to disable it: I make zero grammar mistakes in Italian and most people are willing to put up with my broken English).

Generally speaking, I find that most UIs are downright hostile to "mixed" needs like mine, and I end up defaulting to the US/English locale everywhere because it's the least broken (except for units of measurement. Come on guys, inches? Farhenheit?)

The concept of locale itself is broken and wrong: I want it.IT or en.US contextually, neither is the correct one, why should I be asked to definitively pick either? In many cases, localization is downright harmful. Excel comes to mind, but even several ETL tools "helpfully" "translate" decimal points to commas by default!

Websites could greatly benefit from this, e.g. social media, where you're likely to be part of both a local community that speaks your native language, and a global community that speaks English.

Ah ok, I agree and I have some of the same issues. I'm using Firefox and the spellchecker works well. I regularly switch between 3 languages. Same for Thunderbird, plus there is an extension that remembers which language I'm using with a specific contact, which is great.

Regarding the en-US locale, I've read that some people use the en-CA locale instead, this way you get (partly) American spelling, but with metric units and international standards like A4 paper size and reasonable date format.

en-IE is a good choice in Europe: defaults to €, Anglo-Irish spelling, metric, 10 September style dates.

You can set per-application language settings in macOS. Go to System Preferences -> Language & Region -> Apps. Click the + button to add an app then choose the language you want.

You can also do this in iOS and iPadOS. Go to Settings, scroll down to the bottom where your app is listed, tap on its settings, then click on Language.

While this is true, as another near-native speaker of English, reading things in my native language always feels easier. There's a slight stress on the mind when reading and using English that I don't even realize exists, except for the 2% of the time when I get to interact with UI in my native language and realize that it feels significantly better this way.

A translation doesn't have to be worse, especially in a technical context. It's just that the translation market has been steadily going downhill since the 90s, and nobody cares about quality any more.

My parents are software translators. They've been in the business since before I was born; back when software was just starting to be translated. You have no idea how much prices and quality have fallen. It's really, really sad.

Software localization used to involve the localizers working together with the developers, making UI changes, testing the real software, and using translation memory tools as an aid to ensure consistency.

These days people just get a pile of strings to translate with no context, machine translation is used by default (and agencies pay less because they give you a garbage MT version to start off with, as if it doesn't take as much time to fix it as it would to transalate from scratch), and translation memories are used with no cross checks, often translating things wrong due to entirely different context.

Further, localization is often treated as an afterthought, with developers having no idea of what the technical requirements for good localization are. Plural forms, placeholder reordering, etc.

If you want a good translation, you need to pay for it, but nobody wants to do that these days; they just want the bare minimum so they can claim to have their software available in such and such language.

> A device’s location/IP address isn’t indicative of the language preference of the person using it

I really don't understand why this is so popular (google being a major offender). The browser already sends the preferred language(s) as a http header.

It's a major PITA as I live in a country where I don't speak the language well. It's even worse when there's no obvious way to just get it in English. It's triple worse if they decide to "localize" content for you based on your location even if you have everything set to a different country anyway.

Google really sucks at this; I can set it to English or Dutch all I want, but I still get suggestions and results in Indonesian. Funny enough, the date format is always in the confusingly reverse "month/day/year" in spite of their ham-fisted forcing of everything else.

Seconded. I'd really love to finally hear from the person at Google that made this decision. There has to be some reason, of all companies I'd trust Google to be both big enough to have a good overview of what it should be (based on complaints, user research, the impact they know this decision has) and engagement testing (since they do a lot of things based on data).

Not that I'd know where to complain to, but Google employees have friends so it would reach them in some modicum anyhow if others experience this problem as well. And everyone who ever went to a country whose language they're not very comfortable in will be having this problem.

I must say, I don’t encounter this issue. I live in New Zealand, you would expect English to default to English. Instead, because I use my computers in French, it defaults to French — even when I log in and tell it (repeatedly) to favour English results.

(Don’t ask why I use my OS in French but want English results; it’s not that interesting) :-)

All of this is so recognizable, but I'd add one more: Make sure the user has a way to switch the translation back to English.

I have some sites showing in, say, Chinese, where even the current language is a Chinese glyph. Nothing on such a page is readable for a non-Chinese speaker. So you get to click around randomly until some menu opens where you see the word 'English' which brings you to a page you can read enough to get to your own language.

Some sites, like Google, will helpfully change the language depending on your current IP address. Trying to find how to switch to English from say, Korean, when Google surely knows that I'm English and don't speak Korean (I'm reminded of this[1]) should be forced upon the chimps writing their UIs.

What's wrong with using national flags? It's so easy for the user, don't designers care about us?

[1] https://news.ycombinator.com/item?id=28336850

National flags and languages are not a 1:1 map. Some flags have multiple languages, and some languages have multiple flags.

And that can be "close enough", until you for example serve English speaking people in Ireland the Union Jack. Both languages and flags can be sensitive topics in certain parts of the world.

Also, often the country and language settings need to be independently modifiable, e.g. for pricing vs. product description.

Agreed. The symbol for pricing could surely be the actual symbol though (€$¥£ etc).

It's not just about the currency, but also the value of the price. To use an example I have on my table right now, German newspapers and magazines usually target Austria and Switzerland as secondary markets and have different prices for each of them. So an issue that's 3.95 € in Germany can be 4.30 € in Austria and 6.30 CHF in Switzerland. Even though Germans and Austrians use the same currency, they don't pay the same price.

Why conflate the choice of language with the choice of currency? If there's a need to differentiate by those countries, give them that choice. If not, don't.

Or you could display everything in Korean to those with a Korean IP even though in some cases it will happen to be an Austrian who'd be thankful to see a German flag (or any flag!) on the screen to help them change the language. Then they can worry about the currency.

Same for country and date/time/number formats.

In fact, the first external link in TFA is to a whole website [http://www.flagsarenotlanguages.com/blog/why-flags-do-not-re...] apparently dedicated to this.

First example is English:

> How will users from these countries react to an English, British or American flag?

I stopped reading right there because no one cares. If you have users that care then provide them with choices, just don't let them get stuck wandering around a page in Russian or Greek or whatever because you thought someone might be offended by a flag. Perhaps they're offended by crappy websites with no obvious way to change the language? I know I am, show me any flag from any part of the former British empire and let me get on with what I was doing.

If anyone in Ireland is offended at the use of a Union Jack to signify the button for English language then they need to grow up. Fast.

I don't know much about Ireland, maybe it's not that sensitive an issue there. So how about this: what's the correct flag to show for Arabic in Israel?

Last time I fiddled with some automated kiosk at Ben Gurion airport I noticed they used a Jordanian flag for Arabic. That's an interesting choice, because if I had to guess most of the people choosing Arabic at that kiosk would not choose that as their flag.

They're not choosing their flag, they're choosing a language.

So have another example: if you were to translate your application into Tibetan (for some reason), what flag would you use for it?

(The Tibetan flag is literally illegal to display in Tibet, as the Chinese government considers it a symbol of the separatist movement. So you probably don't want to use that...)

I'd show them a picture of the Dalai Lama's face. Would that cause trouble? For whom?

I've had issues with Google's localization choices for a long time. I wonder if anyone else has had this issue with other language combinations.

I'm a bilingual Japanese/English speaker. Searching Google's search languages to English and Japanese causes the following:

- Random Japanese words show up in things such as Google Maps, even though my display language is set to English.

- Japanese results will be prioritized over English ones. For example, If I search for "the beatles", it will show the Japanese Wikipedia page as the top result before the English version. For some sites (like Discogs) only the Japanese version of the site will show up.

If just set English as my search language, searching in Japanese can bring up results that are entirely in Chinese, even though I've set my preferred languages as English and Japanese (in that order).

Yep. I'm trilingual English/French/Bulgarian, and i have all three as languages in Google search, and they're mixed up too often. I can understand Google proposing the French spelling of an English word and results for it, but almost every time when i search something in Bulgarian i get results in Russian, even when i use words that don't exist in Russian. The languages aren't even that close, and they aren't the only Cyrillic ones...

Well, those are quite close to each other from the orthographic point of view, I guess: Ukrainian or Serbian are visually very distinct from either Russian or Bulgarian, while to tell the latter two apart you need some actual knowledge about the differences of those languages: say, that the abundance of letter "ъ", words ending in "ът"/"та"/"то" and tons of prepositions (i.e., often repeated two-three letter words) are a pretty good indication of a Bulgarian text.

Yeah, it's annoying, especially since I've told Google explicitly the languages I know. I do suppose that a lot of people haven't set their languages, and the automatic detection works well enough, most of the time.

I’m learning Japanese and I’ve had a very similar experience... although I haven’t noticed the Chinese results, which is surely a result of my slow progress! (o_0)

In my experience, the Chinese results tend to happen when the entire query is in kanji. Queries that have at least some kana generally aren't an issue. If it doesn't make sense to use kana, then I'll sometimes add "site:jp" as a workaround, since most Japanese-language sites do use the .jp TLD.

BTW re:google products, sometimes this is really annoying and not easy to find where to change lang in UI.

A trick that often works: add ?hl=en to the URL.

I've run into this problem but I don't know there is a good solution. Let's say for whatever reason the app guesses the user wants Japanese. The app has no way to know what to present to allow to user to switch languages. Should they put a button that says "Language", how does that help a Chinese language person, a Korean person, a Thai person? Should they put "English"? Same point as above. As you complain, they'll likely put 言語, the Japanese for "language" and only if you click it will you see other languages, and it may be buried under 設定 (settings). Sure it's not useful for you, but neither is "English" or "Language" useful for a large part of the world.

I don't know there is a good solution. Checking my iPhone it's 設定ー>一般ー>言語と地域ー>iPhoneの使用言語 so fairly buried in language not useful for someone who doesn't know Japanese. Checking apple.jp, the place to switch is at the bottom right and it just says 日本, no indication that if you click it you'll get a list of countries and if you don't know Japanese you'd likely not know that means "Japan".

National flags - who doesn't know their own flag or any of the major ones enough that they'll miss a Union Jack, Stars and Stripes or a Hinomaru?

A few issues:

You can't trust that flags will be available, and you open yourself up to political/territorial disputes.

Windows doesn't render the Unicode flags (likely due to maintenance and territorial disputes).

Mainland Chinese iPhones don't render the Republic of China flag

Would you use the current Afghanistan flag, or the Taliban flag for Pashto (Afghani)?

Some languages don't have recognisable flags (our translation platform doesn't have a flag for Cantonese)

Flag to language is a many to many mapping. My Android lists ~107 available languages under 'English'.

Why would anyone pick anything other than:

- a Union Jack, because that’s the origin

- an American flag, because it’s the largest English speaking nation

- or their own flag (e.g. Australian)

if the language is English?

(1) You're proposing displaying a Union Jack to users in the Republic of Ireland

(2) You have a dependency on the number of states in the US. British people tolerate seeing the US flag, but it implies en-US, rather than en-GB

It's much more simple to avoid these issues by using a generic "A/文" symbol

> (1) You're proposing displaying a Union Jack to users in the Republic of Ireland

Yes. They're on a website, not watching an orange march go through their town.

> (2) You have a dependency on the number of states in the US.


> British people tolerate seeing the US flag, but it implies en-US, rather than en-GB

Who cares? It's a website.

National flags are not in general a very good way of labelling languages. There are far more languages in the world than countries. In any case, since people are looking for a language they understand it's good enough to write each language name in the corresponding language, like Wikipedia does (wikipedia.org).

But that's not the question, anyway. The question is how to label the button that lets the user change the language when the user might not understand the current language at all. Perhaps a big bright "?" ...?

National flags are actually a very good way of labeling languages for two reasons: first and most important, almost everybody already understands that a flag-looking icon (or two-flags-stacked-on-each-other icon) is used for switching languages. That's already a very strong practical reason to use them. Second, for like 70% of the most popular languages there are flag assignments that won't mortally offend the speakers of those languages — maybe they'd rather see a different flag but generally they'd grumpily agree that "guess it conveys the intent good enough, whatever, I've managed to chose the actual language I want to use".

Another good option is a letter A and a Chinese character.

Localization is treated as translation job in whole tech world, but in reality it should be a UI/UX design job. Most of localizations are awful even by translation standards though. And not because it lacks some kind of context etc, but nobody just cares – these are done by big agencies using mostly automatic process and with prices racing to the bottom.

PS. I have 10+ years experience with open source localization and tried to make it my job at some point. I escaped industry very quickly.

+1 to that. If you design UIs, you need to understand concepts like Right-To-Left, one-few-many for numbers and counts, etc. Same for phrases that use different typefaces with an aim of concatenation (e.g. "red" + "apples") where other languages may have more than 2 words or need a reverse order (e.g. "apples in red").

Modern UI design tooling allows for integrations with Localization Management Systems, which will perform automated translation or pseudolocalization in order to allow the designer to preview how their text looks like in another language or length.

One aspect not entirely covered in the article is when you have a deeply technical piece of software and the dominant vocabularly of the technical field is in English.

We get many requests from users of Ardour (a crossplatform digital audio workstation) to disable automatic translation based on their system language setting, because the terms used in the original/English version are the ones they are used to.

It also leads to some hilarious discussions among translators (at least for those of us watching from the outside). The funniest one I recall was the Portugese and Brasilian Portugese translators discussing how to translate the word "Roll" in the context of a DAW's "transport control" (i.e. the "play" button)

Hmm, haven’t come across the language icon before: <http://www.languageicon.org/>. Unfortunately, it looks rather shabbily done and very abandoned. Serving a JPEG for the large image instead of a PNG (to get an alpha channel while still being easily copyable) or SVG (for best results except for most copy-and-paste purposes), speaking of 2013 in the present tense, no HTTPS, claiming “copyright and hassle free” but it’s actually using an extremely problematic barely-specified license (claiming “a CC license” with no link, and a whole bunch of terms so that it’s a poorly-modified CC-BY-SA-NC), claiming you can download “rar or zip” and it includes SVG and more but it’s actually ZIP only with an eclectic mixture of bizarre formats (not including SVG), colours (some obviously wrong) and sizes, the icon itself is not aligned to any sort of sane grid and has been hand-placed and angled… all up it’s an unhappy mess. That’s not the way to go about trying to make a universal language icon that you want adopted. Pity, because the idea is decent (though the original article is quite correct that labels are far better than icons anyway).

https://materialdesignicons.com/icon/translate is a reasonable alternative to be used with a dropdown selector

PNG/SVG available. Under Apache 2.0 [0]

[0] https://github.com/google/material-design-icons/blob/master/...

If you actually care about people being able to find the switch language button, use flags. Usability trumps correctness here.

Please don't. Flags are not languages. (http://www.flagsarenotlanguages.com/) There are many languages which cannot be clearly distinguished with a flag icon -- for example, there are dozens of languages which could be represented by an Indian flag (Hindi, Bengali, Telugu, Urdu...), and some languages which don't have any clear flag, or whose flag could be politically complicated to display (like Catalan, Romani, or Tibetan).

Use the localized name of a language to indicate the language (e.g. English, Español, Svenska, 中文). It's unambiguous and takes very little effort to implement.

> It's unambiguous

On the contrary, if you just use the name of the language then it's not at all clear to a user who's using the "wrong" language where the language selector is.

Then supplement it with a standard language switcher icon: either A/文 or a globe.

That will still be less recognizable for most users than a flag.

lmm may be suggesting the use of flags as iconography only for the button that opens a popup where you can choose a language. That would be somewhat less bad.

Iconography already exists for that. On Windows, it’s a symbol comprising an A and a 文. On Apple systems, it’s a globe icon. I think the globe icon is seen more often on other platforms and on websites. Having the localised language name next to the icon helps, too.

I tend to try designing the app with as much culturally-neutral iconography as possible (difficult). ISO icons are useful (although many designers hate them). I also try to leave as much as possible to the platform (Apple platforms). Again, designers tend to hate that. The problem is that every custom element requires both a visible string, and at least one "invisible" one (voiceover). It can get a bit dense. It's nice, if I can rely on the built-in Apple versions.

I've also been caught out by choosing culturally-biased icons and visual elements.

I've used ibabbleon.com, in the past, and I'm told they do a good job. Not too expensive, fast, and technically correct.

Nothing beats having the end-users do the translations, though. I have been able to do this, with some of the open-source stuff that I've done. It can be an ... iterative ... process, though, as they can do things like send you translations with illegal characters, or in formats like UTF-8(BOM).

One nitpick about localization:

If you speak several languages, try setting your device to use one _language_ and keep your locale to US.

Now you can spot which developer understands the difference between a language and a locale and which one doesn't (hint: on large enough apps you'll land on pages using the wrong one, ie determines the language using locale). Or the opposite (watch the UI quote you prices in Euro despite your locale being USD).

If someone sets the language to (say) French but keeps their locale as English, do you write a thousand as 1,000 or 1.000? What about the reverse case?

Congratulation: You just opened the Pandora box of locale vs language!

My understanding would be that locale should dictate numerical formatting. But one could argue the opposite and also be right.

If you were discussing POSIX locale, the definitions are quite clear, and there's no ambiguity. But that's also because POSIX subdivides locale into many subsections.

From "man setlocale":

       LC_ALL              All of the locale

       LC_ADDRESS          Formatting of addresses and
                           geography-related items (*)

       LC_COLLATE          String collation

       LC_CTYPE            Character classification

       LC_IDENTIFICATION   Metadata describing the locale (*)

       LC_MEASUREMENT      Settings related to measurements
                           (metric versus US customary) (*)

       LC_MESSAGES         Localizable natural-language messages

       LC_MONETARY         Formatting of monetary values

       LC_NAME             Formatting of salutations for persons (*)

       LC_NUMERIC          Formatting of nonmonetary numeric values

       LC_PAPER            Settings related to the standard paper size (*)

       LC_TELEPHONE        Formats to be used with telephone services (*)

       LC_TIME             Formatting of date and time values

Not to disparage the conversation the technical bits are certainly interesting but, for almost every application, this is the point where things move towards diminishing returns. You could invest infinitely into getting every bit of design and copy for every language and locale permutation flawless and you’re company would be worse off because you should have spent that time elsewhere. In many cases, it’s simply not unrealistic to expect your customer to use google translate.

If we're opening Pandora's boxes...

I've got GNOME set up in English, but my region to be the Netherlands. It will helpfully display local dates, but that also results in the month names being in Dutch.

Maybe that's what the Gnome interface limits you to (no idea) but LC_TIME and the timezone are not inherently linked under Linux.

If by "local dates" dates you mean the format rather than timezone then LC_TIME=en_DK will give you English text with RFC3339 YYYY-MM-DD formatting.

Unfortunately I don't really know what LC_TIME is :) But if I look through the calendar widget that drops down when I click the time in the top bar, is says e.g. that the previous month is called "augustus" and that today is "vrijdag". Looking into my "Region & Language" settings, my language is set to English (United Kingdom), and formats is set to Nederland (Nederlands), which is said to cover numbers, dates and currencies. Especially for currencies I'd prefer my local currency, but I'd prefer for numbers to use the UK system, and not quite sure what I'd like for dates as long as it's not the US system.

I checked the documentation for the ICU Unicode library and Apple's Foundation library, and they both say that numerical formatting is a property of the locale rather than the language. I'd be surprised if other major platforms did otherwise.

Yeah my understanding is people would classify this as locale too, but it's always seemed weird to me. I guess my question is whether this is about formatting to begin with. Periods mean something different depending on the language, right? It seems less about displaying it differently and more about conveying correct information. But then again, mm/dd/yy and dd/mm/yy are often exposed as a formatting option...

Ouch. If you are writing user interface for something that doesn't need to be "correct" (legally, or whatever), I find it best to accept whatever decimal separator (what you want, or change by locale, or both, whatever), and never use or allow thousands separators. Bonus points if you document it somewhere on the interface.

If you really, really, really need a thousands separator, use spaces.

Locale would be a country so e.g. UK? Then it'd be just like the US (1,000).

This is correct, but if we're being pedantic (it is HN after all) the locale code for the UK is "gb". The language / locale would be fr-gb.

I work on this sort of thing for airline ticketing pages. fr-gb would mean the customer wants the page text to be in French, but they want to buy a ticket using the UK system (i.e. Using GBP instead of EUR as the currency if possible, and all the formatting differences that would be specific to "gb").

Quick edit: technically I guess your locale doesn't have to determine your store region, that's just how we do it. As far as I know there's nothing except development effort preventing us from allowing someone to book a flight in French, using GB locale, and with preference for the Japanese point-of-sale system.

I really appreciate websites that allow me to use "Denmark - English" and see prices and payment options I understand, in the language I understand best.

locale defines decimal formatting

I got stuck with this a few years ago. I live in New Zealand. My bank used iOS’ number pad keyboard to let me enter a deposit amount in its app.

When I switched my phone to French language but left the region and formatting as English (New Zealand), it was clear the app developers didn’t know the difference between language and locale because suddenly I couldn’t deposit amounts with a decimal point; they didn’t account for the fact ‘,’ would show up on the number pad rather than ‘.’.

It meant that they detected my phone’s language rather than locale. It used French’s locale settings for the decimal separator despite the fact I had specifically set my locale settings to remain in English (New Zealand).

I complained and it was later fixed to detect locale instead — but seriously, language vs locale, not rocket science.

A related pet peeve of mine: if you want us to volunteer time and energy correcting or adding translations for you ("Help with the translation!"), then please make it as low friction as possible. Just linking to some third party website and expecting me to make an account, email-verify, figure out how it works, all to add a couple phrases of translation, is a good way to demotivate people trying to do free work for you.

What would you consider a low friction workflow?

For a long time in my open source application I had an ODF spreadsheet with all the translations, people could download it and send back to me. But that caused edit conflicts and was a pain to maintain. I've since moved to Weblate, which is basically what you describe, a third party website where users have to register and figure how it works, and I've got a lot more translations in and it is way easier to manage for me.

It's about proportional overhead. When I found uBlock (before it was Origin I think) was available in my language, I was willing to create an account because I knew I wanted to use it long term, and planned to work on as many phrases as I could, so it seemed worth the overhead.

99% of the time though, it's a word or two that I wish to correct in a site I might never visit again (just like I might correct a typo in a Wikipedia article that I'll never see again), and the overhead is many times more work than the actual translation. If I was able to (for eg.) hover on the vertical Feedback button that many sites now have, see a Translation Feedback option, and paste in the offending phrase with some context, and the correct translation, I'd be much more likely to do it. That can perhaps then automatically go into the third party website as from some common guest user, maybe even given lower priority since they likely require more processing - you can even warn me with "Register here to ensure your translation is seen" to manage expectations. But this way, the long tail of users that are put off by the friction can still together contribute - many eyes, shallow mistakes, etc.

> my open source application

My gripe is with the many large (for- and non-profit) organizations that do this though, to be clear. If it's something from a solo developer or a small team, it's understandable that the overhead of processing these might be more than they can bear (though that can be reduced by some categorization on the feedback form).

What I don't get is why some companies with obviously large marketing budgets have so poorly translated websites. Most recent example I stumbled upon: 1password's German website. It sounds so horribly bad. Everything sounds like a word by word translation of English texts. "Halten Sie Ihre Familie online sicher" - wiebitte? Nobody speaks like that. 1Password immediately feels like a scam when I read those sentences.

Wow, 100% of users on English-speaking site said that they prefer to use English language. I suspect that 100% of them are using Internet also.

This has a lot of good points, but the site it self has a huge distraction:

Why is the font so big?

To clarify, 30px size text on a 22" 1080p monitor looks more like a headline that is really long than body copy.

I realize that viewing websites on computers is much less the norm than it used to be, but such large text feels really strange.

Probably because the author is thinking about people with poor vision?

IMO, it's poor design for a page to handle accessibility issues that are already built into the browser. (It's trivial to shrink and zoom in a browser.)

I have to wonder what the complaint can be if it's trivial to shrink text - aren't you simply looking at this from the perspective of someone with better vision?

My own recommendation for general web body type in typical English fonts: never go above 20px, and 18px is generally a better upper limit.

The unspoken assumption here is that you're designing for younger people, which is natural as most designers are younger, but that doesn't make it any better a recommendation than "never go below 18px and 20px is a better lower limit".

Surely it depends on your audience and/or those you wish to attract.

A large part of my recommendation is consistency. For general content sites, it’s not good to be too different from the mean. Font family and size is definitely such a property where you don’t want to deviate too much: twenty years ago, 20px for body type would have been outrageously large; twenty years ago, 11px was definitely on the small side, but not outrageously small as it is now. Conventional sizes have definitely slid upwards, but they’ve peaked in roughly the range 16–20px, and I declare that anything higher than 20px is just too much, pointlessly limiting what you can fit on the screen—any time people go above there, you will see people zooming out because they find it too big.

(And this recommendation is designed for desktop-sized displays; on mobile displays, just use 16px, and certainly don’t go above 18px.)

I am emphatically not assuming designing for younger people; quite the contrary. You will find me saying “don’t go below 16px, you make things harder for many people”. But once you’re in the range 16–18px, going larger just doesn’t help—the people that want to go larger will (or should) already be used to going larger by zooming in, and you’re making life harder for everyone else by preventing the screen from fitting much content at once.

> For general content sites, it’s not good to be too different from the mean.

The mean is decided by those designing sites and they are overwhelmingly younger people - in fact, usually they're among the youngest of all adult workers - and it shows. If sites were designed for the mean amongst society then the move would be to larger fonts.

Personally, I welcome sites that use larger font sizes, I'm not sure why I need to strain my eyes one iota for text in a virtually infinite space. I know I'm not alone (though perhaps for different reasons). Deviating from the mean would work for a substantial part of the population who are currently treated like second class citizens on the web.

> But once you’re in the range 16–18px, going larger just doesn’t help—the people that want to go larger will (or should) already be used to going larger by zooming in,

I've had to show several oldies how to zoom. They do get used to it because they have to.

> and you’re making life harder for everyone else by preventing the screen from fitting much content at once.

Do the people who need this not know how to zoom out?

It looks fine in my locale.

Serious answer: the author has an "under construction" notice at the top of the page.

Design-me: "wow, people really do skip over top level banners now.

Now what do I use if something's important?"

The banner on this site is too skinny, and the background color is too dull. It doesn't jump out as anything important, and I skipped right over it as well.

A wider banner with more intense background color would go a long way.

I saw the banner too but wasn't sure exactly what he meant by "janky" and "updating in the open".

I thought it was going to be an example of a problematical translation

Alert boxes. Can't go wrong!

I typically close those too. The trick is to make them modal and persistent until you enter your email address.

I went through a lot of these pains. The biggest one was undoing the misuse of localisation tools to display content differently on each geographic region, and misuse of plurals.

I would also add collaboration efforts, how to make localisation work with continuous integration and not go waterfall, where you make a release, and you have to wait 4-6 days to localise half a dozen strings

In case the author is here: for me, the font size was at a level where I found it was most comfortable to read after zooming out all the way to 50%. And I have neither great eyesight (I should go for new glasses... soon...) nor a retina screen with zoom or anything. Probably looks great on phones, but on desktop for me the font was set uncomfortably large.

Another concern: If your app/website is available in a language, customers will expect support in that language.

I've had lots of support conversations where the customer is typing in their own language while I type in English. Google Translate isn't perfect but it gets the ideas across well enough that we can usually resolve their issue.

For consumer-facing apps, online translation, community support and screenshots work well enough that it's only a minor inconvenience.

I do this over Facebook Messenger at least once per fortnight.

Youtube still shows dates for "Premieres on..." in MM/DD/YY, regardless of the locale. Internationalisation and localisation are such hard problems that even the big FAANG companies fail at it.

I would very much be on board for YYYYMMDD universal standard enforced by whatever means necessary to get it done.

But why people don’t use JAN / FEB / MAR … DD YYYY, IDK, seems low friction and no one gets confused even if they would rather see 7 JAN 2022

Yes using JAN FEB etc. would be a very effective quick and dirty fix. I guess the only reason why not many complain in this case, is that "premieres on..." is a feature that almost no one uses on Youtube.

Interesting and related: why Mozilla created Fluent, a project to translate software with: https://hacks.mozilla.org/2019/04/fluent-1-0-a-localization-...

> Translation and localization costs money

If you have a fan base, you can leverage it to get some amount of translation done. A lot of users are happy to help the product they like get better. Granted, the quality will not be as good as the quality you get with professional translators.

The article also did not talk about the actual the translation process, which in the case of a product that is released but keeps getting updates is not trivial.

There are tools that exists, but I personally decided to build a workflow around git, with a python scripts that generates a status of all the translations: https://github.com/jyaif/ppl-i18n#status

The downside is that contributors need to figure out how to use github to contribute. The upside is that it's free, you get auditability, versioning, and the barrier of entry may actually increase the quality of translations.

Like Shattered PD with Transifex.

Not to mention messages are almost impossible to google.

In fact, I used to play all my games in english for the same reason: items, places, starts... When you want to know more, the only good wikis and tutorials are in english.

I used to write a blog in french, it became super popular. Yet, it's a shadow of what you can achieve woth an english audience.

Sometimes I wish we could switch the whole world to a single language. Get rid of timezones and use metric everywhere while we're at it.

About as tempting as switching all cuisine to Soylent to save us the mild indecision when having to choose a restaurant for date night.

Having just recently been picking a restaurant for an anniversary, I can tell the problem is real and I'm appalled at the sad state of technology for restaurant discovery.

Soylent for two?

Getting rid of time zones is a nice idea for the first five seconds you think about it.

Mainland China did mostly. Everything is +8[0] (besides Xinjiang Time[1])

[0] https://en.wikipedia.org/wiki/Time_in_China

[1] https://en.wikipedia.org/wiki/Xinjiang_Time

Yes. But, that’s also only 4 time zones squashed into one, with vast majority of the population living in 2. EU could remove time zones, US too, but worldwide it would be quite unpleasant to completely lose the notion of when a « morning » is.

Also not to forget that currently midnight and the associated change of the calendar date happen at a time when most people are either asleep or otherwise don't really care about that fact.

Not having time zones means that everybody except the privileged minority living near wherever the new meridian ends up will have the calendar date change during their waking hours, which will be rather confusing.

Plus how are things that are currently specified at the granularity of calendar dates supposed to work? Do public holidays and suchlike suddenly start and end at something like for example 3 o'clock (solar time) in the afternoon, because that's when the new midnight (global time) happens at your location? Or does everything now need to specify starting and ending hours, too? Or do you decouple the change of calendar date from midnight (global time), which basically means reintroducing time zones through the back door…

My 5 cents from the antipodes (NZ, so it really should be 10 cents): yes.

For so much of what we do, we have to do it on Europe and North America’s time. Being able to say “if we want to join that 9am conference call in New York, we have to be up at 1am”.

Taking out time zones and saying “oh good, the meeting is at 9am” doesn’t help me to know what time in the middle of the night I have to be up — other than that it’s 9am, which isn’t helpful.

The whole am/pm thing would also probably need to be ditched for military time. (Now that is a thing I would agree on doing anyways)

That’s just a different way to format the same thing. It doesn’t stop the sun being up in one place and down in another.

Great way of erasing all cultural differences and putting an end to diversity. Might as well merge all countries together into one and establish a single religion and a single political system while you're at it.

One problem I stumbled upon frequently is codebases that did not support localized formats, but just assumed a certain format to use, for example through concatenation.

There are capabilities built into the programming languages, which allow to format numbers, currencies, etc. with a specific locale. There are also great resources [1] out there that provide all kinds of formats and localized names for countries, currencies, etc.

[1] Unicode CLDR: https://github.com/unicode-org/cldr

My first job out of college involved template-based translations for chemical bottle labels (with information printed in English, French, German, Italian, Japanese and Spanish, with different regulations attaching to each language (jurisdictions included the US, Japan, Canada, EU, Germany and UK, each with their own specific requirements). Machine translation was still expensive and unreliable (this was 1991–3) so we had translators build up a phrasal dictionary that we could then apply rules to in order to build up the text that would appear on each label. Thanks to the regulatory regime, there was a lot of care taken in designing the system and with hand-translation of each phrase by actual human translators, no glaring errors.

Looking at after the fact attempts at internationalization, there are lots of pitfalls and it's something that needs to be done intentionally. (I'm still thinking about how to best implement the equivalent of LaTeX's \cref for finl. What works for English, doesn't work for other languages (e.g., in Czech, “in sections 3 and 4“ would be renderered “v sekcich 3 a 4” while “see sections 3 and 4” would be “viz sekce 3 a 4” although “see sections 3–10” should be “viz sekci 3–10”.

It’s unfortunate that by default a lot of systems just gather strings into a pile for translation, as if all strings are created equal. But if you have two uses of the same text with different meanings, you have to be careful (e.g. put them in separate translation tables or something). Even for single words, e.g. somewhere there’s a button named “Open” but elsewhere there is a status text indicating that the current state of something is “Open”; lots of things like that. Just because it’s the same in one language, it may not be in another.

It would also be nice to have more context besides, say, just a comment. For example, things in toolbars ought to be short (ideally one word), things in menu items might be medium (a few words tops), things in notifications might be medium-to-long, and stuff in a window might have no restrictions at all. When you start from just a string, you do not necessarily know how much space you have and even if your UI can auto-resize, that doesn’t mean you’d want it to in every case.

I'll add my few experiences. The primitives of UI that we use in English are often untranslatable, or need really verbose translations simply because they've been inducted into English as part of the tech UI lexicon.

I speak four languages fluently, and 'OK', 'Abort', 'Retry' and 'Load' are some of the hardest things to translate.

Apart from the translation itself being bad (plenty of comments here already), this is another thing that also bugs me with apps/software in Brazilian Portuguese:

> Words may have radically different lengths in other languages

Sometimes the UI gets completely screwed, and I know it'll just look better in English, if the design was originally done with English text

That's an obnoxiously large font, especially for a .design website

I'd like to add being able to switch languages in a website is really heplful for learning another language.

I commonly do it with Wikipedia

A few more based on my own experience:

* People don't always speak the language of the country they are in

* There are significant regional differences for a given language. As a French Canadian, I couldn't translate a website to French because it would sound wrong to someone in France. For example, we have vastly different sets of English loanwords.

* The order of things can be different. For instance, German addresses put the door number after the street name. This can break your layout or even your UX in subtle ways.

* You must choose formal or informal pronouns (tu/vous, du/Sie) and use them consistently.

* Labels can make no sense if you don't know what the UI is like. Context is important for translators.

Also make sure whoever translate is familiar with the profession nomenclature. To avoid phrases like "save to disk" be translated to "leftover on plate" which will be very confusing.

I used to work on the localization team at my company. It's a pretty complicated world in itself. I was nodding my head while reading this and I think the biggest surprise I ran into is how not a lot of companies do a great job at localizing for many different regions. I think this is also something where large companies have a huge lead on compared to newcomers.

I live in slovenia... for many cases, it is impossible to even translate stuff directly into our language, because we don't only have singular and plural, we also have "dual" (for two of something). We also have a lot of irregular stuff, where we count beers:

- eno pivo (one beer)

- dve pivi (two beers)

- tri piva (three beers)

- štiri piva (four beers)

- pet piv (five (or more) beers)

How the hell do you put this into strings.xml?!

Well, I don't know how to put it into strings.xml specifically, but there are localisation systems that handle it without much fuss. It requires a bit more effort from developers and translators, but it's not that hard in concept.

For example, Fluent (https://projectfluent.org/, used by Firefox), or MediaWiki's localisation system (https://www.mediawiki.org/wiki/Localisation#Message_paramete...).

> The web’s Anglocentric-bias exists largely because of where it was originally created

At CERN, in Switzerland?

I see a lot of bad translations to brazilian portuguese even on big companies/services. It was always obvious to me that they could get better conversion rates if they fixed it. It's tricky, but not that hard compared to the benefit.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact