
The Language Barrier Is About to Fall - Trisell
http://www.wsj.com/articles/the-language-barrier-is-about-to-fall-1454077968?
======
korginator
This article is BS. Machine translation has been improving but it's still
nowhere close to being able to provide real-time translation of spoken
languages, and I bet anything it won't be there after another decade. I know
for a fact that Baidu is throwing lots of money and people at fundamental R&D
in translating Asian languages, and it's no piece of cake.

As for "earpieces whispering nearly simultaneous translations", it would be a
cool party trick but I doubt they'll work in real life.

These moonshot projects have been around for decades, and they make for some
cool entertainment at conferences and demos in very tightly controlled
settings. As for the predictions that "the software in the cloud connected to
the earpiece in your ear will re-create the voice of the speaker", the author
should take up writing bad science fiction.

~~~
astrodust
It depends on what the bar for "good enough" and "fast enough" is. Remember
that Google Translate, just one example, was pretty feeble to start but has
proven itself to be adequate for most needs now even if a lot of the meaning
gets lost in translation.

Could that be paired up with Google Voice? Absolutely.

It's not easy, but it's inevitable.

~~~
Brakenshire
I wouldn't say there are that many situations where Google Translate is good
enough, to be honest. It's not so different from an automatic way of looking
words or phrases up in a dictionary.

------
radarsat1
I'm currently living in a country where I don't speak the language. I can't
believe how incredible translation tools are. Even though I don't understand
when people speak to me, I can write and read emails, and accomplish things
that would be incredibly more difficult without this ability (e.g. talking to
my insurance agent). And when I say I can read and write emails, I mean it can
do so in about 1.5x longer than a native one. I have developed very basic
abilities in the language (in the 2 months I've been here), and so I
constantly use translation tools and check in both directions -- translate to
the language, modify it to make sense according to my understanding --
translate it back to English -- all within the space of a couple of minutes.

I still miss out on learning more idiomatic expressions, but I imagine those
will come in time. In the meantime, I am able to get a surprising amount of
business done, in ways that would have been impossible before.

Not to mention that continuously doing all this translation of my written
communications is making me learn the spoken language much faster. Although,
for that, nothing replaces actually speaking with people, of course! ;) But
it's much easier to bootstrap actual conversation when you've been learning
individual words by reading and writing.

Overall I would say that automatic translation tools have made it possible for
my life to move in directions that where language would have been a serious
roadblock in a previous era.

~~~
tomp
What is the other language, and how similar is it to English (or other
languages you know)? (Beofre you answer, my predicted answer is "very
similar".)

~~~
radarsat1
Similar enough, certainly. More similar to my second language. (I know French,
I'm learning Spanish.)

It definitely lowers the barrier, but don't think there's no barrier just
because it's a "similar" language. I'm claiming that automatic translation has
accelerated my learning, not that learning it would be impossible otherwise.

Nonetheless it is quite difficult to live somewhere where you don't speak the
language, no matter what tools you have available. Mostly it doesn't matter,
but when you find yourself arguing with administrators... or the police...

(I almost got into a bike accident the other day. I wasn't even all that
worried about my health, I was mostly scared of having to talk to people,
hospital/police/insurance..! or argue with the guy who hit me...)

------
kazinator
I suspect real-time machine translation is within reach if the speakers are
trained to use a restricted subset of their native language. Moreover, I
suspect that this approach could be both realistic and useful.

This restricted subset situation is _de facto_ already the case; existing
tools don't handle regional dialects and colloquialisms. I just tried "嘘じゃねいぞ"
(uso ja nei zo!) in Google Translate and it came up with rubbish because it
doesn't know that "nei" is a variant of "nai".

The kinds of restrictions I'm thinking of are deeper: avoidance of the full
variety in sentence structure, and excessive complexity, like compounded of
relative clauses. Not to mention language-dependent tricks like double
meanings depending on puns, and generally all figures of speech that do not
translate.

A restricted subset of your native language is something which you already
understand, and learning to speak it is easier than a whole different
language.

People doing business abroad would just study that for a few weeks as part of
the workflow.

The dialect could become richer and thus less restricted as time goes on,
making it easier to use. It could be customizable as well, to a particular
speaker's idiosyncrasies. If a given speaker often uses some figure of speech
out of habit, it could just be defined for that speaker, with a hand-crafted
translation to various languages.

~~~
resoluteteeth
> A restricted subset of your native language is something which you already
> understand, and learning to speak it is easier than a whole different
> language.

It's interesting to note that before machine translation, there have been
attempts to try to standardize restricted subsets of English as a way of
making it easier for non-native English speakers (for example "Basic English"
and "Basic Global English").

However, I think both in the case of communication with non-native speakers
and in the case of machine translation, it's actually not that easy for native
speakers to anticipate what phrasings will be problematic.

These existing efforts have had the problem that reducing vocabulary can
actually make the resulting language more reliant on things such as verb
colocations that are actually harder for non-native speakers. Machine
translation has an advantage that it can presumably understand a rich
vocabulary, so it might be possible to get somewhat better results by using as
precise words as possible while keeping the sentence structure simple. Still,
especially with modern statistical machine translation, and perhaps even more
so in the future with recurrent neural network-based translation, it may not
be that easy to predict potential pitfalls in an attempt to work around them.
It actually might have been easier to train people to get better results (at
least for specific language pairs) with older, rule-based translation systems,
since they were more predictable.

In either case, there presumably needs to be a certain level of accuracy
before people can start attempting to tweak it by their choice of wording. I
don't think we are there yet, especially for English <-> Japanese. I just
tried entering "I'm not going to tell him that." in Google translate, and it
gave me "私は彼のことを言うつもりはありません。" ("I'm not going to mention him.") This is a
fairly simple sentence, with no colloquialism, and yet Google Translate fails
utterly. This is much more concerning than something that can be easily worked
around, such as not knowing colloquialisms such as ”じゃねー".

~~~
kazinator
Obviously the program must have figured out that there are two clauses: "I'm
not going to X", where X is "tell him that". How the heck does it go from
there to "kare no koto wo iu". If we translate just "tell him that" alone we
get ことを彼に伝えます which is okayish. If it would just stick some version of _that_
into the "watashi ha ... tsumori wa arimasen" pattern, it would be progress.

------
GuiA
I've been learning Japanese as of late, and automated translation tools are
near useless for anything but sentences that are just a few words stringed
together (and even then, nonsense is often returned as soon as a tiny bit of
slang or metaphorical language is used). I also have friends who speak
languages that are not part of the top 10 most spoken in the world (eg
Finnish, Hungarian, etc) and trying to translate their Facebook statuses
returns complete garbage.

Between languages that are fairly close to one another (such as French and
Spanish), sure - you can pipe text from one language to another in Google
Translate and the results will be mostly understandable (although still very
awkward). But anything else, and automated translation still has ways to go.
Turns of phrases, idioms, general syntax etc. are fairly similar between
languages that are closely related, so automated translation has an easy job
there (with the caveat that it starts falling apart as soon as slang or more
colloquial expressions are used). But step any further than that, and it's a
lost cause.

The article also assumes that there's a 1:1 mapping between everything you can
express in every language, and that it's just a matter of software finding the
right mapping for whatever you're saying into the destination language. But
any translator will tell you that this couldn't be further from the truth -
there are many things I can say in English to my Californian friends that
don't really work in a French conversation with my Parisian friends (let alone
in Mandarin or Japanese). At best you'll be met with blank stares, and at
worst you'll be breaking etiquette in major ways. Part of learning a new
language is learning the space of what can be expressed and how in that
language.

For these reasons, I don't really buy the whole "earpiece that automagically
lets you speak with anyone in the world" anytime in the next few decades.
Which is fine, because learning languages is fun :)

On the other hand, I'm currently in the Netherlands for a conference, and
everyone speaks flawless English. I've had a similar experience in Germany,
Sweden, and a bunch of other countries (but not in my country of origin,
France, where being terrible at English seems to be a point of pride). So
maybe taking a page from their book when it comes to education is a better
idea than waiting for technology to let us be lazy.

~~~
timr
Exactly. I've been taking Japanese recently, too, and the grammatical
structure of the language alone is enough to show that the "real-time
whispering earpiece" as an impossible concept: in English, the verb usually
comes near the beginning of a sentence. In Japanese, it's near (or at) the
end. Translating one language to the other therefore requires a mental "tape
delay", while you map the one grammar onto the other.

Even when you hear bilingual, fluent translators converting Japanese to
English, there's often a weird disassociation between the words the speaker is
using and the words coming out of the translator. The translator is buffering
the words and translating in logical units. Even if you grant the benefit of
the doubt and call this "real time" translation, I know of know translation
system that is capable of any sort of similar kind of judgment with an
unbounded input stream. Deciding where to begin/end translation requires
knowing what the original sentence intended to express. And computers still
suck at that.

~~~
rmc
German, one of English's closest languages, also does the "verb at end" for
many constructs (e.g. past tense, future tense, "must"/"should"/"can")

------
euske
I think people are overestimating the word "language barrier". People often
forget that the language barrier is not a one-sided problem. Communication is
a collaborative effort (unless we're just spying them). I bet that something
like keyword spotting for commonly used 3000 words is already pretty feasible,
and that would greatly reduce the language barrier. Perfect translation is
neither possible nor necessary to induce mutual effort.

(By the way, I've seen so many US tourists in Europe who don't even try to
speak a single word in their language. They don't seem to know that a simple
"thank you" in their native tongue would greatly reduce the barrier and change
their sentiment towards the US. In this sense, I believe that technology is
more or less neutral to mutual understanding, because not knowing a language
(or the idea that not knowing a language is acceptable) can have adverse
effects.)

Edit: a few grammatical errors. Damn my Japaneseness!

~~~
hodwik
> (the idea that not knowing a language is acceptable)

It's not? I'm in big trouble. There are more than 6,000 languages I do not
know.

~~~
Piskvorrr
The point is probably "it's not acceptable to assume 'everyone knows English,
but most are just too lazy to use it'" rather than "of course you're supposed
to be fluent in any language" ;)

------
tallanvor
The problem is that we still haven't been able to write a system to accurately
recognize the words that are being spoken. Differences in diction, dialects,
and accents make it difficult to accomplish. Just look at what Microsoft,
Google, and Apple are doing - as good as they are, they're nowhere close to
where they need to be.

Even once we solve the voice capture issue, I suspect that we'd still run into
many areas where the computer would have problems determining the proper
context necessary to accurately translate the text. Whatever system is
performing the translation would have to maintain a history of the
conversation in order to have any hope of understanding context, and at a
gathering where you could have multiple conversations occurring at the same
time, the process gets even harder.

I think the only part of the author's belief that we're close to achieving is
being able to have the computer use the original speaker's voice. Siri and
Cortana could already sound more human, but people are currently more
comfortable having them sound robotic. I'm sure that will change in time,
however.

------
pluma
Yet another journalist who doesn't understand how languages actually work.

Other languages aren't just "English with different words and syntax". A
single word can have dozens of different meanings, the set of different
meanings of a word needn't be the same across languages. And even if you know
the context the meaning doesn't have to be unambiguous. A perfect translation
would need to recognize plausible ambiguities and retain them across
languages.

Language is not code. Language is an expression of thoughts in a form that
builds upon vaguely defined tokens. It's amazing that human language
communication works this well at all given how ambiguous and faulty it
actually is in practice.

~~~
astrodust
Even English isn't English as innuendo and alternate meanings cloud
understanding of it constantly. Even the simple phrase "I'd hit that" means
any number of things depending on the speaker and the context.

This doesn't even touch on how sometimes British English needs to be
"translated" to American English, like in the case of Harry Potter being
adapted with many words changed.

Likewise it's almost impossible to express the nuance of something like
"that's so fucking fucked" in a language lacking a word of similar
versatility.

I'd argue there's no such thing as a "perfect translation". That's why people
who translate novels have a considerable amount of work to do to pick, from
all possible translations, the one that best represents the tone and intent of
the original author.

I still think it's possible to come up with a passable translation
automatically in real-time. It's inevitable.

~~~
pluma
Sure, but translations are lossy. No 1-to-1 translation can ever completely
emulate perfect fluency.

~~~
Piskvorrr
You can get pretty close though ;)
[https://www.youtube.com/watch?v=akbflkF_1zY](https://www.youtube.com/watch?v=akbflkF_1zY)

------
Piskvorrr
"...in ten years time." To quote XKCD's translation (sic!), "we haven't
finished inventing it yet, but when we do, it'll be awesome."
[https://xkcd.com/678/](https://xkcd.com/678/)

This is the same tired old story that keeps on making rounds for the sixth
decade now - "machine translation kind of sucks now, but it will be solved
within the next decade!!!!!!!!1!" This is still very much a current paper -
note the publication date: [http://www.mt-archive.info/Bar-
Hillel-1951.pdf](http://www.mt-archive.info/Bar-Hillel-1951.pdf)

~~~
astrodust
To be fair, translation today is significantly better than sixty years ago.
For one, the output is often usable as-is, and with some tweaking is even
better.

I've known several people that have learned English by leaning heavily on
automated translation until they got a firmer handle on it. If you tried that
in the 1970s you'd still be trying to figure out what the word "the" means.

~~~
Piskvorrr
Sure, it's better - in the "pre-MT" and "post-MT" systems (although it's
called "crowdsourcing" these days). For standalone MT, the output is just-
about-recognizable; nowhere near the babel-fish-like gadget WSJ is presenting
as "just about to happen."

------
tokenadult
真有意思。I see a lot of dismissals of the projections in this article. Thirty
years ago, just as I was working for several years as a Chinese-English
interpreter, I would have agreed with all those dismissals. I no longer agree.
As some of the subcomments here have pointed out, all that automated
translation and interpretation have to do is become less expensive than either
a) learning the relevant language yourself or b) hiring good-quality human
translators and interpreters. Automated translation and interpretation is
already well along that path.

A crucial skill for a professional interpreter or a casual language-learner is
"lexicalization," figuring out what the correct match is between what is
possibly a phrase in one language but only a single morpheme that is part of a
larger word in another language. But the huge advantage automated translation
solutions have is shared learning and long-lasting memory. If the
lexicalization is built into a database, for example a database kept by Apple,
Google, IBM, or some other multinational provider of translation services,
that learned information can be deployed in products over and over and over
again. In the end, that's all that it takes for automated translation and
interpretation to do better at equal cost (or as well at less cost) compared
to human language-learning or human language services. I got out of
interpreting for a living years ago, because I finally came around to the
understanding of how little I could compete with the worldwide efforts made
with new technology to tackle language problems. I still think learning
natural modern human languages is a very intellectually enriching activity,
and all my children do that, but we expect to live in a world a decade from
now with very good real-time interpreting and translation systems for a wide
variety of language pairings.

------
dalke
Compare this line from the WSJ article:

> A decade from now, I would predict, everyone reading this article will be
> able to converse in dozens of foreign languages, eliminating the very
> concept of a language barrier.

with these excepts from The Guardian at
[http://www.theguardian.com/technology/2016/feb/10/texas-
regi...](http://www.theguardian.com/technology/2016/feb/10/texas-regional-
accent-siri-apple-voice-recognition-technology) (referenced two days ago on HN
at
[https://news.ycombinator.com/item?id=11077168](https://news.ycombinator.com/item?id=11077168)
):

> Y'all have a Texas accent? Siri (and the world) might be slowly killing it

> Voice recognition tools such as Apple’s Siri still struggle to understand
> regional quirks and accents, and users are adapting the way they speak to
> compensate.

> ... “I’ve had a bunch of people from Australia and India say they only
> really get along with Siri if they fake an American accent,” said Lars
> Hinrichs, a sociolinguist at the University of Texas at Austin.

~~~
rollthehard6
To be fair, having not tried Siri for a year or two, it has gotten much better
at dealing with the Glasgow accent ; )

I recall around 2000 having to fake a US accent in order to get an airline's
voice response system to put me through to a human.

~~~
astrodust
You probably have to fake an accent to communicate with some American call-
centres as well.

------
cabalamat
For an example of the state of automatic translation c.2010 I give you this:

This is question, engish is faulty therefore the right excused is requested.
Thank google to translate to help. SORRY!!!!!

At often, the goat-time install a error is vomit. To how many times like the
wind, a pole, and the dragon? Install 2,3 repeat, spank, vomit blows

14:14:01.869 - INFO
[edu.internet2.middleware.shibboleth.common.config.profile.JSPErrorHandlerBeanDefinitionParser:45]
\- Parsing configuration for JSP error handler.

Not precise the vomit but with aspect similar, is vomited concealed in fold of
goat-time lumber? goat-time see like the wind, pole, and dragon? This insult
to father's stones? JSP error handler with wind, pole, dragon with intercourse
to goat-time? Or chance lack of skill with a goat-time?

Please apologize for your stupidity. There are a many thank you

(from [https://lists.internet2.edu/sympa/arc/shibboleth-
users/2010-...](https://lists.internet2.edu/sympa/arc/shibboleth-
users/2010-09/msg00304.html) )

------
pmontra
It must work offline. One of the reasons is that tourists often don't buy a
data plan (imagine a tour of Europe with a different country every couple of
days) and wifi is not where you might need machine translation most, on the
road.

~~~
distances
Many European mobile contracts allow a bucket of roaming data (400MB per month
for me). Not sure what's the status for prepaid SIM cards that tourists from
outside of Europe would likely have.

Also, the roaming costs in EU are regulated, with likely price decreases also
in the future:
[https://en.wikipedia.org/wiki/European_Union_roaming_regulat...](https://en.wikipedia.org/wiki/European_Union_roaming_regulations)

~~~
scarhill
A number of prepaid providers are offering data roaming plans now. See this
list: [http://prepaid-data-sim-
card.wikia.com/wiki/European_Union](http://prepaid-data-sim-
card.wikia.com/wiki/European_Union)

------
mistermann
If I access this article via a google search, it is still paywalled. I see
experts-exchange seems to be doing this again also. I thought this was against
google's terms & conditions, no?

~~~
chrismcb
Google has terms and conditions to be listed in search?

~~~
mistermann
Ya, the page when you click through has to be the same as that indexed by
google, I think it might be referred to as cloaking?

------
dcw303
I live abroad and speak far from fluently in a foreign language. I'm
personally invested in machine translation improving, and I really hope it
does.

> Today’s translation tools were developed by computing more than a billion
> translations a day for over 200 million people. With the exponential growth
> in data, that number of translations will soon be made in an afternoon, then
> in an hour. The machines will grow exponentially more accurate and be able
> to parse the smallest detail. Whenever the machine translations get it
> wrong, users can flag the error—and that data, too, will be incorporated
> into future attempts.

If this is the reasoning that we will be there in 10 years, it is ridiculously
optimistic. I'm pretty sure that there is a exponential fall off from common
phrases to a long tail of more nuanced, not as often uttered language. A
million people punching in "where is the bathroom?" is not going to make the
machine know how to impart the acerbic wit of a phrase from a Palahniuk book
to a Japanese business man. Good luck finding someone skilled enough in both
languages and cultures to be able to correct the error.

Real skills in multiple languages requires an intellect to parse the
situations. While I personally hope that will one day be achievable with AI, I
have to think that we are still a long way off.

------
nsns
The idea that one can translate words without understanding what is being
said, seems rather similar to an idea that one can infer the meaning of words
from their constituent letters. Meaning not only works simultaneously on so
many technically inaccessible levels (context, expectations, allusions,
previous experiences, etc.), anything mildly interesting (not to mention
proper literature), is based by definition on a _deviation_ from established
meanings.

~~~
lisivka
So, we need to translate text to an artificial language, which is well
understood by both humans and AI, and then render same meaning in native
language of listener.

Slovakian or Ukrainian language should be good base for that artificial
language, because they are easy to parse and are well developed, with large
enough dictionary to cover most cases. Moreover, Ukrainians are very cheap
right now, because of war with Russia.

------
xiaoma
> _You could host a dinner party with eight people at the table speaking eight
> different languages, and the voice in your ear will always be whispering the
> one language you want to hear._

And at this hypothetical dinner party, what would the earpieces whisper in
English (or most other languages) when the Japanese guests said いただきます?

Some words just don't have a translation in other languages.

[https://www.google.com/search?q=いただきます&tbm=isch](https://www.google.com/search?q=いただきます&tbm=isch)

~~~
jcranmer
«Bon appétit» would be an adequate translation for most English speakers (even
though it's French). A more "pure" English translation would probably be a
grace of some kind, which serves more or less the same purpose (although I can
also see some atheists bristling at a grace where they wouldn't bristle at
いただきます or bon appétit).

~~~
xiaoma
Not really. My Japanese roommate would say いただきます _when by himself at his
desk_ before eating a candy bar. It would be surprising if even one English
speaker in the world regularly says "Bon appétit" in that situation.

The question isn't, "How do you say いただきます in English." The better question
is, "Do English speakers have a set phrase they say right before eating." And
the answer is no they don't.

~~~
kazinator
In context, the question is in fact what should come out of the English-
speaking listener's earpiece to explain the "itadakimasu" utterance from the
Japanese speaker, whether or not the English speaker has a custom of uttering
anything before any meal or snack.

Suppose English speakers _did_ always say "Bon appétit", even when eating
alone and just a snack. It's still not the meaning of "itadakimasu". Wishing
for gusto in eating is not the same thing as humbly receiving the meal/snack.

------
personlurking
The benefits to this kind of thing are being able to be very specific and go
deeper into the subject matter with one's second or third languages. Plus I
might argue that smaller languages won't need to die (if everyone can get
along and do business in their own tongue). The drawback is that perhaps
people who learn languages in the near future will seem like those who study
Latin ever since it became a "dead" language.

~~~
Piskvorrr
Note the qualification of "small- _er_ " languages: you are implying
"languages which are still large enough that a translational layer is
_economically_ feasible. Small languages (let me pull a number out of thin air
and assume <20 million speakers) would still be doomed. Guess what: that means
English, French, Spanish, Italian, Russian and Polish for Europe, everyone
else kindly GTFO, we don't have resources for implementing your languages.

~~~
personlurking
Sorry, I don't quite follow your intended meaning, but I will try to expand on
my initial comment below.

If it no longer matters what language someone speaks, a language with fewer
speakers than others wouldn't need to become extinct, as will happen without
tech-aided interpretation. Of course, the viability of a language isn't solely
connected to the ability for a native speaker to do commerce in it, but I
would imagine it's nonetheless an important factor.

From Wikipedia's article 'Lists of endangered langauges':

"While there are somewhere around six or seven thousand languages on Earth
today, about half of them have fewer than about 3,000 speakers. Experts
predict that even in a conservative scenario, about half of today's languages
will become extinct within the next fifty to one hundred years."

I would hope that even languages with 3,000 speakers could be 'saved', in the
scenario I propose (assuming a Wikipedic-style worldwide effort - in
conjunction with the tech sector, governments and linguists - to get as many
languages as possible implemented).

~~~
Piskvorrr
Well, that's the "let's assume unlimited resources (inc. time), suddenly
nothing is a major problem" scenario.

The actual scenario tends to be "but you have a choice of all these 20
languages; are you just lazy to learn a _proper_ language that's on the list?"
My native language has about 10 million users, so I get this a lot (and
consequently, most of _my_ digital UX is in English - significantly less
hassle for me).

------
KhalilK
I speak Arabic, French, English and Spanish; the last 3 are "easier" to
translate interchangeably. Good luck breaking the Arabic barrier, each Arab
nation has its own dialect and no one hardly speaks "traditional" Arabic
anymore. The language itself is very rich and dense, even machine text
translation has a long way to go, let alone real-time voice translation. This
article is fiction at best.

------
joakleaf
I thought this was about how the entire world is getting to a point where they
are able to speak and understand basic English.

There are certainly countries and population groups that do not speak English
well, but it has changed significantly the last 10 years.

I wonder if we will get there before 100% accurate automatic translation.

~~~
Piskvorrr
The entire world*, sure.

(the part of it inside my social bubble; smartphone and internet connection
strongly advised, other terms and conditions apply, void where prohibited.)

------
foobarbecue
This "real time whispering earpiece" sounds an awful lot like a bluetooth
headset...

