
Only 4% of languages are used online - taofu
http://priceonomics.com/only-4-of-languages-are-used-online/
======
w1ntermute
And what percentage of people in the world uses one of the languages in that
4% as their primary language? I'm guessing it's a really large percentage,
probably a majority of the world's population. It's not really useful to look
at the _percent of languages_ when they vary in the number of speakers by so
much (and there's a long tail of languages with a vanishingly small number of
speakers).

~~~
vilhelm_s
Indeed. To get a rough idea I added up the first table in
[https://en.wikipedia.org/wiki/List_of_languages_by_number_of...](https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers)
, it seems that about 85% of the world population are native speakers of one
of the top-100 (top 1.6%) languages.

~~~
autarch
It's not clear from the Wikipedia list if the language categories are mutually
exclusive. For example, it seems quite likely that nearly everyone who speaks
the Wu, Jin, and Min Nian Chinese family languages also speaks Mandarin.

I'm not even sure what it means to say that Min Nian is the native language
for anyone in Taiwan. Sure, lots of people learn it at home but anyone under
60 is going to be just as fluent in Mandarin.

~~~
elliotlai
The first language I learned is Min Nan, though my Min Nan is not as fluent as
my Mandarin now, I still consider it my mother tongue.

------
diziet
We parsed a lot of the data from App reviews (about 100 million) in different
countries. Languages used are predominantly skewered toward english or other
european languages. For example, in Israel, 51% of all reviews are in english
while only 46% are in Hebrew.

Writeup on major countries: [http://blog.sensortower.com/blog/2013/11/27/what-
apple-app-s...](http://blog.sensortower.com/blog/2013/11/27/what-apple-app-
store-reviews-can-tell-you-about-foreign-language-app-users/)

~~~
minimax
That's pretty cool analysis. Do you think the high proportion of English
relative to other non-native languages could be related to the general
affluence of Apple users? It would be interesting to see how these ratios
stack up relative to the languages used in reviews on some of the Android app
stores in their respective countries. Thanks for sharing.

~~~
diziet
I suspect the affluence of smartphone users in general is more selected
towards those that can speak/write english. Also, most apps are not localized
to native languages, and so the people that use apps are exposed to english
more often. It's much harder to get reviews on Android, especially for
different countries.

------
melling
I guess the title "Only 240 languages are used online" doesn't sound like a
problem? That probably represents 5 or 6 of 7 billion people?

I think the world should move towards even fewer languages than 240. However,
it would be very sad if we didn't capture the complete grammar, vocabulary,
etc. of those other languages before they die, along with lots of audio, to
preserve them.

~~~
CaveTech
You're promoting both the killing and preserving of languages in the same
paragraph, seems a bit misguided. Surely if they're worth documenting, then
its worth trying to preserve their use as well.

Surely the world should be moving to a place where it doesn't matter what
language the source is, at least online. A fully competent Google translate
would seamlessly allow everyone to communicate without destroying culture in
the process.

~~~
nitrogen
Why promote the use of obsolete technology (including language), when it can
be satisfactorily preserved in museums?

~~~
coldtea
Because "satisfactorily preserved" and "museum" are contradictory for a
language.

A spoken language is a living thing -- and it carries with it the patina of
all those that have used it, their stories and how they shaped the language in
return etc.

Relegating to some museum excibits (that noone will care about, let's be
honest) is like being satisfied by a polaroid of your significant other,
instead of him/her.

~~~
pbhjpbhj
We're not talking about artifacts though; we're talking about people.

Yes it would be better to be able to observe and use a language within a
native, living body of people in order to be able to well comprehend and
thoroughly appreciate the nuance, breadth and depth of a language. But in
order to do that you also must _require_ that population not to use some other
language which may well benefit them (in terms of education, trade,
international relations, access to information, etc.).

In Wales for example the historic Welsh language is being propped up at vast
expense to a relatively poor region of the UK. All lessons¹ in [state run]
primary schools (4-11 year olds) are used to promote the historic Welsh
language and children are taught that their own language - predominantly
English - is to be deprecated in preference to Welsh.

None of that serves the children. Without government interference, if trends
continued, barely a handful of people of any age would speak historic Welsh.
As it is now - made compulsory for all children in state schools - use of
historic Welsh language is still declining.

That would indeed be a loss to those interested in Brythonic languages or who
mourn for the simpler times of their youth when they spoke historic Welsh or
who despise the English people so fervently that using "their" language is
anathema [I'm not even joking], or whatever. But, would it be any loss for the
people living in Wales as a whole? I warrant not. Indeed it seems it would be
a gain.

\---

¹ PISA results for 2012, [http://priceonomics.com/only-4-of-languages-are-
used-online/](http://priceonomics.com/only-4-of-languages-are-used-online/),
are just in and show Wales is falling behind the rest of the UK and that the
UK is slipping relative to the other countries surveyed. I think an insidious
over-emphasis on historic Welsh is at least in part to blame.

------
jliechti1
I'd bet a large portion of these languages don't even have official written
languages. Some of the Chinese languages have tens of millions of native
speakers and even they have no standardized way of writing their languages.
When you start getting into languages that are specific to a tribe or very
local area, the odds seem pretty low.

------
M4v3R
The www.jw.org website (Jehovah's Witnesses website) is translated to 316
languages atm, which is on par with the article's claim (170 actively used +
140 borderline cases). You can disagree with the site's message, but it tries
to do a good job into sending the message to people all around the world in
different languages. In comparison though, JW publications are printed in 600+
languages, so there's still lot to do for the website.

Edit: Going into "publications" section, there's a picker which lets you
select a language from total of 538.

------
bowlofpetunias
I hope the research cited is better, because the article makes a complete hash
of mixing interface language, script and language actually used to
communicate. All of these are thrown together as "use of language on digital
stuff".

It's quite common around the world to use an interface in language X (because
the options are limited) but communicate in their own language. (I personally
do it all the time, so much so that I prefer an English interface over one in
my own language for the sake of consistency.)

Lack of support for certain kinds of script may be an issue, but that never
becomes clear from the article. If the data is heavily based on the formal
support for languages (instead of actual usage) then it's seriously skewed.
And even if it isn't, the article doesn't tell us where the actual problem is.

------
adamnemecek
I can't decide whether this is a good thing or a bad thing.

~~~
TrainedMonkey
That depends on whether you are globalist or nationalist.

Globalist - good, less languages - less barriers. Good for international
commerce and knowledge sharing.

Nationalist - bad, loss of language means some part of cultural heritage lost.

Truth is always somewhere in between.

~~~
PavlovsCat
Loss of a language also often means a slightly unique way of seeing the world
and thinking/talking about it is lost. Many languages have cool or weird
features [1], and I care about that even though I can't communicate with
speakers of those languages and therefore will never know the difference. I
don't care about the national heritage of others, it's theirs - I care about
diversity for the sake of it existing, not for the sake of me experiencing it.
If we all thought the same stuff in the same ways, without any barriers, that
would be much worse than having disconnected pockets of diversity, IMHO.

[1]
[https://www.youtube.com/watch?v=QYlVJlmjLEc](https://www.youtube.com/watch?v=QYlVJlmjLEc)

~~~
stan_rogers
That way lies the Sapir-Whorf hypothesis, and madness. The value isn't about
(slightly) unique ways of thinking/talking _about_ things, but (slightly)
unique ways of expressing those thoughts. That is, without at least a good
description of as many of the various grammars that exist, we lose a lot of
data that can help us to understand language as a whole. F'rinstance, if
languages like Hixkaryana had died out before being described, then the
prevailing belief that OVS was an "impossible" regular order (as opposed to a
special construction) would likely have become "truth", and the possibilities
only explored in artificial languages like Klingon. Losing that data means
that we are losing the ability to make inferences as to how language works at
the most fundamental level in the human brain.

On the other hand, it is rather inconvenient (from the perspective of a great
global society) to have as many "pure" languages on the ground as we have (and
have had). But while linguae francae bridge gaps across culture, full-time
bilingualism and multilingualism tend to cross-fertilize grammars (and, of
course, vocabularies), which has the effect of eroding the more interesting
(and the most difficult) aspects of grammars in the smaller languages.

~~~
PavlovsCat
> That way lies the Sapir-Whorf hypothesis

What's wrong with it? I'm not familiar with it, but Wikipedia says " _Current
researchers such as Lera Boroditsky, John A. Lucy and Stephen C. Levinson
believe that language influences thought, but in more limited ways than the
broadest early claims_ ".

> The value isn't about (slightly) unique ways of thinking/talking about
> things, but (slightly) unique ways of expressing those thoughts.

How is talking not expressing thoughts? Anyway.. Then you go on to make good
points, but I don't see what makes you think they're in contradiction to
anything I said. I was not talking about " _the_ value", someone else claimed
that _everybody_ who cares about language diversity must be a nationalist, and
that being a "globalist" means welcoming putting all eggs in one basket. I
completely disagree with that, but for my nilly-willy reasons. If you have
other reasons that's fine, they're good reasons, but to say that my view is
vaguely wrong, because there lies madness for reasons you don't care to go
into, well.

I personally do not care as much about what is accessible to "us" as about
what is accessible to itself, wether I am aware of it or not. If there was a
button to exterminate all parts of the universe we will never have access to
and then forget that happened, I wouldn't press it, even if you bribed me, I
hope. And again, remember this is in response to someone saying the only
reason someone would care is X; I'm not saying my opinion is correct or
perfect in any way, I'm just saying it's not X, cuz it's not :P

~~~
stan_rogers
Just by way of explanation, B L Whorf's hypothesis (a considerable expansion
of Sapir's original opinion) was that vocabulary, and more importantly,
grammar, guided (and was guided by) modes of thought. A particularly famous
example was that Hopi, having no past or future tense, which led its speakers
to conceive of time as cyclical in nature, as opposed to the almost-neurotical
need for Europeans to keep historical records and plan everything out. Quite
apart from the fact that his analysis of the Hopi language was hideously
inadequate (it is as tense-filled as most European languages, lacking only the
explicit present progressive that is a peculiar feature of English and Celtic
languages), he seems to have managed to miss the fact that the culture had
developed a sophisticated set of timekeeping technologies as well. Much of the
follow-on research ("language as culture" is a tasty morsel for anyone bent on
believing that language is nothing more than learned behaviour) can be shown
to use specific meanings gathered from informants for words that can be used
to express much more general concepts. The problem is that the researchers
failed to ask if the words could be used in another way.

The provable aspects revolve mostly around concepts like the colour "grue".
That is, people whose languages lack separate words for blue and green, using
instead a single word for both that could be rendered in English as "grue",
take longer to distinguish between blue and green. It's not that they can't
see the difference, but they initially file blue things and green things under
the same heading, and only go on to distinguish between them when it's
necessary. We have the same problem in English with yellow -- only people
who've spent enough time in the world of colour would be likely to immediately
distinguish warmer/redder yellows (like cadmium yellow) from colder/greener
ones (like lemon yellow). And they learn that (usually) by figuring out that
the two are not interchangeable when mixing colours -- both yellow and blue
done wrongly and yellow and red done wrongly result in a muddy brown rather
than the expected orange and green. It doesn't affect our perception, just our
initial means of filing things, and it means that some circumlocution is
needed when describing a particular yellow to somebody who doesn't have the
technical vocabulary.

We also don't have a single word for "someone who always feels cold" as the
French do, but we all know a few _frileux_ even if we don't have a word for
it, and are aware of that knowledge. We have dropped _hither_ , _thither_ and
_whither_ , but are still aware of the difference between direction and
location when we use _here_ , _there_ and _where_. A large number of the
world's languages don't have sex-linked gender, but their speakers are
perfectly capable of telling the difference between men and women.

The only place where the Sapir-Whorf hypothesis really holds any water is in
the realm of computer languages (and similar ordering-machines-around
environments). Sometimes it _is_ impossible to express a thought/instruction
even when the machine is perfectly capable of carrying it out. Sometimes it's
just so far away from idiomatic use of the language that it's better from any
perspective to find a suitable paraphrase that _is_ idiomatic. But these are
toy languages designed to accomplish specific goals, not general-purpose
languages that anybody has to _live in_. (Try having a relationship in Lisp
and see how far you get.) In the human world, the ability or inability to
express a thought has a lot more to do with experience and culture than with
any limitations that the language you speak may impose on you.

~~~
PavlovsCat
> the provable aspects revolve mostly around concepts like the colour "grue".
> That is, people whose languages lack separate words for blue and green,
> using instead a single word for both that could be rendered in English as
> "grue", take longer to distinguish between blue and green.

What about languages that require the speaker to keep track of their spatial
orientation?

And how come thinking and talking in English and German are both slightly
different for me? Those languages aren't even far apart, but in my mind one
can't replace the other.

> In the human world, the ability or inability to express a thought has a lot
> more to do with experience and culture than with any limitations that the
> language you speak may impose on you.

And again with the strawmen. Where did I claim such radical difference as
being completely unable to express a specific thought, or seeing the world
completely differently? Nowhere.

There's one big difference though, jokes. You can make some jokes in some
languages, but they just don't work in others.

~~~
stan_rogers
That is, though, precisely what the Sapir-Whorf hypothesis claims -- I never
said _you_ were claiming anything of the sort, merely that the hypothesis did.
Linguistics, in general, is a fascinating area of study (though if you are
good at spotting straw men where none exist at all, you'll find yourself
twitching a lot).

------
Jun8
"A language’s Wikipedia presence was one of the most important indicators of
its ability to leap into the digital age."

Which made me curios to search for and find Vicipaedia
[http://la.wikipedia.org/wiki/Vicipaedia:Pagina_prima](http://la.wikipedia.org/wiki/Vicipaedia:Pagina_prima),
it's in the 10k+ article category.

------
wallnerm
I would rather focus on the many opportunities technology and the internet
offers to preserve those 6000+ languages that are currently not being used
online. It's not very bold to say that technology will prevent them from
disappearing completely unlike many languages in history that were not
persevered.

------
elwell
That statistic is somewhat misleading. Or, at least, not very meaningful. A
more meaningful statistic would acknowledge the different "weights" that
certain languages carry (i.e., "how many people use it").

------
bitops
These types of discussion about language always reminds me of
[http://www.economist.com/node/15108609](http://www.economist.com/node/15108609)
which I find a fascinating read.

------
ars
I don't think wikipedia is such a great way to decide on a language. A better
way is things like chat messages, but they are harder to analyze.

------
noarchy
As someone who lives in an area where language is extremely political, and
government is already involved in language-related legislation in day-to-day
life, I worry that this is a potential vector for more government intrusion on
the Internet. The excuses are practically built-in, from the perspective of
the nationalists.

~~~
hrkristian
La oss håpe dine landsmenn ikke tar til internett med sitt eget språk der det
ikke passer.

------
Anon84
And here is a map of where they are used on Twitter

[http://www.ccs.neu.edu/home/qianz/MapTwitterLanguage/v1/inde...](http://www.ccs.neu.edu/home/qianz/MapTwitterLanguage/v1/index.html)

------
CmonDev
Useless things die out - it's a natural process.

~~~
pbhjpbhj
Things which aren't useless also die out.

~~~
PeterisP
Well, for languages the main way for dying out is when ~2 generations of their
population decide that this language is useless for them and the vast majority
choose not to use it anymore.

There are exceptions like genocide and forced assimilation of conquered
nations; but these refer to extermination of languages/cultures, not the much
more widespread issue of them simply dying out through disuse and voluntary
assimilation.

------
NatCrodo
I wouldn't wonder why there is only 4%. There are thousands and thousands of
languages allover the world.

------
knowitall
Are most languages really a big loss? I suppose there are some interesting
variations in grammar, but what else? What can we learn from some obscure
languages?

------
kimonos
I really agree on this..

