
Identify Any Language Written in the Roman Alphabet at a Glance - tokenadult
https://theweek.com/articles/617776/how-identify-language-glance
======
niccaluim
_…both use accents on vowels, but only Scots uses grave (left-pointing)
accents, like on à in Gàidhlig._

Just a quick note of caution here: "Scots" and "Scots Gaelic" are two
completely different languages, the former being a Germanic language closely
related to English and spoken in the Lowlands, and the latter being a Celtic
tongue largely confined to the Highlands, the Western Isles, and Nova Scotia.
If you can read English you can probably make some vague sense of written
Scots, but unless you have training there's no way you'd understand a word of
Scottish Gaelic. This article is referring to Gaelic, not Scots.

------
brendyn
Some of my Chinese friends use の instead of Chinese equivalent 的 just for fun.
Personally I distinguish them by just going and learning the languages. It's
easy to distinguish them by noticing Japanese has curvy characters mixed with
blocky complicated Chinese ones, where is Chinese is 100% complicated blocky
characters.

Also I just want to put it out there randomly that if you want to learn a
language but believe you can't, you are almost certainly wrong. If you are
able to read this text, you have demonstrated possession of a wet sack of
neurons capable of learning a second language. I've witnessed or read about
all sorts of people learning a new language; old people, shy, autistic, even
while dealing with brain cancer. It is a myth to think children can learn
faster than adults. The only time this happens is when the adult is hindered
by there own reluctances. Go get some beginner materials with audio, not just
text, and dive in. Don't waste time torrenting 12312^23 TiB of learning
materials. Glossika, Teach Yourself X, Xpod, Xclass101, whatever. Learn how to
ask some trivial questions relevant to your life, write them down because you
will forget, then go chat with a native speaker somehow. Read out your
questions because you're nervous and forgot, and then fail to understand
anything they say, but just pay attention and listen to the sounds of the
language. Then go home and learn a bit more, but don't worry too much about
memorising anything, just listen, and comprehend a little bit. Then meet up
with a native speaker again with some more questions prepared. This time you
might understand 0, 1, or a few words, still be nervous, but you'll be a
little bit better than last time. Basically you just keep this up without
giving up and you will pick up pace. For inspiration, look at blogs like Benny
Lewis' and others. It may take 10000 hours to master, but it only takes
hundreds of hours to find yourself understand and contributing to group
conversations comfortably. If you can enjoy the process, you'll be able to
study for N hours for any N as the clock keeps ticking regardless. Just set
N=1000

~~~
adrianN
For some languages a couple hundred hours might be enough to participate
meaningfully in a normal conversation. But if you're for example an English
native speaker and want to learn Chinese, you'll have a really hard time
understanding anything but very simple sentences.

On the other hand if you know English and German well, it is very easy to
learn for example Dutch, and a couple hundred hours will get you really far.

~~~
restalis
You're discussing about the language overlap being relevant for the amount of
time and effort necessary to get results. That is true, of course, and it if
often thanks to the (shared and) already possessed mental models necessary to
master the new language. The most important bit of this mental model of a
given language is the way speakers phrase their thoughts. This is exactly what
you get right by following the brendyn's advice of taking it slow. In time
you'll sense and reproduce the natural expressions. (This advice is especially
valuable for learning English, BTW! Trying to make sense out of the compound
verbs is a poor strategy, therefore just let them sink in slowly in your mind,
each within an appropriate context.)

------
Nadya
Additional bonus: Korean uses lots of basic angles, squares, and lines and
many Hangul have "three parts" eg. 한국어의 It's a beautifully simple writing
system.

You can go quite a long comment chain in Japanese without seeing の. I always
tell people "look for lots of simple characters that can be written with only
2 or 3 lines mixed in between a bunch of really complex characters".

    
    
        来週は学校に行きません
    

For those who don't know Japanese, only following my rule, can you identify
the Japanese characters in the sentence above?

Except with names, Japanese writing will have a bunch of Chinese characters
with many "simple" Japanese characters sprinkled in between. To someone who
doesn't read Japanese, both are "unintelligible" but I find people can
identify the more complex Kanji from the more simple Kana quite easily and can
point them out with pretty good accuracy, even if they can't read any of the
kana.

The above example without Japanese characters if you'd like to see if you
guessed correctly:

    
    
        来週学校行

~~~
umanwizard
Another quirk about the の thing is that some Chinese-speaking people, in
Taiwan at least (not sure about the mainland) use の as a colloquial
replacement for 的, presumably due to Japanese cultural influence. So it's not
completely impossible to see の in a Chinese-language text.

(They still pronounce it "de", only the writing is different.)

~~~
cbd1984
Friendlier link:
[https://en.wikipedia.org/wiki/Martian_language](https://en.wikipedia.org/wiki/Martian_language)

------
schoen
Wikipedia has a more elaborate and detailed guide to this

[https://en.wikipedia.org/wiki/Wikipedia:Language_recognition...](https://en.wikipedia.org/wiki/Wikipedia:Language_recognition_chart)

but this article's approach is actually pretty handy and its tips are very
practical.

------
grondilu
Only tangentially related, but why can't spelling check software automatically
figure out which language I'm typing in? I write mostly in English but
sometimes I write in French or in a mix of both languages and I usually
struggle with the spelling corrector which keeps bothering me.

~~~
hirsin
SwiftKey (Android keyboard) does an amazing job of this. I routinely switch
between French and English, and usually by the end of the first word in the
other language my autocorrect and suggestions are both in the right language.

~~~
slazaro
AFAIK it uses Markov chains, so I think it just lumps all dictionaries
together, and as soon as you write a couple of words, the probabilities for
the following words will be in the proper language. It doesn't even need
specific rules per language, it's all automatic.

~~~
patates
That is my guess as well but, whatever it is, it works marvelously. Many
times, I write a significant portion of my message _just_ through suggestions.

~~~
eru
That's a funny game to play with your friends (if they also have personalized
suggestions). Just see what you can write using only your suggestions.

------
mdturnerphys
Reminded me of this quiz:
[http://www.nicholaswhyte.info/34l/default.htm](http://www.nicholaswhyte.info/34l/default.htm)

------
sdfjkl
Ð/đ may also be Croatian, where it sounds like a "dj". Technically it could
also be Serbian (which is pretty much the same spoken language, called Serbo-
Croatian), but Serbian is usually written using the Cyryllic alphabet while
Croats chose Roman letters.

------
to3m
This is all very helpful, but how do you spot English?

~~~
aylons
No diacritics, and generous use of apostrophes.

~~~
yohoho22
> No diacritics

An understandably naïve view of things whose errancy could easily be corrected
through your coöperative perusal of The New Yorker.

~~~
eru
Wouldn't it be cöoperative?

~~~
tremon
It wouldn't be. The goal is to separate the second o from the first, not to
separate the o from the c:

 _The diaeresis indicates that a vowel should be pronounced apart from the
letter that precedes it [1]_

[1]
[https://en.wikipedia.org/wiki/Diaeresis_%28diacritic%29#Diae...](https://en.wikipedia.org/wiki/Diaeresis_%28diacritic%29#Diaeresis)

~~~
eru
Oh, true. Naive should have tipped me off.

------
sharpercoder
>Dutch, German, and Afrikaans: Of these three close relations to English, only
German uses Ä/ä, Ö/ö, and Ü/ü.

Incorrect: Dutch uses the _trema_ on the i, e, o, u, a as well. Examples: *
reünie * knieën * ruïne * Aäron * zoöloog

~~~
pge
And anal English speakers (such as the copy editors of the New Yorker) also
use the diaresis (the double dot over a vowel). In English, it is used over
the second of two vowels in a row which are voiced separately, such as naive
(should have the double dot over the i) or cooperate (double dot over the
second o). The distinction is between a word like coop (meaning a house for
chickens) in which the two vowels make one sound and a word like cooperation,
in which the two o's are separate sounds.

------
patrickburke
This reminds me of this fine map, "List of writing systems"
[https://upload.wikimedia.org/wikipedia/commons/a/aa/World_al...](https://upload.wikimedia.org/wikipedia/commons/a/aa/World_alphabets_%26_writing_systems.svg)

------
patates
Nitpick: In Turkish, "ğ" is silent by itself but it makes the pronunciation of
the vowel before itself longer and sometimes makes the pronunciation end at
the back of the mouth, especially after "e". "Erdoğan" is indeed just
"Erdooan" though.

------
michalskop
Correction to the article: There is no Ů in Czech (except for CAPS LOCKED
words) - The longer "u" is written as "Ú/ú" as the first letter in a word, and
"ů" in other positions (strange for sure, but because of historical reasons)

~~~
bonzini
It's not just historical. "Ů" is phonologically a longer "u", but it is
actually an alternation [1] of "o": for example see how nominative "dům"
becomes "domu" in the genitive (likewise "stůl", "bůh", etc.). When a "u"
becomes longer, instead, it becomes a "ú" as the first letter of the word, but
otherwise it becomes "ou"; for example see how the feminine nominative of
(some) nouns and adjectives is "a" and "á" respectively, while the accusative
is "u" and "ou", or how the perfective companion of "kupovat" is "koupit".

(Also, see how I sneaked in a "Ů" in the second sentence :)).

[1]
[https://en.wikipedia.org/wiki/Alternation_%28linguistics%29](https://en.wikipedia.org/wiki/Alternation_%28linguistics%29)

------
gazrogers
> Welsh is actually quite different from the other two. It uses lots of ll and
> ff and it uses w as a vowel (e.g., cwm).

Welsh also uses a circumflex accent to extend any of the vowels, and since
both 'w' and 'y' are vowels in Welsh (leading to many jokes by English
speakers about words with no vowels) they can have the circumflex accent too.
I've had problems in the past finding the alt-codes to generate w or y with a
circumflex accent - so those may be unique to Welsh.

From
[http://symbolcodes.tlt.psu.edu/bylanguage/welsh.html](http://symbolcodes.tlt.psu.edu/bylanguage/welsh.html)
: > Because of the writing system, Welsh places accents on the letters w
(phonetic /u/) and y (phonetic /ɨ/ or /i/), which is very unique in languages
of the world. These symbols require Unicode support apart from that of other
Western European languages.

------
LanceH
Persian will have three dots in a triangle above a single upward stroke or
below the line. Arabic only has the three dot combo above the script on a
multiple upward stroke grouping (sometimes a flat line between upstrokes).

~~~
saadat
The same is true for Urdu as well, so if you want to distinguish Urdu from
Persian: look for a backward moving (i.e. towards the right) horizontal stroke
at the end of a word. This stroke will always run under the preceding letters
of the word, except that some dots of the preceding letters may be moved
beneath the stroke in order to avoid collision.

------
hiphipjorge
I try doing this a lot when listening to people on the street and think I'm
pretty good at it... Of course, I never truly know unless I ask!

------
Grue3
Nothing on Filipino/Indonesian languages? Those always confuse me, since the
users also heavily mix them with English, so you might see a comment mostly in
English but also have a bunch of native words or phrases mixed in.

------
varjag
> You can sometimes tell Danish from Norwegian because Danish sometimes uses
> aa (as in Kierkegaard) instead of å.

That goes both ways (e.g. Haakonsvern in Norway), so no you really can't tell
it apart that way.

------
vansteen
I like Hacker News for that. The topic of this article is interesting. Thanks
for bringing that up. However, when you read the comments here, you realise
the article is quite wrong :)

------
beyondcompute
Ħħ - Maltese

------
wibr
ß for German!

~~~
atomwaffel
Yes, if you spot ß, that's a dead giveaway for German. You can't really rely
on it alone for identification however because it's not that frequent (or
rather, it's very inconsistent – German can run for paragraphs without a
single ß only to make up for it with five of them in a single sentence). It's
also not used at all in Swiss German.

Another near-certain giveaway is that all nouns in German are capitalised. The
only other language that does that and uses the Latin alphabet is
Luxembourgish, and you're probably not looking at that.

------
peterburkimsher
There is a character only used in Taiwanese, not Chinese: 互

By that, I mean the Taiwanese language, which is not the same as Mandarin
Chinese. Both languages are used in Taiwan, although Mandarin is the official
language of the (outgoing) KMT government. Taiwan number 1 ;-)

~~~
hawflakes
It's not true that only Taiwanese uses it. It means "mutual" or "each other"
and is used quite a bit.

互相 mutual 互聯網 internet

------
vansteen
French:

Often used: à è é

Used: â ä ê ë î ï ô û ù œ ç

Very very rare: æ ü ÿ

------
superbatfish
Crap, I upvoted this before I noticed which publication it's from. How do I
downvote it?

~~~
billforsternz
On a meta level I find that just a little troubling. It sounds to me like
"Crap, I agreed with this until I noticed it was an opinion from a tribe I
don't identify with - so I can't agree with it". Maybe theweek.com is some
uniquely evil thing I haven't heard about?

~~~
dragonwriter
"Upvote" means something different than "agree"; one of the things it means is
"I endorse people visiting this".

I can imagine quite a few sources that I wouldn't want to direct traffic to
even if they published something where I agree with the sentiment.

~~~
Karunamon
The idea of voting for a link for any reason other than its quality is
completely anti ethical to a karmic voting system like the one used here.

~~~
dragonwriter
Yes, but "quality" is both vague and subjective, not only will different
people evaluate aspects of quality differently, different people will
legitimately have different views on what components "quality" of a link has.
I don't think it's unreasonable to consider the source as a one factor in
overall quality (if nothing else as a proxy for things the rater is unable to
evaluate about the article in isolation.)

~~~
Karunamon
But why should things other than _the article directly linked to_ matter? Why
should it be acceptable to downvote an otherwise interesting and correct
article just because of the source?

That smacks of voting for ideological correctness over truth or
interestingness, a problem that otherwise intelligent people should be able to
look past. What makes this site meaningfully different from the front page of
Reddit if people will crap on an article because it comes from a source that
doesn't align with their politics?

~~~
dragonwriter
> Why should it be acceptable to downvote an otherwise interesting and correct
> article just because of the source?

"Correct" is often a probabilistic assessment, not something a potential
up/downvoter can determine absolutely.

The source is often an important input to that probabilistic assessment.

> That smacks of voting for ideological correctness over truth or
> interestingness

Different outlets of the same ideological bent (whether relatively neutral or
not) can have wildly different editorial standards which produce wildly
different reliability.

