Hacker News new | comments | show | ask | jobs | submit login
Draft of new Latin-based Kazakh alphabet revealed (inform.kz)
62 points by mparramon 97 days ago | hide | past | web | favorite | 51 comments



I've born in Uzbekistan. In 1990s Uzbekistan moved their entire language to Latin-based alphabet. Even I was small kid I knew it's just a stupid sign of nationalism.

Changing your official language to a new alphabet is pretty expensive thing (especially when your country's economy is collapsed and your country has extreme inflation). You have to change all official signs, texts and reteach whole population.

Another stupid thing Uzbekistan did is to publicly burn (literally) soviet school books while not having a good replacement. I remember looking at those burning soviet math books in my school's backyard.

And it's all done instead of what? Instead of:

1. Teach population of proper fluent English as a second/third language along with Russian;

2. Making competitive free-market economy without clan-type quasi-government monopolies which run by top government official's relatives;

3. Attracting and protecting western investors;


Or they've simply wanted to finish what was started in 1920, but was halted by Russian nationalism?


It's worth noting that Russia (then USSR) was the one that introduced the Latin alphabet in 1927 in the first place. Before that, it was Arabic. So, arguably, if they wanted to fully reverse the effects of imperialism, no matter the cost, they should go for Arabic.

From a practical standpoint, though, Arabic doesn't make sense - it's more complicated to learn, vastly more different from the alphabets of all neighboring languages (starting with the whole right-to-left thing), and most other Turkic languages use Latin-based alphabets these days.

But if the choice of Latin is dictated by pragmatic considerations, then it's reasonable to compare it to Cyrillic on the same grounds.


It's worth noting that Latin was introduced by Asian republics, not Russia or USSR - they forced cyrillic on all republics.


It's more complicated than that.

The original attempts to latinize (replacing Arabic) writing systems in those republics were, in fact, carried out by USSR, and it was very much a top-down thing, not some local initiative. USSR had a massive country-wide latinization campaign in 1920s for basically all languages spoken anywhere in the country that didn't already use Cyrillic. Long-term plans were to latinize everything, including Russian itself (this one went as far as a final proposal by a committee in 1930).

https://en.wikipedia.org/wiki/Latinisation_in_the_Soviet_Uni...

For Turkic languages specifically:

https://en.wikipedia.org/wiki/Ya%C3%B1alif

And part of a broader ideological push:

https://en.wikipedia.org/wiki/Korenizatsiya

It makes perfect sense when you realize that Soviet government, at that point, viewed itself as a first comer in a blaze of worldwide communist revolutions, that would ultimately produce a single worldwide federated communist state. Per Marxist dogma about industrial workers being the vanguard of the revolution, the expectation was that the center of gravity would shift to Western Europe once revolutions happened there. Communist utopias of that time period generally assumed that such unification would result in adoption of some common language, and that Latin script would be used for that language. So adopting Latin for Soviet languages made sense in preparation.

When Stalin abandoned the concept of "world revolution", and switched instead to "socialism in one country", with a heavy dose of Russian nationalism underpinning the new ideology, all this was scrapped, and (then already established) Latin writing systems were replaced with Cyrillic ones, from roughly mid 1930s onward. Those Cyrillic writing systems are the ones inherited by the Asian republics at the dissolution of the USSR.


Still seems rather anachronistic and stupid, to say the least. When India became independent, there was a similar nativist drive: you can see how many of the major cities were renamed and such. Luckily people realized there was no real benefit to get rid of the English based education that was already in place. Something which I am personally immensely thankful for.


One of the things I like about the cyrillic alphabet is that it has symbols for diphtongs. Portuguese (my native toungue) fixes this with diacritics (melancia, and distância - the diacritic wouldn't be necessary if the 'ia' in distância used a symbol to indicate it is a diphtong). French mixes vowels up in lots of different ways and English spelling is a mess.

Had the latin alphabet symbols for diphtongs, things could be simpler. Mind you, I am not a linguist.


Technically Latin has had many symbols for dipthongs, but in practice only two became common: Æ and Œ. They are usually classified as ligatures, but that's not technically correct in this case.


> Had the latin alphabet symbols for diphtongs, things could be simpler.

That is also challenging though because you run into a lot of cases where a syllable is a diphthong in one regional accent, but not another.

For example, in Southern US English, "ride" is often pronounced something like "rahd". In the former, the "i" is a diphthong, but in the latter, the "ah" is not.

Conversely, some vowels become dipthongs. A short "i" in Southern accents often gets pronounced more like "ee-uh". Imagine Foghorn Leghorn saying "pin" as "pi-uhn" and you've got the idea.


Here (Pittsburgh) the vowel in "down" or "town" is a monophthong instead of a diphthong like it is pretty much everywhere else.

I wonder if part of the reason for the diversity of regional accents in English is because pronunciation is more loosely encoded than in languages that have more vowel symbols or modifying marks, or strict phonetic pronunciation. Written English doesn't really provide many strong indicators that a particular pronunciation isn't standard.


It's a problem with phonetic spelling in general, not specifically with diphthongs.


Does this indicate a potential change of sphere of influence, like Turkish not choosing Arabic script and Vietnamese not choosing Chinese.


Very much. The Kazakh president pushes the Kazakh language and Kazakh culture and stresses the distinctions with those of Russia. It's also a practical modern choice though, as the Latin alphabet is more widely used worldwide and Kazakh is not a Slavic language so it gains little from being written in the same script as Russian.


"not a Slavic language so it gains little from being written in the same script as Russian"

So, by this thinking, what exactly does it gain linguistically from being written in the same script as italian?


Precisely. Besides, the West Slavic and half of the South Slavic languages are written using Latin script. The choice of script was very much bound up with the matter of which civilization (and religion) the respective peoples chose to align with. At the time these scripts were being adopted, it meant choosing between Rome or Constantinople.

So, if anything, a switch to Latin script is a signal of choosing to align with the West, and perhaps technological convenience.


I think it is also a historical coincidence. Cyrillic is an original Slavic thing and its early variant Glagolitic srcipt (https://en.wikipedia.org/wiki/Glagolitic_script) was actually created for West Slavic "customers". It was later replaced there by Latin script simply because early Slavic states there lost their independence to Germanic states. Since early literacy was a rare thing reserved for small clergy and state official circles, and since Slavic population has limited access to those, it was a natural thing that it died down in favor of Latin alphabet.


It gains international recognizability from being written, like Italian, in the world's most internationally used script.


I am not sure people around the world will start being crazy about all thing Kazakh now that those are written with a Latin alphabet.

As far as popular culture goes, Borat is the only thing most people recognize Kazakhtan for (besides it being No. 2 potassium producer in the world, of course) :)


Is this really of any benefit though? I mean, the Slavic script is pretty well supported by most modern browsers and phones etc. Seems like they are focusing on an utterly stupid problem.


It gains linguistically from using the same script as Turkish and other Turkic languages. These enjoy a considerable degree of mutual intelligibility (similar to, and in some cases closer than, Spanish and Italian, for example), so using a shared alphabet would help.

However, looking at the draft, they did their own thing, instead of taking e.g. https://en.wikipedia.org/wiki/Common_Turkic_Alphabet, or at least using something similar to it. In particular, they decided to use letter combinations instead of diacritics. Their mapping for regular Latin letters is also very different (e.g. U/Y/W).


Ask Mustafa Kemal; IIRC he put the Turks through a similar process (arabic script to latin alphabet) in the 1920s....


most slavic languages use latin alphabet and not script


This is clever:

> The scientists rejected the idea of introducing diacritical marks (glyphs added to a letter, or basic glyphs) as they suppose that because of rare use, the specific sounds of the Kazakh language can disappear.

I see nothing wrong in principle with diacritics (I use two languages that require them daily, in addition to English) but given the dire state of internationalization this makes it easier for foreign software to support Kazakh, and fits their entire language in the ASCII plane.


I guess not using diacriticals makes computing easier, but it's silly to claim that use (or non-use) of diacritical marks can make sounds disappear.

If specific sounds in Kazakh disappear because "they were written the same", it merely means that the particular sound distinction was already being phased out in the spoken language. If the sound distinction is gone, separate symbols won't save them. It will merely become yet another opaque spelling rule for children to memorize.

English "th" represents two different sounds, and English speakers manage just fine, because in its sound system they are totally distinct and never confused with each other.


The sentence from the article is ambiguous. It's not clear that "rare use" refers to rare use of the diacriticals or of the sounds themselves. If the latter, their claim makes sense — the sounds are already rarely used, so aren't worth enshrining in the alphabet.


> English "th" represents two different sounds

Three main ones (excluding compounds where it's really a “t” next to an “h” that aren't related): in IPA terms, they are θ,ð, and t.

> and English speakers manage just fine, because in its sound system they are totally distinct and never confused with each other.

English speakers do okay because they learn fairly quickly that written English isn't phonetic and you just have to memorize spelling, not because of a lack of ambiguity.

Nothing about “Thai” vs. “thigh” vs. “thy” tells you which “th” sound each uses (and in pronunciation, the “th” sound is the only difference between them) except knowing what each word is as a rule unto itself.


I don't think English spelling not being perceived as phonetic does much here. For example, a lot of Russian speakers think Russian spelling is phonetic, and yet there has been no shift to pronounce words as written there either, as is surely the case in most other languages: there are not many writing systems that are truly phonetic, and often this happens because the spoken language has shifted away from the written form.


To say nothing of the regional accents in England where "th" is often pronounced the same as "f".


In which case it would still only be three distinct realizations: [f v t].


25 can't be enough, Kazakh has 9 phonemic vowels. I presume the six Latin vowel signs will still get diacritics?


Looks like they're using digraphs instead:

    To ensure fullness of the Kazakh language sound system, the scientists included 8 digraphs,
    [...]
    The scientists rejected the idea of introducing diacritical marks


Am I crazy, or does the article only list 4 digraphs?

I don't see any vowel ones


I only see seven vowels in the graph.

I also only see four digraphs, for consonants only.

It looks like U is replacing У-with-a-line, and W (!) is replacing У


https://tengrinews.kz/kazakhstan_news/kak-vyiglyadit-proekt-...

see chart with three rows for other vowel sounds


W for У is natural because that's how it works in Arabic.


But Kazakh is not in any way, shape or form related to Arabic. Nor is there widespread knowledge of how to write Arabic among the population.


No doubt, but the academics who designed this alphabet definitely can write Arabic, and the letter W was going spare anyway.


Apart from the digraphs, both Һ and Х are mapped to H. That makes the Latin proposal not round-trip safe.


Maybe they did that because Һ and X are pronounced the same in Kazakh?


They aren't - the distinction is voiced/unvoiced.


The distinction is place of articulation - velar versus glottal - not voicing. And, according to Wikipedia, "The letter Һ is used only in Arabic-Persian borrowings and is often pronounced like an unvoiced Х".


Didn't German deal with something similar with S and ß?


Is this a political act intended to reduce Russian influence?


Perhaps they just want to have a greater choice in keyboard selection?


Of course.


Seems you already have cyrillic alphabet everyone uses.


Cyrillic can't represent all of the sounds that Kazakh has. It borrows some symbols but even now it uses symbols that don't exist in Cyrillic. If you ever go to Kazakhstan you'll see that each official sign, such as in the Almaty metro, has 2-3 languages: Kazakh, Russian, and sometimes English.


> even now it uses symbols that don't exist in Cyrillic

To be precise, it uses symbols that don't exist in Russian. Cyrillic is much broader than Russian, though. Ukrainian and Serbian use their own letters, too.

In USSR, there was actually a standard called the "common alphabet", that combined all letters of all languages spoken in COMECON, both Cyrillic and Latin.

https://ru.wikipedia.org/wiki/%D0%A4%D0%B0%D0%B9%D0%BB:Ob-al...


But 25 characters of Latin alphabet would represent it better?


There's also 8 digraphs, which, while not really distinct in English, represent distinct sounds.


soo.. one can't do digraphs with cyrillic glyphs? yeaaah..




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: