Hacker News new | comments | show | ask | jobs | submit login
Kazakhstan Cheers New Alphabet, Except for All Those Apostrophes (nytimes.com)
68 points by cantrevealname 37 days ago | hide | past | web | favorite | 55 comments

The alphabet used for a language is bound to be a significant part of the culture belonging to that language. It's going to appear everywhere, absolutely everywhere, all children will have to learn it, all people will have to use it daily, etc. It's not something you can just fuck up and be done with it.

And I get the feeling that this president, how great he may have been for the country in the past and present, is on the brink of defacing it now.

You may be an authoritarian leader, but if you first collect a commission of experts to form a new alphabet, then suddenly push your own idea (ignoring pushback from those experts you selected in the first place) and apparently are not willing to modify your proposal according to what your experts and the public want, you are being a dictator that's making wrong decisions.

I do find it promising that even though a quote in the article mentions that usually all that the president does shall be cheered, this proposal is getting friction from the public. I hope, for the sake of the Kazakh people, that the president will realise that he is in the wrong and will listen more to his people.

Hardly anybody will wonder to yet another "what a joke of a man he is" when this will fly into Nazarbayev's face. Locals are kinda accustomed to it. The man was so infuriated with Kohen's "Borat" not for it taking an insult on the nation, but him himself, personally (or so he keeps thinking) lol.

A so so dictator he is, not much of an issue he is. Ones who are the dangerous people in the country are his lieutenants: sharp guys who actually run the country, all waiting for him to finally die.

To locals, him being an unscrupulous buffoon gives much elation, as many see that they will fare much worse were he to be as sharp as leaders of neighboring gangsterlandias

I don’t think alphabet is a huge deal.

Here in Montenegro we have two alphabets for the same language, one Cyrillic another one Latin. The Constitution says neither one is to be discriminated for/against. People just read & write both. Wikipedia and government web sites have transliteration switches. Media mostly use Latin.

If you don't want Russians to come 'protecting' russian-speaking people and stealing part of you territory (like in case of Ukraine/Georgia) you should be as far from them as possible. This tactic worked fine for Hitler when invading Austria and still working. As for me president done more long-term decision which should protect country. Even if that's harsh.

Alphabet never stopped US from "bringing democracy" to Libya and Iraq. Let's not forget that.

I don't think Kazakhstan is currently on US radar. But for Russian's it's definitely attractive territory to steal. They're just waiting when country will be weakened like Ukraine to steal everything possible. Nothing basically changed after USSR: steal and conquer if not possible bring chaos and degradation. This can be clearly seen in: Pridnestrovian Moldavian Republic, Abkhazia, etc.

Don't underestimate the Russians. They might go as far as changing their own alphabet so they could invade anyone they choose.

It's funny that the objections from both sides have to do with ease of use on computers:

- The President wants letters that occur only on a standard keyboard, hence the use of apostrophes to modify the sounds of letters as opposed to accented letters.

- Others prefer accented letters because apostrophes would interfere with Google searches and Twitter hashtags.

What is a standard keyboard, anyway? As a native German, I am accustomed to my local „standard“ keyboard. So, no one prevents them from defining a standard keyboard for their new language.

And, as someone brought it up, on smartphones your options are basically limmitless, with the option to fully customize the keyboard as it‘s all virtual.

That's a privilege of big countries like Germany, although Kazakhstan might qualify as well. I live in Estonia, a country of 1.3 million, and although there is a standard Estonian keyboard layout, there's basically zero products using it. [1] The local stores primarily sell laptops and desktop keyboards with the US ANSI layout. This is especially annoying, because ISO layouts like Swedish, UK, or even German would be a closer match to the Estonian standard. I also fear that this is impacting how many young people get into programming, because the Estonian layout binds three essential programming characters < > | to a key which physically doesn't exist on the US ANSI keyboard. I've seen it in practice how a friend tried to learn programming with his US ANSI keyboard, by copy-pasting those characters when needed. Not exactly an optimal situation. [2]


[1] Most keyboards had the Estonian layout in early 2000s, but they soon rapidly lost market share, mostly due to laptops if I had to guess why.

[2] Why not use US ANSI layout in the OS to match the keyboard? Because it doesn't contain characters required to type Estonian words.

> Why not use US ANSI layout in the OS to match the keyboard? Because it doesn't contain characters required to type Estonian words.

You can use dead keys and character combinations to get characters the keyboard doesn't support directly. This is what's done in France, where the standard AZERTY keyboard doesn't have keys for every French letter. For some letters it also only has a lowercase version, not uppercase. (Accents are optional on uppercase letters in French because... who knows why.) There is a dedicated key for the letter ù though, which only appears in a single word of the French language.

> Accents are optional on uppercase letters

Unlike what some french believe, they are not supposed to be optional:


> Accents are optional on uppercase letters

In spanish it used to be the same. Old printing technologies were not so flexible as a screen, so for large imprints it made more sense to have all letters be the same size whilst not wasting space above the letters.

With both MacOS and Linux, you can access all accented uppercase letters simply by using the Caps-Lock key. For some unfathomable reason, Windows makes it much more difficult.

> I've seen it in practice how a friend tried to learn programming with his US ANSI keyboard, by copy-pasting those characters when needed.

Couldn't your friend use trigraphs (https://en.wikipedia.org/wiki/Digraphs_and_trigraphs) in place of these characters?

Note that C and C++ digraphs and trigraphs require < and > as part of the `token`.

Why can't you switch to US layout when coding and switch back to Estonian when writing Estonian text?

I speak Spanish and have a laptop with a US keyboard. There's a similar issue with the <> key. I tried switching all the time at first, but it's a pain in the ass. Eventually I remapped right alt+Z to < and right alt+X to >. It took me an entire afternoon to figure out how to do it on Ubuntu, and I'm pretty sure it can't be done on Windows. So yeah, not exactly a solution for everyone

For Windows, there is the Microsoft Keyboard Layout Creator[1], which would allow you to create such a mapping. I used it to map AltGr+s to ß on a Finnish/Swedish keyboard since I found myself writing a decent amount of German.

[1] https://www.microsoft.com/en-us/download/details.aspx?id=223...

I find it nuts that numbers are unmodified but delimiters like that are not when I mostly use my (US ANSI macOS layout) keyboard for front end coding. My fix was space cadet modifiers. Tapping left/right shift produces opening/closing parenthesis, command does curly braces, and option does angle brackets. Also caps lock is control but tapping it is double quote. You may find with this approach that you have way more keys at your fingertips than you thought.

It's not a pain in the ass (at least in windows) because there is an OS setting that lets each window remember its keyboard layout. So I don't have to switch very often. The coding windows use US layout and my chat windows use GR (Greek) layout.

> What is a standard keyboard, anyway?

One that has exactly the letters that can appear in a passport's machine-readable field.

I have lots of funny accented characters in my name, and I have lived in several countries under several different names adjusted to the local officials' keyboards. And don't even get me started on forms saying "your name exactly as in your passport, but also without any funny letters".

If there is a realistic chance of writing Kazakh (names) in 7-bit-ASCII-only letters, they should go for it.

They could also use two different but equally valid writing systems. Norway has two different official written languages that both sort of correspond to the same spoken language. AFAIR in former Yugoslavia both Latin and Cyrillic were written, I remember a friend telling me that in school they alternated between Latin and Cyrillic weeks.

-As for the two written Norwegian languages, they both utilize the same letters and are, by and large, more like two different accents than distinct languages.

One (Bokmål) is loosely based on Danish (language of the ruling elite up until 1814), adapted to resemble the way Norwegians for the most part spoke, while the other (Nynorsk) was pretty much based on rural dialects.

Nynorsk does make (slightly!) more use of diacritics than bokmål, but most current nynorsk users skip them in daily correspondence; their use is not mandatory. ('too', for instance, being written 'også' in bokmål and 'ôg' in nynorsk.)

Also, both versions of Norwegian uses our funny letters æ/Æ (ae, similar-sounding to a in 'bad'), ø/Ø (oe, similar-sounding to ea in 'heard') and å/Å (aa, similar to 'a' in saw).

Disclaimer: I am an engineer, not a linguist. There are probably some inaccuracies in the above, but it should get you through the next linguist pub quiz without causing embarrassment.

Yes, in YU they used both. It was localized to member states tho. Croatia used mostly Latin, and Serbs used mostly Cyrillic. They often were able to read and write both non the less.

It probably means the keyboards people in Kazakhstan already have. Which I assume is a Cyrillic keyboard, which can also do Latin script but does not have keys for umlauts and such.

Desktop keyboards contain apostrophes, but smartphone keyboards generally don't. And one proposal (the best, IMHO) used digraphs rather than accented letters. This would work very nicely with computers, smart phones, Twitter, Google, etc.

> ... digraphs ...

I wonder who to credit for encouraging the English to use th instead of thorn. It turned out ok.

True but we now have pretty much the world plus dog thinking that "Ye olde shoppe" sounds very different to "The old shop".

Note that I did not put in a real thorn and instead substituted a Y. See https://en.wikipedia.org/wiki/Thorn_(letter)

Yeah, why not use the Slovene alphabet and add some carons to short vowels and be done?

The way it's been proposed reminds me of the old Wade-Giles custom of adding apostrophes, breves and other diacritical marks. Pinyin is so much cleaner (except a few umlauts).

Pinyin is cleaner but really misleading if you’re trying to pronounce a word just with your knowledge of other Latin based writing systems.

X- is pronounced like sh

Q- is pronounced like ch

C- is pronounced like ts

-iu is pronounced like yo

-ian is pronounced like yen


In this regard even Wade-Giles does a better job in my opinion.

"Pronounced like" in this context really means "vaguely similar, but actually fundamentally different". There is no English equivalent to the phonemes represented in pinyin by j, q, x, zh, ch, sh, r or z. English simply doesn't have alveolo-palatal or retroflex consonants. If anything, I think that pinyin is less misleading than Wade-Giles, because it doesn't lull you into false confidence. I have a weak preference for bopomofo, but that's another argument entirely.

I think you're right about highlighting the quality of the consonants when it comes to official mandarin, but those characteristics fade when it comes to regional variants.

Yo be clear, mandarin is of the most concern, just want to raise the issue that while true, it's not uniform throughout the language.

It's a fair point, but a pretty niche one - I expect that the overwhelming majority of foreign-language pinyin users are learning standard Mandarin. If you're learning Min or Wu as a second language, you've got bigger problems than the quirks of your transliteration scheme.

For native users of pinyin, the consistency with English pronunciation is essentially irrelevant.

Sorry, I didn't make myself clear and also for belaboring the point. I wasn't referring to separate dialects, but local non northern mandarin which gets pronounced slightly differently than received pronunciation.

A brief example, yes/is, approx [shi], might be pronounced approx [si], etc. But it's mandarin and not hakka, min, etc.

> Yeah, why not use the Slovene alphabet and add some carons to short vowels and be done?

Kazakh makes a phonemic distinction between [g] and [ɣ]. Slovene, and most European languages, doesn’t have that distinction and their writing systems are therefore not equipped to represent it.

Huh. That's interesting. Do you know if that's positional? I.e. it's predictable? For example in English you know an "s" is /z/ when between voiced vowels, most of the time. Or OE "good" vs "geostran dag".

In any event, as per English, French, among others, an alphabet does not have to orthographically represent the phonemes exactly.

It's cheaper to fix Google and Twitter than it is to change keyboards, so I think the President has a stronger point here.

We have adopted a different approach in Romania: we usually don't use diacritics at all on the internet. It's not that hard to read the original word (though, I've heard it's pretty hard for foreigners that try to learn the language to understand what are we writing without them). Of course there are few words where it's harder to understand the original meaning, but the advantage of using plain ascii outnumbers it.

We do have a local keyboard standard, just most of us don't use it when typing something that's not official.

edit: I'm looking at my driving license and car registration certificate, there are no diacritics and they are official documents.

Same with Czech on the internet. Sometimes there is confusion but 99% of time it works. We use diacritics on official documents though.

English and lots of other languages have word with apostrophes in the middle, seems like Google would have worked out the technical issues. And for hashtags you just omit the apostrophes.

It doesn’t even look that inelegant to me:

    Barlyq adamdar ty’mysynan azat ja’ne qadir-qasi’eti men quqyqtary ten’ bolyp du’ni’ege keledi. Adamdarg’a aqyl-parasat, ar-ojdan berilgen, sondyqtan olar bir-birimen ty’ystyq, bay’yrmaldyq qarym-qatynas jasay’lary ti‘is.

your quotation has two version of the apostrophe (’ and ‘, two different unicode points); is that intentional?

I wonder if people would complain less if instead of the apostrophe another unused letter was used, such as "x" (or something else?)

    Barlyq adamdar tyxmysynan azat jaxne qadir-qasixeti men quqyqtary tenx bolyp duxnixege keledi. Adamdargxa aqyl-parasat, ar-ojdan berilgen, sondyqtan olar bir-birimen tyxystyq, bayxyrmaldyq qarym-qatynas jasayxlary tixis.

I wonder why they did it - perhaps they were trying to avoid having digraphs mapping to a single sound. An alternative might be carons + acute accents to modify the letter like the Czech, Slovak and in the Balkans so in the example from the article below...

> Under this new system, the Kazakh word for cherry will be written as s’i’i’e, and pronounced she-ee-ye.

You could define the alphabet so that this is written "šííě". To be honest it seems like this is a tricky example to map to any latin-based alphabet. Maybe diacritics were avoided to avoid internationalisation issues (i.e. you could easily fall back to the locale "en-us" and your Kazakh doc still looks alright)

> Maybe diacritics were avoided to avoid internationalisation issues

Yes, the article quotes the president as saying: "There should not be any hooks or superfluous dots that cannot be put straight into a computer."

Ah fair enough. I mostly skimmed, but searched through for "diacritic", "acute" and "accent" among others to see if it was mentioned - didn't think to use these.

I can't even find a simple listing of the new Kazakh alphabet anywhere, nevermind the experts' recommendation(s)…

> Under this new system, the Kazakh word for cherry will be written as s’i’i’e […]

Ouch. That's just plain horrible.

Yet a lot of, say, Hawaiian words are written like that and nobody cares. Seems like Nazarbaev wants to keep the language ASCII instead of adding diacritics which would make it hard to type.

Why would it make it hard to type? There are probably hundreds of different keyboard layouts so far. That is a long ago solved problem.

There are a lot less people to complain about that, though. 25 thousand speakers versus 10 million speakers.

To me non-accented Latin characters + apostrophes seems way more computer friendly than accented characters.

Sure, Unicode is supposed to be everywhere, but in practice it’s simply not. On top of that a more limited character set not requiring any sort of IME or special layer shift keys is way way easier to type.

So, when deciding an alphabet for a language, you would go with computer-friendliness as a key metric? Also, what does "computer friendly" mean? How would you break the word "s’i’i’e" into graphemes? "s’" is a single phoneme, but is it a single letter? Does "s’i’i’e" contain letter "s"? If so, why don't we hear it? If the Kazakhs already have the unique opportunity to reform their alphabet, they should do it right and make it phonetic. Introducing apostrophes is simply stupid, and, if not now, they will get rid of them immediately after the emperor dies.

But better than diagraphs? It may be my bias as an English speaker/writer, but I'd pick sh over s’, for example.

The vowels are trickier, and just doubling them like Finnish would turn s’i’i’e into shiiiie, so you might need something (an apostrophe ;-) ?) to divide the syllables. I still like shii’iie better, but that's just one word.

Not invented here syndrome IRL

Where else does it occur?

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact