Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

When written Chinese was simplified there was a short-lived movement to move towards non-logographic system called Bopomofo or Zhuyin Fuhao which has the cool aspect of still looking non-Western, but being a phonetic system with unambiguous "spelling". My understanding though is that it still introduces significant issues with homophones (including tones) which is sorted out in written Chinese by unique characters.

It probably could have been resolved with superscript numbers and a strictly controlled dictionary that mapped each number to a specific dictionary definition for a homophone. But that's not what happened and instead we ended up with Simplified Chinese which is still among the most complex written languages ever created.

I believe it's still used in some dictionaries as a pronunciation guide however.

I can't speak for Chinese, but for Korean there's a large number of Chinese loan words, except spoken Korean doesn't have tones and it does introduce a number of comprehension issues when context is ambiguous and the different words are written with the same Hangul. This wasn't really an issue in the past as Korean used the Chinese system until Hangul started becoming more commonplace.

https://en.wikipedia.org/wiki/Bopomofo

http://www.omniglot.com/writing/zhuyin.htm

https://en.wikipedia.org/wiki/Simplified_Chinese_characters

http://hunjang.blogspot.com/2005/04/koreans-reading-comprehe...



This is getting off-topic, but as a native Korean speaker, I'd say Hanja ("Chinese characters" used in Korean) is overrated. :)

(I do think some amount of Hanja education helps learning more Korean words, but using them in everyday documents is another matter.)

Some newspapers banished Hanja entirely in 1988, when many old-generation scholars decried the sorry state of Hanja education and the impending downfall of the Korean culture.

That didn't happen, and one by one, those newspapers that ran the op-eds of worried scholars followed suit, a process hastened by the introduction of the internet. (Writing Chinese characters with a keyboard isn't really an easy process. Especially when you're speaking Korean: you type Korean first and then have to convert each word to Chinese, so why bother?)

Nowadays, Korean culture is just as strong as before, Korean dramas are popular in China, and many official documents are arguably much easier to understand than in the 80s, partly thanks to the efforts to banish esoteric jargons that nobody could understand without writing in Hanja (and only poorly understood even when written in Hanja: think about it, you can't understand what a telephone is even if you know the Ancient Greek words for "far" and "voice". You could only have a vague guess.)

On top of that, many of these "esoteric jargons" were direct import from written Japanese words during the colonization period, so they were never a proper Korean word outside a small group of people.


Please feel free to jump in if I'm misrepresenting your language at all! My Korean is personally pretty bad, but I find language features interesting enough and spending the 3 or 4 days it took to learn Hangul made my times in Korea much better (as well as giving me a great admiration for the alphabet).

I remember back even in the mid to late 90s seeing Hanja in the local Korean newspapers, but these days they seem pretty much absent. My wife (who has a pretty good memory for Hanja) laments the inability to figure out homophones sometimes though.

In the North, my understanding is that Hanja has effectively been eliminated for quite a while and things written in the North are pretty much 100% Hangul (as well as some Korean original neologisms to get rid of Japanese and English loan words).


I can't really speak to the utility of using the characters in practice, but looking back on when I was studying Japanese and Korean, it frustrates me that more emphasis wasn't put on the characters, as learning the roots of the words was incredibly helpful in learning vocabulary, in the same way that learning Latin roots would be helpful in studying English. It was, though, much more of an issue in Japanese, where you run into the characters everywhere. Because of the heavy conversational emphasis in most foreign language classes, I didn't learn about character composition (radicals), or the phonetic element to characters until I took a bit of Chinese. When I did finally learn these things, my reading comprehension got a very significant boost.

I'd maybe argue that the use of Chinese characters in Japanese allows me to read much faster than if all writing were in Hiragana, particularly since people tend to identify words more by their shape, but Hangul does this quite well on its own, so I think maybe there's not as strong of a need for it in Korean.


One theory I heard was that Japanese has a rather limited set of syllables, so the ambiguity is greater. In spoken Japanese some of the ambiguity would be resolved by pitch accent, but accent is not written in Hiragana/Katakana.

On the other hand, Korean has a relatively large number of syllables (so less ambiguity), and modern Korean spelling system is highly morphophonemic (which is a fancy way of saying "words that sound the same in a form may still be written differently, depending on how they sound in other forms"), which also helps a bit.

For example, the words 낫 (scythe) 낮 (day) 낯 (face) all sound the same when in isolation (or when followed by a consonant), but sound different when followed by a vowel.


Bopomofo is still used in Taiwan in schools and until about 15 years ago, as a phonetic alphabet for teaching Mandarin to foreigners.

Pinyin never took root in local schools and the preexisting romanization systems (Yale, Wade-Giles, etc.) are confusing and contradictory and inconsistently used. Look up the history of "Peking" or "T'aipei" to get some insights into this.

You still see Bopomofo on computer keyboards in Taiwan, and it is a popular input system on mobile phones. I learned how to read it 20 years ago, but pinyin is so much easier, even without the tone marks.


Transliterations have existed since at least the 19th century, most notably with the Wade Giles system.

Now though Pinyin is the international standard, and is how mainland schools teach pronunciation.

Taiwanese still uses Zhuyin, but in recent years the government has adopted Pinyin for street and city naming. (I'm not sure if schools are still teaching zhuyin, but friends in their 20s don't know pinyin).


I live and grew up in Taiwan. Zhuyin is a learning tool taught to all schoolchildren to teach pronunciation. It also happens to be a computer input method because everyone knows it, and the alphabet fits onto a computer keyboard. No adult regularly uses Zhuyin otherwise.

I've recently switched to using Pinyin for computer input because Zhuyin has a larger alphabet compared to Pinyin, and the Zhuyin keyboard on iOS has smaller keys to fit more characters in the same space, making it much harder to type with. The Pinyin keyboard is essentially an English keyboard that spits out Chinese.


Well sure, but I've found these kinds of systems, and the number of different systems, difficult to use. Latin just isn't a very good alphabet, even for English. I deal with Koreans more and I always find it better to just see the Hangul than whichever transliteration system of the day is in use because trying to shoehorn vowels and consonants that simply don't exist in Latin into Latin is a tough battle.

Zhuyin has the nice aspect of properly representing the spoken language and isn't hard to learn, probably a couple weeks of an hour or two a day and you'll be able to pronounce most things.


I guess I don't understand the problem with homophones. If two words are pronounced identically, presumably there must be some way to disambiguate when speaking, which could be used in the written text as well.


Chinese has a vast number of homophones, the average word length is much shorter than in many other languages and tone and context is basically used to provide meaning. Now eliminate tone and you're stuck with the Korean problem.

A better way to think about it in English are homonyms, in Korean there's pretty much a 1:1 mapping between pronunciation and spelling so all homophones are pretty much homonyms. In English we can get clever and use different spellings to determine the difference.

So let's assume we can't do that in English and spelling is entirely unambiguous (like in Korean) -- both the words 'Aisle' and 'Isle' are spelled 'Ile'.

So the sentence "she walked down the ile" is now ambiguous. I don't know if she was walking down an island or down the aisle of the bus or what. Now magnify this to 20 or 30% of you language so that sentences like "I bought a car" get confused with "I bought a tea" (both are 차 'cha').

Or "can you walk on the leg?" vs. "can you walk on the bridge?" 다리 'da-li'

or "I have a bag of grass." vs. "I have a bag of glue." 풀 'pul'

It's one of the reasons Korean<->English Machine Translation is so dreadful.


Native speaker here. :)

Well, there's some degree of such a problem, but in general they aren't such a big deal.

Think of it this way: if a sentence can have two different meanings based on different Hanja (Chinese characters), then they will be ambiguous when spoken. Obviously, a language cannot function if a lot of sentences are ambiguous. Therefore, when ambiguities arise, native speakers invariably figure out some way of suppressing the ambiguity. (A commonly confused word may lose its position to another similar word with distinct sound. An additional word (say, an auxiliary verb) may be used to disambiguate the context, and given enough time, may even become a grammatical suffix. Or people may just decide "what the hell" and just borrow an English word.)

For example, your example of "I bought a cha" might normally be expressed like this in modern colloquial Korean:

나 차 좀 샀어. na cha jom sasseo. I bought a little cha. (This clearly implies "tea": how would you buy a little "car"?)

나 새 차 샀어. na sae cha sasseo. I bought a new cha (= car). (This is somewhat hard to explain, but it implies that the speaker bought something brand new. It would be a rather odd expression to use on teabags.)

Or even more idiomatically: 나 새 차 뽑았어. na sae cha ppobasseo. I "picked up" a new car. (The verb "ppop-da", literally "to pick up", is a colloquial expression for buying something expensive or worth bragging about.)

In fact, English itself is quite prone to ambiguity (although at a different level): famously, "time flies like an arrow" can be parsed in at least five different ways. Although this is a made-up example, I've seen many English learners struggle to understand complex sentences in, say, New York Times, because pretty much every English word can be a verb or a noun at the same time. Of course, all these sentences are perfectly clear to a native English speaker.


(This clearly implies "tea": how would you buy a little "car"?)

http://api.ning.com/files/iINqnIBAPQtK2AiQcRXTrRTfMH51FKVu6Y...


Thanks for fleshing it out. I'd add also that context is often set (in any language) outside of a single example sentence. So a word, or sentence in isolation can be ambiguous, but in the context of a conversation or a book or whatever can be pretty clear.

Ambiguity can be great as well, lots of poetry, clever puns and jokes rely on ambiguity of specific words to add multiple layers of meaning.


I'm not sure whether or not it's the case with Korean, but in Japanese, many homophones are tonal, though not often explicitly acknowledged as such. words like 神 (kami: god) and 紙 (kami: paper), or 席 (seki: seat) and 咳 (seki: cough) are fairly consistently spoken with a particular distinguishing intonation, which helps, along with context, to disambiguate them from eachother.


Spoken Korean used to have some tones until the 1500s, then lost them. In written Hangul (the original 1400s version) the tones were indicated with a dot diacritic to the left of the character.

It's not known if Hangul influenced the dropping of tones from the spoken language or not, but around the 1500s and into the 1600s tones started being omitted in written Hangul (probably to make it faster to write, like dropping vowel diacritics from Arabic). Hangul also lost some letters and there's been some vowel shift with two different vowels (애 and 에) nearly merging today.

Like in Japanese, it appears the Korean tone system was replaced with a length system to help identify homophones, which lasted until very recently. My understanding is it's still formerly part of the language (and is taught in grade school with this rule), but most Koreans don't really use it anymore.

http://askakorean.blogspot.com/2014/01/tonal-vestige-in-kore...

I think today you still hear vestiges of tone in Spoken Korean but it's more used as part of the phrasing and is distinct from statement/question tone changes like in English. You can hear it in longer conversational Korean like here: https://www.youtube.com/watch?v=RMnp3efz6s4


Japanese has pitch accent, but usually only one syllable is accented and there are only 2 "tones". It's not fully tonal like Mandarin.

FWIW, I've never bothered to learn Japanese pitch accent properly. I'm sure that gives me a foreign accent, but in 10 years I can't recall a single instance of being misunderstood due to pitch accent. Context takes precedence over pitch accent.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: