Hacker News new | past | comments | ask | show | jobs | submit login
Zhou Youguang, creator of the Pinyin writing system, has died (bbc.com)
245 points by jrwan on Jan 14, 2017 | hide | past | web | favorite | 239 comments

I have long thought that Pinyin is underestimated for its simplicity, conciseness, and its implicit description of regional phonetics in China. Its uses all 26 letters of English except for "v", but "v" on a keyboard is used to type "ü" instead, and so Pinyin input does not need a special keyboard layout. A basic QWERTY layout is all that's needed. This is an improvement over the Wade-Giles system which requires diacritics and curious apostrophes that the international community--which generally had no interest in learning Chinese or the peculiarities of the Wades-Giles system--learned to ignore when writing Chinese words with the Wade-Giles system (i.e. Taipei should be T'aipei. In Pinyin it's Taibei. Tai chi chuan should be T'ai chi ch'uan. In Pinyin it is Taiji quan).

When I was beginning to learn Mandarin while studying abroad in Chongqing, I was very frustrated with understanding the southern accent. Instead of pronouncing "zh", "ch", "sh" (retroflex consonants), the locals would pronounce "z", "c", "s" (dental/alveolar sibilant consonants). As you can see, Pinyin makes this pronunciation disparity easier to understand for learners and to anticipate the way that many southerners will (mis)pronounce Mandarin. Learning all of varieties of pronunciation in China is still a daunting task with or without Pinyin, but at least it describes this major one pretty well.

Pinyin actually also slightly changed how Chinese words are pronounced. There was once a exploratory idea of abandoning Chines written language as a whole, and adopt fully the latin-character-based language. The compromise is pinyin, a latin-based pronunciation system and simplified written language.

This absolutely modernize the nation. It's no small feat to educate 1BB+ people in a few decades. The nation's unity is once again at an unprecedented level.

This is not without its cost. Modern Chinese pronunciation is considered less appealing to ears, meaning that many sound disappears in the new system. The resultant language often sounds more dull compared to historical system.

The written language is less artful. The traditional written language is no doubt a more appropriate subject for Chinese Calligraphy.

I guess the global momentum to move to a more latin-based language is not going to be stopped. But it's nevertheless a saddening event to see a nation's historical root is altered significantly in a short period of time. This did not destroy the root, it's still there. But the changes are more artificial and more brutal.

I didn't know the pronunciation was simplified with the creation of pinyin. That is really interesting. Could you give an example (e.g. two words or sounds that used to be different and are now pronounced the same)?

If you look at cantonese, an older chinese dialect, you will find 9 tones:


Particularly useful in that section is this graph:


And I think that's a great way to visualize it. Each word has it's own tone contour and it just so happens that a lot of them could be clustered and curve-fitted to match a small number of tones. But the number of curves you choose is arbitrary just like how choosing the number of clusters in k-means is not well defined.

So if you treat the pinyin as authoritative you will sound robotic compared to a native speaker who learns each word's actual tone curve naturally. Just like how if you believe english is a phonetic language you will pronounce lots of things wrong. (e.g., as an English learner once I learned how to spell "doubt" I kept trying to enunciate the "b" since I thought the pronunciation I learned from hearing, "dout", was wrong)

Disclaimer: I'm a native cantonese speaker so my knowledge about tones is bullshit. Just hypothesizing why text-to-speech and people learning to speak it always sound funny.

https://en.wikipedia.org/wiki/Cantonese_phonology explains the reason for the disagreement about the number of tones in Cantonese. If you still think there are nine tones, you should be able to provide some minimal pairs (https://en.wikipedia.org/wiki/Minimal_pair) to prove it.

I am a native speaker so I don't really care about the number of tones, I was just citing the link. You can speak to me in monotone and I will figure it out from context just like how you can learn to understand someone with a strong accent.

My theory above is that a fixed number of tones is a linguist madeup concept from data that happens to fit the clustering of the tone graphs. The real tone is a distribution on a per word basis that you learn from natural variations from the other speakers around you. Do the 6 or 9 or 4 tones sometimes fit these distributions? Yes, but it's still just an approximation of the real variations that natives use and shouldn't be treated as an authoritative pronunciation.

Well, even languages with no defined tones such as Japanese have some intonations to distinguish homophonic words in context. And English speech synth still sound robotic because it lacks the full tonal range of a human speaker.

The simplification of tones in Northern Mandarin predates Pinyin by hundreds of years, similar to how English developed a writing system that is often inconsistent with speech.

Doesn't pinyin just apply to standard mandarin? China has many different dialects with different pronunciations, pinyin wouldn't have simplified those at all.

According to what I read on Language Log (Victor Mair's posts), pinyin also made it massively easier to teach literacy. So the tradeoff could well have been totally worth it just for the literacy gains, and the historical roots might not matter as much.

One wonders what kind of future technological breakthrough it would take to make the traditional characters easier to enter in a computer system ( rather than going through pinyin first )- I've seen that handwriting in simplified characters is probably on par, in terms of speed and readability, with pinyin, so it doesn't seem that far fetched, given that many phones and computer can accept stylus input these days.

Smartphones have pretty good handwriting IMEs, but if you want to bypass Pinyin, Cangjie is also an interesting system - you can (could) even create new characters with it:


> adopt fully the latin-character-based language

Ironically, everyone else is moving away from latin based characters towards icons, emoji and international symbols. (Really, why is O easier to learn than Off, and | easier to learn than On? At least the latter can be looked up in a dictionary. Anyone wonder what those washing instructions mean?)

Pinyin still requires curious apostrophes to distinguish between xian and xi'an, for example. I think a hyphen would have been the better choice.

As someone who spent time learning Chinese in Beijing, this article is very confusing to me.

I learned Chinese, like pretty much every other foreigner, via pinyin pronunciation -- and it's great for language courses. And it's also great for road signs that foreigners can actually read.

But I never heard of pinyin being used by/for the Chinese themselves (except, now, as one computer typing system among many, although it still outputs in characters). All my Chinese teachers explained that young schoolchildren were taught basic pronunciation with bopomofo (which uses special characters, not latin letters), not pinyin.

And increases in literacy in China are simply due to widespread schooling and massive efforts at memorization of characters by children -- pinyin has absolutely zero value for functional literacy, because everything is still written in characters. (There are no pinyin newspapers or books I ever saw.)

EDIT: I see from comments below that pinyin is used with schoolchildren in mainland China, thanks. But I still don't see how this has helped literacy, when all reading material is still in characters.

As someone from mainland China and learned pinyin in grade 1 of primary school, which happened in early 90s, in Beijing, I use pinyin daily, for inputting Chinese into electronic devices.

>All my Chinese teachers explained that young schoolchildren were taught basic pronunciation with bopomofo (which uses special characters, not latin letters), not pinyin.

No, we use latin letters in mainland China, Taiwan uses special characters.

The article is very accurate. Older generations like my parents have to learn latin lettered pinyin by themselves to use PCs and smartphones.

And the pinyin form of my name and home address are on my passport.

Can you explain the reasoning behind typing in pinyin to see the output in traditional characters? To a native English speaker this sounds silly and pointless.

Is it something that's mostly tradition or is there advantage to typing this way?

If you are asking why don't they type pinyin and leave it like that, then the reason is that Chinese, particularly mandarin, has a large number of homophones. Learning elementary mandarin sometimes felt like learning new meanings for the sounds "ma" "bu" and "shi" for two years. Despite having nearly the same pronunciation, all those syllables are distinct morphemes with unique characters to represent them, so characters are a clearer form of writing. If you are asking why they don't directly enter characters then the answer is that some people do, but it's not as convenient. To enter characters you draw it in a box with either a touch screen or a mouse and it's quite a bit slower, especially compared to the quality of autocomplete in pinyin entry systems.

Well explained. Does the similar pronunciation and many homophones cause any issues when spoken?

Sometimes but not very often. The reason is that the rest of the words in the sentence together with the general situation is enough to make words unambiguous. This can be a challenge when learning because you have to map meanings onto both sounds and situations if that makes sense. It also means that Mandarin is great for making puns.

Does being situational make the language more efficient to speak? You don't need nearly as many words when the meaning is tied to context.

I've always been curious about east Asian languages and culture since they're relatively isolated from Western influence and historic ties to the British empire

This might be the case for illiterate people, but these days it's mandatory for Mainland Chinese citizens to complete at least 9 years of education. Different words with the same written form usually have different pronunciations, and different words with the same pronunciation usually have different written forms. It's rare for a word to carry different meanings solely depending on context.

Tones is the obvious answer that others have given, because they are often omitted in written pinyin. But even phonetic representation with tone annotation has problems. Take the following characters as examples:

他: he, him; it 她: she, her

These are both pronounced EXACTLY the same in spoken mandarin, and would be indistinguishable in a phonetic transcription. But standard written Chinese is more expressive than spoken mandarin because it allows for gender clarification.

Also take this example:

買: "mǎi" to buy 賣: "mài" to sell

These have exactly opposite meaning, and are differentiated only by tone. Without tone diacritics you would not be able to distinguish the two apart. Even with tone diacritics, reading is slowed because the difference between the two is only a small, semantically meaningless diacritical mark, whereas 賣 has an extra symbolic component that indicates its meaning.

It is worth noting that the female first person pronoun 她 is a relatively recent invention inspired by republicanism and the rise of venacular writing. Some writers in the 1920s and 30s used a more distinctive word 伊 (pronounced yi1) but over time the homophonic female Ta1 became mainstream.

What also happened then was an influential movement to fully romanise written Chinese and some of the proposals are radical even for today as they plan to do away with tones completely.


Pinyin was a scaled down version of these except it was never meant to replace the current writing system.

There's a language/dialect called Dongan which is written in Cyrillic without tone markers, and is to some extent mutually intelligible with Mandarin: https://en.wikipedia.org/wiki/Dungan_language

A litlle. Intonation and non verbal cues help a bit with disambiguation. Also, while pinyin can have tone markers, they are a pain to type and are often omitted.

It is not rare to see people who are having an oral conversation disambiguate words by drawing them with the index finger in the palm of their hand.

Spoken Mandarin is virtually always spoken with tones (except when sung, or when foreigners speak it), but when written virtually never has tones (e.g. signs). Only childrens books write the tone marks. So, along with various other intonation, there's a lot more information communicated during speech.

Sure it always happen. "Let me eat a Chicken" has the same pronunciation of "Let me eat a dk". Same thing as some one named focker...

Not really, except that Chinese is a fantastic language for puns :D

> As someone from mainland China and learned pinyin in grade 1 of primary school, which happened in early 90s, in Beijing, I use pinyin daily, for inputting Chinese into electronic devices.

Do you think learning Pinyin first threw off your early pronunciation of English?

If people learned their languages in unique phonetic characters, would they be more likely to focus on learning the new sounds of other languages?

Like many other Chinese have replied, we do use and learned pinyin in school and use it to learn how to pronounce new characters. All the Chinese dictionary has pinyin in it.

Pinyin to us is basically International Phonetic Alphabet to English. https://en.wikipedia.org/wiki/International_Phonetic_Alphabe...

This is especially important because each region in China speak their own dialect of Chinese and sometimes it is almost an complete different language. As an Cantonese growing up, pinyin was my way of learning Chinese.

They are basically different languages (calling them dialects of one Chinese language is really more a political thing -- a bit like calling the languages of Europe dialects of Latin).

Born in a Hakka family, growing up in Cantonese-speaking region, having studied in places where ~50% of the students are Teochew people, and working with people who speak Wenzhounese, I consider Cantonese, Hakka language, Teochew, Wenzhounese, Mandarin Chinese all dialects of the Chinese language, and can't see why it is so special about Cantonese. At least Wenzhounese is much more different from Mandarin Chinese than Cantonese is, but why use the deviation from Mandarin Chinese as a standard anyway? Mandarin Chinese itself should be a dialect.

To be counted as a different language, I would say it should be as different as Zhuang language, Tibetan, Uyghur, Manchu, Hmong, etc. People who speak these languages might or might not come from politically controversial regions, but I don't think anyone would consider these languages Chinese for a second.

No, they are dialects.

They share the majority of the syntactic and grammar. Their written form are all the same. Pronunciation is the major difference. This level of unity is established probably since the Qin dynasty.

> Their written form are all the same.

Regional languages in China have separate writing system when you need to write what is actually spoken.



Easiest way to see this difference is watch the news in Cantonese -- it is mostly 书面语 (standard) mandarin but pronounced as Cantonese, whereas day-to-day spoken Cantonese is completely different in structure and vocabulary. What do you think gets written in HK movie scripts? Definitely not standard written mandarin.

Yes. It's a bit as if French, Spaniards, and Portuguese would all correspond in Latin, and then say, see, it's one and the same language :-)

This is also true of Spanish and Portuguese but they are still considered different languages. If two people speaking different dailects couldn't understand each other they are surely different languages

To take a different perspective, calling them separate languages helps reinforce a nationalist agenda. It gives your citizenry another reason to feel separate from, and maybe superior to, other nations. Calling Spanish and Portuguese separate languages, instead of two Iberian dialects of modern Latin, helps reinforce the idea that Spain and Portugal are separate nations, and not just different regions of the same peninsula.

Yeah but maybe we should let linguistics and science decide what a language is or is not, and not bend the truth for political reasons?

I mean, 90%+ of Chinese people were taught and speak Standard Mandarin anyway. Language is not a reason for separatism today.

My understanding is that there is no bright line in linguistics and that language definition is by default messy and political

It's not really by default, more like by necessity. Almost all categorization schemes for human cultural artifacts are necessarily blurry at the edges, and just because someone has provided a definition that is useful for one purpose (the study of linguistic differences across cultures) does not mean that it is suited for another purpose (rhetorically implying solidarity or separation between two sets of people).

Mutual intelligibility is the standard criterion, and by that standard, there are different 7-8 language families in China.

There's also Catalan: https://www.theguardian.com/world/2012/nov/22/catalan-langua...

and the possibility that Catalonia might break away from Spain: https://www.theguardian.com/world/2016/jan/10/catalan-indepe...

Fair enough.

To be consistent with the linguistic facts, one should speak of the Chinese languages. You could also consistently speak of a Chinese language with dialects, and then also consider Romanian, French, Spanish, Portuguese different dialects of the Romance language; but that would redefine the terms from how they're currently commonly understood.

Your assertion is a political one, and not universally shared. The Central Party has incentives to make that claim for nationalist reasons, and has been doing so for decades, but that does not necessarily make it true. In my personal opinion, whether it is a dialect or not should be answered by the people who speak it, not by the people who rule them.

Well, you sort of prove my point, that it's a political thing, about the unity of the country.

Here's what linguists say:

1. From "Visible Speech: The Diverse Oneness of Writing Systems" by John DeFrancis:

"Chinese [...] is an umbrella designation for at least eight present-day varieties of what are usually called "dialects" but, since they are mutually unintelligible, might better be considered parallel to the various languages that make up the Romance group of languages."

2. From "Asia's Orthographic Dilemma" by Wm. C. Hannas:

"some eighty million or more people living in China [...] speak non-Chinese languages written in alphabetic or indigenous systems. [...] If we ignore this inconvenient phenomenon and focus on the speech of China's Han population, we find a collection of at least seven or eight mutually unintelligible varieties that in any other context would be called "languages," but which are "dialects" in China, in part for political reasons and in part because of a problem with the translation of the Chinese term fāngyán. The political motivation for claiming that these distinct varieties constitute a single language is fairly obvious: it is easier to govern a country in which the majority believe they are speaking one "language" (whatever the linguistic reality) composed of several "dialects" instead of several related languages.


Most linguists familiar with the classification problem acknowledge that the major Chinese varieties differ from each other at least on the order of the different languages of the Romance family.


We have seen that the Chinese languages differ not just in pronunciation but also in vocabulary and grammar, and that these differences are realized through unique morphemes (or unique uses of shared morphemes) for which characters do not exist at all, do not exist in Mandarin, or are used with different meanings and functions. Consequently, character texts in Cantonese and (where available) in Taiwanese are largely unintelligible to Mandarin readers. Many characters are completely unfamiliar; others are recognizable but make no sense in context. This occurs where conventions exist for writing the non-Mandarin variety in characters. Actually, most of these languages have no established writing system and hence lack even the possibility of being understood by readers of other varieties.

[1] http://pinyin.info/readings/texts/visible/index.html

[2] http://pinyin.info/readings/orthographic.html

It is a subject of argument because there is no clear definition what is a language. There is spoken language and there is written language, then there is culture behind the language. Chinese dialects resemble complete different spoken languages. However, they all share the same written language. Chinese were rarely written down exactly as the way it was spoken, and this is being practiced since two thousand years ago (probably ever since the unification of the First Emperor). In fact, because of this written language, it reflects back to spoken language as well. Chinese may speak the way they write -- that is, they may speak the same way as their written form, with their various dialect pronunciations. In a way, there is spoken Chinese for the illiterate people, and there is spoken Chinese for literate people. Last, with a common written language, Chinese shares a culture. A language communicates meaning and the meaning is the meaning of the culture. With the same culture, there is little barrier communicating (even between people who speak different dialect). It is weird, but it is not unusual to have two people communicate in two different Chinese dialect -- won't that be the practical definition of a language?

And by politics, I think one really means culture. It is the need of communicating with each other on a daily basis forges a language.

Well, I was mostly commenting on the traditional Chinese. Traditionally most population are illiterate so there is little problem having disconnected spoken language and written language. With the modernization and majority of population becoming literate, there is a unification between spoken language and written language. That has updated both Chinese spoken language and written language to a common form -- dialects are becoming merely a different pronunciation form. Today, with schools mandate speaking of mandarin, dialects (the spoken languages) are on the way out.

This process take place in both the main land and Hong Kong and Taiwan. But due to the isolation, they took slightly different form. That is how to a foreigner's view, Chinese dialects appears to be different languages. This is not fundamentally different from between French and Spanish, only the extent of time differs. With only decades, the Chinese in Hong Kong and main land, e.g., are still viewed by most Chinese as the same language.

Having spent my first few years in mainland China, I can say that that the very typical kindergarten I went to taught pinyin and not bopomofo. Pinyin is the standard romanization in mainland China, not Wade-Giles (which seems to be preferred in Taiwan?)

Also, I would say that yes, pinyin did help me learn the language, and was especially important in providing a phonetic standard for characters.

Pinyin is the standard romanization in mainland China, not Wade-Giles (which seems to be preferred in Taiwan?

Taiwan is in a bit of a confusing place right now with phonetic alphabets for Mandarin. Here's my understanding - I'll use the abbreviations "WG" for Wade-Giles and "HP" for "Hanyu-Pinyin".

- They traditionally used Wade-Giles to romanise things for international consumption. ie Taipei (HP: Taibei), Taichung (HP: Taizhong), Kaohsiung (HP: Gaoxiong) etc.

- They officially switched to Hanyu Pinyin as their official romanisation about 10 years ago, but it seems to only have taken place on a very low scale. Ie the Da'An (WG: Ta'an) distrct in Taipei, or Wuri (WG: Wujih) in Taichung. With regards to people - the second president of the ROC-On-Taiwan is known in English by his WG name Chiang Ching-kuo (HP: Jiang Jing-Guo), while the current president is knowing exclusively by her HP name Tsai Ing-Wen (WG: Zai Ying-Wen). You can see this confusion manifest itself today - we still eat "Peking Duck" (WG), but we refer to the city as "Beijing" (HP).

- Some places in Taiwan are romanized with neither system, like Keelung, which seems to be some ad-hoc romanisation of Taiwanese Hokkien (another Chinese language entirely) that took root in the 19th century. Though interestingly enough, in the most common Taiwanese Hokkien romanisation system it would be "Kelang".

- In terms of education for locals, it's 100% Bopomofo as far as I can tell. I've also yet to meet a Taiwanese person who uses anything other than Bopomofo for input on their phones and computers. Apple atually added an extra tone mark to their Bopomofo IME (high tone is indicated by no tone mark, traditionally), which some Taiwanese people now seem to think is actually part of the alphabet!

- And lastly, in terms of education for foreigners, it seems to be a hugely mixed bag. Many foreigners now demand the use of Pinyin because it uses the Latin alphabet and it's more comfortable for them. A lot of places will still expect you to learn Bopomofo (which IMO has a lot of advantages - it's only 37 characters, stops you thinking in terms of your native pronounciation, and the IMEs for it are way better because they let you filter characters by tone). AFAIK Wade-Giles is dead for teaching, but one famous Taiwanese book for teaching Mandarin I used uses all three phonetic alphabets - WG, HP and Bopomofo, which was quite distracting!

Full disclaimer - my Mandarin is rudimentary and I am not Taiwanese.

It's slightly more complicated. Besides WG and HP, Taiwan actually once used 3rd system called "Tongyong-Pinyin" (Tongyong means general/univeral usage), I'll use TP for abbr..

- In Taiwan, it's officially using TP during 2002~2008, the reason was to unify romanize/pinyin system for Mandarin/Taiwanese Hokkien/Taiwanese Hakka.

- It was then switched to HP at 2008 for English-friendlier environment.

- It is 100% Bopomofo in education. For most Taiwanese people we use Bopomofo as computer input method, but there are alternatives like Cangjie (based on how we write characters), Ziran (means natural literally, which is a mix of Cangjie and Bopomofo plus some heuristic). The alternatives are invented because there are too many characters sharing the same Bopomofo, you still have to choose characters after typed Bopomofo. Most alternatives have low market share in general public, but high market share in typing heavy jobs.

- Because WG was officially translation rule of ministry of foreign affairs before 2002 and they only recommend instead of forcing switch to TP/HP, so name of most city and people are still based on WG. So does those widespread words (like, Peking Duck).

> Many foreigners now demand the use of Pinyin because it uses the Latin alphabet and it's more comfortable for them. A lot of places will still expect you to learn Bopomofo

As far as I know, the rule of thumb is that exchange students are taught using Pinyin, and full-time students use Bopomofo. Teaching materials usually have both, anyway, so I don't think it's a big deal.

> I've also yet to meet a Taiwanese person who uses anything other than Bopomofo for input on their phones and computers.

I've seen older people use the handwriting IME on iOS. You can tell when you see stray simplified characters or other variants that can't be produced with Bopomofo :)

Interesting. My only objection is to the bit about the current president. You said that she uses a pinyin spelling, but "TS" isn't a pinyin consonant.

Whoops! I use Bopomofo exclusively these days - I forgot c is a "ts" sound. Looks like her name is indeed spelled Wade-Giles style.

To add to the confusion, Taiwan seems to have had their own adapation of wade-giles between 2002-2008 called Tongyong Pinyin...


The moral of the story is - when asking for directions in Taiwan, don't rely on the street signs much for pronunciation :D

She's Cài Yīngwén in HP, according to Wikipedia.

Many would argue that not switching to pinyin was a huge error.

Chinese literacy isn't necessarily that great...



I started learning mandarin only a few months ago, but from what i learned so far, there is no easy way to switch from 汉字 to pinyin. Pinyin is an essential learning tool, but loses to much context and meaning encoded in the chinese characters. Even texts in the textbook somewhere after the 10th lesson start making much more sense when written in chinese characters than in pinyin.

To understand how pinyin helps Chinese become literate you have to start from a Chinese perspective. For an illiterate Chinese, he/she would be fluent in the spoken Chinese. So the barrier is the connection between written characters and pronunciations. Pinyin bridges that. With starter reading materials that are annotated with Pinyin, people can read (sound) the material and understand the material. So pinyin lets people read materials that even when it contains characters that they don't know. Pinyin lets people read, and by reading, people learn the written form and become literate.

It's not trivial - I would not want to read an English or German text written in IPA. Nevertheless, I'm convinced that it is quite possible to write Mandarin in some sort of alphabetic script quite easily.

What people have been doing in Chinese is basically write only one syllable of a multi-syllable word, which would be ambiguous in speech, but disambiguated in writing by the character. But then, if your writing were more phonetic, you'd just have to write the full word.

I don't quite understand what you mean. In modern standard mandarin it is pretty much one character = one syllable. They'd have to travel 3000 years back in time to recover the lost syllables.

Yes, yes, absolutely.

My point is that written Chinese is much terser than spoken Chinese (particularly classical Chinese, of course) and that this is enabled by the disambiguation inherent in the characters. That is, where in spoken language you'd use a two-character word, when writing it would be sufficient to write just one character to evoke what's meant.

Now, if you were to write in some sort of alphabetical script, you'd just write the full two-syllable word.

Slighly offtopic. Our own languages start incorporating pictograms and hieroglyphs, there are a little over 1000 emoji codepoints. How many syllables do you need to spell out an emoji?

Virtually all languages use semantic digits to represent numbers. Just like the way you used "a little over 1000 emoji", instead of "a little over one thousand emoji". There's 10 "semantic" digits to the 26 "phonetic" letters on a keyboard.

This article is very accurate. Everyone in mainland uses pinyin daily for inputs. Only the older generation uses hand wiring for inputs.

I don't know the stats but 3 out of 10 people I know use Wubi for input instead of Pinyin. Wubi needs to be learned from the ground up but is faster to type by a half.

I think I understand your confusion. I also spent time as an ex-pat in China. This is a bad sentence from the article:

"Before Pinyin was developed, 85% of Chinese people could not read, now almost all can."

In reads like the development of Pinyin is what led to this big change in the % of Chinese people who can now read. That is not correct and very misleading.

Pinyin was not the catalyst nor the major reason for changes in literacy levels. Rather it was new education policy and development of the new Simplified Chinese Characters (Newer Chinese characters with fewer line strokes in them - makes reading and writing much much more simple - it is still used today and is the official character language in China).

Pinyin was and is a huge deal. For foreigners learning Chinese and more importantly later on when it was chosen to be utilized as the input method when technology developed that required putting the Chinese language into computers and mobile phones.


> Rather it was new education policy and development of the new Simplified Chinese Characters (Newer Chinese characters with fewer line strokes in them - makes reading and writing much much more simple - it is still used today and is the official character language in China).

Then why do Hong Kong and Taiwan have higher literacy rates?

EDIT: Why the down votes? The parent is making an unsubstantiated claim for which there is conflicting evidence:

1) Reading simplified is not objectively easier in the inside view as some characters with distinct meanings are ambiguously combined, and semantic information stripped from many other characters so if you aren't quite remembering that character you aren't given any hints.

2) Writing simplified is not objectively easier in the inside view as many of the components are changed in form from the characters they derive from, and again some semantic clues are stripped from the character. So there may be slightly fewer characters composed of slightly fewer strokes on average, but those aren't the metrics one should use for measuring difficulty -- a character with more strokes can be easier to write on demand because the strokes are part of components that carry semantic meaning. You don't have to remember each little detail, just the broad gist of a plot for the character and then systematic rules fill in the rest.

3) Finally in the outside view, non-mainland communities like Hong Kong and Taiwan have always used and continue to use the traditional characters. And these communities have higher literacy rates across the board compared with mainland China.

People who have already learned simplified-only find simplified characters easier to read. That shouldn't be surprising, but it is not objective truth. And the PRC likes to toot its own horn and say that simplified characters are cleaner, neater, simpler, more modern, led to higher literacy rates, cure cancer, etc. But that doesn't make it true without further substantiation either.

Hey, I didn't down vote you but based on your question and your edit of an "unsubstantiated claim for which there is conflicting evidence" I think you may have mis-interpreted my comment.

Apologies for any confusion. I was not making a claim nor comparing between characters used by PRC vs. Taiwan vs. Hong Kong, in particular I wasn't ranking them as better or worse or easier or harder.

I understand there are ongoing differences, posturings and debates between PRC and Taiwan and Hong Kong, including about language, and the whole thing is serious and complicated. Bro, I have no dog in that fight. Good luck y'all.

Maybe see my other comment as it goes into more detail. But just on a basic level I think you have mis-interpreted the words "simple" and "easy". A pretty common mistake I have made in the past too. The characters are called "Simplified Chinese Characters" not "Objectively Easier Chinese Characters". The definitions of simple and easy are different. Context matters too.

For example: Contributing useful comments to a discussion on Hacker News is very simple. Clearly it is not easy.

I'm a little skeptical of both the claimed literacy rate and of simplified characters' part in the increase of literacy.

The replacement of characters on a phonological basis is good, but you still have to memorize thousands of characters, simplified or not. Further, I've never seen how "literacy" is defined for those statistics.

IIUC, pinyin was originally proposed as a replacement for characters, and there was some interest when the communist government took over, but the choice was finally made in favor of simplification.

Agree. Simplification just saves some strokes when writing by hand (and when hand writing "cursive style", you skip/elide/join strokes anyway).

Initially, it might even make things more confusing: In traditional, you immediately see that 言 is part of 語, but the same radical in simplified 语 looks different. Similarly, 金 in 錢, but different in 钱.

Be skeptical. I am too. Especially when it comes to statistics coming out of China things where details can be very unclear and the skeptics have been right a lot lately.

We can debate what the actual literacy rates are or how to define it. But it's pretty clear literacy in China has gone up in recent decades however you define it. More so, we can debate how big a part simplified characters were to the change in literacy, but my comment above was trying to clarify any confusion pinyin played that big a part in literacy improvement. Pinyin is taught in schools very recently but it did not play this big part in improving the % of the population who can read and understand Chinese like the article suggested. You do not need to know one letter of pinyin or even have heard of pinyin to read Chinese fluently. In fact, the push to teaching younger students pinyin and the roman alphabet in recent years may actually slow down a bit learning to read Chinese (pinyin is needed to use computers and mobile phones and helps when eventually learning English and is worth teaching early even with such a tradeoff).

Also, memorizing of thousands of characters is not the major problem. Firstly, you don't need to memorize thousands of characters. Words are multiple characters. You can read and understand +90% the Chinese you encounter on a daily basis, involving thousands and thousands of Chinese words with less than a thousand characters. Yes, it is harder to memorize a few hundred characters compared an alphabet with a couple dozen roman letters, but characters are not all that difficult once you understand the system behind it (radicals, components etc). I know a few hundred million people who seemed to have done it just fine.

Lastly, government policy alone could have improved literacy even without simplified Chinese characters. It wasn't a necessary step. And many Chinese can read and understand the gists of texts written in traditional Chinese characters even though they were never taught them. (similarly if you know English but were never taught Latin, go spend an hour reading about Latin and etymology you can probably start understanding a lot more Latin than you realize. errare humanum est)

I'm a native mandarin speaker. In my experience Pinyin was like a jump-starter of my self-assisted language learning--you can easily lookup words in dictionary with Pinyin and fill you mind with Chinese words. Also some entry-level books comes with Pinyin on top of the Chinese letters so you can learn how to pronounce by your self. After 1-2 years one can read Chinese and Pinyin will be put away.

One great thing is you can precisely pronounce name of a street or a medicine with less effort. Great for communication in daily life.

pretty much all children books in China include under Chinese characters also pinyin, exactly because it's used to teach children

i would not say pinyin didn't play role in increasing literacy levels, but it was one of the factors together with simplification of characters and wider availability of education

I think bopomofo is only used for learning in Taiwan, where it's also used for keyboards sometimes.

Incidentally, bopomofo is also pretty neat. It's based off common characters with the sound of the letter.

> "We spent three years developing Pinyin. People made fun of > us, joking that it had taken us a long time to deal with just > 26 letters"

I actually think it is quite fast. Grasping an entire languages phonology is a huge accomplishment and condensing it to its bare minimum is also impressive.

With that said, I must admit I don't know how pinyin relates to other phonetic systems such as the Bopomofo[0] or Wade-Giles[1] and for all I know it might just be the exact same system with different letters.

[0] https://en.wikipedia.org/wiki/Bopomofo [1] https://en.wikipedia.org/wiki/Wade–Giles

Pinyin is a little more compact (I.e. fewer Roman characters per Chinese characters) than Wades-Giles, so more efficient storage. That makes it easier to input chinese characters when using an English keyboard (or keypad on a mobile phone).

It's also arguably more aligned with the English pronunciation of the letters.

Not a all. As a phonetic enthusiast I can assure you Pinyin is by far the worst representing Chinese sounds with English letters. It is made to represent Chinese phonetic, but its creators are seemingly completely ignorant toward basic phonetic rules, such as voicing and aspiration, and how words actually sound in English.

As an example, b in English is called an unaspirated voiced consonant, which means you vibrate your vocal folds, but send only a little breath when you sound it. You can compare it with p, its aspirated voiceless counterpart, for which you do not vibrate the vocal folds (voiceless), but sends a strong breath through your month (aspirated). Chinese does not have most voiced consonants present in English, so Pinyin naively uses b (also d and some others) to represent voiceless aspirated consonants, which does not sound like a b, but more like p in “spade”. This results in most of the foreigners learning Chinese with Pinyin bringing English pronunciation when they speak, and makes them sound way too stiff and thus awkward.

Wade-Giles, in comparison, is much more systematic, and does a much better job hinting speakers with European (especially Germanic language) background to sound more correctly. Bopomofo does not have this problem, as it basically invents a new set of scripts. Pinyin probably lets you pick up casual speaking Chinese most quickly, but beyond that, it’s a curse.

[Edit]: Voice “folds” not cords, sorry. Also fixed some typos.

Pinyin is very well-adapted for (Mandarin) Chinese. It isn't specifically designed for English learners, but if you understand phonetics and learn the sounds first, that shouldn't matter. The alphabet is originally from Latin and is used in a variety of ways in different languages, e.g. Turkish c sounds like English j, German j sounds like English y, so the foreign accent problem exists generally.

There are two bilabial stop consonants, called p and b, in many different languages, with different sounds, e.g.

                          Mandarin English French
    unaspirated voiced       -        b       b
    aspirated voiced         -        -       -
    unaspirated unvoiced     b        -       p
    aspirated unvoiced       p        p       -
though there is, as you point out, an unaspirated unvoiced allophone of p in "spade". Other stops (t, d, k, g) follow a similar pattern.

English speakers using English p and b in Mandarin will still be understood even if their b sounds foreign, but French speakers might not be, because French p sounds like a Mandarin b.

The only change I'd like is replacing Pinyin -ong with -ung.

I think it would also help to expand -iu to -iou and -ui to -uei, and of course u->ü where applicable. IMEs could still accept both forms.

Pinyin is pretty well-adapted for entering Chinese, but I think that goal is at odds with being easy to pronounce. I'm not sure that having one ISO standard for both purposes is a good idea, although if China ever starts to export more culture, maybe everyone will eventually learn how to pronounce Hanyu Pinyin.

Choosing aspirated unvoiced consonants as "typical" of English is very weird. Native English speakers do not consciously distinguish aspiration but they do distinguish voicedness so it seems it would be more accurate to call the aspirated form the allophone (aspiration only occurs at the beginning of words and stressed syllables, and never after "s" as you point out, so the unaspirated form probably occurs more frequently too).

Still, I agree with your main point. The Latin alphabet is not used like the IPA by most languages, even English has not preserved all the original Latin sounds as used by the Romans. And there's even precedent for some of the sound choices made in Pinyin that would seem completely alien to English speakers. For example "c" is used in all the Slavic Latin alphabets (e.g. Polish, Czech, etc) for the unaspirated version of the Pinyin "c" sound. German and Pinyin use the letter "z" for the same sound, etc.

And maybe replace yu with yü, etc., for consistency.

When I was taught Pinyin, the instructor explained that the phonetics were influenced by the Russian pronunciation of the roman alphabet. Not sure if this is true, or if it explains the issues you're describing. But it seemed plausible that China thought they were going to be closer to the USSR given the shared ideology in the 50's.

In any case, as a person who has a decent pronunciation in Chinese but lacking the ability to fully read/write, Pinyin has been absolutely great for being able to type Chinese informally. I can communicate almost anything I can say, and that's truly wonderful. Part of that is the smart prediction software. But Pinyin has a big part in it too: it's not difficult at all to sound out a word and figure out how to spell it in Pinyin. That's the essential feature that Chinese characters lack, and Pinyin nails that essential quality as far as I can tell.

I know very little Russian and can’t say for certain, but I think Russian is similar to English in these consonants.

I should make it clear I don’t think Pinyin is a bad system, only that it does not map sounds to European languages well. And that is fine, as least for its original purpose. Contrary to common belief, Pinyin was not developed to better educate illiterate people (this does not even make sense if you think carefully), but as the next step after letter simplification (what we know as Simplified Chinese today) towards Chinese Romanisation. The goal was not to represent sounds of Chinese characters, but to outright replace them. The actual reason behind using (for example) b and p instead of p and p' is exactly this—they are optimising for ease to write, not sound accuracy.

And in regard to making it easy for people not prolific with Chinese letters to read, be them foreigners or simply illiterate, a phonetic transcription system is indeed important. But although having a standard phonetic transcription is key to the jump in literacy in China, I would argue the same can be achieved with any decent system, and a few of them are already around by the time they started working on Pinyin.

All in all, while Pinyin is good for its original purpose, it achieved its current status not because of academical reasons, but political ones. The Chinese Communists government had always have a habit ignoring existing systems and inventing their own. :p

Definitely not. Russian has no aspirated consonants at all, nor would they pronounce them in the Latin alphabet. In fact a Russian would pronounce the roman "T" exactly the same way a Mandarin speaker would pronounce the pinyin "D" so I don't think there's much credence to your instructor's story.

There was Russian influence, but maybe not for all the letters. Pinyin 'q' is based on Russian 'ч' if I recall correctly.

I don't think pinyin was even meant to represent voicing and aspiration in a way that is similar to English or otehr Germanic languages. There are many other languages using Latin alphabet which are a little bit different with both consonants and vowels, and that doesn't stop from sharing the letters.

Pinyin does a good job in providing a fairly consistent way to present Chinese sounds using Latin alphabet. It's also the workable way to input Chinese characters on a computer keyboard. It works, it is consistent, and it is now ubiquitous in China.

Well, it's not that simple though. You have 2x2 possibilities:

1. unaspirated unvoiced (av)

2. unaspirated voiced (aV)

3. aspirated unvoiced (Av)

4. aspirated voiced (AV).

German and I think English have 2/aV (b, d, g) and 3/Av (p, t, k), the diagonal so to speak (for the beginning of a syllable).

Mandarin has 1/av and 3/Av. So, 3/Av is pronounced the same - and Pinyin renders those as (p, t, k), as in German/English, so that's actually nice. Wade Giles renders them as (p', t', k'), indicating the aspiration.

German and English do not have 1/av (at the beginning of a syllable). Pinyin then chose to render them as (b, d, g), correctly indicating the unaspirated character, but tempting German/English etc. speakers to voice them (wrongly). Wade-Giles render them as (p, t, k), tempting German/English speakers to aspirate them (wrongly).

No easy solution. Plus, the apostrophes are so often ignored (e.g. Pinyin Taibei = Wade-Giles T'aipei).

Technically "p" in English does not default to the aspirated (pot) or unaspirated variety (hop). English speakers don't distinguish different word meanings when just those sounds by themselves are different, so they're considered homophones. Perhaps you knew that already and I just misread your comment, but thought it may be helpful info if not.

"Hop" comes out strongly aspirated for me. Certainly "mop" and "mob" don't sound the same.

You seem to be confused about the difference between "aspirated" and "voiced". Both these words are unaspirated for native English speakers.

Not for initials (consonants):

I find Wade-Giles ch'in more evocative than the Pinyin qin, and similarly hsüan vs xuan, and even tzŭ vs zi and tzʻŭ vs ci. Wade-Giles you basically have to learn not to ignore the apostrophe, Pinyin you have to learn most of the initials. Though the Wade-Giles jih vs Pinyin ri is a bit of a toss-up.

For finals, I find Wade-Giles more consistent than Pinyin yu, but Wade-Giles to less descriptive than Pinyin duo. But then, it's shorter.

So, yeah, "arguably" :-)

You're right about Bopomofo - both it and pinyin are based on the same pronunciations so there's actually a one-to-one correspondence between the two systems:


As there are only about 400 syllables (ignoring tones) in standard Mandarin, it's a fairly quick exercise to make a complete one-to-one mapping :-)

Oh, and here we go: All 400 syllables of standard Mandarin in Bopomofo/Zhuyin Fuhao, Wade-Giles, and Hanyu Pinyin:


>I actually think it is quite fast. Grasping an entire languages phonology is a huge accomplishment and condensing it to its bare minimum is also impressive.

Absolutely. But I don't see how this statement could apply to pinyin. The work done on German's last spelling reform was impressive in this regard, but faulty pinyin was not.

Care to expand on the faults of pinyin, and how they are more erroneous than say the German spelling reform?

My experience with pinyin is that it is very robust, and I don't remember ever encountering pinyin that was ambiguous about how it is read (unlike written English or romanized Japanese). I have also never heard of standard Chinese syllables that are lacking a clear mapping onto pinyin.

To me that is very impressive.

Compare "xi an" with "xian". One is 2 words and one is 1 word. But for some reason Chinese people don't like to put spaces in between words so "xi an shi" is often written as "xi'anshi" which is different from "xianshi".

Mandarin speakers are seem to be fairly sloppy with their own phonetic alphabets - you're supposed to have spaces in bopomofo and apostrophes in pinyin to handle cases like these, but they often don't bother.

It becomes clearer with tones I guess, "Xi1 An1" vs "Xian1". But again, proper pinyin with tones seems to be a foreigner thing... I don't recall seeing tone marks or numbers on street signs.

But that's what you have the apostrophe for in Pinyin. Works fine.

But pinyin isn't Chinese so your point just doesn't exist.

Well of course it's not ambiguous like English because they had no historical baggage. But properly romanized Japanese is also never ambiguous.

"ene" can be えね and えんえ.

> But properly romanized Japanese is also never ambiguous.

What is "properly romanised Japanese" anyway? There are several modern systems in use [1] often mixed together (Even the government can't stick to the government mandated Kunrei-shiki). Some system represent ず and づ the same, some represent ティ and チー the same, some represent じ and ぢ the same. ō represents both O+O and O+U

[1] https://en.wikipedia.org/wiki/Romanization_of_Japanese#Moder...

えんえ should be en'e according to the govenment system. ティ is used exclusively in loan words anyway as far as I'm aware. Wouldn't you just write it in its original language?

Also the other ones he mentioned are pronounced the exact same way.

(you probably know this but) zu du and zi di do make a difference in terms of which kanji are used. The government has been reforming kanji readings over time though E.g. 藤 ふぢ → ふじ or 図 ヅ (look at the character in the box!)→ ず

Leaving out the apostrophe for えんえ is not standard in any of the three systems. Mixing and matching the systems is also not what I meant by "proper."

I thought 花 and 鼻 were ambiguous in most romanizations - at least I would write it as "hana" in all romanizations I can think of, yet the pitch accent differs. No?

There are systems (basically what looks like a long sideways L) but you're right that romanization doesn't record pitch accent. But there are big regional variations in pitch accent and people won't have trouble understanding you if you use the wrong one, mostly (and you could level the same complaint at hiragana or katakana, after all).

Nevertheless, that's a fair point that I wasn't really thinking about.

I would think the parent is referring to unambiguous in terms of pronunciation. You're right in that you wouldn't know which character was originally used in isolation—or whether kanji was used at all.

As far as I know, there's little (any?) pitch accent that distinguishes between words in Japanese. What pitch accent are you referring to?

Every word in Japanese has a standard pitch accent and some otherwise homophonous ones differ in pitch accent. However, from one dialect to another the pitch accents on the same word can be totally different so I don't think they're usually a big hindrance to communication. Pitch accent differs from English stress accent.


To be clear, you're saying that pitch can change due to regional accent, as opposed to distinguish between homophones, which is which is what 'singhblom was implying, correct?

I thought 花 and 鼻 were ambiguous … yet the pitch accent differs. No?

It does both things. Pitch accent is the only thing distinguishing some homophones (hana, hasi, ame, kumo) but it's also got variation in dialects including a handful where the Tokyo and Osaka versions have the same distinction between two words except with the opposite meaning (sorry, I don't have examples off the top of my head).

Fortunately there aren't that many sentences where it's equally plausible you meant both spider and cloud.

Interesting. Admittedly not a native speaker, I never found pitch accent used to distinguish between homophones during my time in Japan, nor was it introduced in any of the courses I took on Japanese while I was there, including the examples you provide.

I majored in Japanese and it wasn't really seriously introduced until I studied in Japan and I'd already been studying Japanese for three years. But while not learning it won't hinder your ability to communicate too much it can help you sound more natural. NHK publishes a dictionary of pitch accent in standard Tokyo dialect: https://www.amazon.co.jp/NHK-%E6%97%A5%E6%9C%AC%E8%AA%9E%E7%...

Thanks for the reference! Could I trouble you to provide an example from that text that describes such a homophone (as opposed to regional dialect) distinction?

The examples I gave upthread all have pairs with different accents in hyozyungo.

Okay. I was hoping for quotes from the text itself.

> but faulty pinyin was not.

Is this a real opinion or just a political stand against PRC? How is pinyin faulty, specifically with respect to Mandarin Chinese?

What Romanization system is better and how so?

OP elaborates more about this below.

Very cool.

I noticed an interesting quote in the article:

Before Pinyin was developed, 85% of Chinese people could not read, now almost all can.

While the literacy rate used to be 15% and is now 95%, that means that there are still 54 million illiterate people in the country. Obviously nothing wrong with the quote and it is at least in spirit correct, but it's interesting how vague language can be.

Edit: while looking for those numbers, came across this which is a bit dated now (2001), but gives more insight into the situation there:


According to Wang Dai, an official with the ministry's Illiteracy Elimination Office, literacy rates stand uniformly near the 95 percent level in the nation's major cities, and throughout its more prosperous coastal region.

In China's less developed western region, however, there remains much to be done. The problem is especially acute in areas populated by non-Chinese ethnic minorities, such as Tibet, where illiteracy rates today run as high as 42 percent.

Nationwide, there are still 30 million Chinese between the ages of 15 and 50 who cannot read at all. Adding in all those defined as "semi-literate" and those, like Hua Lijun, who are above 50, the total approaches 150 million. About 70 percent of all illiterate Chinese are women. The largest gains against illiteracy were made in the 1950s and 1960s when the government made good on its promises to provide at least basic education throughout much of the country.

Communist China is really good at massaging statistics to make things look better than they actually are.

Significant numbers of people in modern China can barely communicate to each other about complex ideas in Mandarin (several hundred million are 2nd language speakers or quasi 1st-language speakers).

Someone reading faulty pinyin and making a barely-comprehensible sounds might count as literacy in fluffy government surveys commissioned to show progress, but in reality being able to read pinyin does not mean someone can read in the way you or I understand it.

If you need to understand the meaning of something written in Chinese, you must see the Chinese character. Just because you can read the pinyin pronunciation doesn't mean you understand what's being said. Being able to produce sounds is not literacy - there's high amounts of ambiguity.

For example, reading

Shíshì shīshì Shī Shì, shì shī, shì shí shí shī. Shì shíshí shì shì shì shī. Shí shí, shì shí shī shì shì. Shì shí, shì Shī Shì shì shì. Shì shì shì shí shī, shì shǐ shì, shǐ shì shí shī shì shì. Shì shí shì shí shī shī, shì shí shì. Shíshì shī, Shì shǐ shì shì shí shì. Shíshì shì, Shì shǐ shì shí shì shí shī. Shí shí, shǐ shí shì shí shī shī, shí shí shí shī shī. Shì shì shì shì.

Does not tell you the actual the meaning of

石室詩士施氏,嗜獅,誓食十獅。 氏時時適市視獅。 十時,適十獅適市。 是時,適施氏適市。 氏視是十獅,恃矢勢,使是十獅逝世。 氏拾是十獅屍,適石室。 石室濕,氏使侍拭石室。 石室拭,氏始試食是十獅。 食時,始識是十獅屍,實十石獅屍。 試釋是事。


Another story entirely is the idea of adding "Simplified" Chinese characters. Now people who want to be seriously literate have to learn an additional set of characters. But I digress.

For another example of massaging statistics to show progress, notice how when a comparison of "China's educational progress!" is made to the West, a place like Shanghai - whose school district is among the best if not the best in the entire country - is often then compared to the national average of other countries as if one were somehow representative of the other.

That's a ridiculously contrived example. By this logic I guess we need to introduce a system based on Chinese characters because it's hard to interpret the meaning of "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo" (https://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffal...)

Sign up for WeChat and start talking to people in just pinyin. They'll tell you how much they don't enjoy talking to you.

I like a good discussion, but you are really stretching each and every post you make. I know people that simply don't write characters at all. I know people that prefer characters.

I've written people messages in just pinyin and they tell me they don't understand when it's just pinyin. Maybe they were just illiterate?

Nobody will have a discussion with you in English using Cyrillic characters either, but not because English is uniquely suited to the Latin alphabet. Obviously people who read and write with Chinese characters all the time are going to prefer what they're familiar with.

As a neat curiosity, ambiguity sometimes cuts both ways - reading

云朝朝朝朝朝朝朝朝散 潮长长长长长长长长消

does not help you unless you also know the intended pronunciation:

yūn, zhāo cháo, zhāo zhāo cháo,zhāo cháo zhāo sàn;cháo,cháng zhǎng, cháng cháng zhǎng, cháng zhǎng cháng xiāo


I agree with the overall point - Pinyin will never be a replacement for written Chinese. Indeed, both Japanese and Korean struggle with this to some extent - in Japanese, a sentence written in pure hiragana with no kanji is very difficult to understand, and in Korean, many people learn hanja as they are sometimes needed in writing to disambiguate.

Actually, except for the very old most Koreans know pretty much no hanja beyond really basic ones like 大. I always think this argument is ridiculous. Japanese, Korean, and Chinese people have managed to speak their respective languages (or ancestors of them) for thousands of years without being excessively troubled by those pesky homophones despite the fact that until the 20th Century only a small minority of them were literate.

Very much this. Some more thoughts:

* Standard Mandarin has a surprisingly small number of distinct syllables, namely about 400 when tones are ignored, and 1300 with tones. English has about 7000 to 15000. So, homophones are a bit more of a problem in Mandarin than in English.

* English disambiguates some homophones in writing (where/were, there/their/they're, and so on), though as less literate people increasingly demonstrate, one can communicate reasonably well without these disambiguations.

* Spanish has an extremely close correspondence between what is written and what is said, and only disambiguates a very few words (typically question words) with an accent, and the writing system works just fine.

* Given that spoken Mandarin works, and disambiguation of homophones in alphabetic languages is only sometimes done, I am personally convinced that some sort of alphabetic writing for Mandarin is absolutely feasible. That might be Pinyin or Bopomofo or some variation thereof.

* Personally, I find the Chinese script very beautiful, but a colossal waste of time (think of the billion of poor kids that had to learn it).

* Would switching to some sort of alphabetic system cut people off of their cultural heritage? Maybe somewhat, but I'd argue it's mitigated by two things: a) Classical Chinese is in fact so remote from modern one that you have to study it anyways in depth, if you are into it. You could still do that, if you are into it. b) Nearly nobody speaks Latin anymore in the West, the horror. Are we cut off of our cultural heritage? Well, somewhat, but there are translations, and if you're really into it, you can still go and learn it, see a).

Great resources on these issues, btw:

* "The Chinese Language: Fact and Fantasy", John DeFrancis

* "Asia's Orthographic Dilemma", Wm. C. Hannas, extract available here http://www.pinyin.info/readings/orthographic.html

* Pinyin Info, http://www.pinyin.info/

* Language Log, http://languagelog.ldc.upenn.edu/nll/ , in particular Victor Mair's posts

Yeah I'm a fan of all the resources you mentioned. It's hard to talk about this without people accusing you of just being too lazy to learn, never mind that I've already put in the time to learn the Japanese writing system.

One way to disambiguate in Korean, especially North Korea, guess what, is simply using another word to replace the ambiguous one. Why keeping using a word that is causing confusion anyway?

It's a chicken-and-egg problem. When most people write in "phonetic" script, people stop using ambiguous words that sound alike, so these words fade away, which further strengthens the adoption of phonetic script.

If everyone around you uses Chinese characters, there's little incentive to avoid homophones, so they will remain a part of literary language, reinforcing the use of Chinese characters.

That is exactly what I meant. The theory that w/o Chinese, makes languages like Korean/Vietnamese 'worse', is a myth constructed by some sino-centrists. The decision to keep or deprecate Chinese characters is political, but people find their way to adapt and evolve with the new language.

Very true.

And indeed, Korean words that come from Chinese (about half) have a much higher ratio of homophones than in the native Korean part.

Languages are for communicating, if there is ambiguity, ask and let the person clarify it.

In fact, zhuyin is equally bad resolving the Lion Poet as well. While in reality, no one really speaks like this.

> Now people who want to be seriously literate have to learn an additional set of characters

It is not left to you to decide what is 'literacy'. In a society that most of people using simplified characters, understanding traditional chinese brings marginal benefits, as you don't speak Shakespeare's english on daily basis.

More convincing counterexample will be nation like Vietnam and Korea, where Chinese characters have been long deprecated, are not illiterate in any way, and are doing absolutely fine.

Some traditional Chinese users often fall into the trap of associating language/writing system to their own identity, preventing themselves from thinking about the stuff they are actively objecting in a more rational and constructive way, such is this case. Simplified Chinese is here because people are happy with it and it is not limiting their expression. If the language serves the purpose of enabling communication between people, then it is a good language, and it has every reason to stay. I didn't see why a thriving language with 1 billion user base should go anywhere based on some niche group's agenda.

Correct me if I'm wrong, but from what I've been explained, the issue with Simplified Chinese is that it's "simplified" by removing characters for homonyms in Mandarin Chinese. So this works fine for Mandarin speakers, but in places like Taiwan, where they speak a different dialect, it does not make sense since there is one character for different words that are not pronounced the same, making the writing difficult or impossible to read... Which is why they stick to Traditional, and might understandably feel Simplified as an affront to their identity.

Ok: you are wrong. There are several classes of simplification and most of them involve standardizing common written shorthand or replacing radicals to form new characters. There is some replacing of characters for homophones, but this is the least common type of simplification. As for Taiwanese, the problem there is that, simplified or not, the language features a lot of words that just don't have a standard representation in Chinese script. The whole idea that Chinese script lets Chinese people just write their vernacular speech and all understand each other is fallacious. Almost all written material in Chinese (especially anything formal) is written according to the grammar and vocabulary of Mandarin (although perhaps it will be read with the pronunciation of a different Sinitic language).

> there's high amounts of ambiguity.

Were that true, Mandarin speakers would have trouble understanding each other when they speak. The poem is written in Classical Chinese, and in that language, the different characters would have different pronunciations (as is also the case in modern Cantonese).

With hindsight, you're right about the simplified characters. Traditional characters are still used in Hong Kong and Taiwan ROC. Simplified characters are easier to write, but not necessarily to read. But Chinese isn't written as much nowadays - it's usually typed in Pinyin and the appropriate characters selected by software, a development that couldn't have been foreseen when simplified characters were introduced.

It is fascinating, and a cute curiosity, but not a serious argument for the necessity of characters as opposed to an alphabetic writing system.

The "shi shi shi shi shi" poem caught my attention, so I looked for a video:


He saw his eleventy-first birthday? That's a rather curious number and a very respectable age for a human.

111 years old would made him amongst the top 5 oldest men living on earth ... or at least yesterday, before he died

only about 1500 suerentinarians are thought to have been documented in history, most of whom were women. being male is really rare..maybe only a hundred total out off 50 billion people who have ever lived...

The easiest way to live to be over 100 is, to use your father's birth certificate. Pretty common, and easy to pull off because there's nobody else alive who can contradict you.

Interestingly your comment is the only google result for "suerentinarians"

Maybe it's a misspelling of supercentenarians?


Pretty solid evidence of a time traveler. :)

To understand the impact of Pinyin, all you have to do is take a look at the alternative input sources for Chinese on your computer or phone.

On my Mac it's under Settings -> Keyboard -> Input Sources. Add a new input source for Select Chinese (Simplified) and take a look at the alternatives:

- Stroke, which uses only symbol characters and numbers to "write" words. Good luck with that!

- Trackpad handwriting. You "write" the characters on your trackpad and it recognizes them. Works pretty well if you already know how to write the characters. But it's much, much slower in my opinion. Plus, you need to memorize how to write thousands of characters and how the characters connect to how the words sound, which there is no systematic mapping for.

- Wubi Xing, which I'm unfamiliar with but seems radical based. Just look at the keyboard - https://en.wikipedia.org/wiki/Wubi_method

- Pinyin, which uses your regular English keyboard. With a little training, you can sound out Chinese words and immediately type them with autocorrect-like functionality that works amazingly well. I can type Chinese almost as fast as English this way.

In short, Pinyin has made it possible for Chinese people to operate efficiently in the modern world we live in.

Zhuyin is conspicuous by its absence - it maps perfectly well to modern keyboards, and requires slightly less letters to write a word ("en" and "an" in pinyin are one glyph, for example).

It's an invented 20th century alphabet, which means there's logic behind it - the left side of the keyboard are initial consonants, in the middle are the vowels, and at the end are vowels with trailing consonants on them like "ang" and "en". Even the consonants on the left have a logical order - the first column are sounds you make with your lips (b, p, m, f), then with the tip of your tongue (d, t, n, l) then the tongue goes further back in you throat (g, k, h).

It really is a well designed phonetic alphabet, tailor-made to mandarin. And the IMEs for it let you filter your characters by typing a tone - none of the Pinyin ones I tried let me do that which was my main reason for switching.

A funny thing about how the Chinese type on their phones is that not only they use Pinyin, but they use the old-school numeric input method (2="abc", 3="def", etc., like SMS in the pre-smartphone era) to type it. Or at least that's what I saw most people do when I went to Beijing, maybe in other regions/demographics it's different.

I find it rather amusing that while in the West we use 30 keys to type our around 30 characters, in China they use 10 keys to type their thousands of characters.

Also, it's probably not a bad idea for those of us with a tendency to fat-finger (like myself). I was quite faster writing messages on pre-smartphone keyboards than on smartphone ones (although probably a large part of that is physical key feedback vs. virtual keys). I'd definitely give a 10-key option a try on my smartphone, but it isn't offered.

This is the main input for japanese as well, I love it and wish an english version was offered

Cellphone number pad input is very well-suited to Japanese: each key is a row of kana. When I type Japanese on my phone, I always use the number pad input these days.

though T9 typing ain't as accurate as QWERTY, my wife was very happy to switch to Google pinyin keyboard which support QWERTY layout unlike most of the Chinese keyboards which use T9 which must guess much more based on your input so it's less accurate than QWERTY input, the thing is most of Chinese keyboards don't have this option or people are not aware of it, it also helps keep consistent layout when switching between English and Chinese

Simple example using Pinyin: 你好吗 is "Hello, how are you?". In Pinyin you'd type: "ni hao ma?".

It sounds like this: https://translate.google.com/#auto/en/你好吗?

I can type fine with Zhuyin keyboard... Shame you an only buy them in Hong Kong and Taiwan tho :(

Pertinent resources:

* Pinyin Info, http://www.pinyin.info/

* Language Log, http://languagelog.ldc.upenn.edu/nll/ , in particular Victor Mair's posts

* "The Chinese Language: Fact and Fantasy", John DeFrancis

* "Asia's Orthographic Dilemma", Wm. C. Hannas, extract available here http://www.pinyin.info/readings/orthographic.html

There was a short interview with him in Stephen Fry's "Planet word" series of documentaries. His humility was very inspiring.

pinyin is not nearly as good as zhuyin in regards to phonetic representation.

as one example, some words like pinyin's /zhun/ look like they should have one syllable, but there's actually two, as far as i can tell. or at the least, the schwa is definitely missing from the pinyin representation, as in, it's not /zhu@n/, where @ is the schwa.

pinyin is rife with oversights like this, that a learner has no idea about, and indeed instructional guides don't even mention much of the time.

if you're serious about learning chinese, don't use pinyin unless you want to spend extra time undoing the damage done by doing so.

Wouldn't this be offset by being familiar with another Latin-alphabet-based language? For example, in German, the letter "s" sounds like an English "z" and the letter "v" sounds like an English "f."

At that point, you've already disassociated the specific sound from its letter. How is Mandarin/Pinyin any different?

It is indeed a problem for any language that reuses another language's symbols and attempts to jam them into place to fit the phonology of its own.

When a language adopts the symbols of others of a nearby language relative, the fit can sometimes turn out ok, but of course you'll still have some bumps.

But the poor suit is especially apparent when, for example, you try to throw the Latin alphabet into a tonal Sino-Tibetan language.

I'd much rather learn 400 new unique phonetic symbols (or one set of IPA) than have to learn 26 alternate pronunciations whose sounds will compete with the pronunciations that fire off in my brain due to knowing the German, French, Spanish, English and Hungarian varieties of pronunciations of that same letter.

Most languages of the world are written primarily in Latin characters. The only language these characters were designed for is long dead.

The reason for switching to Latin characters has nothing to do with phonetics and everything to do with making the language more practical to use with computers. Keyboards with 400 keys don't really work.

Countless other languages with no relation to English have been using romanized alphabets for many years. Your argument that Chinese is any different doesn't make sense.

most languages maybe, not so sure about most of the population when already India and China have more than 37% of works population plus most of the Muslim world and you end up with fact you are just reinforcing your Western view of world, while by statistics our Latin writing systems are in minority but because of our wealth we feel superior to think it's the easiest and fastest way to write

For some reason I can't edit my old posts, so I'm making a new one.

But after further investigation I'd like to point out that in Zhuyin, just because there's two vowel-containing symbols, it doesn't necessarily mean there's two vowels. And the more I think about it, the more it does seem there's one syllable. And if /zhun/ does have two syllables, it would seem to be fairly unique.

I've asked a Chinese speaker about syllables before, and they didn't understand what I was trying to say. So if a Chinese speaker could chime in on this, it would be much appreciated, because I've been trying to get concise information on this for a while, and I probably shouldn't be telling the internets that it's two syllables if it's actually one.

This is complete and utter nonsense. First of all, plenty of native speakers of, say, Cantonese and other regional languages of Chinese, learn Mandarin using pinyin, and they do just fine.

Second, as a non-native speaker, the advantages of not having to stumble over zhuyin and its problems hugely outweigh any advantage it might be wrongly thought to have. And the benefits continue well past the learning stage.

The /zhun/ example isn't even true. That sound is sometimes broken down for students as multiple sounds, not syllables, for pedagogical purposes when the teachers are trying to get through to a student. That's the case with either pinyin or zhuyin. They explain it as multiple component sounds (not syllables) because the underlying whacky Chinese phonetic theory says it is made up of multiple sounds. (By the way this theory takes about 10 seconds to learn, so the fact that sounds can be broken down into smaller parts is not some mind-boggling obstacle.) If you were taught that it has two syllables, you may want to consider the possibility that your teacher is fuzzy on the meaning of the English word "syllable."

If teachers are using zhuyin, they also break the sounds down to justify or explain how zhuyin works, since zhuyin is so oddly counterintuitive it needs extra explanation to help ease the cognitive burden of using it (which could be the reason it was reformed and mostly dumped decades ago in mainland China to be replaced by pinyin).

Zhuyin does survive in Taiwan, where they keep it around for political reasons. Accepting a superior mainland system would be a linguistic defeat, and would damage their pride, so they cling to an archaic system. They don't want to admit this outright, so they come up with all sorts of reasons to keep it, and they end up not learning pinyin well themselves, so with that handicap they then try to convince naive foreign students that zhuyin is good.

Don't fall for it if you like, let's see... touch typing, reading signs that are written in pinyin, being able to write down the sounds of Characters you haven't learned yet and have them readable by the average Chinese person, and focusing your time and efforts in the right places.

A better argument on your side of the issue would be to say that pinyin is bad because the 'z' and 'h' and 'u' and 'n' are four separate symbols, but zhun is one syllable, so it's (you could say) confusing to the mind (it's not). But that's simply ridiculous. Users of phonetic languages, as all of us are, are perfectly accustomed to seeing multiple sounds and syllables for that matter represented by a number of symbols that does not necessarily correspond to the number of letters.

Mostly the great thing about pinyin is it just does its job transparently without being a distraction. It becomes invisible almost, because since the letters are familiar and the sounds are already readable right off the bat, I can focus on the language and learning words and sounds correctly, instead of having an extra layer in between me and the language.

Pinyin has served me well as a learner, and the results are hard to argue with. I didn't start learning until my 20s, and now pass as a native speaker over the phone. So... don't listen to that guy. Pinyin is great.

I have learned pinyin and zhuyin both, but found zhuyin to be the "superior" system. In pinyin the two vowel sounds ㄜ and ㄝ are written as "e", and both ㄨ(sometimes) and ㄩ are written as "u". Furthermore, typing is faster in zhuyin—the longest syllable takes 3 keystrokes instead of 5. And on top of that, learning zhuyin helped my phonics since it forced me away from using English sounds as a starting point. Another small benefit was that when reading children's books with zhuyin, I generally only looked at it when needed, whereas with pinyin, my eyes would be drawn to the familiar roman characters even when I didn't need them to read a character.

Despite my experience, I still believe claims that either pinyin or zhuyin is "superior" are pretty subjective. Taiwan has undeniably done better than the mainland in terms of literacy and computer literacy... but it's also had the benefit of a superior educational system and more open, wealthier society. At least for natives, either system is fine. Most foreigners could probably gain something from learning both (and IPA too, for that matter).

> typing is faster in zhuyin—the longest syllable takes 3 keystrokes instead of 5 [as in pinyin]

The longest syllable in pinyin is 6 keystrokes: zhuang, chuang, etc

Two bad shuangpin isn't used more extensively to type Chinese as it maps easily to pinyin but only requires 2 keystrokes per character. The unused initial keystrokes map to the two-character initial pinyin sounds (i=ch, u=sh, v=zh), but the mapping from keystroke to medial/final sound is more complicated. If there was a way to change from pinyin input to shuangpin (or similar method) input incrementally so it was easier to learn, I think more people would convert.

Don't fall for it if you like, let's see... touch typing, reading signs that are written in pinyin, being able to write down the sounds of Characters you haven't learned yet and have them readable by the average Chinese person, and focusing your time and efforts in the right places.

It is nice to hear you say this. I practice characters and stroke-order an hour a day. I also read for about an hour each day in some kind of reader or book. My other classmates are really surprised that I hand write every assignment instead of just using a digital input method and have it write the characters for me. But, I think this is why I read and understand a lot faster.

I am trying something new, since January 1. I am trying to learn 30 new characters a week. So far this is proving a bit hard. I need to know the pinyin, pronunciation, meaning, stroke-order and how I might use this character. Perhaps 30 is to many. I'll keep trying.

Not to discourage you, just to create realistic expectations: I'd be very surprised if you can learn and retain 1500 characters in a year with an hour a day, let alone 4500 characters (basically all you need) in 3 years with an hour a day, unless you're blessed with an excellent memory.

I suggest to temper your expectations, or budget more time.

sorry, I meant an hour a day just on writing characters for practice. I also spend 1 hour in class 3 times a week and time doing the homework and time trying to read short stories, looking up characters that I don't know.

My memory for language is pretty good, however like you said I may need more time yet still.

Thanks for pointing that out about the two syllables, I'm glad you cleared that up.

Really great comment, except for the tone of the first sentence. Please be civil.

Well in his defense, what I did say about the two syllables was utter nonsense, and I'm glad he pointed it out.

I had seen conflicting information about it before and heard speakers say it as if there were two syllables, so there was an ambiguity in my mind that was cleared up by his confidence.

"if you're serious about learning chinese, don't use pinyin unless you want to spend extra time undoing the damage done by doing so."

Please explain this. I have been learning Mandarin, can you tell me what damage has been done to me?

Can you show us an concrete example of pinyin vs zhuyin and why zhuyin is a better choice?

Let us start small: Dàjiā hǎo. Dàjiā hǎo ma? (大家好. 大家好吗?)

Pinyin lures second-language learners into a false sense of understanding what is actually being said.

Take the first letter, d, for example. That's not a d like you would understand it in other latin-letter-based languages. It's actually more like a t. And actually it's not even like a t, it's more like an unaspirated t.

More information is available here: https://en.wikipedia.org/wiki/Standard_Chinese_phonology

I'd like to see an IPA-based Chinese dictionary, for example, but I won't be holding my breath. IPA suffers from its own problems in that it seems like its under constant revision, but damn if it's not the most useful tool in understanding the phonology of arbitrary languages.

>Take the first letter, d, for example. That's not a d like you would understand it in other latin-letter-based languages. It's actually more like a t. And actually it's not even like a t, it's more like an unaspirated t.

That issue is by no means unique to pinyin. "S" and "W" in German sound more like "Z" and "V" in English. Colloquial French has half a dozen ways of pronouncing "R", all of them utterly unfamiliar to most English speakers. If you're under the misapprehension that these symbols represent familiar phonemes, then you've simply been badly taught.

"Take the first letter, d, for example. That's not a d like you would understand it in other latin-letter-based languages. It's actually more like a t. And actually it's not even like a t, it's more like an unaspirated t."

I can understand this point, but how is it damaging me? I guess I view this as a "rule" of learning the language.

How would zhuyin had done this better?

Thanks for the link too.

Edit: Thinking about this more, other languages have characters that you pronounce differently that what is written. Spanish for example has a bit of this. As other tonal languages do too.

>I can understand this point, but how is it damaging me?

Were you pronouncing it as /d/ before or were you pronuncing it as an unaspirated /t/?

The problem is that with pinyin, people assume it's a /d/ sound, and their language learning books never tell them it's not a /d/. Conversely, when Chinese people then go to learn other languages, they see a d and say something like unaspirated /t/, and then they wonder why they can never sound like the natives when they try to say d.

If you only relied on pinyin, you'd just always sound wrong and no one would tell you, and you'd never figure it out that it was the faulty phonetic system that hypnotized you into saying something almost the complete wrong way.

Then you get stereotypes like "all Chinese people sound like impression", when they don't have to, but they have these misleading learning tools that are making them sound that way.

edit as a reply (thread reply limit reached):

Here's a concrete example of a failure of pinyin:

Consider that 也 is /yě/, and 得 is /de/.

Except actually, the two e's are not the same sound.

In IPA, the vowels are e and ɤ, respectively.

Zhuyin also correctly makes the vowel distinction with ㄜ and ㄝ.

Pinyin as written does not show a difference between these two distinct sounds, whereas zhuyin does.

Also, take a look at the example for /zhun/ that I give above for pinyin neglecting to show the addition of an entire syllable of a schwa.

You are making a lot of assumptions here about how people learn. When I started learning, pinyin foundation was the absolute first thing. Before anything else. Yes, I was told /d/ is an unaspirated /t/, but I also read it in my text book.

Also, you say people will never tell me I am wrong. I disagree here. People correct my tones all the time. They know I am learning. Even people I try and talk to at a supermarket are pleasantly surprised I talked to them in their language and they offer a correction.

Also, you keep saying zhuyin is better, but you haven't given a single example as to why. You just keep saying pinyin isn't as good enough and gives false hope.

I'm not trying to be a jerk, but you made an incomplete argument saying the method I am learning is damaging to my learning and all you state is not knowing the proper pronunciation of characters by simply looking at them. You learn a language, with its "rules".

Certainly it's important to realize that the english text is a merely an approximation of the actual pronunciation, but this isn't really that different than any other language that uses the western alphabet. The same thing applies in French, Spanish, German, Italian, etc. The student will quickly learn that even though the letters look the same, each language has its own idiosyncratic way of vocalizing them. Any textbook or teacher that does not emphasize this is not a very good one.

> Consider that 也 is /yě/, and 得 is /de/.

There is no distinctions between the "e" because the latter uses a different rule in pinyin system from the first one:


In English you also have similar situations where the same composition gives you different pronunciations:

baked, raped vs gated, naked.

This goes along with what I have been saying. The OP's argument would suggest that anyone just reading outright and not taking the time to learn would pronounce the /e/ all the same. However, learning the "rules" is required to pronounce it properly. This isn't a bad or damaging thing.

Would you say that Pinyin run through ROT-13 is just as easy as Hanyu Pinyin, because it's still 26 characters and you just have to memorise the rules? I found the ambiguities of Hanyu Pinyin really annoying when learning Chinese. Probably still more time-efficient than learning Bopomofo, but Pinyin is definitely a "least worst" choice for me.

"Consider that 也 is /yě/, and 得 is /de/. Except actually, the two e's are not the same sound."

I know this. And I see where you are coming from. But again, you are to focused on someone just reading text outright, without learning the "rules" of the language. This is where the problem is. Not taking the time to understand the language you are learning.

Your example is just another "rule" that is found in any textbook. I have probably 30 Chinese languages books on my shelf and I can find this and all the "rules" about pronouncing a /d/ as an unaspirated /t/ in most all of them.

So, it seems that learning pinyin is just fine if a person "learns" properly and just doesn't gloss over the foundation.

I appreciate your explanations, BTW.

You haven't learned pinyin properly. Anyone who learns the utmost basics of pinyin and the language will not stumble on any of this stuff. You sound like you are relying on the phonetics too much. You just need to get the phonetic rules, and then whatever system you are using, you will know the sound in your head and match it to what is on the page. ye and de are super easy to distinguish as two things that have two different rules (read about initials and finals, if you need more info, here for example: http://www.yellowbridge.com/chinese/pinyin-rules.php)

I gotta agree with what you're generally saying. I'm not as familiar with phonetics, but I've found i learning Chinese that one of the most useful things to keep in mind when reading pinyin is to not take the letters literally and to focus on how native people are actually saying things. Like in the examples you give, the pinyin can lure you into bad habits

>Also, take a look at the example for /zhun/ that I give above for pinyin neglecting to show the addition of an entire syllable of a schwa.

Can't edit anymore, but I just wanted to update this to reflect that there isn't actually a second syllable here.


Please don't insult other users on HN like this.

To be fair, I enjoyed the feedback. I see how this could often snowball, though.

Perhaps my "make a less-than-exhaustively-researched statement then wait and see if someone corrects you" method of testing validity of statements isn't very harmonious. I think this was my bad, here.

What is this "zhun" that's actually two syllables? How do you write it in hanzi? (I'm just a beginner trying to understand.)

pinyin's /zhǔn/ as in 準 in zhuyin is ㄓㄨㄣ. in IPA, it sounds like /ʈʂuən/, and no it's not a diphthong here as far as i know. If you just read the pinyin, you wouldn't know the schwa was there at all.

You sound like you know a lot of this. But educate me how ㄓㄨㄣ is supposed to better describe 準 as two syllables? Both looks like three syllables to me, with zh/u/n or ㄓ/ㄨ/ㄣ.

In fact why is that word even two syllables? From what I understand, if two vowels are together (in English), it counts as one syllable only.

I'm curious about that as well - I have always thought "zhun" is one syllable.

There can be confusions though - the standard way of writing pinyin groups whole words rather than individual characters, so something like "xian" may represent either "西安" (two syllables) or "先" (one syllable).

It's one syllable. The OP in this case is actually being misguided by zhuyin, and would be better served by pinyin. With his poor understanding of the rules of how sounds in Chinese combine in real speech, it looks to him like two syllables, but it's not, because sounds combine in practice.

Or... maybe he's thinking of Chinese opera. Yeah sure when listening to Chinese opera, all kinds of sounds get lengthened into many syllables ;-).

But really I can imagine how this happened. Couple of IPA wonks sit down with a Chinese native speaker, maybe a linguist, and they say, wtf, explain to us what is the difference between 'jun' and 'zhun'? The Chinese person does a hyper-exaggerated slow demonstration of the sounds, and they write that down as the official IPA transliteration that the OP is fixated on, further cementing a misunderstanding of real usage that was implied by the zhuyin transliteration.

As to Xi'an, pinyin has a rule for that kind of case, which is to add an apostrophe in between the parts to resolve the ambiguity.

You know, it could be one syllable. This is a topic that I've tried to search for specific answers for before but I couldn't find definitive information about it.

When I listen to recordings and to people speaking it, it sounds like two, which the Zhuyin seems to support.

Uhm... Not sure about that. I speak Chinese everyday for the past 20 years and it sounds like one syllable to me.

because ㄣ is /ən/, whereas ㄋ is just /n/.


edit as reply:

>That's another rule. But in return you get 26 alphabets which majority of people on planet recognizes, instead of zhuyin which is only used in Taiwan and some Taiwan-related schools overseas.

I would argue the opposite; you don't actually get double bonus understanding of reusing language A's writing system on language B's, you get double misunderstanding. Aspects of each overlap on each other and the user confuses one for the other when they are in fact different and can't be dropped in as accurate substitutes for the other.

When they are dropped in as substitutes, unintelligible pronunciations are made. In the case of dropping a Latin writing system into a Sino-Tibetan writing system, this occurs even more frequently than when Indo-European languages are dropped in for each other.

If you want the right tool for the job, two different systems might be more appropriate.

>zhuyin which is only used in Taiwan and some Taiwan-related schools overseas.

I'm a rare example, but I didn't learn Zhuyin in Taiwan, nor did I learn Chinese in school. I reviewed the phonetic representations available and decided to use it because it happened to be the best supported by Pleco. If Pleco had IPA support, I would use that instead, because it's an even better tool for the job.

Reply to your second part:

That's a shock to me. Billions of Chinese people and millions of people learning Chinese are doing just fine. If anything, pinyin makes the first few lessons easier.

Of course I am not suggesting that pinyin can be used to replace characters, but it is good for learning pronunciations and useful in typing.

Japanese is different in the sense that each character already carries its own pronunciation, so the use of romaji is discouraged. But for Chinese, the character rarely reveal its pronunciation so you need a good way of describing them.

As for Pleco, ironically I see pinyin being featured more prominently, as the only pronunciation in the list view: https://www.pleco.com/

>Japanese is different in the sense that each character already carries its own pronunciation, so the use of romaji is discouraged.

I may be wrong but isn't that only the case with hiragana and katakana? Kanji in Japanese can have multiple readings with multiple pronunciations - which you just kind of have to know.

for that you have Furigana, which is essentially kana for kanji:


Which you will only find in children's books or on names.

"n" in pinyin is pronounced as /ən/ when paired with "u", which shares the same pronunciation as "en"(恩):

zhun, tun, gun, lun...

That's another rule. But in return you get 26 alphabets which majority of people on planet recognizes, instead of zhuyin which is only used in Taiwan and some Taiwan-related schools overseas.

This answers my question (you replied to it in the original place too). Thank you. I get the point you're making. And I can see how it's a bad system. Now that I think about it, I remember a similar problem with _ei? Sometimes it's pronounced <schwa>i and sometimes <a as in day>i. Yes, I also completely agree that Pinyin seems like a shitty system and very non-uniform system given what it set out to do and what it already had to take inspiration from.

That having said, the I still don't get the point you make about tone-sandhis and such. How do you propose they be incorporated?

>That having said, the I still don't get the point you make about tone-sandhis and such. How do you propose they be incorporated?

I haven't given it much thought, but I would just write the tone with the unique combination of the phonology and the pitch level function that it actually has. Sheet music does this.

A difficulty in this approach would be defining a standardized tone sandhi, and the learning curve required for users to understand and write it.

And just to update, there's not actually two syllables in there.

Sorry can you please elaborate? When I see 我, Wǒ. What do you propose I should do?

我 /wǒ/ in isolation in pinyin isn't so bad, but problems arise when you actually use it with another word, or in other words, almost always.

What I'm about to mention here isn't unique to pinyin, and seems to be one fault you'll find in most (if not all?) transcriptions of tonal languages: a major problem is that they don't represent tone sandhi well if at all.

For for example, you say 我家, you don't say /wǒjiā/, it's more like /wo(short flat low tone)jiā/, instead of /wo(medium-low falling rising tone)jiā/ like wǒjiā would seem to indicate.

I'm a beginner to Chinese so maybe I'm misled, but I find it quite silly that all books and tutorials describe the third tone as "falling-rising". This is something that left me scratching my head at the beginning of my learning because the descriptions in courses just didn't match what I was hearing.

From what I hear, it seems to just be a short low tone the majority of the time, as you describe in 我家. Then it becomes a second tone before a third tone, as books say. But I hardly ever hear it as a falling-rising tone. Practically only when it's pronounced in isolation (e.g. when one is counting). So I wonder why the least frequent case is presented as the rule and the most frequent case is often not presented at all... maybe tradition from old pronunciations of Chinese?

I'm against Pinyin for the same reason as why I'm against using Romaji for learning Japanese. Pinyin and Romaji uses latin alphabet for representing sounds in a foreign language and if you are a native English speaker like most of us on HN, it's really bad because your brain will try to associate your knowledge of English with Pinyin you are reading. If you use Zhuyin for Chinese, or the Kana system for Japanese, you are telling your brain to start learning the phonetic system from scratch, which makes learning Chinese or Japanese a lot easier phonetically at least.

Except that if you use a western keyboard you still need to learn Pinyin or Romaji in order to be able to type the language. In my own attempts to learn a little bit of Japanese I've noticed that it is indeed difficult to associate the 'new' characters to the correct sound without using Romaji as an intermediate representation but my hope is that the more I learn and interact with the language the less this is the case. I think that, especially when using online learning tools, learning Romaji is inevitable and don't think that learning the phonetic system 'from scratch' makes it easier (if it is even possible to completely forget about your own phonetic/alphabetic system).

In short, as a starting learner of Japanese I acknowledge that the problem you describe exists but think that 'really bad' is an overstatement and it is almost inevitable that your brain associates your old knowledge of English (or other languages) to the new language.

>"Except that if you use a western keyboard you still need to learn Pinyin or Romaji in order to be able to type the language."

No you don't. I type in zhuyin every day. The sounds line up in order of the alphabet, starting with ㄅㄆㄇㄈ, in the left column. It's actually easier than touch typing English.

As someone who only knows English, and is learning Mandarin / Traditional Chinese. I agree that Zhuyin is easier than Pinyin.

So you have a zhuyin keyboard and is suggesting everyone to buy one in order to type Chinese?

No. That's wrong on both counts. Please re-read my comment.

I read your comment 5 times and still do not understand it.

The standard zhuyin input is laid out on the keyboard in order. By in order, I mean it is in the same order as the sounds have been taught for many generations and still are today (https://www.youtube.com/watch?v=YlNGJzYJQDI).

It would be like if English speakers didn't use "qwerty" keyboards but instead used "abcde" keyboards. Touch typing would be far easier to learn. My computer doesn't have zhuyin printed on the keyboard, since I bought it in California, but I type zhuyin on it every day. Even before I could touch type, it was easy because it was in order.

I learned Chinese in Taiwan with Zhuyin, as a child. The first thing the teachers got was instruction in Zhuyin, and the way our books and teachers taught Zhuyin was by giving latin characters and English words with corresponding letters. Just looking at Zhuyin did not tell my English-trained brain I was learning new sounds from scratch.

In college I was taught Pinyin and we were given instruction on the correct pronunciation of each phonetic sound, positioning of the tongue etc., and the teacher spent time drilling us on that.

What matters is good instruction. Using Romanization doesn't encourage more correct pronunciation. Roman and Cyrillic alphabets are used for all manner of languages with completely different phonologies.

Honestly, once you know either pinyin or zhuyin, learning the other is trivial. It took under a day for me to get it 95% down and then with some regular usage it was done. But one effect was I found I had a much easier time keeping sounds straight than my fellow learners who often confused sounds that had less straight-forward mappings in the system they learned.

Learning IPA on top is an extreme but still useful third point of attack and again, it's trivial compared to the difficulty of actually learning to hear the sounds clearly. It must have been a year before I realized how different the d in pinyin and the d in English were.

Are you opposed to using it as a learning device to help bootstrap? Or are you referring more to its long-term, continued use? Or its use at all?

I'm opposed to using it for beginners who aren't familiar with the phonetic system of the language. Once you learned the phonetic system properly, and you know the character's in Romaji and Pinyin don't represent the sound you know in English, of course it's then helpful for learning how to type Japanese or Chinese.

Don't get me wrong, Romaji has useful purposes, such as representing place names in Japan for foreigners, etc. But as a learning device for learning Japanese, it's not a very good one.

i made a visual explanation/exploration tool about pinyin: http://enjalot.github.io/pinyin/

Pinyin really helped me learn Chinese. RIP

Very cool!

Pingyin doesn't not only help me learn Chinese, but also learn English's pronunciation later in life. It's not one-to-one match but close enough.

Bopomofo predates Pingyin and did what Pingyin did but with non-Latin alphabet. It was always a surprise looking back that Mao adopted "Western" alphabets for Pingyin when he was very much against western ideology.

It was an excellent call, especially when writing is done digitally nowadays.

Actually - the legend goes - that Mao wanted to ditch Chinese characters in favour of Pinyin, until Stalin said every great civilisation needs its own writing system.

Chinese writing was seen as a factor in holding back the country - everything was up for change or removal.

I'm currently learning Zhuyin.

I learned Chinese using Pinyin it's the first thing you learn and it was a total epiphany. RIP.

For anyone interested in the topic, a couple good books are:

* Chinese: A Linguistic Introduction by Chaofen Sun, and

* The Chinese Language: Fact and Fantasy by John DeFrancis.

Interestingly, Chinese from mainland china don't know much about him. And I even doubt it will be in any headline in main press. In China, politics (esp. agreeing with the Communist party) is more important than achievements. I hope though that in a few decades, China will find a way to recognize him


I was taught pinyin at Chinese university - too bad the Chinese textbooks didn't include the genesis of pinyin - interesting, R.I.P.

I am taking Mandarin classes and our textbook has both pinyin and characters for almost everything. There is always a section in each chapter that is entirely in characters.

I have been buying books on Amazon to help accelerate my reading. I need a story to follow along with. These book indeed do not have any pinyin.

Pinyin is fascinating to think about. Sometimes words are created from pinyin when you say them. Example: New York is Niǔyuē (纽约). Another is how in English we say "ha, ha". In pinyin it has been adopted exactly that way: Hāhā (哈哈).

Interesting facts about pinyin and Simplified Mandarin -- Mao and his cadres were interested in spreading the gospel of Communism, but with high illiteracy rates, people were generally not as susceptible to propaganda.

Both initiatives, pinyin and Simplification, were designed to more easily bring literacy rates up, for the purpose of spreading Communist teachings.

So you invent an essentially new language, make it accessible via Roman letters to even non-Chinese people, and all documentation and literature produced in that new language talks about the glory of the CCP. Pretty brilliant (if sinister).

Silly revisionism. You think Mao engaged in a national literacy program for the sole purpose of inculcate people with Communist propaganda? I don't think so. That's equivalent to saying Hitler built railroads solely so people could get to Nazi party gatherings and transport Jews to camps.

It's possible for Mao to simultaneously be a disastrous autocrat who committed atrocities and to have engaged in projects of national and economic development. Some of them were debacles. Others, like pinyin for Romanization and simplification of writing, were complete successes.

Do you know anything about Chinese? What "new language" are you talking about? Simplified characters and pinyin are not new languages.

The persistence of traditional characters in Hong Kong/Taiwan or diaspora doesn't say anything about the success of simplified characters, which brought literacy to over a billion in the Chinese mainland.

I have some personal experience in this matter; I do not have a natural aptitude for foreign language learning. I learned oral Mandarin as a second language as as a child. I was taught reading and writing Chinese with traditional characters and with Zhuyin/bopomofo (still used in Taiwan, at least until recently), not Romanization.

I studied Chinese in college, learning introductory written Chinese again with simplified characters and pinyin. My personal experience as a native English speaker was that pinyin and simplified characters were substantially easier to get to a functional level of literacy level.

It's not really rocket science, just from a human memory perspective, simplified characters have many fewer strokes and radicals, and many homonymous characters are united so there are fewer of them. This also makes keyboard entry somewhat faster and makes smaller characters legible on displays (or even in print).

I'm not arguing that simplified characters necessarily reduce the effort in achieving a very high level of literacy, but for basic literacy, simplified writing/spelling systems are effective.

Agreed that Pinyin is a great help for the learner.

However, have to disagree on the character simplification.

While it would sure make sense to make characters simpler, simplification really only made some characters faster to write (by hand), by reducing the number of strokes. It is very unclear whether it makes things easier to read (in print), or easier to write/remember (when you use a keyboard to type it).

On the other hand, traditional characters retain more of the etymological components which are helpful for creating mnemonic devices to more easily memorize the characters.

Would you be willing to say how long you have been learning Chinese? Just curious. I am taking classes and a lot of outside focus/study. Native English speaker too.

Do you have any evidence for this at all? I'm by no means a Maoist, but this is a McCarthyist "sinister Commie plot" level of conspiracy theory.

Of course propaganda is one of the motivations behind higher literacy. But I do not think it was the main motivation. I will grant that perhaps my cynicism for your idea stems from the fact that I am a Communist.

Pretty much no material is produced entirely or even mostly in pinyin except material intended for children, learners, or Chinese diaspora who can speak but not read the language.

Regardless of the material you want people to read I think it's hard to call increasing literacy rates "sinister."

> So you invent an essentially new language

It’s still the same language, just new ways to write it down. It’s no different to writing Russian or Greek using the Latin alphabet.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact