
If English were written like Chinese (1999) - rogerbraun
http://www.zompist.com/yingzi/yingzi.htm
======
allemagne
The author really knocked it out of the park with this analogy. This explained
things to me that I might have learned through experience but had never been
explicitly told before. He even took the time at the end to point out where
his own analogy breaks down. Zhongwen.com is also a fantastic site worth a
look for students of Chinese. Overall extremely impressive, creative, and
interesting.

------
wodenokoto
This article is actually _a lot_ better at explaining the characters system
used in both china and Japan, than the articles that have hit the front page
today and in the last week or so.

~~~
visarga
It's a fucked up writing system that is pretty beautiful as well.
Unfortunately it has limited the number of syllables (combinations of sounds)
in the language. Why? Because they have to make due with just 2000 phonetic
sounds (335 in modern Japanese) while English letters can combine in many more
ways and form a much more diverse array of sounds.

As a result Japanese people (probably Chinese too) have a hard time processing
some of the English syllables. This creates difficulty in learning English for
Japanese and Chinese, and in turn, it leads to isolation. It's like a
"cultural moat" separating ideographic languages from phonetic ones. If they
tried to give it up, it would be sad, because of all the history and beauty it
has, but on the other hand, in these countries children in the 5th grade can't
properly read a newspaper because they need to already know 2000 characters
and many more combinations of characters by rote. By comparison, a first
grader can read a newspaper in English.

In terms of programming, Chinese is like Perl (even Perl6! - huge, complex,
mysterious, beautiful) and English is like Lua (small, elegant, expressive).

"There is a standing joke among sinologists that one of the first signs of
senility in a China scholar is the compulsion to come up with a new
romanization method"[1]. That's how much they are bothered by the huge initial
cost of learning it.

[1] "Why Chinese Is So Damn Hard" \-
[http://www.pinyin.info/readings/texts/moser.html](http://www.pinyin.info/readings/texts/moser.html)

~~~
muddyrivers
"As a result Japanese people (probably Chinese too) have a hard time
processing some of the English syllables. This creates difficulty in learning
English for Japanese and Chinese."

We can also say that American/English have a hard time processing some of the
Japanese/Chinese syllables. All the languages I speak a little bit have
distinctive sounds that are difficult for most non-native speakers to master
unless you put a lot of practices. It escalates to another level to speak a
whole sentence, that is, a sequence of sounds. That is why it is easy to tell
if a person is a native speaker of a language in most cases.

It is true that Indo-European languages are difficult for Japanese and Chinese
to learn. It is due to other reasons, which would be a long answer.

~~~
visarga
What I am saying is that in English, after learning 26 letters, you can read
any combination of them. If you try to see how many possible syllables there
are, there would be many thousands. Combinations like "eng" "lish" can't be
properly put in Japanese phonetic characters because you can't put "eng"
directly, you have to use "en gu ri shu", inserting vowels all over the place.
Japanese can't properly learn English because their brains are trained with
much fewer possible syllable sounds and they can't make the jump.

But in turn why is it that there are so few syllable sounds in Japanese? It's
because each syllable has to have its own ideograph (usually there are
multiple ideographs) and you can't easily learn 10000 of them. Being limited
to much fewer ideographs you can memorize, you are also limited to fewer
sounds you can use. On the other hand, with English, after 26 letters you can
read any combination of them, and they form many more possible sounds. So you
learn to make all those sounds as a kid. Even if you come from another
phonetic language you will probably have a similarly diverse sound vocabulary
which will help you transition to English.

So, memory limitations -> fewer characters known than possible English
syllables -> a kind of mental straightjacket in producing other sounds that
fall outside the allowed ones in their native language -> difficulty in
learning English -> cultural isolation. Sad story.

Just look up Japanese people who have learned English from grade 1 to 12, and
see how great their pronunciation is. It's so damn hard for them to make the
sounds.

~~~
justinpombrio
> But in turn why is it that there are so few syllable sounds in Japanese?
> It's because each syllable has to have its own ideograph (usually there are
> multiple ideographs) and you can't easily learn 10000 of them.

Please don't make up facts just because they seem plausible to you. A quick
internet search reveals that the Japanese language predates any writing system
for it:

"Before the 4th century AD, the Japanese had no writing system of their own.
During the 5th century they began to import and adapt the Chinese script,
along with many other aspects of Chinese culture, probably via Korea. However
the Japanese were aware of Chinese writing from about the 1st century AD from
the characters that appeared on imported Chinese goods."[1]

Thus Japanese had settled on having few syllable sounds _before_ there was any
way to write them down, so their syllables _can 't_ have been limited by their
script (at least initially).

> Just look up Japanese people who have learned English from grade 1 to 12,
> and see how great their pronunciation is. It's so damn hard for them to make
> the sounds.

You should see English speakers try to pronounce Chinese...

[1]
[http://www.omniglot.com/writing/japanese.htm](http://www.omniglot.com/writing/japanese.htm)

~~~
philovivero
I have a couple things to add to this thread, not everything directly related
to your comment.

1) Knowing the 26 letters is not remotely enough to be able to read the
newspaper in English (earlier commenter made this assertion). Example: "She
caught a cough. Such is life and death, dear!" How does knowing the letter "u"
help you pronounce half those words? What about "g"? What about "h"? What
about even if you put "gh" together? What about "e", "a", and if you shove
"ea" together? In this single paragraph I've ballooned the number of letters
and letter combinations needed into the 40's, and I've barely just begun.
Don't get me started with borrowed words from French and Spanish (c'est la
vie!).

2) Your comment: "You should see English speakers try to pronounce Chinese" \-
From direct first-hand experience, this is tough for a completely different
reason than this comment thread is talking about. You've gone off on a tangent
here. The reason Chinese is hard for English speakers is that tonality
suggests overall sentence semantics, it does not directly affect any single
word, whereas tonality affects every single word in Chinese. I thought we were
talking about how syllable pronunciations were expressed by the writing
system.

------
WalterBright
The increasing use of icons and emoji suggests that English will become like
Chinese!

(Ever try to look up an icon in a dictionary? This puts paid to the idea that
icons are decipherable by people who don't know the language. Copyrighting the
icons makes even that infinitely worse, as it prevents standardization.)

~~~
Kirth
I'm not sure where you've seen this "increasing use of icons and emoji". When
I encounter people (this is pretty much limited to teenagers) use emoji, they
have very little communicative value.

In the modern era the use of pictographs has become Chinese's Achilles' heel:
the hanzi are not sortable. The very things that define the Chinese are what
makes it stupidly difficult to get computers to grok the language.

~~~
United857
They usually sort by number of strokes.

~~~
pacaro
There are many ways to sort. Typically sorted first by radical, then by number
of strokes.

There's more than one ordering of the radicals, and choices to be made when
characters have the same radical and the same number of strokes

Paper dictionaries are very thumbable though, the current radical is usually
highlighted in a way that makes it easy to flick through and find what you
need

------
swort
The letter 'M' in the Roman alphabet is derived from the symbol for running
water in phonecian, and the letter 'A' is a rotated cow's head.

Phonecian is partly derived from Egyptian hieroglyphics. I love this stuff.

[https://en.m.wikipedia.org/wiki/Phoenician_alphabet](https://en.m.wikipedia.org/wiki/Phoenician_alphabet)

------
yomly
This is a fantastic guide / insight for how Chinese characters work, and does
a great deal to dispel the myth that Chinese characters are virtually random -
that each individual character requires independent memorization.

As someone who has studied both Chinese and Japanese, this article read very
fluidly. Curious how other readers have found this?

~~~
int_19h
As someone with practically zero understanding of the Chinese writing system
beyond the basic notion of what an ideogram is, it was very helpful in aiding
understanding (assuming it is accurate).

------
gerbilly
People might be interested to know that Egyptian Hieroglyphics, despite
looking even more like pictograms are actually an alphabet.

Another cool fact, in hieroglyphics there is more than one way to write a
word, because, unlike most alphabetic systems, some characters are multi
syllabic and can represent two or more syllables.

~~~
int_19h
Wasn't it rather a syllabary?

~~~
gerbilly
I've never heard it described that way.

And like most semitic languages, hieroglyphics mostly record the consonantal
skeleton of a word.

There are additional symbols (determinatives) to disambiguate homonyms.

------
labster
[work] + [fight] + [sun] is poor Huffmanization of the common English suffix
_-tion_. That's what, 14 strokes when English just takes 7?

Well, maybe that was the point, though -- that the priorities of the language
may change over time, and eventually you're dealing with 2000 year old Cockney
rhyming slang baked into your written language. But that's just the opinion of
one guy sitting on his Vannevar.

~~~
gfaure
That is a part of the analogy which definitely applies -- Chinese script (even
Simplified) often completely ignores Huffman coding-like principles. Otherwise
we wouldn't have words like 繼續 "continue" (41 strokes).

~~~
yorwba
For comparison, the letters in "continue" have approximately 15 strokes total.

~~~
labster
続く is 14 strokes, so Japanese beats English!

I could go on ... and "go on" is only 6 strokes.

~~~
vacri
You could just use the ellipsis, and use only three strokes. Or even simpler,
an em-dash at one stroke. :)

------
Animats
He wrote this before the rise of emoji. We may see something like this as a
cell phone writing form. Teenagers would use it to be incomprehensible to
olds.

------
Y_Y
I've been waiting for years for somebody to do something like this with emoji.
All we'd need is a way to add radicals and a central dictionary.

~~~
PeCaN
In a way, eyes and mouths are like the radicals for emoji.

I'm not sure how you'd sort emoji though, since there's no stroke order.

------
force_reboot
One thing that is glossed over slightly in these discussions is the
distinction between collections of sound-meaning compound characters where the
rebus is the same but the words aren't cognate, and where the words are
cognate.

I've read in some places that the scribes attempted to use the same rebus for
cognate words but I everything I could find online was a re-hashing of
wikipedia (or wherever the wikipedia article is sourced from) which states
"However, the phonetic component is not always as meaningless as this example
would suggest. Rebuses were sometimes chosen that were compatible semantically
as well as phonetically."

I'm not sure how important this really is or how many characters that share a
common rebus are cognate. But to my aesthetic senses I much prefer characters
to contain etymological information than just the pronunciation when the
character was first written.

------
kwhitefoot
Is this meant to be serious (surely not) or rather like Mark Twain's take on
spelling reform?

~~~
millcoo
As regards Twain: the take you are thinking of was probably not written by
him, but wrongly attributed to him after his death.

He did, however, write an article[0] about spelling reform later in life, in
which he sincerely advocated the use of a simplified "longhand, written with
the _shorthand alphabet unreduced. "_ That is, he proposed the use of Isaac
Pitman's phonographic alphabet—the basis of what was then the most popular
shorthand in the English language—without the brief forms, phrasings, and
abbreviations that allow stenographers to write (by omission) at the sound of
speech, but slow down the reading back of what they've written; in Twain's
preferred shorthand, every sound would be on the page.

[0] -
[https://books.google.com/books?id=KoBYAAAAYAAJ&dq=what%20is%...](https://books.google.com/books?id=KoBYAAAAYAAJ&dq=what%20is%20man%20twain&pg=PA256#v=onepage&q&f=false)

(Unfortunately, the Project Gutenberg transcription of this essay does not
include the plates of Twain's shorthand. This is why I've linked the Google
Books scan.)

~~~
kwhitefoot
I didn't realize it wasn't actually his.

Thanks for the link, unfortunately Google doesn't seem to want to show me the
content. I wonder if they are implementing some kind of region coding (I'm in
Norway).

I always wonder whether people advocating phonetic spelling have ever
encountered someone who speaks a different dialect. In Twain's case it is
pretty certain that he did as the Huckleberry Finn books include dialect
dialogue. How did he think such people would use a phonetic script or worse a
phonological one? Did he expect everyone to suddenly start speaking the
American analogue of what in the UK we call Received Pronunciation (RP)? If
not then surely another person's script would be even harder to read than it
is already.

~~~
millcoo
> I always wonder whether people advocating phonetic spelling have ever
> encountered someone who speaks a different dialect.

For just this reason, English spelling reform was probably doomed from the
start. Any new standard seems just as arbitrary as the old one to a speaker of
a non-standard dialect. The chosen set of vowels—whatever it might be—would
ring particularly untrue to the ears of millions of English speakers, who
might use any 12 to 20 vowels of a pool of a few dozen in their own speech.

My own example: I grew up in a part of the U.S. that makes no distinction
whatsoever between the vowels in 'thought,' 'lot,' and 'father'—yet these are
three separate sounds in Received Pronunciation, and might be divided into two
elsewhere. Not only can I not make three different sounds for these vowels—I
do not know how they would differ!—I cannot tell the difference between them
when listening to someone who can. (Perhaps the 'a' of 'father' I could note
in contrast to the other two with some conscious effort.) If I were to use a
phonological spelling system that split this vowel group into three, I would
have to memorize by rote—again!—the spelling of many words. What was supposed
to be an easy system that did away with rote memorization, turns out to be
more of the same. Perhaps this second learning curve is not so steep—no
'ough'-es—but I've already gotten past the first one! Why bother again?

If, on the other hand, a phonemic alphabet were provided—something like a more
elegant IPA—making our spelling as idiosyncratic as our speech, we would just
have a new set of problems. First, second language learners would find no
refuge from the dizzying amount of dialects and accents while they were still
learning fundamentals. This would be a sore spot for any language, but
especially for a common lingua franca like English is today. And then, the
language of the law, the academy, and business might recede still further from
students unprivileged in class, birthplace, and schooling, if they first
learned to write in their own dialect, not that of the ruling classes.

There is an advantage to an orthography having _some_ remove from the spoken
language, as it provides a common ground for speakers of different dialects to
communicate. Chinese ideograms do particularly well here—to draw us back
toward the article—as entire articles might be written which could be
understood equally well by two speakers of mutually unintelligible dialects,
dialects so far removed from one another that they could be called separate
languages.

I don't think I'm letting you in on anything new here, but as it is a topic
I've put some thought into, I took the opportunity to turn into a windbag.

Sorry about the Twain link. Here's Gutenberg[0], if you're interested in the
text all the same.

[0]
[http://www.gutenberg.org/files/70/70-h/70-h.htm#link2H_4_001...](http://www.gutenberg.org/files/70/70-h/70-h.htm#link2H_4_0013)

------
superobserver
FYI we're close to this with the use of acronyms and abbreviations, not just
emoticons, given English's phonetic basis, etc.

------
golergka
[https://xkcd.com/1709/](https://xkcd.com/1709/)

------
jl6
What a breath of fresh air is a 1999-era web page. Fast loading, simple,
readable.

