1 \ /
\ ---- /
Image search for "kuchi-no kanji + shodou (calligraphy)":
It seems there are two schools on the relationship between stroke 2 or 3: 2 can extend downward past 3, or 3 can underscore 2.
Maybe it is a missing character after all :) (Oh no, I've been fooled by chrome/freetype/something!)
(I was expecting it to be a rant about insufficient unicode usage.)
That said, I expected something Japanese related, given that it's Candy Japan, but you'd think common fonts would have different looking characters for “no glyph” and the kanji for mouth.
Ha, for me, they look quite different in the font used for editing, but not so much in the rendered comment other than a small difference in size.
It's not clear whether or not that naming for that character originated in Japan.
As someone who speaks Chinese, I got a chuckle out of reading 'put your favorite snack in your 口 and 食t it!' due to the association in my mind of that character and its Chinese pronunciation, immediately followed by a 't'.
So in Mandarin you wouldn't find people using 食 as a verb, but in cantonese it is the correct verb.
In Cantonese it's pronounced something like "sekk"
("sihk" in Yale romanisation, but I think if you're not familiar with Yale you'd try and read that "sick" or "sikh")
The pronunciations of kanji correlate to different Chinese eras and areas.
The usage and choice of characters are influenced by Classical Chinese, which was the literary language in Japan until, uhm, recently-ish.
There is a very good recent YouTube video that explains the usage of Chinese characters in Japanese and which I recommend to anyone interested in the subject:
I was very impressed by the accuracy and knowledge displayed, as most of what is written and said on the subject is ehhh somewhat disappointing.
From my brother who studied Japanese for a while, the Kanji was a real blocker for him.
So I did wonder what it'd be like, given I already know a good amount of Chinese characters.
I guess this video answers that question:
Even more confusing.
I learned a bit of one dialect from a friend's mum.
And the basics were very similar to Cantonese.
Would be interested to learn more dialects once I'm done with Mandarin and Cantonese.
†IPA something like this: /ʃɝ/
Edit: To add a clear example, look at 事. Standard Mandarin would pronounce it "sure", Beijing dialect goes even further and pronounces it more like "shar".
Not all the people in television shows, including talk shows, speak Standard Mandarin. This is especially true in shows produced in the north. They adopt Beijing accent to certain degrees, which, more or less, leads to the erhua pronunciation.
For Standard Mandarin, listen to 新闻联播，the official news program from CCTV.
I notice that for non-native Chinese speakers whose mother tongue is Indo-European language, it sounds erhua pronunciation is easier for them than Standard Mandarin. If erhua pronunciation is pushed to extreme, it is called 大舌头。
Also erhua pronunciation is usually NOT used in very formal conditions, like presentation, etc. Some people regards erhua pronunciation as vulgar, except for very widely adopted cases.
An interesting consequence of this is that you only need to learn around 3000 symbols to read a Chinese newspaper, just like how you can ascertain the meaning of an unfamiliar English word by having knowledge of a small set of Latin/Greek roots.
This is not a good result. Learning 3000 characters could take years, and even then there would still be some you don't know. To read an English newspaper you only need to learn 26 symbols. Yes you need to learn the words, but you also have to do that for Chinese which is separate from learning the symbols for each word. It's a hugely inefficient way to write.
Furthermore, while combining smaller words to represent a more obscure word is an improvement over introducing a new character (and is a step towards an alphabet), the examples I've heard are very one way. You might look at 程序員 and think "one who orders rules" makes sense for "programmer." But that sequence of characters could just have easily meant any number of other things (lawyer, politician, etc).
Agreed. Remindeds me of an anecdote from David Moser:
"I was once at a luncheon with three Ph.D. students in the Chinese Department at Peking University, all native Chinese (one from Hong Kong). I happened to have a cold that day, and was trying to write a brief note to a friend canceling an appointment that day. I found that I couldn't remember how to write the character 嚔, as in da penti 打喷嚔 "to sneeze". I asked my three friends how to write the character, and to my surprise, all three of them simply shrugged in sheepish embarrassment. Not one of them could correctly produce the character. Now, Peking University is usually considered the "Harvard of China". Can you imagine three Ph.D. students in English at Harvard forgetting how to write the English word "sneeze"??"
Regarding the grandparent's point of joining characters, my favorite has always been 火雞 (fire chicken) for turkey.
 Why Chinese Is So Damn Hard
Turkeys are not native to Asia so it had to be added late to the language, which is why it doesn't make much sense.
The problem with Chinese is that when you have a new thing like turkeys you have to either create a new character, which then people have to memorize forever, or you have to string together sort-of related ideas into a compound word that has only some tangential relation to its parts. When you have thousands of words that are only somewhat related to their parts, the parts lose their meaning and become not much more than a really large and complicated alphabet.
Chinese was over-engineered to work great for maybe a few thousand words, but the world keeps getting bigger and bigger and every new thing makes the Chinese language worse.
Replace English with any other language using the latin alphabet and this doesn't hold up as there is no indication of meaning. For that matter, there's not a strong indication of pronunciation. I may know English and French but that doesn't help me understand Hungarian or Swedish.
I do not know Chinese but my understanding is those 3000 characters are not only glyphs, they are elements of meaning. Incidentally, that is the same order of magnitude of words one needs to know to read an English paper, with the difference that in English, there is a portion of the meaning that is encoded in grammar, ie. the relation & order of words. That a cognitive overhead we take for granted in our language system.
In Chinese, you have to learn the spoken language, and then you have spend just as much time (if not more) to learn the written language.
Many Chinese characters do contain elements of meaning and elements of pronunciation, but that doesn't mean that you don't have to learn them.
"Dilapidated" contains elements of its meaning in the root lapis = stone, but you'd be hard pressed to deduce the meaning from it.
Chinese characters are made by combining 214 radicals, and most of the characters are written by combining two of these. Most words are created by combining two such characters. It's not really that much more difficult than remembering how to spell an English word.
程序 means program. 程序员 builds on this, not the individual parts of the constituent words.
Chinese is actually a very logical language. The writing system is arguably more complicated than it needs to be (see Victor Mair and his quixotic crusade for romanization), but it conceals a language with very flexible morphemes and simple grammar. I recently had to translate 照顾着 and 被照顾着 for an app, and I had a hard time coming up with concise and decent translations that mirrored the simple relational antonymity of the Chinese.
Knowing Chinese characters is closer to knowing not only the alphabet but also understanding most basic English words plus Latin/Greek stems like auto (self), locus/loco (place), -ous (possessing or full of -), etc. From that base it's pretty quick to learn longer words and there are tons of clues to help you remember them.
Not necessarily true.
For instance, "Not necessarily true" is 18 characters and 2 white spaces, while in Chinese, it could be as simple as "未必".
While English might be among the worst alphabetic writing systems (compared to, say, Spanish, which has a wonderful correspondence between what is written and what is spoken), it is certainly a more efficient writing system than Chinese, in that fewer years of school have to be dedicated to just learning to read and write.
Furthermore, as highlighted by someone else, it is not uncommon for writers of Chinese to just completely forget how to write a word.
In English, you might misspell it, but it'll be rare that you can't render it at all.
Great book on the topic of Chinese, debunking several misconceptions, is "The Chinese Language: Fact and Fantasy" by John DeFrancis.
Also, FWIW, I couldn't read a paper at 3k characters. It's more like 4-5k if the goal is reading rather than slogging through with a dictionary.
Speaking of mainland-only variants, their official translation of computer is "计算机" (lit. computation machine).
A lot of non-English words for computer terms in various languages are translations of outdated English terms for the same that stuck around after the English terms changed.
That's nice, but I think most languages do this. It's not particular to Chinese. The word "computer" comes from "com-" (with/together) and "putare" (to reckon). Or "composition"; to position together. "compile"; to pile together, etc.
It is really interesting that in China people often ask their friends '吃了吗? (Have you eaten?)' rather than '嗨 (Hi)' in daily life. So initially I thought this post was describing something in Chinese w/o my noticing the URL.
Has entered mainstream as a cliche of Scottish parsimony but does exist in the wild.
Though I have extracted one dinner from an Edinburgh man who deploys the phrase, but that was probably because his ( English ) wife had put it on to cook earlier!
I'm in Canada and not redneck I wonder if jeet is actually used in the US.
Not really. Foxworthy's joke isn't that that is actually a word, which it's not; it's that that is how the phrase "Did you eat?" comes out sounding in the accent of his home, which is a couple states east of mine. (Also apparently in the accent of New Jersey, which suggests to me that this particular trait may be more coastal than specifically Southern.)
The one bit of that I consistently use is "agnishna", for "air conditioner".
E.g.: "space ghettos" (in standard American accent) becomes Scottish "Spice Girls" :D
If the question "你好吗" ("How are you?" although literally translates to "Are you good?"), you can answer with "好" ("Good"). If you're asked "你是老外吗" ("You're a foreigner?") you can answer "是的" ("I am").
So it usually depends on the verb that was used in the question. Although you often can say "对", which is with “是的" the closest Chinese have to "Yes".
In this specific case, you would likely just answer "吃了" ("I ate") or "吃了，你呢？" ("I hate, what about you?").
Keep in mind that this is more or less the equivalent to"How are you doing?" or "What's up?" in the US; it's a greeting and people aren't necessary expecting an actual answer.
Languages are weird, man.
> Square-shaped object (tripled) → high-quality goods spread throughout a protective container (compare 舗 and 販) → quality; counter for goods; (person's) character; grade; value.
But I think I might like to visit.
"口" = mouth, not crate
"品" is a stack of mouth characters
Two or three of something is a pretty common pattern in kanji. Two basically implies "several", while three implies "a fuckton". Two suns, for instance, is "bright". Three suns is a sparkling crystal.
Also, why 食 is translated as "Eclipse" in Google?
Also, since ancient Chinese people thought eclipse was caused by a "dog in the sky" eating the moon, it's reasonable for them to use "食" to describe it. But modern Chinese almost always say "月食", meaning eclipse of the moon, to distinguish from eclipse of the sun, so Google Translate did this wrong. It should translate it to "eat" or "food" first.
I'm a westerner who learned Japanese as an adult. I feel its quite unfortunate how much of the meaning was lost with the Chinese simplification. I can mostly make out Taiwanese/Republic of China newspapers, but can see nothing in the simplified characters.
Yay, I'm wrong, see below. Thank you internet.
On the other hand, you might recognize 吃 as the 喫 from 喫茶店 (café).
The Chinese simplification was overall a good thing. A lot of the simplifications are from Japanese, even. Like, compare the Traditional 體 with the Simplified 体 (body) - the latter is from Japanese.
Those simplifications appear to have been designed purely for reduction of stroke count (that is, making it faster to write by hand), not for simplification in the sense of making it more simple, logical, and consistent.
(As a matter of fact, that "simplification" introduced further inconsistencies, in that certain radicals were written differently when part of a character, while the traditional writing maintained it.
Example: 金 gold is the left part of money, which you can see in the traditional 錢, but not in the simplified 钱. Similarly 言 in traditional 說 vs simplified 说.)
Japanese kanji come from Chinese characters, but has become very different too. As a native Chinese, I think there is a clear link between modern simplified Chinese and 'near ancient' Chinese character. Here 'near ancient' means characters used up to 汉(漢, Han) Dynasty. Before Han, Chinese was very different too. There has always been a simplification process and a link.
And although I don't know the "喰" character, I can tell it didn't become Chinese 吃 and Japanese 食. 食 is more "ancient", where "吃" seems only used so widely in modern time.
Simplification of Chinese characters indeed started many arguments, but the "tranditional" Chinese used in Taiwan has also developed some "simplified" characters.
It was quite funny because staff at restaurants thought I was some kind of weirdo who could point at exactly what he wanted on the menu but couldn't answer basic questions.
They appear to have simplified them predictably in a way that is not impossible to understand if you have a senior HS-level of kanji knowledge.
I personally find that elitist though, but then again, in chinese history there was always a distinction between what the commoner spoke (白话文) and what the intellectual elites wrote (文言文) so I'm not surprised that this mentality has continued on
I'd also like to add that 食物 is food, which translates literally to "eat" and "thing". Put together it means "edible thing", i.e. food. The original meaning of 食 still means, "eat", not "food". It's a modern contraction that "食" means food
The ghosts part was not supported.
Is a preference for 食 a Taiwanese thing? All the mainland Chinese people I've ever heard say 吃.
Since HK was quite insulated from the Cultural Revolution, and evidence from older texts that use 食 all the time (喫 was not really used IINM), it would not be amiss to say that the development to prefer 吃 is quite new. Hence in my other post I mentioned that it was political agenda that drove selection of preferred words to use.
addendum: I think there is also a nice narrative in the shift to use 吃 - it was more a "commoner" word, and communism was then about replacing the elite sounding words with simpler words that is common to everyone.
乞 is most commonly used with 乞丐 (begger), but the etymology of the word comes from qi (气) according to zhongwen.com
I'm just suggesting from modern Chinese's perspective, because, after all, we are modern people. 文言文 (uh I don't know its English translation) is fun to read and learn, but it's like Latin since basically nobody writes it anymore.
I do find 文言文 to be quite elegant and terse though.
For the record, 食 is a pictogram of a small amount of food (the "roof" looking thing) stacked on a heavily stylized pictogram of a kind of table or plate (do an image search for 'takatsuki table'). It's claimed the Japanese word for bean 豆, has an older stylization of the same takatsuki table with a little bit of food at the top.
From a learning to read and memorization perspective, most people will probably find doing Look/Cover/Write/Check type drills (either manually or with a spaced repetition flashcard program like Anki) much more effective than using mnemonics based on (sometimes very complex) etymology.
EDIT: found another one: http://imgur.com/BsQsNBb
It uses mnemonics, but it only loosely follows real etymologys. It diverges to nonsensical, but memorable stories when this will make things easier to remember. It also has mnemonics to remember tone and pronunciation (for Mandarin Chinese only though).
https://en.wiktionary.org/wiki/%E6%97%A5#Etymology if you didn't get the admittedly horrible pun.)
I don't know which of these is true, if any of them is. But your link doesn't have enough etymological information to indicate one way or another, either!
It's a "leaky abstraction". Makes things easier, until it doesn't. Much lik3 "I before E, except after C"
For example, when you write "inconceivable," you're not regurgitating every single letter in a line from memory. You probably remember the prefix "in," "conceive," and you know "able." You probably also know the common patterns "cei" or "eive" or "con," so the word "inconceivable" really isn't as complex as the initial length makes it look, as long as you know the blocks.
Kanji/hanzi are the same way -- they look complex and inscrutable to the uneducated eye, but they're all made of common building blocks that make it easier to remember them. After all, human memory works roughly the same way all the world around; people wouldn't be able to memorize thousands of 20-stroke character if they were all completely patternless.
The vocabulary utilizing kanji/hanzi works the same way.
Someone could look at "inconceivable" and say "well shit, that doesn't make sense! It's long, you'd have to memorize so many letters, and the letters themselves have so many bits! Plus it has 'in' in it, which makes no sense because 'in' commonly means 'inside of something', and 'con' usually means 'to swindle someone'! This alphabet thing is completely useless."
It's absurd, reductionist, and a bit offensive.
> people wouldn't be able to memorize thousands of 20-stroke character if they were all completely patternless.
Well, people don't. 20 strokes is an unusually high stroke count, and people don't remember thousands of those. Simplified chinese characters were created because traditional characters were too complex and cumbersome for people to remember.
Also why did Korea and Vietnam abandon them entirely?
Or just inertia. Norway has had a steady stream of language reforms over the last century aimed at bringing the official written language better into sync with a majority of spoken dialects. This is a result of hundreds of years of Danish rule that ended in 1814, followed by the period of national-romanticism in the period up to the subsequent break from Sweden, that led to a lot of desire to make language etc. more uniquely Norwegian.
As a simple example, we inherited parts of Danish counting.
It used to be in some parts of the country that we'd say "fire og tyve" for 24 - literally "four and twenty". This was changed to "tjuefire" (twentyfour) in the early 1950's. Anyone who has learned Norwegian in school since then has learned the new form in school and been marked down for using the old forms etc.
Despite that, and being born to parents who were in primary school when this had just changed and who learned the new forms, I still regularly use the old form.
I never learned it at school, and I occasionally had teachers complain about it. I don't use it consistently, to make matters worse - it's not a conscious choice to use a more conservative style or anything, it's just habit I picked up mostly from my dad, which is persisting in my spoken language now, when I'm 41, despite having changed in a language reform a couple of decades before I was born.
This is a difference where there's no practical benefit at all to the old form - it's longer, and the new form is more consistent with spoken Norwegian overall -, yet more than half a century later the old form still persists out of habit.
In particular, trying to engineer changes to language tends to take a long time even when there's no resistance to the change.
Is that what they use hanyu pinyin for these days? I've always thought of pinyin as a pronunciation guide for Mandarin, similar to furigana in Japanese.
'what does "in-con-ceiv-able" mean?'
as opposed to 'what does "..." mean'
A japanese beginner will see 照り焼き and say "uhhhh.. ri..."
compare this with seeing テリヤキ
Really? You don't think someone with less experience in English would say "what does... inkonsayvaybull mean?" There are plenty of instances where the word is not pronounced the way you think it is.
> A japanese beginner will see 照り焼き and say "uhhhh.. ri..."
A child or beginner would probably be more likely to say "uh, what's that thing with the 日 and the 火, it's something ri something ki." Just because something is a symbol doesn't mean you can't describe it. Children are also very likely to just sketch out a picture of what they remember, even if it's incorrect, and you can usually figure that out.
You're missing his point and English is a crappy example because it's spelling is an unmitigated disaster. For example, if you can read and vocalize the Greek alphabet, you can just ask someone what "νόστιμο φαγητό"* means because you can vocalize it. You only need basic knowledge of the alphabet there. Where as with Chinese/Japanese you need to have a good base of characters to be able to potentially vocalize an unknown character which requires much more work than learning a new alphabet.
(* νόστιμο φαγητό means delicious food)
While many Chinese characters have a phonetic component (in addition to a component related to the meaning), it rarely corresponds exactly to the current pronunciation (in Mandarin).
Furthermore, you can very rarely conjure the right character out of pronunciation and and some aspect of the meaning.
If I had to guess, most Japanese people aren't going to have much trouble disambiguating ‘eat’ from ‘eclipse’. (And as zorceta explains, Chinese uses different hanzi for them anyway.)
For example, Twitter never really took off in southeast Asia, but Line is incredibly popular. Why? Stickers. Line offers endless little pictures you can use with your messages, while Twitter doesn't.
Now stickers (and emoji) are taking off a lot more in the West, because they are super compact and effective communication symbols. I think we'll see more and more of it.
I also think that the Latin alphabet could be easily used for Japanese, which does not contain any sounds that do not have an obvious equivalent in English, and even if it did, we could always repurpose a character or sequence of characters for that sound (do we really need a 'c').
Having said that, the Japanese phonetic system writes voiced sounds as a modification of their unvoiced counterparts. why can't we all do that.
The biggest risk of using Latin is that simply sharing an alphabet could cause spelling conventions of other languages to bleed in.
Also, it's not like you stop learning even after school. For example English has according to the Oxford dictionary 171,476 words in current use excluding inflections, and several technical and regional vocabularies. Does all English university students know these words?
• It's possible to know how to say a word, but have no clue how to write it. This phenomenon is called character amnesia, and it affects most native speakers. Phonetic languages allow you to write out a misspelled word, which readers can understand (or autocorrect can fix).
• Likewise, it's possible to know what a symbol means, but have no idea how to pronounce it. This is extra-fun in Japanese, where most kanji have multiple pronunciations.
• Looking up words is harder, as there are no "letters" to sort by. Sorting can be done by stroke count, by radical (four corners or SKIP), or by phonetic spelling (in pinyin or hiragana). Modern technology has made this easier, and some phone apps (like Pleco) can even OCR hanzi. Still, it's far less convenient than phonetic languages.
The only aspect in which logographic systems win is information density. You can fit more words on a single page. This is obvious if you've ever seen Chinese or Japanese copies of works that were originally written in English. The Harry Potter books are crazy thin. Also, Chinese and Japanese tweets can express a paragraph of information.
> Likewise, it's possible to know what a symbol means, but have no idea how to pronounce it.
As a second language learner of English I can attest that this is not just a problem of languages written in logographic systems:-)
>The only aspect in which logographic systems win is information density.
I vaguely remember a paper that claimed that information density is pretty much constant across languages and writing systems, but I couldn't find it as for now. There is another thread on HN 
where people compared the size of "Universal Declaration of Human Rights" in different languages. I think this misses the point because it doesn't account for intra-character information density.
It'd be much more interesting to render the text into a bitmap and then compare compressed bitmap sizes.
Sorry if it wasn't clear, but by "information density" I meant area on a page or screen, not digital bytes. In the thread you linked to, people correctly point out that digital information density depends on encoding and compression schemes matter far more than language.
The paper you're probably thinking of is A Cross-Language Perspective on Speech Information Rate, which (as the title indicates) studied spoken language, not written. Annoyingly, the study was widely misrepresented in the media. It found that languages with lower information density tended to have higher syllabic rates. That is: Spanish contained less information per syllable than English or Mandarin, but Spanish speakers spoke faster to make up for that. Most media summaries of the paper omitted an important finding: the compensations didn't balance out. Different languages had different information rates. In the study, English had the highest. The runner-up (French) was 10% slower. And Japanese was 30% slower at conveying information.
2. This blog post has a more accessible summarization of the data: https://www.tofugu.com/japanese/why-do-japanese-people-talk-...
You can certainly write things out in kana. When I was more serious about studying Japanese, I knew less than 1000 kanji, but had a vocabulary several times that size, and would at times write out the word I meant in hiragana. And if we're counting autocorrect, your IME is going to take that hiragana and let you find the character.
>• Looking up words is harder, as there are no "letters" to sort by. Sorting can be done by stroke count, by radical (four corners or SKIP), or by phonetic spelling (in pinyin or hiragana). Modern technology has made this easier, and some phone apps (like Pleco) can even OCR hanzi. Still, it's far less convenient than phonetic languages.
Eh, I disagree here. It's harder if you're used to looking things up by the spelling, but once you're fast at looking things up by radical, it's not that difficult. My misguided attempts at slogging through 1Q84 while reading at a, at best, middle school level got me pretty fast at looking up kanji. Not any appreciable difference vs. looking things up in a regular dictionary.
Even without autocorrect, you can write a word in English such that most people would understand. Of course, in a logographic system you'd just write a homophone (which is what people actually do, write a simpler word pronounced the same).
As for looking up, it is in principle easier though. You only need to learn the order of about 26 things, not about 200, and can then run iterative binary search over it, and don't have to switch to stroke count. It is possible, of course.
Anyway, any alphabet is better than Chinese characters.
I don't think English is much better in these cases. In fact, the writing can be so divorced from speech that spelling bees are a thing.
> I was once at a luncheon with three Ph.D. students in the Chinese Department at Peking University, all native Chinese (one from Hong Kong). I happened to have a cold that day, and was trying to write a brief note to a friend canceling an appointment that day. I found that I couldn't remember how to write the character 嚔, as in da penti 打喷嚔 "to sneeze". I asked my three friends how to write the character, and to my surprise, all three of them simply shrugged in sheepish embarrassment. Not one of them could correctly produce the character. Now, Peking University is usually considered the "Harvard of China". Can you imagine three Ph.D. students in English at Harvard forgetting how to write the English word "sneeze"?? Yet this state of affairs is by no means uncommon in China. English is simply orders of magnitude easier to write and remember. No matter how low-frequency the word is, or how unorthodox the spelling, the English speaker can always come up with something, simply because there has to be some correspondence between sound and spelling.
I can read and write (via pinyin) a large number of characters,
but cannot recollect their shape in abstraction.
I think that's just because as a foreigner learning chinese in the modern world I've never had to learn this skill.
The difference between Recollection and Recognition.
But because I never write characters by hand, I have a really hard time reading handwritten notes, and that is a problem.
If you're bringing computers into it, isn't text entry in Japanese usually done phonetically anyway?
Here is a website which questions you with some random sample of words from an English dictionary, mixed with randomly generated non-words. Then it estimates the percentage of English words you know.
I am a non-native speaker, and I have scored in the 77% to 89% range, when doing this test several times.
I got 73% and I didn't say 'yes' to any fake words.
73% is apparently "This is a high level for a native speaker."
Writing Japanese entirely in Latin characters would be no different from writing it entirely in hiragana. Have you ever tried reading that way?
Having read English language papers on Japanese linguistics, I can also say that reading the Latin is easy too.
The larger point being, Japanese isn't locked into using a logographic system - it already has two phonetic syllabaries that people could start using exclusively if there was some advantage to doing so.
You stuck an extra “do not” in your sentence
* * *
As far as alphabets go, the Phoenician/Greek/Etruscan/Latin alphabet is pretty ad hoc and mediocre. But hey, it’s what we know. At this point, I think we’re stuck with it.
Similar story for modern Hindu/Arabic/European numeral glyphs. Learning arithmetic would be noticeably simpler if the glyphs expressed some of the symmetries of the number system. Alas.
As far as the alphabet itself goes, I do not think that Latin is that bad. All symbols have a canonical sound associated with them. The problem is that our usage of the alphabet is horribly inconsistent. This is partially due to the fact that English has sounds that cannot be expressed using the "pure" alphabet. Arguably Japanese has this same problem in their system, with the ゃ、ょ、ゅ modifiers. But at least they distinguish those from や、よ、ゆ by size, and are disciplined about their usage, so we can consider the set of compounds to be their own characters and not have a mess.
Of course you still have the ず/づ issue, and the pronunciation of は and を as わ and お in their most common usage. But, even in modern Japanese, these oddities are not universal.
Out of curiousity, are you aware of any numeral system that beats Arabic? By pre-Arabic European standards, Arabic numerals are a masterpiece of symmetry.
It can also be nice to use a “balanced base”, with digits for negative numbers, e.g. in a base ten context you’d have digits for –4 to 5 (or if you’re willing to have multiple expressions for the same number, –5 to 5).
A balanced base twelve multiplication table might look like this: http://i.imgur.com/quEcxH0.png
You mix the whole development line of that Latin alphabet into one dismissive argument. I see lots of difference between the Phoenician and the Latin alphabet and FWIW, the Latin alphabet is quite versatile as its wide application shows.
It wonder what do you consider mediocre about them?
> Similar story for modern Hindu/Arabic/European numeral glyphs. Learning arithmetic would be noticeably simpler if the glyphs expressed some of the symmetries of the number system. Alas.
I don't think learning arithmetic would be much simpler with other numerals. Even the Romans could do it and they had one of the worst possible numerical systems.
I find our numerals quite fine. My daughter was recognizing numbers before she turned 2. There is some mnemonic to the first four (1 line, 2 corners on the left, 3 corners on the left, 4 corners overall) and most are quite distinct from our Latin letters. 6 and 9 are annoyingly symmetrical of each other, though.
For an example of a better designed alphabet, check out Korean Hangul.
The numerals 1, 2, 3 come from just writing strokes, like tally marks, which over time became connected in handwriting. The other numbers were mostly fairly arbitrary symbols, which morphed slowly over time with occasional replacements and swaps. Otherwise, the symbols have absolutely nothing to do with the numbers they represent or with the base ten number system. Overall, I’d say numbers 0 and 1 are pretty effective. The rest are a huge waste of potential.
Same story for the words/names used to represent the numbers. They are made of arbitrary sounds in arbitrary numbers of syllables, reveal nothing about the theoretical properties of the numbers, some of them are hard to say or easy to mistake, etc. Especially for numbers beyond ten, the names are irregular and confusing. This has a real practical impact. Counting is notably easier for Chinese speaking children than for English speakers.
> I don't think learning arithmetic would be much simpler with other numerals. Even the Romans could do it and they had one of the worst possible numerical systems.
In general, Romans did their arithmetic using little pebbles (“calculus”) on counting board (“abacus”), and used written symbols only for recording the output of their calculations. This made some types of computation very difficult (because using pebbles to record every step gets cumbersome), which helps explain why science has taken off in the past 500 years in Europe after we started developing better notational conventions and using Hindu–Arabic numerals and later decimal fractions, logarithms, etc.
My son is about 2 weeks old, so I can’t tell you yet how well he learns arithmetic using a different set of numerals. Ask me again in about 10 years.
Fun fact, we do do that in English, at least for C and G. (G was introduced as a modified C to indicate the voicing).
languages are not solely a means of communication but a part of a people's cultural identity. I think the greater dependence on contextual cues and ambiguity in Chinese/Japanese lends itself much better for linguistic art forms like poetry and literature.
There are pros and cons. A big con with Alphabets is that words lose their meaning over times. I find reading Old English (1500 years old) to be less comprehensible than "modern" Latin, despite being a native english speaker, and only knowing a little latin.
I find reading even Early Modern English (400 years old) an effort initially before I get reacquainted with it (Shakespeare).
In 300 years time I hate to think what English speakers will think of our texts.
That said, if I had to choose another language to learn, it would be one with an Alphabet, which seems far easier to me to learn, and type, than memorizing 1000s of symbols.
Easy to learn, no more trying to guess if it's on/kunyomi, immune to mispronunciation from using a foreign alphabet, the list goes on.
Or rather, the 2-3 months would be ten times easier and everything after would be ten times harder.
What is the advantage of using a different symbol for each word, that offsets the huge disadvantages of having to learn and remember a different symbol for each word?
Especially considering that the spoken language already distinguishes between all possible words through pronunciation (and context in the case of homophones.)
So, obviously learning 1000 kanji isn't easy. But doing that is what makes it possible to learn 100,000+ words whose pronunciations and meanings would be otherwise largely unrelated.
It's quite similar to the role that Latin/Greek roots play in English. When you see a word that includes "-graph-" you know it probably involves writing, and similarly when a student of Japanese sees a word with "間 (kan)" they know it involves an interval or space. Throw away the kanji, and your student now just sees "kan" - which means the word will probably involve an interval -- or a barrier, or emotion, or appearance, or a tube, or a building, a warship, a crown, an ending, China, a publication, a government ministry, or.. you get the idea.
Spoken language is quite limited compared to written Japanese.
Assuming yes, do their users have significant problems understanding the written text when pronounced in an audiobook? Are there well-known conventions or shortcuts or explanations that audiobook readers insert into their speech to signal the correct meaning of the word?
Do Japanese audiobooks provide evidence for or against the idea that doing away with kanji in writing would not harm understanding significantly?
It was/is far easier to ensure redundancy of scripts and books
since the costs of reprinting/copying was far lower compared to other forms of phonetic systems.
The compactness explains how so many archaic, buddhist scripts could survive to this day.
Could you elaborate on why the alphabetic system is intrinsically more efficient than Chinese characters in terms of recovering messages from partial loss of texts?
It's the same principle as how soldiers are trained to spread out when in battle: If they bunch up, it increases the risk that a single mortar shell (or artillery round or machine-gun burst) could take out a lot of troops.
Though I do not have data to back up my argument, I still reckon the Chinese glyphs/scriptures would have had a better chance of survival.
While I think your point is valid, its disadvantages outweigh the advantage, at least since paper/papyrus was invented.
Being spread out to double in length (double being an arbitrary multiplier) would still be inferior to being dispersed to two physical locations (redundancy). I think this is where don't put all your eggs in one basket holds true.
Plus, important docs must have been actively maintained by hired librarians(?). With human maintenance involved, less in volume could have been an advantage for it is easier to move around and maintain the docs. Ofc, when left out in the wild, it is a different story.
Personally I do not like Chinese character system as it has so high a barrier to entry for learners. I love alphabets, Korean Hangeul, or Japanese Hira/Katakana for this matter. Have you tried learning any of those? :-)
I'll try give an example:
The quick brown fox jumps over the lazy dog
Note: This is composed by me, maybe not very well-written, and maybe it can be even more compact, but you see what I mean ;)
A vomitorium (any modern person associates that with vomiting, i.e., stuff coming out of your mouth) was the name for entrances in Roman amphitheatres.
It's not that a land's-mouth is a crater, more like, mouth and opening are more synonymous in feeling in Japanese.
(idiomatic) To exert maximum effort.
It does not include sounds which are borrowed from other languages, like the Hebrew 'ch' which happens at the top of your throat, or the Spanish trilled r, or the glottal stop which actually occurs in spoken English all the time in some dialects.
If king is 王, kingly is 王ly, and royal is 王al, what is regal?
If mouth is 口 and mouthed is 口ed, why would ate be 食t rather than 食ed?
Japan misinterpreted the Chinese writing system (already terrible) into easily the worst writing system known to mankind. It won't look cute when you go beyond two symbols.
Yes. And that's exactly what they do in Japan. Rather uncommonly among languages, English is a fusion between romance and germanic stems. Often you'll see two phonographic bases for the same concept: Easy ones are food: Pork/Swine, Cow/Beef, but Oral/Mouth, etc.
Japanese is similar, many words have both their Japanese native stem and an imported chinese sound. The imported chinese character was assigned to the Japanese native word, even though it sounds totally different. I imagine it's makes Japanese quite frustrating to learn as a chinese speaker, but Chinese is really easy to learn for a Japanese speaker.
Take for example 食 (eat, from the article)
can be read: i, ji, jiki, shoku (which derive from chinese, imported twice during two dynastic periods, corresponding to chinese ji/yi/zi)
but also deriving from native japanese words: ku-, ha(mu) ta(beru) o(su), uka, uke, ke, shi, (last five are very rare).
All of these uses mean variously "to eat". The pronunciations are entirely contexual.
I'm aware that that's what they do in Japan. I'm pointing out that it's idiotic.
In Chinese it'd be 王道 (the tao of the King if translated directly to English), which has a completely different connotation (it's more holistic in concept (i.e. it has an overloaded meaning) when compared to "kingly" in English, which has a more singular-use meaning).
"Royal" itself is an overloaded adjective in English. AFAIK there are no adjectives in the east asian language that has the same semantic meaning as "royal" - The Japanese version would simply be 王の, which translate to "belonging to the king", while the chinese version would be 王室的 or 王室之(belonging to the king's office(I guess you could say crown)). Ditto with "regal". There simply isn't any proper translation for the adjectives in English.
Other than that, in Chinese, there is the concept of a radical, which can be combined to inform the readers about the context it's used in. In Japanese, as bemmu wrote, it'd be additional kanas to inform of context.
Fun fact: 之 and の used to denote the same things up to about 300-ish years ago I believe (timeline could be wrong). In Japanese 之 may be pronounced the same as の. Either way, the Chinese and Japanese words are very much the same, barring some minor kanji differences. Grammarwise, however, it's a completely different language
I'm not sure about that. Japanese has adjectival nouns (commonly referred to as na-adjectives), and the 的 suffix.
Additionally, as you identify, the の particle also serves this function; but you give it a much more restrive role that it actually has (as is typically in English language Japanese learning material. In general, の marks the genitive case, which simply means that the first noun modifies the following noun in some way. It is often used to show possesion, but can also be used in a way close to ~ly in English.
No, it's because I thought making the point in English made more sense in an English-speaking forum than making the point in Chinese. I'd go with 像国王一样(的) "like a king" or 适合国王(的) "fit for a king" for the English senses of "kingly".
The point is that, as bad as the Chinese writing system is, it's still fundamentally a writing system. 女, 娘, and 妮 may all variously mean "girl" (actually, 女 is an adjective), but they are written differently because they are different words (or, as the case may be, stems). Conversely, it makes sense to talk about the pronunciation of a Chinese character. Japan somehow overlooked this principle when trying to adopt writing, and the Japanese system is a total mess. Japanese 漢字 can only be read in context; in isolation, they represent a grab bag of some unrelated words with shared semantics along with assorted nonsense syllables.
> Other than that, in Chinese, there is the concept of a radical, which can be combined to inform the readers about the context it's used in. In Japanese, as bemmu wrote, it'd be additional kanas to inform of context.
This appears to be... nonsense? The concept of a character radical is not restricted to Chinese. It refers to a part of the character that gives you a hint about the overall meaning. For example, the radical of 冷 "cold" is the 冫 on the left, which means "ice". The radical of 切 "cut" is the 刀 "knife" on the right. And the radical of 漢 "the Han race" is the 氵, which means "water" (they're not all helpful). They are an inherent part of the character and are totally independent of any context. And the kanas you describe as "inform[ing] of context" in the OP do no such thing; they encode grammatical suffixes which have no characters of their own. Chinese uses (wait for it...) Chinese characters for the same purpose; Chinese grammatical markers, unlike Japanese ones, do have dedicated characters. Radicals are a completely unrelated phenomenon.
Japanese's appropriation of hanzi is largely a historical accident due to geographical circumstance, but most learned in the language most would agree it's far more efficient than simply using hiragana or katakana or even romaji; disambiguation by pictographs (though in modern times they are more accurately phono-semantic compounds) is of great value in written language where space is at a premium.
It is the opposite. A writing system that requires multiple years to learn is not "efficient".
"Mouth" is both a noun and a verb.
The past tense of "eat" is "ate", not "eated".
No wonder English is so difficult to learn.
Each one has more than 1 reading, a particular stroke order, and many other things.
Except that usually, almost nothing about the arrangement of strokes can be inferred from the sound or meaning of the word, and vice versa.
It's a lot more complex than just using letters, which is something they can also do.
I've just spent the last few hours learning all about languages how they developed and each culture's spin on adding as much meaning as efficiently as possible to written symbols. I've always loved languages so this was more of a brush up plus learning.
It seems and rightly so ambiguity is death to any characters and efficiency is also fundamental to the character.
I'm not Korean but I like their style literally I like how their language style is so efficient in context to mouth position. It was created because Chinese characters didn't suit Korean language. Japan also streamlined Chinese characters to better suit their culture.
Mayan is another wild language full of meaning in such compact symbols. I had a hard time following their characters.
I have no , but I must ...
Edit: Nevermind, HN swallowed the Emojis.
I recognized what the author is doing from the start as a Chinese speaker.
The keyboards look about the same. The trick is there is also a phonetic alphabet that you can use to compose the ideographic characters. Basically Japanese input methods work kind of like autocompletion in an IDE. You spell the word you want using phonetic characters and a little popup lets you pick the ideographic transliteration when you hit the space bar.
Here's a typical Japanese keyboard https://s7.postimg.io/i5fg1c1kr/SKB_KG3_BK_FM.png There are a few different kinds with slightly different ways of working but they're more or less the same. A lot of Japanese just use a standard American layout and spell the phonetic characters using Romaji (Japanese phonetic characters transliterated to Latin characters).
Furthermore, how was the placement of the characters decided? Are they more closely related to QWERTY (so that typewriters don't jam) or to Dvorak (so that the most frequently used letters are on the home row, and so that alteration between left and right hand is maximized), or unlike either? I use Dvorak and if I were to learn Japanese and type it on a keyboard, I'd want the typing experience to be similar to how it feels to type on Dvorak for English compared to QWERTY.
To explain how it works in detail, it's important to first note that Japanese doesn't really use spaces between words. This probably sounds weird if you're used to English / most Western languages like it'd be hard to read but actually it's not. You basically compose one word at a time and it draws an underline under the word you are currently typing, then when you press space bar it autocompletes to the correct ideographic spellings. Pressing space again let's you cycle through different possible transliterations (including leaving it spelled in phonetic characters); it almost always gets it right except for homophones (which there are a lot of in Japanese) or if you want to deliberately pick some unusual/archaic spelling for stylistic reasons. Then you proceed to composing the next word. Sometimes dedicated Japanese keyboards have a separate button for the "autocomplete" function, but most use space bar AFAIK. Wikipedia has a description with a demo image of the Windows IME (they all work pretty similarly AFAIK) https://en.wikipedia.org/wiki/Japanese_input_methods . Cell phone input described on that page is where things get interesting / deviate more if you're interested.
So there's a small amount of extra overhead with selecting the correct transliteration, but it's minute once you get used to it. Japanese is I'd say slightly more information dense than English, so it compensates and the typing speed is about the same. I'm not actually sure how the character layout was chosen for the Japanese keyboards. Looking at it roughly, I'd guess that the placements are approximately matching the English usage frequency of QWERTY corresponding to the frequency of usage in Japanese as that looks about right to my eyes, but that's just a guess.
You can use Dvorak or whatever you want to though. You don't need a specialized Japanese keyboard. Instead of typing the kana (phonetic characters) directly you just type their Romanized form. So instead of typing たべます (phonetic spelling of "I eat") you'd type "tabemasu", but it'd otherwise behave as I described. I know a lot of Japanese that don't even bother using proper Japanese keyboards and just use standard English keyboards, especially programmers. You'll have to fiddle with some settings, but I know for sure that it can at least be done on Linux and Mac and I'm sure Windows can do it too.
Edit: to explain how Japanese writing works a little better, there are actually 3 "alphabets" - two phonetic (hiragana and katakana) plus the ideographic alphabets, called kanji. The phonetic alphabets always correspond to the same sounds, whereas the kanji refers more to a meaning/idea and can can be read to correspond to multiple different sounds depending on context. For example 食, the character for eating/food is pronounced "ta" in "taberu/食べる" ("to eat"/"I eat"), but "shoku" in "shokudou/食堂" ("cafeteria"; the two characters literally mean "eating room").
Kanji usually only serve as the "root" of a lot of words and Japanese writing tends to be a mixture of ideograms and phonetics. For example, 食べる, where べ ("be") and る ("ru") are phonetic characters. If you conjugate the verb to for instance past tense 食べた ("tabeta"/"I ate"), the root character 食 that means "eat" stays the same but the phonetic characters change to indicate tense. It's also completely valid to not use kanji and spell things out entirely phonetically and this is how most people learn starting out this way and gradually replacing them more and more kanji as they learn, but to do so is considered childish / uneducated.
Japanese is a lovely language, I recommend learning some. Relative to English I'd say it's actually grammatically much more organized / logical (and hence easier), but on the other hand reading and writing are significantly harder to learn. The easy availability of manga/anime/novels in both untranslated and translated forms across a wide range of language levels makes it much more accessible than it was even just 10-20 years ago.
It's an April Fool's joke, obviously. For Japanese you can use romaji or some other phonetic system; for Chinese you can use pinyin or bopomofo.
I don't think anyone does it kanji-by-kanji. In reality, it's really autocomplete as it exists with English keyboards. You type the first few bits in hiragana or romaji, then autosuggest comes up with commonly-used words and you select the one you want.
In addition, hiragana input on mobile devices is FAR faster than romaji input, so I'm not sure how you lose time.
Typing Japanese on mobile phones is pretty fast and painless. Easier than English on a regular keyboard, though not quite as fast as English "swipe" style input (whatever the term for that is?).
There's often (always?) a reaction when new forms of communication are introduced to language. Usually the older generation railing against the "degradation" or "misuse" of the language the way they learned it. For example txt and emojis currently in English.
Similarly with e.g. Chinese and modern IMEs. Because there are frequently many characters that match phonetically, to save time, typically younger people just pick the first one that's suggested. Mostly due to internet and phone/tablet use.
A Japanese phonetic IME should have enough local context to guess which Kanji is correct. You type, then go back and quickly proofread later.
I recall this window strongly, with ~10 items in it being common. Playing with it now in OS X perhaps it's improved in recent years.
In fact, you can often just type initials for your entire sentence and the IME will guess correctly all the way through. It's like being able to type 'w a y u t?' and getting 'What are you up to?' filled in for you.
I'm pretty sure Chinese is even more homophonic than Japanese, which is why I'd expect the Kanji inference to work better.
(In Japanese, the topic of the original article, the situation is somewhat better, as mostly only a fairly well defined subset of about 2000 characters is used, next to (two) syllabaries and the Latin alphabet).
Knowing only 250 or even 1000 characters is quite unsatisfying, though. When you read a newspaper, it'll read like this: "At the meeting yesterday, president <someone> said that it is of great importance to <something> the <something>, otherwise surely the <something> will <something> down. <someone> suggested a possible solution, though, by bringing the whole country together to <something> for the future and implement a better <something>."
Sure, you understand 90%, but that's not really cutting it.
Different Romantic languages could be represented this way. In the same way that Mandarin and Cantonese use the similar character sets with different pronunciations, and with some characters specific to each one, different languages that have Latin roots would have a few of their own characters specific to their own language, but mostly drawing from the Latin pool.
The pronunciation for each would have to be memorized of course.
Romaji works just fine, albeit with the addition of a macron diacritic. Though if it really was the primary writing system, a way to notate tone accent might be necessary (My gut says no, it carries relatively little semantic weight).
For example, how many kanji can be read as 'shuu'/しゅう:
Try to do that with tones.
I'd be willing to bet heavily that the vast majority of those "homophones" are primarily writing-only, domain specific or archaic "shorthands", which are referred to in speech with slightly more verbose alternatives. Switching to a non-character based system would admittedly in that case mean some domain specific writing would be slightly less compact, but that seems a reasonable tradeoff given the unwieldiness of the current writing system.
You'd lose your bet. In that "shuu" link as an example, most (10-12 or so) are common enough that you might hear them in a typical newscast, with that pronunciation.
What makes things manageable is the combinatorics. E.g. there are dozens of kanji read "shuu", and many dozens more read "kan", but most of them are only read that way when part of a 2-character compound, and only a small subset of the possible "shuukan"s are words, and only a subset of those words are common in spoken conversation.
Even then, it is a very homophone-heavy language. I can think of four "shuukan"s off the top of my head that you might hear from a newsreader; it would only be after those that you'd get into domain-specific words. This is pretty typical.
Here's a great example of how in writing you can disambiguate "aunt"/"older woman": https://twitter.com/MaggieSensei/status/765769637372030977
In the above example all three are read as おば (pronounced: oba). When spoken you still need to differentiate, but it'd either be obvious from context or you'd just explain it manually.
It creates a situation where you have people who have wildly different voices in their writing than they do in their everyday speaking, which is an interesting phenomenon. (To me, at least)
Also Korean avoids many homophones thanks to it's 10 vowels. Japanese has 5.
A bit like English "packed" being written with "-ed" even if it sounds identical to "pact". Helps disambiguation.
(Actually, come to think of it, it's rather analogous to the Japanese way of maintaining the same Kanji while the suffix changes.)
Koreans did have an old writing system made of Chinese characters, where some were used for meaning and others were used to denote Korean suffixes with a similar sound (kinda like how Hiragana started out, I guess). But it eventually died out.
In other "dialects", such as Cantonese or Teochew, the characters are pronounced as 7 or distinct syllables, with 6 different tones, leaving more than 20 distinct pronunciations.
Mandarin has very few available syllables compared to other languages (not only, say, English, but also older Chinese "dialects").
主 = 主シュウ
集 = 木シュウ
终 = 幺シュウ
州 = 川シュウ
衆 = 血シュウ
就 = 京シュウ
秋 = 禾シュウ
收 = 又シュウ
週 = 辶シュウ
周 = 周シュウ
宗 = 宗シュウ
修 = 彡シュウ
習 = 白シュウ
執 = 幸シュウ
秀 = 乃シュウ
渋 = 氵シュウ
拾 = 扌シュウ
袭 = 衣シュウ
捜 = 扌シュウ
祝 = 礻シュウ
Furthermore, people use simpler language when speaking then when writing.
In Chinese, which has even more homophones, it is quite difficult to tease out the meaning of a passage written phonetically (in hanyu pinyin). When speaking and there is a word out of context (such as a name), it is necessary to explicitly disambiguate by putting the word in a common phrase or describing the characters constituent parts. For instance, I would introduce my self as "Zhe as in 'philosophy', Hao which is 'sun' on top of 'sky'".
Modern Japanese could learn a few lessons from hangul imo.