Regarding the simple grammar, I’ll say that tenses and plurals are easy for an English speaker to pick up. But Mandarin introduces new grammatical requirements that English and most other languages don’t even have constructs for, like change of state (which serves as the past tense in many situations) or counting words. Once you move beyond the “simple” grammar, things get complex fast.
Having learned both Mandarin and Japanese, I go back and forth on this. While yes counters are more front-and-center in these languages, they are not completely absent in English. We do not explicitly call out (or teach) counters as a part of speech, yet while no one says "two dynamites", everyone will say "two sticks of dynamite". I see 'stick' effectively functioning as a counter there. And while venery terms are not counters, and many of them are unused in modern English, those that do persist occupy a niche that you would notice the absence of, while being decidedly absent in other languages (or in Japanese, I can think off-hand of mure as a catchall for a collective of animals, and it is less specific than English venery terms).
Overall I agree with your assessment of "interesting but a lot of the things are wrong". For instance the statements:
>And some people have used less used English letters to denote specific chinese pronunciations: Eg. in Xi Jinping, "X" is pronounced as "sh", and in Qing, "Q" is pronounced as "ch".
I was always taught that English letter choices in pinyin were not just because "some people had used" them, but as a deliberate choice by Chiense to teach each other proper putonghua, especially speakers of other dialects. And additionally, to accomplish this it borrowed from a Russian perspective on Anglicized sounds, with Russia being both a geographic and political neighbor. I don't know any Russian, but it was taught to me that the "zh-", "q-" and "x-" in pinyin were Russian in origin, albeit in a filtered, haphazard way.
Korean (and Chinese, I'm assuming) has actual counters/classifiers , that is, separate grammatical concepts. 저는 차 두 대를 봤어요 - Here, (대) is the counter for cars (차/차동차). I saw two [counter] cars. This concept doesn't exist in English, except for measure words which serve a different purpose.
For Korean there are around 30-35 or so  used in common speech I believe, with 개 being a general catch all for objects. Other examples include: One has to use 명 or 분 for people, and 달 or 개월 for months (duration).
You could view Chinese nouns as all being mass nouns like English "sand". In the same way you can't say "one sand", but need to add a counter as in "one grain of sand", in Chinese you can't say "one book", you have to say "one volume of book" (一本書).
Sure, you can make distinctions between "classifiers" vs. "measure words" vs. units of measurement, but it feels like the same construct to say 一位人 ("one person"), 一群人 ("a group of people") or 一斤人 ("half a kilo of human"), as grisly as the last one may be.
That would be a good argument, except that the way they function isn't different.
English: measure certain mass nouns or classify certain nouns. Not used for counting every noun.
Korean: Always used, regardless if they measure or clarify. Six dogs, three months, two papers, four volumes, one bowl, ten things all use counters. Always.
The difference is night and day.
 Certain things that you might think of as a noun based on English are, in Chinese, measure words which do not measure a noun. For example, 天 "day" and 次 "time" (as in "it happened three times") are syntactically measure words, but it is not possible to follow them with a noun that they notionally measure. This isn't really distinguishable from saying that all English count nouns are really measure words.
I firmly feel that, despite being distinct from measure words, they are close enough to English concepts that it isn't really hard to relate to and learn. Gendered inflections are far more inaccessible to me as a native English speaker than counters ever were!
That is basically what I was trying to relay: That the choices of letters were picked to represent something "different but close enough", borrowed from alternative Anglicizations that were close at hand.
>Those sounds do not exist in English, hence English speaker cannot hear or pronounce it correctly.
I never had a problem differentiating between the 'q' and 'ch' sounds in Mandarin, and I do not think I am uniquely gifted. Listening to a native speaker pronounce Chongqing for the first time made it immediately obvious. I remember struggling a bit with pronouncing 'zh-' but not because I couldn't hear a difference, it just took a little time. And even despite taking a while to learn the pronunciation, I never struggled in hearing the difference between zhuan and juan.
One doesn't hear letters (or characters for that matter), instead one hears articulations, diphthong, glottal-stops and so forth. I guess you could try to mediate what you hear through an unrelated written language, but that seems like counter-productive extra effort.
I imagine that someone could choose to make deliberate progress on this skill, even though it's not at all a common approach to teaching or learning Chinese. I can report that I know that San Francisco is called 旧金山, and that I know the meaning of each of the three characters as well as their meaning together, but I don't know the sound of any of them. If I heard someone refer to San Francisco in spoken Chinese, I would have no idea what was being referred to, but if I saw it written, I would!
I've also seen Chinese speakers who don't know any Japanese understand the basic meaning of signs in Japanese, and vice versa, because often individual hanzi and kanji continue to share their most basic or common meanings (though by no means always). I realize this is also a far cry from being able to read a newspaper fluently, but I find it very suggestive, since most likely the speakers in question wouldn't be able to read these signs aloud!
Edit: but in support of your intuition about this, Wiktionary, for example, lists 256 Chinese words that use (for example) 市, a huge number of which probably don't have a transparent meaning to a non-Chinese speaker who knows all the individual characters in a given word. And it's a similar situation with other characters, so at least it would require a lot of deliberate study to understand complex texts.
Interestingly, this is only true among Chinese people. In its own official documents (which it has to issue in Chinese), San Francisco uses an entirely different name to refer to itself.
I've always been a little bemused by that choice.
Chinese use compound words a lot. For example, there is "午餐", which means "Lunch", where "午" means "noon" and "餐" means "meal". In this way, it is more like German "Mittagessen" where "Mittag" is "noon" and "Essen" is "eating".
There are also a lot of words do not make sense like "天真", which means "naive", while "天" means "sky" and "真" means "real(ly)". This does not make sense at all.
Still, most of the words are just between these two categories. For example "自然" means "nature", and "自" means "itself" and "然" means "happened". So "nature" means "it just happened itself". This is kind of make sense somehow but it is actually pretty blurred for most people.
So 天真 (tiānzhēn), along the same lines, roughly translated, means "then sense of reality that you have when you are born or which you are gifted by nature", unsophisticated and naive. Don't know if that makes sense, but I've always thought about these two words together and felt like I understood them better through context.
Something else I find is when speaking to many of my Chinese in-laws I regularly get various stories and explanations for words and phrases. Sometimes they are straight forward and sometimes there are literary or historical references that I would have never been able to derive on my own. :)
You mixed up the word "Essen" (meal, food) with the word "Essen" (noun of the word "essen" [to eat]), which means eating.
So translated word for word, "Mittagessen" also means "noon meal".
Edit: You could go one step further and make it "mid day meal".
The same is true with many English words as well actually: many of them started out as composites that were meaningful in the last but today are not. For example, “understand.”
(Additionally, modern English spelling is complicated in ways that (to use the author's other comparison) the Devanagari syllabary used by Hindi (Nepali, Marathi etc.) is not, and is only 'phonetic' in very complicated ways (e.g. often representing Middle English phonology) to the point that English orthography is not entirely dissimilar to Chinese orthography.)
This is overstated. Both of those concepts are present at a robust level in English. As such, English obviously does have constructs for them.
Measure words are the really obvious one. Any treatment of English grammar will mention the distinction between "count" nouns, which have plural forms, and "mass" nouns, which don't. Mass nouns require measure words in exactly the same manner that Chinese nouns do. They are common; some mass nouns that are almost always used to refer to discrete items, but which nevertheless require their appropriate measure words, are "pants" (you can have a pair of pants, but not a pants), scissors (ditto), and bread (which has its own specialized measure word, "loaf").
Change of state is often not marked syntactically in English, though it can be. But it is very commonly marked lexically -- see the distinction between "being married" and "getting married", or "being on fire" and "catching fire".
I think lexical marking of grammatical concepts is an under-studied phenomenon. There is a traditional division of verbs in linguistics into those that express "states", "activities", "accomplishments", and "achievements". You can read about it here: https://en.wikipedia.org/wiki/Lexical_aspect (and just look at what they named the concept!).
It's called "lexical aspect" because the different categories are expressed, in English, by choosing different words. But they don't have to be; the difference between an activity and an accomplishment is expressed in Mandarin Chinese with a syntactic marker. Where English says "look" and "see", Mandarin has 看 and 看到. Where English has "listen" and "hear", Mandarin has 听 and 听到. Where English has "search" and "find", Mandarin has 找 and 找到. It's only a lexical distinction if you assume that English is more Platonically correct than Chinese is.
Similarly, Indo-European languages generally have a syntactic distinction between factual conditionals ("if I'm the king, why do I have to wait?") and counterfactual conditionals ("if I were the king, I wouldn't have to wait!"). (Fun side note: for hopefully obvious reasons, where this distinction exists for sentences set in the future, they're called "future more vivid" and "future less vivid" as opposed to "future factual" and "future counterfactual". We use a different word even though the grammatical distinction is identical.) English makes this distinction syntactically, as you'd expect. But it also makes it lexically -- "hope" and "wish" are the more-vivid and less-vivid equivalents of each other.
Actually something like 80% of characters are phonetic-semantic. More info: https://www.hackingchinese.com/phonetic-components-part-1-th...
Singapore and Malaysian Chinese also uses Simplified Chinese.
> In English, if you can speak something, you can write it too
Compared to Spanish, Italian or Japanese hiragana/katakana, this is not true at all in English. It is _more true_ than in Chinese/Japanese (Kanji), but still not much. It is in fact one of the things that English Learners struggle with the most!
The characters are designed to be written with a brush dipped in ink. The shape, order and direction are arranged such that a right handed person has minimal chances of smudging prior strokes.
This kind of muscle memory seems to be very beneficial even to recognizing the characters (much like autoencoders or transfer learning in AI).
I'd be interested because so far I mostly hear from people trying this approach, but not actually succeeding.
I would agree that the value of writing characters diminishes quickly after the first few hundred.
I can only write a handful of characters (probably less than 100?).
I probably would get the characters mixed up less frequently if I practiced writing them, but I don't think I would have been able to learn so many if I was doing that.
In Chinese, writing characters, and knowing their meaning helps a lot more, I guess.
Beginner learners often write ㅁ wrong, making it almost look like a ㅇ.
义 in isolation might mean "virtue", but most characters have a handful (or more) of meanings, and when it comes after 主, 义 takes on more of its "idea" meaning.
But all in all, pretty good!
I particularly like B14607.
It also doesn't aid you in pronouncing it because it's not obvious which tone it is, unless you already know it.
I cannot stop helping myself to share a Chinese quiz to you for celebrating our new year - please use 20 different Chinese words to express “I” or “me”.
(Funny but inappropriate comment self-censored.)
A more precise way that the author could have made this point might be something like this: "In English, we do sometimes use pitch to convey meaning, for example to show which word in a sentence is most important, to show whether a sentence is meant as a question or not, and to show certain kinds of emotion. But it doesn't cause one word to turn into another. In Chinese, it often does."
To be fair, there is no equivalent of 施氏食獅史 in Chinese either. The text is written in classical Chinese and is not intelligible when read aloud. Obviously, the pronunciation of classical Chinese was different.
In that case I should definitely not use that as an example of this point! Thanks. :-)
That is quite a claim.
Off the top of my head:
Each of the listed words can also have stress applied, or not, in different ways and still be differentiated by the “tone” as it would be called in Chinese. I’m not calling it tone since we don’t really have a word for it other than “pronunciation” but it’s essentially the same thing as what is happening in Chinese with tones, albeit with a very limited set of cases rather than being a pervasive feature of the language.
But feel free to call the thing I’m talking about whatever you want. But calling it “stress” doesn’t make it the same thing as the kind of stress that you were talking about with your example sentences.
Did HE contract the disease?
Did he CONTRACT the disease?
In each of these sentences, if you pronounce “contract” as it were a noun describing a legal agreement, you are going to sound somewhat off. Same with the other words listed.
Well, that is, unless you and your listeners don’t know how to pronounce “contract” differently in each usage. But again this is only one of the examples.
1. Did HE contráct the disease? - "Was it that person (or someone else) who got infected with the disease?"
2. Did HE cóntract the disease? - "Was it that person (or someone else) who commissioned a third party (to create?) the disease?"
3. Did he CONTRÁCT the disease? - "Is the thing that happened with him and the disease that he got infected with the disease, or something else?"
4. Did he CÓNTRACT the disease? - "Is the thing that happened with him and the disease that he got a third party to create the disease, or something else?"
The ALLCAPS is likely focus prosody, but there's still a differenced from 'catching' and 'commissioning' which is usually referred to a difference in the placement of 'stress' within the word - whether it's on the first syllable or the second syllable (in this case). Since English is stress-timed, it also affects vowel quality. But that's rather different from tone. (Or, to abstract away from terminology the difference between English 'cóntract' (commission) and 'contráct' (catch) is different from what goes on in Mandarin with tone distinguishing between lexical items.)
3. Did he CONTRÁCT the disease?
5. Did HE contráct the disease?
Does that make it more clear? In other words, these sentences show the same "tones" (different from "tone"), but different stress.
You were saying tones are just stress, but they are not. The stress here is different from the tones.
With #3, the stress is on "contract"; with #5, the stress is on "he"; the "tones" (again in quotes because we don't really use that word for it in English, although I'm saying the underlying phenomenon is the same) are the same in both, although the stress is different.
You can change the tones and the stresses independently of each other, and when you change the tones of the syllables, you get different meanings for the words.
I continue to believe that the case of say, digest (such as a compilation of summaries) versus digest (such as a creature processing food to extract nutrition from it) is a very similar case compared to 好學 (hǎoxǔe, read as háoxǔe) (of a study topic: easy to learn) versus 好學 (hàoxǔe) (of a person: loves learning).
You perhaps have a cartoon level understanding of Chinese tones that causes your confusion. Tones in Chinese aren’t always singsong or rising or falling as you seem to think they must be. They can be subtle (including a neutral tone and very deemphasized uses of all the other tones).
The English examples I posted are quite analogous to tone differences in Chinese in real usage, and your attempts to assert otherwise are lacking relevant evidence.
However, I don't have any more evidence than you do, just my assertions to yours. So I'll wrap up with a fitting quote from Frederick Jelinek: "Every time I fire a linguist, the performance of the speech recognizer goes up."
So Mandarin lexical tone and English lexical stress are quite clearly functionally equivalent in many ways, and I certainly would be unsurprised if an ML algorithm treated them as representationally similar. But that's still different from English stress and Mandarin tone being the same phenomenon in phonetic terms --- again, in terms of the actual acoustic signal.
So, the áccent thing, is acoustically different from Mandarin tone. So, using the digest example:
a. dígest (such as a compilation of summaries)
b. digést (such as a creature processing food to extract nutrition from it)
The (a) one is usually realised as /ˈdaɪdʒɛst/, while the (b) one as /dəˈdʒɛst/. So not only does the first syllable in (a) have a different vowel than in (b), but first syllable in (b) will have a drastically shorter duration than the first syllable of (a). These acoustic correlates in English are very different from what occurs with tone in Mandarin, which doesn't affect syllable duration or vowel quality in the same fashion. You can visualise the acoustic wave-forms of the accent-thing in English and compare it against the acoustic wave-forms of the tone-thing in Mandarin and see that they involve different acoustic properties. (So no need for any linguistic theory etc.)
But... you introduced the accents first in this conversation! So you get to decide what you meant by them.
I guess we can’t communicate since were using a different language. Learn how pinyin works with tone markers and then we can talk.
I'm assuming your word-pairs are supposed to be examples of differences in pronunciation of the 'verb' vs the 'noun'. Or did you have something else in mind?
Well, the three words "I don't know", if fully realized, are only three lexical items.
English has three words that can be spoken without opening your mouth, with the meanings "yes", "no", and "I don't know". Yes and no are conventionally spelled "uh-huh" and "uh-uh", and consist of voicing interrupted by an [h] (in the case of yes) or a glottal stop (in the case of no).
The "I don't know" sequence consists of voicing broken into three tonal segments. It is a single three-syllable lexical item in which every syllable is "mm" (if your mouth is closed) or "uh" (if it's open), but which uses a tone sequence copied from the phrase "I don't know".
It's a good model for how languages develop tone in the first place.
> It's a good model for how languages develop tone in the first place.
I'm not fully convinced for that. It's a really specialised instance and it's still standing in for a phrase (a whole proposition in fact) rather than a lexical item as such. Full-blown lexical tone, at least in some cases, seems to develop as the result of reinterpreting tonal correlates of some other phenomenon, e.g. Punjabi tone as resulting from reinterpretation of aspiration.
I would have said that the "I don't know" sequence matches this description -- it's just the tonal contour of the original phrase, with the other details stripped out. I don't see how that differs from Punjabi syllables preserving their (originally) allophonic tone while losing their (originally) phonemic aspiration, such that the tone becomes phonemic.
> I wonder if that's really the only item of that type in English.
Obviously I can't guarantee that, but I think it's pretty likely for several reasons:
- I can't think of anything similar.
- The yes/no question answers (including the "I don't know" sequence) are a clear conceptual group and differ from normal English in a few ways. The "no", "uh-uh", is also (I conjecture) unique, being the only English lexical item to contain a phonemic glottal stop. (Plenty of English words feature an obligatory glottal stop, but that stop is allophonic, as when the /t/ of "kitten" is transformed into a glottal stop by the /n/ that follows it.) It is of interest that the only phonemic glottal stop and the only phonemic tone should occur in these two closely related words. I strongly suspect that this group of answer words has these unusual features because of the constraint that you can pronounce them without opening your mouth.
- It's a single word that contains a lot more information than one English word should. For the Latin word "nescio" to mean "I don't know" is unsurprising; all Latin verbs are marked with subject agreement and in the case of a first person subject that is generally the only subject marking. English subject marking is almost always mandatory. This is just more evidence that the tone sequence is quite different from normal words.
- English has a non-vestigial system of grammatical tone at the sentence level; tonal marking is obligatory for yes/no questions. But sentence-level grammatical tone can't really coexist with lexical tone. So I'd be very surprised to see other instances of lexical tone outside very special cases. (For example, the phrase "I don't know" can't ever be a yes/no question, and so can't conflict with the English interrogative tone.)
because, as you elaborate, the English example you mention is really isolated - it's not enough to spawn a system of tone. Even in this case (the "I don't know" sequence), the tone isn't really contrastive.
> the phrase "I don't know" can't ever be a yes/no question
Sure it can, though the contexts wouldn't be that frequent. And some speakers have final rising tone even on non-yes/no questions.
> English has a non-vestigial system of grammatical tone at the sentence level
And of course there's more than just question-associated tone, there's also focus-associated tonal prosody.
It's not contrastive in that there is no minimal pair that would demonstrate it. But good luck being understood if you don't produce it correctly. The lesson I would draw here is more "minimal pairs aren't sufficient to demonstrate every phoneme" than "it's not a phoneme until a minimal pair exists".
> the English example you mention is really isolated - it's not enough to spawn a system of tone
I didn't claim and don't believe that English is developing lexical tones. But I do believe that the development of lexical tone in this single unusual word is fundamentally similar to the larger-scale development of tone in other languages, and that the still-obvious relationship between the phrase and the tone sequence makes this a good example to help English speakers understand how it can happen.
> And of course there's more than just question-associated tone, there's also focus-associated tonal prosody.
Very true, but much harder to describe.
Perhaps. (Minimal pairs are, of course, simply a diagnostic tool.) It would be a pretty unproductive phoneme though.
> I didn't claim and don't believe that English is developing lexical tones. But I do believe that the development of lexical tone in this single unusual word is fundamentally similar to the larger-scale development of tone in other languages, and that the still-obvious relationship between the phrase and the tone sequence makes this a good example to help English speakers understand how it can happen.
I'm not deeply versed in tonogenesis - but that is interesting as a potential source (phrasal tone). I wonder what conditions could lead to phrase-related tonal patterns becoming prominent/frequent enough to be a source of more prototypical lexical tone.
> > And of course there's more than just question-associated tone, there's also focus-associated tonal prosody.
> Very true, but much harder to describe.
It's been discussed fairly extensively though, at least as far back as Jackendoff (1972 or thereabouts).
None of those turn into completely different words based on which tone you use to pronounce it; a person would still know what you mean based on the sentence context but it would sound off.
(that is, there are word pairs that are distinguished only by tone)