

"shi1 shi4 shi2 shi1 shi3" - a Chinese tongue twister - soyelmango
http://www.yellowbridge.com/onlinelit/stonelion.php

======
blahedo
This is almost certainly written in literary classical Chinese (even if
written recently). Basically, in Chinese, words that were once pronounced
differently merged in pronunciation in modern Mandarin, though since the
writing system is quite divorced from sound, the written forms never changed
(at least, not until Simplified Chinese under Mao).

What that means is that a _lot_ of classical Chinese literature has a bit of
this problem, in that if you read it aloud, it is tricky to fully comprehend.
(This poem/story takes it to an extreme, obviously.) For a long time, up
through the early 20th century, it was fairly common for written things to be
written in this literary Chinese even if the writer would _say_ their ideas
completely differently. A "write as you speak" movement has largely changed
that: in modern written Mandarin, many words are two or three syllables long,
and written with two or three characters. Etymologically, the words may have
derived from compounding monosyllabic words, but in the modern spoken (and now
also written) language, the sub-word syllables are simply no longer words in
any meaningful sense.

(An analogy on a much smaller scale in English: many dialects of English merge
/ɪ/ and /ɛ/ before a nasal sound, so that "pin" and "pen" sound identical.
Speakers of those dialects will often _say_ "straight pin" or "ink pen" (or
"pig pen" or "cow pen") to distinguish which kind of pin/pen they mean; they
may or may not make the distinction in writing. So the writing is unambiguous,
the speaking is unambiguous, but reading a written thing _is_ ambiguous. When
it's just one or two words it's no big deal, but imagine that nearly every
word suffered from that problem....)

EDIT: It turns out there's a Wikipedia page on the poem (naturally, should've
checked that first). A lot of this is covered there:
[http://en.wikipedia.org/wiki/Lion-
Eating_Poet_in_the_Stone_D...](http://en.wikipedia.org/wiki/Lion-
Eating_Poet_in_the_Stone_Den)

~~~
fhe
One solution to the "many characters sounding the same" problem was turning
Chinese into a tonal language. I guess technically that makes them sound
different. but even with tones (in modern mandarin Chinese there are 4 tones;
in older time there were more; Cantonese reportedly still has 9).

Even with tones though, there are still too many characters sounding exactly
the same. In a Chinese dictionary (you might want to pause for a moment and
imagine what a Chinese dictionary looks like. if you come across an unknown
English word in writing, you can look it up by spelling. but what's the
equivalent thing to do for a chinese character?), under any one sound
specification, you can easily find 10-20 characters.

where I disagree is that, yes, classic written Chinese is tricky to
understand, but not in the sense that they are mostly one-syllable words. they
are tricky to understand in the same sense that Shakespearean (or Chaucerian)
English was tricky to understand to modern English speakers - due to
unfamiliar vocabulary and old-style grammar. if this opinion is valid, I
further propose that, when the Chinese talk, they might rely more heavily on
semantics to parse sentences in real-time, since based on sound alone there
might be too many character candidates. in other words, they use what's being
talked about to do some pretty heavy-handed proning as they process incoming
syllables.

~~~
heinel
On Chinese dictionaries. I actually don't think there is any difference from
English ones, at least in the way I use them. When I use an English dictionary
I use the word's spelling. I don't check how the word is pronounced when I do
this. When I use a Chinese dictionary I use the number of strokes that make up
the character, which puts all the characters in a sort of order not unlike an
alphabetical one. I also do not need to know how the word is pronounced when I
do this.

There is no such thing as "turning Chinese into a tonal language." Chinese is
tonal to begin with. For native speakers, a different tone does sound as
distinct as a different phoneme. The problem the poem exemplifies is the sort
we run into when we reduce the language into only the phonetics -- a problem
I'm not convinced is unique to Chinese. In fact, just the post above pointed
to an example of such in English.

------
poutine
The overloading of the shi sound in mandarin approaches absurd levels.

I've always liked:

四 是 四 ， 十 是 十 ， 十 四 是 十 四 ， 四 十 是 四 十 ， 四 十 四 只 石 狮 子 是 死 的 sì shì sì shí shì
shí shí sì shì shí sì sì shí shì sì shí sì shí sì zhī shí shī zǐ shì sǐ de.
Translation: 4 is 4, 10 is 10, 14 is 14, 40 is 40, 44 small stones are dead

Then get a southerner with say a Sichuan or yunnan accent to say this (in
mandarin). They cannot properly pronounce shi (they say it like si). The above
just sounds like an angry bee. Given that shi and si is used a lot this
becomes a real pain for a non native speaker.

Makes this tough when you're buying something that is 44 rmb and you cant tell
whether they said "is 14" or "44" or what.

~~~
nonce42
My impression (from very brief study of Mandarin) is that Mandarin has a lot
more homonyms than most languages, e.g. lǐ has a ton of meanings:
<http://en.wiktionary.org/wiki/l%C7%90> It also seems to me that a lot of
Mandarin phonemes are much closer together than most languages, e.g. ch and q
are both a similar "ch" sound (likewise sh and x). Or maybe these sounds just
seem similar to me because of my English upbringing?

Do linguists have a way to quantitatively measure how close together the
sounds and words are in a language? Some sort of Shannon entropy measure,
maybe? Or a way to measure how spread out words are in "phonetic space". I
couldn't find anything, but I'd like to know if there's a way to measure these
things objectively.

~~~
blahedo
On the first part: See my other post for more, but basically, much (not all)
of the homonymy is from older and/or written-only forms; and yes, ch and q
don't sound any closer to a native Mandarin speaker than, say, sh and s do to
a native English speaker or u and ou to a native French speaker.

On the second part: no defined measure that I know of. It would be a little
tricky in that language is a moving target, everyone speaks it slightly
differently, and even for a single speaker the "location" of a particular
phone is more of a probability distribution even _after_ you factor out
varying context. That said, there definitely are charts that map out the space
and take a stab at identifying the prototype location of each phone in the
sound space, so it's not entirely implausible that you could summarise that
with a distance measure. I strongly suspect that the value of the measure
would not vary much among languages with similar-size phoneme inventories,
though.

------
lisper
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. :-)

[http://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffalo...](http://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffalo_buffalo_buffalo_Buffalo_buffalo)

~~~
maw
If we remove the adjectives formed from proper nouns, turn the nouns into
"people", turn the verbs into "bully", and add a pronoun and an adjective, we
get

    
    
        people whom other people bully bully people.
    

This makes sense.

But when we remove the pronoun and adjective and we get

    
    
        people people bully bully people
    

which is as far I can tell gibberish (without changing meanings such that they
don't correspond with the original sentence). It can't have anything to do
with many people persons, because in the original there's an adjective there.

If we assume that, just as there are social adept people, whom we call "people
people", there are also socially adept buffalo called "buffalo buffalo", we
can get "Buffalo buffalo buffalo buffalo buffalo". But to go beyond that is to
claim an incoherent sentence missing many words is coherent and grammatical.

~~~
artlogic
It seems to me that if you replace "whom other" with "that" you end up with a
similar meaning:

    
    
        people that people bully bully people
    

I've found it fairly common to hear people drop the "that" when speaking - so
in this case:

    
    
        people people bully...
    

the first people refers to the people being bullied while the second people
refers to the people doing the bullying. It's a bit clearer if you replace
some of the nouns - for instance:

    
    
        dogs people bully bully cats
    

which I believe is a perfectly legal English sentence.

~~~
maw
_dogs people bully bully cats_

Aha! It finally makes sense. Thanks.

You should edit the wikipedia article, because your explanation is way better
than the one contained therein.

------
linker3000
Even though it was tough to go through slough in the snow, I ploughed through
without a thought.

~~~
soyelmango
I've heard Slough is tough enough to go through even without snow :P

------
misterbwong
Here's the wikipedia entry on this poem along with the audio.

[http://en.wikipedia.org/wiki/Lion-
Eating_Poet_in_the_Stone_D...](http://en.wikipedia.org/wiki/Lion-
Eating_Poet_in_the_Stone_Den)

~~~
byw
It's interesting to compare the original version in Classical Chinese with the
vernacular Chinese one.

The Classical Chinese version just sounds (err, reads) so much more awesome,
like a smooth-talking wisecrack who can't be bothered to dispense extra
breaths, while the vernacular version reads like a seven-year old's essay.

I always had the impression that Classical Chinese was painstakingly crafted
and preserved by a small group of elitists obsessed with aesthetics and
abhorred ease-of-learning. The ultra-minimalist syntax seems very "inorganic",
but it makes every line poetry and invitation for puns. As a result, most
people were illiterate, but if you were well-read, boy, the fun you could have
with words.

------
emehrkay
NPR just had a story about this

[http://www.npr.org/templates/story/story.php?storyId=1295525...](http://www.npr.org/templates/story/story.php?storyId=129552512)

------
baby
the number after the word only indicates what tone it is. always good to
review which tone is which :

shi1 = shī

shi2 = shí

shi3 = shǐ

shi4 = shì

~~~
praptak
True, it is also good to have an idea which tone means what. As far as I know
it's: high-level, raising, falling-and-then-raising and falling (the shape of
the dashes over i gives the hint)
<http://en.wikipedia.org/wiki/Mandarin_phonology#Tones>

~~~
nivekastoreth
[http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi1....](http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi1.mp3)
(shī)

[http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi2....](http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi2.mp3)
(shí)

[http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi3....](http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi3.mp3)
(shǐ)

[http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi4....](http://www.mdbg.net/chindict/rsc/audio/voice_pinyin_cl/shi4.mp3)
(shì)

For the falling-then-raising (shǐ) changing the "shi3" to "chi3" in the URL
gives a better idea, I think, for what the inflection sounds like.

~~~
nwomack
Agree on the last point -- at least, for Taiwan. In fact, those idealized
graphs caused me a bit of a headache early on. Native mandarin speakers in
Taiwan tend to drop the end of the 3rd tone, so it falls slowly then holds at
the bottom for a minute, then rises slightly or even just cuts off.

------
boyter
Its an amusing one because its easier for foreigners learning mandarin to say
then cantonese speaking chinese who then learn mandarin to say.

In my chinese class in the north they just it as a way of testing southern
chinese ability at speaking mandarin.

~~~
ezl
While its true that native cantonese speakers can generally be identified, its
not the case that they lack the ability to speak mandarin. They just speak
mandarin with a southern regional accent. Native mandarin speakers from taiwan
are also easily distinguished from beijingers.

Generally, cantonese speakers actually pick up mandarin a lot more easily than
mandarin speakers pick up cantonese.

Adopting an accent can be a lot harder than adopting a dialect because 1. your
brain doesn't parse the new sounds well if you didn't grow up hearing them and
2. you have no experience creating these sounds and its a lot harder to learn
to produce them. There's a lot of literature about native accent adoption in
second language acquisition that suggests an age dependency due to increased
neural plasticity in youth (even though there's a lot of dispute about that
critical period for language (not accent) acquisition).

------
sunicrass
...that's what Shi said. :-)

~~~
soyelmango
Hmmm, there is potential here for multi-lingual homophonic exercises...

------
barrydahlberg
I've asked a few Japanese people to say "She sells sea shells by the sea
shore..." and it comes out sounding a lot like that... Tricky because Japanese
uses Shi and Chi sounds but not Si on its own.

Don't worry, they get plenty of revenge on me.

------
est
there is a whole lot of Chinese stuff like this:

<http://gist.github.com/568487>

------
garply
Ironically, when pronouncing this you don't move your tongue at all.

~~~
blahedo
Not quite true: although the tongue is not as obviously involved as it is in
stop sounds (like /t/ or /k/), friction between the tongue and the air are
what make the sound /ʂ/ (here transcribed as "sh"), while during a vowel sound
(/i/), the air passage is open. So the tongue does have to move a little
relative to the roof of the mouth; just not very much.

~~~
garply
I really don't move my tongue when I say these in a row. Is this because I'm
speaking with a Beijing accent? Apparently what's coming out for me is: ʂ̺ɻ̩.
Or should that combination also force me to move my tongue?

When I lay off the accent and move the i to the front of my mouth a bit, I do
seem to move my tongue a little. I'm not sure I'm doing it right though, the
i's in non-Beijing Mandarin are actually always a bit stressful for me, but
I'm pretty sure I'm pronouncing my "shi" correctly for Beijing.

~~~
blahedo
It's likely that your tongue isn't moving _much_. It's also possible that the
tongue isn't moving front-to-back (as it would for the English word "she") or
that it isn't moving with respect to your lower jaw or that it isn't moving
very much compared to nearly any other set of sounds you might utter. But the
tongue _must_ be moving if you are making any differentiated sound at all
there.

------
kroger
I like to use this dictionary to check pronunciation, meaning and stroke
order:

[http://www.mdbg.net/chindict/chindict.php?page=worddict&...](http://www.mdbg.net/chindict/chindict.php?page=worddict&wdrst=0&wdqb=chi)

------
kijinbear
Not exactly a tongue twister, but here's something similar from Korean: 가가가가가?
(gagagagaga?)

In the Southeastern (Gyeongsangdo) dialect of Korean, this means "Does he have
the surname 'Ga' ?"

------
jerf
"This thread is useless without" mp3s. In all seriousness, I'd love if it
somebody took a crack at this because I'd love to hear it. (And I'm sure the
karma would flow.)

~~~
clevercode
[http://en.wikipedia.org/wiki/Lion-
Eating_Poet_in_the_Stone_D...](http://en.wikipedia.org/wiki/Lion-
Eating_Poet_in_the_Stone_Den)

------
jankassens
A recording a friend of mine made: <http://kassens.net/shi-shi-shi-shi-
shi.mov>

------
Luff
Imagine how hard it must be to make understandable text-to-speech / speech
synthesis for Chinese.

~~~
soyelmango
...and how hard it would be for voice recognition systems too.

~~~
ced
That's unlikely. How do Chinese people write a sentence on the computer? Old
systems would have you write the letters, eg.: "shi", then display a long list
of all the possible characters with that sound (for all 4 tones). Newer
systems use context to figure out the right characters for the whole sentence
at once (probably using naive Bayes --- see Peter Norvig's spell checker), so
that I can just type "Wo shi Jianada ren" and the output would be the correct
6 Chinese characters. I never need to specify the tones.

So I would assume that voice recognition could do even better by analyzing
tonal information.

~~~
soyelmango
Thanks for the link to Peter Norvig's spell checker - interesting reading
there.

------
ccarpenterg
A Spanish tongue twister: Tres tristes tigres trigaban trigo en un trigal.

~~~
soyelmango
What does it mean? Three sad tigers...

~~~
ccarpenterg
Three sad tigers ate wheat in a wheat field.

