
Toki Pona: A created language that has only 100 words - chippy
http://www.theatlantic.com/technology/archive/2015/07/toki-pona-smallest-language/398363/
======
yongjik
> The point is simplicity. And in Toki Pona, simple is literally good. Both
> concepts are combined in a single word: pona.

That's how you end up with words like English "cool", which means low(-ish)
temperature, except when it's about people, or "blue", a particular color,
except when it's about feelings.

It's anything but simple, and honestly it doesn't sound good to me, either.

~~~
canjobear
How often does that really lead to confusion though?

I can't think of a time someone said "blue" meaning sad and I thought they
meant "blue" the color.

Humans are great at disambiguating things.

~~~
yongjik
If you are a native speaker, rarely. Otherwise, all the freaking times (for
the first ten years or so).

E.g., consider the difference between a "cool guy" and a "cool attitude". If
you think about it, they describe almost totally opposite situations.

Or consider "No, it's cool." After all, "it's hot" means the temperature is
high.

~~~
duaneb
> "cool attitude"

Do you actually use this much, though? I had to double take because I'm not
familiar with the phrase and I'm a native speaker. Typically you might use
cool to in this way to describer an action or gerund, or with a prepositional
phrase, or as an adverb.

Heck, if anything, I'd say a "cool attitude" is a _calm_ attitude—as in, the
person played it cool.

~~~
afandian
Maybe not that exact phrasing but "he gave me a cool look" can mean exact
opposites. Depends entirely on dialect / register. The direct metaphorical
version ("he gave me an unfriendly look") is reasonably common in English. The
other reading of it isn't so much, but is perfectly understandable (probably
incorrectly).

------
Houshalter
This reminds me a lot of a natural language processing tool called word2vec.
It reduces words into vectors, where each dimension represents something about
the meaning about the word. E.g. whether it's a place, or if it's male or
female.

And these vectors are not designed by humans, but learned from data. It's
optimized so that words that occur in similar contexts have similar vectors.
E.g. places tend to occur near words talking about places, like "there" or
"go", and male names tend to occur near "he" and "his", etc.

You can do really interesting things with these vectors. Like
'King'-'man'+'woman'='queen'. Recently a computer program beat the word
analogy part of an IQ test using this. It's useful in computer translation.
Take two languages and constrain a few identical words to have the same
vector. Then all the other words also end up with similar vectors.

Anyway, with something like this, the actual words and symbols don't matter,
just the 100 or so dimensions needed to accurately represent the meaning of
every word. Of course speaking a 100 dimensional vector is way less efficient
than a single word or symbol.

But perhaps using this, we could come up with an absolute minimal set of
symbols necessary. And some rules for combining them to get all words.

I know a lot of the comments here are complaining that this doesn't make a
good natural language. But that's not the point of an auxiliary language. The
main thing auxiliary languages should optimize for is how easy it is to learn.
Simpler is better. Reducing the words and grammar is hugely important.

Another advantage of such an approach, is that it might be more amenable to
machine translation. Which is another advantage for an auxiliary language.

~~~
rhaps0dy
That may be the way to reduce duplication and bloat in the Lojban [1] root-
word list. Currently, about 1300 root words exist. A lot of them have
overlapping or redundant meanings, and it has already been proposed to
(manually) unify them and reduce their number.

Combined with the already existing rules for word composition, and the
grammar, it could make a very nice language.

The idea is still very far from possible completion though. One (of several)
things to look at is: how do we characterize word composition? word vector
adding? averaging? I guess the operation could be learned too.

[1] Lojban is an artificial language based in predicate logic
[http://lojbo.org/](http://lojbo.org/) .

Source for the "could be used in AI" claim in that website:
[http://www.goertzel.org/new_research/lojban_AI.pdf](http://www.goertzel.org/new_research/lojban_AI.pdf)
.

------
zachrose
Cool. There's a similar project put forth by linguists in the early 70s called
Natural Semantic Metalanguage, with about 70 "primes" that all other words can
be factored out as, like you would have to do if you wrote a dictionary
without circular definitions.

Unlike Toki Pona or the other languages mentioned in the article, Natural
Semantic Metalanguage is mostly used for thought experiments and to analyze
existing languages for ambiguities and double meanings.*

[https://en.m.wikipedia.org/wiki/Natural_semantic_metalanguag...](https://en.m.wikipedia.org/wiki/Natural_semantic_metalanguage)

*"Double meanings" may not be the right word, but is there a word to describe what happens when a system is non-orthagonal and the same piece winds up with multiple responsibilities?

~~~
mintplant
Overloading?

~~~
zachrose
Totally, perfect!

~~~
oneeyedpigeon
For us geeks :-) others know such words as 'homonyms'.

~~~
zachrose
But within NSM scholarship, overloading is probably more descriptive than
words that happen to sound the same but have well-known distinct meanings. In
blurb form:

"This book is based on two ideas: first, that any language--English no less
than any other-represents a universe of meaning, shaped by the history and
experience of the men and women who have created it, and second, that in any
language certain culture--specific words act as linchpins for whole networks
of meanings, and that penetrating the meanings of those key words can
therefore open our eyes to an entire cultural universe. In this book Anna
Wierzbicka demonstrates that three uniquely English words--evidence,
experience, and sense--are exactly such linchpins. Using a rigorous plain
language approach to meaning analysis, she unpacks the dense cultural meanings
of these key words, disentangles their multiple meanings, and traces their
origins back to the tradition of British empiricism. In so doing she reveals
much about cultural attitudes embedded not only in British and American
English, but also English as a global language."

[http://www.amazon.com/Experience-Evidence-Sense-Cultural-
Eng...](http://www.amazon.com/Experience-Evidence-Sense-Cultural-
English/dp/0195368010/ref=pd_bxgy_14_2?ie=UTF8&refRID=1ME4VECZ3R26YBX27A6Z)

------
monitron
There was a nice episode of The Allusionist about Toki Pona, where you can
hear people trying to learn and speak it:

[https://soundcloud.com/allusionistshow/tokipona?in=allusioni...](https://soundcloud.com/allusionistshow/tokipona?in=allusionistshow/sets/allusionist)

------
escherize

        Toki Pona has a five-color palette: loje (red), laso
        (blue), jelo (yellow), pimeja (black), and walo (white). 
        Like a painter, the speaker can combine them to achieve 
        any hue on the spectrum. Loje walo for pink. Laso jelo 
        for green.
    

I wonder if there's a reason for using the pigment color combinations instead
of the light combinations of color.

~~~
2muchcoffeeman
Because that is what most of us learn as children?

Even now, other than white and black, I'd have to think what light
combinations make what colour.

~~~
david-given
Of course, one of the problems with colours is that different cultures have
different colour values.

The Celtic languages like Scots and Irish Gaelic distinguish between _dearg_ ,
which is the red of paint or blood, and _ruadh_ , which is the red of hair or
heather. To them these are completely different colours, and a native Gaelic
speaker would be really confused to find that English translates them both as
the same word.

Toki Pona's limited root vocabulary really just demonstrates its cultural
biases.

~~~
JshWright
Or the fact that many languages don't distinguish blue and green at all.

------
peteforde
I actually know the woman who created Toki Pona, here in Toronto.

She taught me that professional translators can only translate into their
native language, which I found frustrating but it made sense.

~~~
diskcat
Wait what?

This doesn't sound right to me at all.

There is no arbitrary skill ceiling that can only be achieved by being a
native speaker.

~~~
PhasmaFelis
I can kind of see the logic. It's certainly _possible_ for an adult to learn
to speak another language indistinguishably from a native speaker, but in
practice there's a surprisingly large gap between that and a person who can
speak fluently on complex subjects with no fear of being misunderstood, but
makes occasional minor errors--or even usages which are grammatically correct,
but _sound_ stilted--that betray them as non-native.

There's a big jump between 99.9% and 100%, is what I'm saying, and if you want
a really professional translation job, you're not going to settle for 99.9%.
And the thing is, you can't tell the difference between 99.9% fluent and 100%
fluent without _being_ 100% fluent. So if you want to be sure you're getting
100% fluency, a native speaker is your best bet.

~~~
Grue3
Imagine a translator who is 50% fluent in source language and 100% fluent
(native) in target language. He would often misunderstand the source text and,
even though he can produce target language natively, the translation would be
incorrect.

Now imagine a translator who is 100% fluent in source language and 50% fluent
in target language. He completely understands the source text and, using only
simple sentences, can accurately convey the meaning in the target language.

This thought experiment demonstrates that the knowledge of source language is
_far_ more important for the accuracy of translation.

~~~
PhasmaFelis
Both of those people would be equally unacceptable as a professional
translator. It's true that #2 would be better if those were your only two
options, but that's an unrealistic limitation.

A guy who is 99% fluent in source and 100% fluent in target will generally
produce a flawless translation--if he happens across a word or idiom in the
source that he doesn't quite understand, it will stand out to him, and he will
pause to research it before continuing.

A guy who is 100% fluent in source and 99% fluent in target will produce a
perfectly clear but very slightly stilted translation, because he's not aware
of the few mistakes that he makes.

If you're translating a commercial product and you have my two guys to choose
between, you would want #1.

~~~
diskcat
>A guy who is 100% fluent in source and 99% fluent in target will produce a
perfectly clear but very slightly stilted translation,because he's not aware
of the few mistakes that he makes.

Why isn't the reverse true? It is only that the lossyness of the conversion
process is invisible in the result, so somebody reading it would not know that
it's wrong, it still produces the same quality of translation i.e. that its
not perfect (or as good as can be).

Additionally, there are so many words in english that it's not possible to
know them all, and so one has to assume that all people are at most 99%. But
my english alarm doesnt go off everytime a word like 'mellifluous' isn't used.
So I even doubt the claim that a guy who is 99% in the target language will
make a 'stilted' translation. There exists a skill level where a person can
produce a text that won't read as 'wrong' but won't be as good as it could
possibly be and this skill level is sufficient to not have a text feel
stilted. Not to mention most people do not have the same writing skills, and a
very good high school student writer writes a much more pleasing text than his
schoolmate who is not very good.

------
chippy
The article also describes another more complex language. Toki Pona on
wikipedia:
[https://en.wikipedia.org/wiki/Toki_Pona](https://en.wikipedia.org/wiki/Toki_Pona)

Both languages seem to encourage thinking about language and how we describe
things and the world. A bit like E-Prime (of which I'm a fan).

~~~
kseistrup
The other, more complex, language is called Ithkuil ⌘
[https://en.wikipedia.org/wiki/Ithkuil](https://en.wikipedia.org/wiki/Ithkuil)

~~~
zdkl
Iun-niu ti casexh

I keep thinking of implementing a crude translator for that lovely language
but lack the skills. Anyone interested in playing around with it?

------
personjerry
So it's like the vi of languages? I.e. memorize the relatively few meanings of
the "words" and then customize/combine them to the context. Seems like with
languages there would be a lot more room for ambiguities though.

~~~
solipsism
No. vi has a grammar, like all natural languages, including this one I guess.
vi's is simple, mostly combining a noun with a verb, plus some simple
modifiers. It sounds like the interesting thing about this language isn't that
it has a grammar, it's that it has so few (and well chosen?) words that
virtually every noun and every verb must be expressed by combining these
building-block words. vi doesn't seem like that to me.

------
nikolay
Link to the homepage: [http://tokipona.org/](http://tokipona.org/)

------
scottyates11
“What is a car?” “You might say that a car is a space that's used for
movement, tomo tawa”

Luckily, it is not widely used. Otherwise, it would be a nightmare of
translators, especially for Google Translate developers!

Before I wiki it, I thought it was a language used by the tribes in Africa or
ancient civilization, but it was created in 2001!

Find out more about it on Wiki:
[https://en.wikipedia.org/wiki/Toki_Pona](https://en.wikipedia.org/wiki/Toki_Pona)

~~~
kaoD
Toki Pona isn't meant as a universal language to be translated from nor to
translate into (unlike e.g. Esperanto). When you translate from/to, a lot of
subtleties are lost (which is exactly what Toki Pona intends).

Toki Pona is just a "toy" language (for lack of a better adjective), somewhat
like yoga for the language (for lack of a better simile). Each person uses
Toki Pona in their own way: for some it's pushing language to the limit, for
some it's just fun, some others consider it an experiment in psychology, an
experiment in language construction, an introspection tool, a challenge,
poetry in and of itself...

Obviously it's not the best to communicate precisely, just like yoga asanas
aren't meant to walk.

~~~
sago
Perhaps 'recreational' rather than 'toy' is a better word. :) The analogy with
yoga is good, or plenty of programming languages. They make you better as a
person, stretch you, widen your knowledge, and expand your mind, without
necessarily being directly useful. For me it is enjoyable, definitely poetry.
Though the opportunities to actually speak it is very limited.

sina sona ala sona e toki pona? (do you know it?) ... if so: sina kama sona
tan seme?

~~~
kaoD
I assume "sina kama sona tan seme?" means "where did you learn it from?". Did
I get that right?

A small nitpick: proper nouns have to be preceded by a Toki Pona common noun
or noun phrase (in this case toki, language) and then the proper noun in
capitalized case. E.g.: toki Toki Pona, telo Coca Cola, etc. I.e. proper nouns
must modify a preceding common Toki Pona noun.

\---

mi toki e toki Toki Pona. taso mi jo ala e tenpo mute. tan tenpo suli la mi
toki ala e toki Toki Pona. mi kama sona tan tan mute. lipu Tokipona.net en
lipu Tokipona.org en lipu Reddit.com/r/tokipona li pona. lipu Tokipona.org li
anpa. taso lipu Forums.tokipona.org li pali. sina wile la sina ken toki tawa
mi lon ni: "el" en nimi mi pi lipu HN, lon kulupu Gmail.

(I speak Toki Pona, but I don't have a lot of time. It's been a long time not
talking Toki Pona. I've learn it from many different sources. tokipona.net,
tokipona.org, /r/tokipona are good. tokipona.org is down but
forums.tokipona.org is still working. If you want, you can contact me at "el"
concatenated with my HN username at gmail.)

~~~
sago
Nitpicking is welcome! mi toki mute alla pona. I've not seen 'toki Toki Pona'
used that way though, I've rarely seen it capitalised and used as a proper
noun. So 'sona e toki pona' (nimi pi lipu pi jan Piljin e 'o kama sona e toki
pona!'). That said, the rule for the extra verb I end up dropping quite often
from carelessness, so 'sina telo allo telo Coca Cola?' rather than 'sina telo
allo telo telo Coca Cola?' So it was a good nitpick.

'tan seme' is usually 'why', I think (I learned it as a compound lexicon entry
I guess). 'sine sona kama tan seme?' for where from?

Thanks.

~~~
kaoD
Ah! Very true. My Toki Pona is very rusty as you can see. I forgot a lot of
compound lexicon and idioms :(

I've always seen 'Toki Pona' used as a proper noun following the proper noun
rules to avoid confusion with 'good talk' and the like (which I guess is the
purpose of the rule in the first place, as well as discouraging proper nouns).
I learnt it a long time ago so maybe the requirement has been dropped? Or
lousy usage since it's very common that people forget to follow the proper
noun rule. In 'kama sona e toki pona' I'd be inclined to understand "learning
to talk well" rather than "learning Toki Pona". Perhaps that was the intent?
Anyways, there's so much outdated and non-canonical material, and most of the
canon has holes and even doesn't follow its own advice. And actual usage has
diverted so much too (plus having zero native speakers everyone has their own
usage). "Language-ing" is hard :P Even more so in a vague language like Toki
Pona.

The Coca Cola example would require 'e' to mark the direct object ('sina telo
ala telo e telo Coca Cola'). I've grown so accustomed to a common noun
following 'e' that it missing sounds really jarring.

mi toki e toki Toki Pona tan ni: toki Toki Pona li musi tawa mi. toki Toki
Pona li pona kin tawa ali! tenpo suli la mi toki mute e toki Toki Pona. taso
tenpo ni la mi toki lili e ni.

------
V-2
Interesting. Eye is "oko" (same as in Polish), a man is Jan ("John"/"Ian" in
Polish), a hand is "luka" ("ręka" or "renka" in Polish, "ruka" in Russian), a
leg is "noka" ("noga"), a mouth is "uta" ("usta") etc.

But the creator isn't Polish or even Slavic, even though the interviewed
language fan (?) Krzeminska is. Go figure

~~~
riffraff
"oko" is also similar to words of romance origin (compare: "ojo" in spanish
and "occhio" in italian).

Anyway, the author of the language basically just got random words off of many
languages, see

[http://archive.is/6lxwq](http://archive.is/6lxwq)

------
miseg
Is there a good sub-Reddit with interesting topics like this about language?

~~~
doublec
There's a subreddit for constructed languages:
[https://www.reddit.com/r/conlangs/](https://www.reddit.com/r/conlangs/)

------
a_c
> In Chinese, the word computer translates directly as electric brain.

Computer has two translation in mandarin, while "electric brain" is the only
translation in cantonese

~~~
hawkice
This is straight off the top of my head, but are you referring to 计算机？ I have
always considered that best translated as 'calculator', but obviously it is
used in a lot of places where the English wouldn't be calculator.

Of all the massively incorrect reporting out there on Chinese language and
culture, this seems pretty innocuous.

~~~
sswezey
If you think about it historically, calculator would be a more accurate
translation - it was something that computed or calculated numbers for you. In
German, calculator and computer are also both Rechner.

------
rosser
Previous discussion:

[https://news.ycombinator.com/item?id=9914534](https://news.ycombinator.com/item?id=9914534)

------
cromwellian
First thing i thoughtof was Darmok and Jilad at Tinagra. :)

------
nightmiles
Original article from July 2015:
[http://www.theatlantic.com/technology/archive/2015/07/toki-p...](http://www.theatlantic.com/technology/archive/2015/07/toki-
pona-smallest-language/398363/)

The BI copy doesn't even bother correctly distinguishing pull quotes from the
following text in its copy-paste job.

~~~
dang
Thanks. Url changed from [http://uk.businessinsider.com/the-worlds-smallest-
language-h...](http://uk.businessinsider.com/the-worlds-smallest-language-has-
only-100-words-and-you-can-say-almost-anything-2015-7).

------
thealistra
I can't see this kind of language used for technical documentation or at any
kind of tech workplace. There are already nuances with current languages. This
would be terrible.

~~~
majewsky
A fruitful interpretation could be that there is no one language for all jobs.
Just as programming languages are chosen for the specific task at hand, you
can also choose a natural language by its effectiveness for the current task.

