I love Toki Pona and have been using it for a little while. You can check my YouTube channel if you wish to see/hear it live in action: https://www.youtube.com/@janPolijan
Toki Pona gives you the fun part of using a foreign language, but it skips the boring part of having to learn a ton a grammar and vocabulary. And the phonology is delightfully simple, so you won't have a bad accent. There's a growing and very active community around it and people write and share novels and songs. But it's also useful just as a personal language because it's really like a game for your mind. After a while, you reach proficiency and then fluency and it gets even better. It is absolutely possible to explain fairly complex things if you continue to push it. You obviously wouldn't want to do engineering in Toki Pona, but explaining quite complex topics is absolutely possible if you really desire (for example, someone has put a YouTube video explaining non-euclidean geometry in Toki Pona). One final thing: contrary to popular belief, Toki Pona is not designed to be as minimal as possible (the "tuki tiki" conlang with only 39 words modifies Toki Pona and pushes towards true minimalism). Toki Pona sure is tiny, but it's mostly designed to be fun and retain expressiveness.
I guess my question is: as you invent new phrases to describe more complex words, aren’t you effectively expanding the size of the language, even if this is “implicit”? Snowman in English is counted as a word, for instance. The advantage here of compound words is that you can easily get a gist without knowing exactly the meaning, similar to characters in Chinese that have a phonetic and meaning radical both present.
There are common forms for certains words of course, but no lexicalisation. In Toki Pona, you're not adding words, set phrases, to the language's lexicon. So there are no compound words that would be set in stone and things can and do change depending on the situation. Let's take an example: coffee. The word "telo" is some form liquid or water, so coffee could simply be just telo (if you're in context where it's clear), but you may just add it's the dark/black liquid ("telo pimeja"), or the strong hot drink ("telo wawa seli"), or the liquid of no sleep ("telo pi lape ala"), or there are many other possibilities depending on what's the important aspect you emphasize and how brief or precise you want/need to be (if you need a lot of precision, you'll end up doing several sentences, but you'll get there).
Set phrases aren’t consciously created, they’re the accumulated contextual/cultural meaning that people remember for the phrase, until people no longer use it to point at the literal meaning. If some group always drinks coffee and never tea, then eventually “strong hot drink” means “coffee”, and an outsider who asks for their own group’s “strong hot drink” (always meaning tea) is going to get a bitter surprise.
Same goes for accents. Any group will develop an accent, and it will be dominated by the group's native accent. Toki Pona won't be able to escape those, not even at small scale. I also predict the inclusion of loan words, should the language ever go beyond purists.
I don't think they would be surprised, since anyone having used the language enough to get to that point would know how it works, and they would know that they are not in their usual context.
It's like leaving out units of measure. We sometimes omit them, but we practically never lose track of when it makes sense to or have to consciously think about it.
I have a convertible. My nephew in law? (Wife's brothers son, now I wonder how you express that in Toki, you don't bother, he's just young relative maybe?) had seen that car when I visit several times over the course of a few years, and it just happened that in his entire short life up to then, he had never seen it with the top up.
One day we show up with the top up.
We're inside and he comes in to tell his parents my car has "a cover or shield"
I think the idea is you don't bother to unless it actually matters in a given situation, and when it does, you just add more words.
Probably writers get this more easily. I think they are trained and edited constantly to avoid unnecessary fluff, and identify what is pointful atmospheric detail and what it pointless detail.
Does the story actually change depending on what drink dad enjoyed with his morning paper? Maybe if it was spiked, or if it was a pointed aspect of his character that he drank chocolate which others find childish, and he knows it and doesn't care. So it could, but in 99% of scenes I don't think it does matter, it only matters that it's a common thing that people do, and a common setting prop. If the language were not English and the standard way to say "morning coffee" was something else instead of specifically coffee, the scenes and stories all function exactly the same.
> I think the idea is you don't bother to unless it actually matters in a given situation, and when it does, you just add more words
Yes I understand that. Vietnamese works a bit like this although with a far less limited vocabulary.
I was asking about the specific case where it does matter. How complicated is it to differentiate between these two similar drinks when the difference is important?
Probably painfully, like the full real scientific names for chemical compounds and biological species, where there are hudreds or even thousands of variations of things.
I would imagine that in a system like this, you end up adding only the particular extra detail that matters at the time, and so you almost never say quite the same thing as what we mean by "coffee".
Instead it would be just "drink" most of the time (and for all I know maybe even that is too specific and it's really just "liquid" or "liquid food", but anyway...)
And then when you come in from shovelling snow, your partner has "hot drink" ready for you, because "hot" is the extra property that matters.
And when you get up in the morning, you want "invogorating drink" to get going, because "invogorating" is the extra detail that matters. (setting aside that I bet "invogorating" is NOT one of the precious few 120 available words, but there will be something like active or up or positive or fast)
And rarely bother trying to express all the of the bag of properties that "coffee" conveys.
And if you DO once in a while, maybe that is not so different from English anyway.
"a steaming hot mug of black coffee" is a lot of syllables, and we would say all of that in English if we happened to want to express all of those facets, so maybe it's not all that different?
When you ask for "a cup of coffee" in different countries, you may also get pretty surprised at what you get (it may be a tiny cup, or a large glass, sugary or bitter, with milk or without...).
not even just countries - my family from California visited NYC and kept getting frustrated when they said just wanted a "regular coffee," until someone finally realized they meant "black coffee" and it was racist to say black, so they had to order "plain coffee, no milk or sugar"
That... is exactly how a language is expanded, "set phrases" are established and dialects are born.
If Toki Pona was used as an actual language in an everyday setting in a community, it would quickly get set, established terms and phrases for things, and essentially expand the vocabulary.
The Roman Catholic church uses Latin for official definitive documents so they presumably have ways of expressing at least some modern concepts.
"Reginald Foster, a former plumber’s apprentice from Milwaukee who, in four decades as an official Latinist of the Vatican, dreamed in Latin, cursed in Latin, banked in Latin and ultimately tweeted in Latin, died on Friday at a nursing home in Milwaukee. He was LXXXI. "
Or the community would literally just expand the vocabulary directly, unless toki pona has mechanisms to preempt this which doesn’t seem to be the case. Except for the scripts, which both seem to be ideographic, but even then nothing precludes creating new ideographs.
And inevitably at one point something like "hot brown water" will become a single term for coffee. And then it will be contracted into something like "hobrow" because the original term was too long :)
You say that because you’re 1st gen. If this language went anywhere then the people in the 2nd gen will start adding complex words and structures that you never taught them.
This has all been observed before and is inevitable. You want them to call it “telo pi lape ala” but that’s shit so they’ll call it tala or lapela or something. That will then become the word for it.
There are real languages that kinda sorta work that way already.
For example in Tok Pisin helicopter is called "mixmaster bilong Jesus Christ" (Jesus Christ's food blender) or piano: bigpela bokis he got whitepela teeth hegot blackpela teeth sappos you hittim him he cry out (big box with white and black keys that cries out when you hit it)
Sure piano is piano and helicopter is helicopta and its been like that for years, but these expressions are documented and widely known as the initial words people came up with to describe things for which they didn't yet have words for.
The more those objects got in daily lives, the more incentives was to find proper words.
The point of my comment was to find an example of a _natural_ language where people came up with ways to refer to new objects by using simple vocabulary (As opposed to a conlang like toki pona)
I want to know how/where the word for plane came from? balus
But also remember the tok pisin is a recent language, created from the colonisation efforts so it's not strange to find that modern words will have a translatation. Helicopters and piano's were around at the time.
it's possible the word comes from some substrate language, possibly the name of some flying animal.
The thing with pidgins is that when they start out, they are everybody's foreign language and are used as lingua franca. Generally people have their own native language where they may or may not have words for a given concept but then when they talk with other people from another language group they may or may not be able to just sneak in such a word and be understood. That's why sometimes long turn of phrases are used to describe stuff; not because the speaker doesn't have a word but because the speaker and the listener don't have a common word.
Interesting. I asked GPT4 if it can translate and it responded a very confident Yes. Below is the translation.
In Toki Pona, the translation for "Would you like coffee or tea?" would be:
"sina wile e kafe anu telo nasa?"
Here's a breakdown:
"sina" = you
"wile" = want/need
"e" = direct object marker
"kafe" = coffee (borrowed word)
"anu" = or
"telo nasa" = literally "intoxicating liquid", often used to mean "tea" or "alcohol" (there's no specific word for "tea" in Toki Pona, but in some contexts, "telo nasa" can be used to mean tea)
It's worth noting that because Toki Pona has such a limited vocabulary, some words and concepts may be represented differently than in other languages.
The equi-complexity hypothesis states that all languages are equally complex and just trade off complexity in one area for simplicity in another. However, it is not yet accepted and highly disputed.
Languages seem to often have a lot of accidental complexity though, that doesn't seem to add anything and just makes it harder to learn. Like irregularities. Seem to come from a lack of design.
Just like in English: "Show up" is, ostensibly, two words, but it's a phrase with a different meaning that either "show" or "up". So yes, pretty much. But "Blue Sky" is two words indicating a single thing, but that thing is both "blue" and "sky", so no, you're not really expanding the language.
That said, I don't speak Tiki Pona, so I don't know how the language is used by the various communities that actually speak it.
This kind of phrase usually has a historical origin.
I tried to find out what it is for "show up" but apparently, this is a very recent one and originally meant something else entirely:
"verbal phrase, by 1826 as "to disgrace through exposure;"
It then changed to the current meaning only in around 1888, just ~135 years ago:
"The meaning "to put in an appearance, be (merely) present" is by 1888."
Unfortunately, I couldn't find any explanation based on historical use of the phrase (phrasal verb, to be accurate), but given it's a quite recent one, it probably originated from people just trying to emphasize "completion" of the "show" action by adding "up", like with most other phrasal verbs.
Thanks! Yes, I remember reading older novels (Austen, maybe?) where the "disgrace through exposure" sense was used... it was obvious what was meant, but surprising.
"disgrace through superior performance" is a current meaning. AA does a thing, and then (especially if AA appears to be getting praise for it or appears proud of it), BB either does it better or does something more difficult (usually accompanied by a derisive remark about AA or AA's work).
Note that as it is descriptive rather than proscriptive, each translation has a superscript indicating what fraction of English-speaking Toki Pona speakers used that particular translation when the book was written. I don't have my copy on hand but I recall that there was a community survey and Sonia Lang used some kind of scale such as 4 for a universally accepted meaning down to 1 or 1/2 as a highly idiosyncratic/fanciful translation only used by a small number of people.
I suspect if the language evolves some people may compress these further so that "tomo tawa" becomes "tomo-tawa" becomes "tomotawa" which may help with parsing written text. Perhaps fluent speakers do not have this problem, but I typically have to read a sentence more than once to parse it correctly.
> Snowman in English is counted as a word, for instance.
Note that for linguists, the definition of “word” is so slippery as to not have much use. It boils down to “because it’s written that way,” and even that is a stretch, because this definition is different for other languages. And of course, not every language is written.
To use your Chinese / Mandarin example, Mandarin does something very similar to Toki Pona in terms of inventing new phrases to describe more complex concepts (here I avoid using “words” in this context because it’s not a useful concept for our use): chéngyǔ are four character idiom phrases that are not words, but references to much longer cultural stories. They have a distinct meaning, but they are not considered words.
ChatGPT seems to do an ok job, although I have no idea how accurate it is. Here’s one translation of the comment:
> mi olin e toki pona. mi toki kepeken ona tan tenpo lili. sina ken lukin e tomo tawa mi lon lipu "https://www.youtube.com/@janPolijan". toki pona li pona tawa toki musi, taso ona li jo ala e toki mute e nasin toki mute. kalama ona li pona li lili. kulupu pi toki pona li suli e toki. jan li toki e toki mute e toki musi. taso, toki pona li pona tawa toki insa tan ona li musi tawa lawa. tenpo kama la, sina ken toki pona mute. sina ken toki e toki suli. taso, sina wile ala toki e toki pi ilo tawa kepeken toki pona. jan li toki e toki pi ilo sona suli lon tomo tawa lipu. toki pona li lili, taso ona li musi li ken toki e toki mute.
Re-translated back to English in a new chat:
> "I love Toki Pona. I've been speaking it for a short time. You can see my channel at "https://www.youtube.com/@janPolijan". Toki Pona is good for playful conversation, but it doesn't have many words or complex grammar. Its sound is pleasant but minimal. The Toki Pona community enhances the language. People speak many fun conversations. However, Toki Pona is great for introspective conversation because it's fun for the mind. In the future, you might speak a lot of Toki Pona. You can have long conversations. However, you might not want to discuss technology using Toki Pona. People discuss technological topics in great depth on channels. Toki Pona is small, but it's fun and can express many things."
Lots of little mistakes, here's an accurate back translation
I love Toki Pona. I've been speaking it from a short time. You can see my vehicle at https://www.youtube.com/@janPolijan. Toki Pona is good for fun conversation, but it doesn't have a lot of talking or ways of talking. Its sounds are nice and few. The toki pona community is growing the language. People do a lot of talking and have fun conversations. But toki pona is good for thinking in because of it and is fun for the mind. You'll be able to talk a lot in Toki Pona in the future. You'll be able to say large things. But you won't want to talk about transportation technology in Toki Pona. People talk in about big knowledge gadgets in book vehicles. [lol] Toki Pona is small, but it's fun and can say a lot of things.
Sure. Here's a very quick translation of the first part (up to "people share songs"):
toki pona li suli mute tawa mi. mi kepeken ona tan tenpo lili. sina wile lukin anu kute e toki pona uta lon la, o lukin e lipu Jutu mi lon ni: https://www.youtube.com/@janPolijan. toki pona li pana e musi toki. taso pali pi kama sona li pona li sama ala toki ante. pilin ike li lon ala tan ni: sona pi nimi ale en sona pi nasin toki li wile e pali lili. kin la, nasin kalama pi toki pona li pona li suwi, la uta sina li ken ala pana e kalama ike. kulupu pi toki pona li kama suli li pali mute. jan li sitelen e lipu li pana e kalama musi.
I've always thought it was interesting that in English there are lots of different words which mean the same thing but a different scale - like pebble / rock / boulder. It could easily just be little rock, rock, big rock. In a lot of cases the distinction does matter though, because what we can _do_ with a thing changes at different scales. We can throw a pebble without really hurting someone, a rock would be mean, and a boulder would mean you're a giant.
I don't have ads, so you can safely check the whole channel. If one wishes to preview one single video, check rather this one as it's shorter and has actually better spoken toki pona, plus subtitles are available in English (and other langs): https://www.youtube.com/watch?v=VGx_gjTqEyU
I've been delving deep into the structure of toki pona and noticed some intriguing gaps:
Abstract concepts - The language seems to skate over complex philosophical or scientific ideas. You won't find an easy way to express ideas like "justice" or "quantum". It's a testament to its minimalist design, but it also means some concepts are lost in translation.
Specific flora/fauna - While you can use terms like "kili", "kala", or "soweli" to describe plants, fish, or mammals, respectively, don't expect to convey specific species such as "maple tree" or "cat". It's a broad brush approach that lacks granularity.
Intensive modifiers - Toki pona leans hard on the word "mute" for intensifiers. This simplification might hinder nuanced expressions, as there's a big gap between "mute" and a neutral descriptor.
Collective nouns - The language only really has "kulupu" to describe collectives. So, distinguishing between, say, a "flock" and a "crowd" might get tricky.
Idiomatic phrases - A repertoire of idioms or set phrases for common topics would make conversations feel more culturally rich and engaging. It's one thing to communicate efficiently, but another to do so with character and flair.
This isn't criticism but an observation of the trade-offs made for simplicity. Every language has its quirks, and understanding these nuances can make for a richer linguistic experience.
> Abstract concepts - The language seems to skate over complex philosophical or scientific ideas. You won't find an easy way to express ideas like "justice" or "quantum". It's a testament to its minimalist design, but it also means some concepts are lost in translation.
Justice could be translated as “sama pona” (literally “good sameness”). Quantum could be translated as “wan” (literally “unit”); so a quantum leap in the sense of a non-gradual physical movement would be “tawa wan” (“single movement”), whereas a quantum leap in the sense of a large change would be “ante suli” (“large difference”).
You're right that it is very easy to lose nuance in translation. I guess that's true of translating between any two languages that are quite different.
> Specific flora/fauna - While you can use terms like "kili", "kala", or "soweli" to describe plants, fish, or mammals, respectively, don't expect to convey specific species such as "maple tree" or "cat". It's a broad brush approach that lacks granularity.
The idea is that rather than naming things, you describe them. A maple tree could be “kasi suwi” (“sweet tree”) or “kasi pi ma Kanata” (“Canadian tree”), depending on context. A cat could be “soweli pona” (“friendly animal”) or “soweli utala” (“fighty animal”) or “soweli utala pi luka kiwen” (“hard-clawed fighty animal”) or “soweli kalama” (“noisy animal”) or “soweli pona utala kalama” (“noisy fighty friend animal”), depending on your relationship to the individual.
Toki pona forces you to think about which aspects are relevant to your meaning, and state the context explicitly. This is one of the stated intentions of the language, and therefore a feature rather than a bug :)
> Idiomatic phrases - A repertoire of idioms or set phrases for common topics would make conversations feel more culturally rich and engaging.
There are some of these, for example “jan pona” (“good person”) very commonly means “friend”.
> “jan pona” (“good person”) very commonly means “friend”.
But “good person” and “friend” are two very different things. I might describe a well-known charitable figure as a good person, but that wouldn’t mean I would be calling them my friend. One is a value judgement and the other is a personal relationship.
But English has similar ambiguities. For example, a "girlfriend" is usually not just a girl friend. Other expressions are used when there is no romantic relationship.
I don't know toki pona but there is probably another expression for actual good people. Maybe something like "person of high value".
The language is also geared towards expressing ideas in general much more than expressing your own feelings. In this way, a "good person" is someone you would consider a friend even if you've never met, which while not exactly a concept that's entirely sane on the internet, makes a lot of sense in a more thighs knit community (eg: a good person is a friend to the community and therefore a friend to me).
I have a direct answer: once you get good enough at toki pona to want to distinguish, you do some personal reflection about what a friend is to you and find more specific ways to say that, such as jan poka (accompanying person) or jan mi (my person). You can also specify whether jan li pona tawa mi (good to me) or li pona tawa kulupu (good to the broader community).
Even in English the term "friend" is ambiguous though, some people refer to someone they've only met once as a friend already, whereas others really need to think about who they consider a friend.
If I ask you "do you have a pet?", and you respond "yes, I have a cat", I can easily imagine the pet you have and be generally right (maybe except the colour). How would you respond in Toki Pona so that an average person would understand you have a cat, not a dog or a guinea pig or fish or a rabbit?
From other comments, you would probably describe the pet’s features (likely with a fair amount of confusion / fuzziness), or some feature-descriptions might be stand-ins for specific animals.
For instance imagine that rather than a cat your pet is a chuckwalla, while that name is quite precise, it is almost certainly useless to the average conversation partner because most people have no idea whatsoever what a chuckwalla is (or that there’s something called a chuckwalla in the first place).
If you have an unusual pet, the name is useless, agreed. But everyone knows what an average pet, like a "cat" or a "dog" or a "guinea pig", looks like, and it would be weird to describe what a cat is in response to this question in most non-conlang languages.
> Justice could be translated as “sama pona” (literally “good sameness”)
But how is that not just a compound word that needs to be learnt? Because I would not guess justice from "good sameness", it might as well mean morality.
So now you have to learn these compound words and the idea of a 130 word vocab is a bit dubious.
Yeah I seem to be noticing a common theme here of claiming that new words are easy to create by putting these simple words together without realising that this doesn't really help anything without people agreeing on which combinations of words mean which things.
If everyone starts to use "good sameness" for "justice", you now have a 131 word vocab. Repeat for how many combinations become common. Eventually you will end up with as many words as any other language. It's almost like the proponents of the small vocab of this language don't actually understand what a vocabulary is
The idea of the language is that you describe what you want to say in order for yourself to understand what you mean by each word
The commenter thinks justice is good and about equality, do you? It's mostly a challenge in self-discovery, to know how you would describe different things with the limited dictionary
It's not really made to communicate, although it is of course possible
I've studied three foreign languages (four if you count a failed Toki Pona attempt a while ago), and I find it neat to discover a new grammatical concept, such as subjunctive in Spanish, which exists but only barely in English, or to realize that a language is missing an English construct, such as Mandarin, which doesn't have anywhere near as many gendered words as Germanic or Romance languages. As with computer languages, it's rarely the case that something is inexpressible, but it might have to be done differently compared to another language. I don't completely buy into the Sapir-Whorf hypothesis, but I do believe that cultural differences are partly attributable to grammar and vocabulary.
As I understand it, neither Mandarin nor Irish have words that directly mean "yes" and "no."
Then again, half a millennium ago, English had 4 such words: 2 for yes, 2 for no. (Sir Thomas More tried to explain this circa 400 years ago, and even he got it wrong.)
Norwegian and Czech, both of which I speak very badly, preserve two forms of "yes" and I cannot keep them straight.
Learning Mandarin does make me yearn for the simplicity of English "yes" and "no." But I have to admit, each word in Mandarin works a lot harder than it does in English, so you can often say a heck of a lot in just one or two syllables.
In Mandarin and Vietnamese, you can build new words by concatenation of words with broad meanings. Say, a plane is "flying machine", a mobile phone is a "hand machine", and a fridge is a "cold closet". In practice, it's very effective and it allows you to either guess the commonly used concatenations, or make up your own and you might be understood. I assume you can do that in Toki Pona too ?
Most languages are like this. That English words are so opaque is quite a historically contingent thing, not the baseline of how languages usually work. English retains a lot of Latin and Greek and French compounds, which are usually translated more literally in many other languages.
In German, plane is Flugzeug (flight tool), fridge is Kühlschrank (cooling closet). In Hungarian, they are repülőgép (flying machine) and hűtőszekrény (cooling closet) as well, calqued from German.
English speakers often perceive this as some kind of endearing primitive Tarzan-like communication without all those classy prestigious words, but English also has plenty examples. How about "washing machine" for example?
In English, we make words the same way - it's just that we construct them out of latin or greek roots.
"Television" is tele (Greek, 'far') + vision (Latin, 'seeing'). German for Television is "Fernsehen" - fern (German, 'far') + sehen (German, 'seeing').
Even English kind of does this, just playing parcour through multiple languages, and with time and use shortening the compound words to shorter alternatives. A plane is an aeroplane, or aeróplanos, an air wanderer. A mobile phone is a mobile telefone, a movable far-voice.
And of course in many languages more related to English as well, such as (a bit infamously) German. And obviously there are compound words in English as well, either concatenated ("snowman", "airport") or not ("cell phone", "compound word").
Well, the kind of computer you have in mind used to be electronic computer to distinguish it from a computer or calculator which described a human being who computed. See, for instance, Nevil Shute's biography Slide Rule in which he describes being in charge of a department of calculators when working on the R100.
It's a fascinating read. You can get it from Anna's Archive.
“Don't you see that the whole aim of Newspeak is to narrow the range of thought? In the end we shall make thoughtcrime literally impossible, because there will be no words in which to express it."
Pragmatically, in English we deal with this by briefly dropping into Latin to talk about acer caesium, for example, and this is fine because we’re generally not aiming for any kind of linguistic purity - it’s not so easy if we try to use Toki Pona as a highly puristic, controlled vocabulary language.
This is suggestive that ~Hard Whorfism may have some truth to it does it not? Expert consensus seems to be that it is a "fact" that Hard Whorfism is false, though I strongly disagree, wonder what your take is on it?
this was circulating on usenet in early 2000s for a while, id love to see it official:
# Toki Pona: Snoufax II Advanced
## Basic Particles
- *a* - particle
- *e* - object marker
- *en* - and
- *la* - used for conditions and time
- *li* - links subject and predicate
- *o* - vocative particle
- *pi* - of
## Tense and Aspect Modifiers
- *sili* - Continuous or ongoing.
"mi sili toki" – I am speaking.
## Moods and Modality
- *kin* - Potentiality (can/might).
"mi kin toki" – I might speak.
## Technical and Scientific Terms
- *ilo sona* - Computer (knowledge tool).
"mi kepeken ilo sona" – I use a computer.
## Abstract Concepts
- *pona nasin* - Justice (right way/path).
"pona nasin li tawa mi" – Justice is important to me.
## Idiomatic Expressions
- *pakala luka* - Break a limb (good luck).
"pakala luka tawa sina!" – Break a leg (Good luck)!
## Politeness Levels
- *sewi* - To elevate the status of the person being spoken to.
"sewi jan Lisa, mi toki tawa sina" – Honorable Lisa, I speak to you.
## Detailed Descriptors
- *jelo pimeja* - Dark yellow (like mustard).
"mi jo e tomo jelo pimeja" – I have a mustard-colored house.
## Complex Syntax
- *seme* - Universally before a statement to make it interrogative.
"seme sina toki?" – Are you speaking?
## Pronouns
- *ono* - Neutral pronoun (they/them).
"ono li toki tawa mi" – They are speaking to me.
## Numerical System
- *ten* - Ten.
"mi jo e ten tomo" – I have ten houses.
## Prefixes/Suffixes for Intensifiers
- *suli-* - A prefix to mean "very/big".
"suli-pona" – Very good.
## Passive Voice
- *palisa* - Passive.
"mi palisa toki" – I am spoken to.
## Directional Particles
- *noka* - Toward.
"mi tawa noka tomo" – I am going towards the house.
- *monsi* - Away from.
"mi tawa monsi tomo" – I am moving away from the house.
## Negation of Modality
- *ala kin* - Cannot, impossible.
"mi ala kin toki" – I can't speak.
## Temporal Phrases
- *tenpo pini* - Previously, before.
"tenpo pini la, mi toki" – I spoke before.
- *tenpo kama* - Soon, later.
"tenpo kama la, mi toki" – I will speak later.
## Comparison and Superlatives
- *sama lili* - Less like, lesser.
"mi sama lili jan Lisa" – I am less like Lisa.
- *sama suli* - More like, greater.
"mi sama suli jan Lisa" – I am more like Lisa.
## Conditional Statements
- *sama... la* - If... then...
"sama toki pona li pona tawa sina la, sina toki kepeken ona" – If you like Toki Pona, then you speak using it.
## Reflexive Actions
- *sama mi* - Myself.
"mi toki tawa sama mi" – I talk to myself.
## Possession
- *pi mi* - Mine.
"tomo ni li pi mi" – This house is mine.
## Emphasis Particle
- *kin li* - Emphasizes the action or state.
"mi kin li toki!" – I really am speaking!
## Uncertainty or Hesitation
- *...anu...* - Either... or...
"mi toki anu toki ala" – I either speak or don't.
## Verb Structures
- *li... e* - Direct object indicator.
"mi li toki e toki" – I speak the language.
- *tan... tawa* - Cause and effect.
"mi tan tomo tawa tomo sona" – I go from home to school.
## Adjectives and Adverbs
- *sama sona* - Intelligently, knowledgeably.
"ona li toki sama sona" – He speaks intelligently.
- *pona mute* - Very well.
"mi toki pona mute" – I speak very well.
As an experienced toki pona speaker, this list is amusing because most of this is either a pointless terrible idea or already valid and accepted - somehow with very little in between. I'll leave it to the next commenter to go point by point.
I've been reading, writing, and translating text in Toki Pona for a year.
My main criticism of the language is that the word are so abstract that understanding spoken Toki Pona at real speed is practically impossible. Too many possible translations. Too many ways of combining these words, and no mechanism to identify which words form a compound.
I didn’t understand English at real speed either during my first years of learning. There’s a moment you easily understand all words in a conversation but it still require a bit subconscious effort so you don’t have enough “cortex RAM” left to understand the whole conversation meaning at first read/ear.
But the whole point is that English and Toki Pona are totally different languages and Toki Pona has a property which makes it harder than English to understand at speed.
You can't treat all languages the same. Just because you struggled with English doesn't mean that English is as hard to understand as speed as Toki Pona
Oh. Toki Pona is fun. I learned it a while back and the limited set of vocabulary really leads to two things:
a) Local dialects, similar to sign-languages. e.g. "good fruit" might mean "mango" in one Toki Pona community or "apple" in another
b) It felt kind of freeing, because a lot of words convey a very broad concept. It is actually really interesting to talk about certain emotions, because "a bad day" is "a bad day" and one doesn't really need to go into specifics into _what exact kind of bad_ in order to be consoled by the other person. This is not super helpful if you need a problem solved or need to describe what kind of pain you feel to a doctor, but just for the every-day venting it was kinda nice.
I was reading somewhere that most Americans have an extremely limited working vocabulary of emotions. For instance "I feel bad" can mean guilt, disappointment, abandonment, sickness etc etc.
There are dozens and dozens of possible emotion words but very few actually used in practice, which supposedly limits people's ability to introspect and manage their emotions.
eh, I don't know. Cursive is pretty rare nowadays. You also in my opinion do not really need the capital letters, and many of them look like their lowercase counterparts, but I take your point. I definitely think English can be tricky to learn, though, and feel lucky to speak it natively given that it is the lingua franca in many (perhaps most) contexts.
But it does. Most of the lower case letters are simply very similar to the larger upper case ones. Upper case, or majuscule, is used as the initial letter of a sentence and for proper nouns just as in English. At least that's what Wikipedia says.
A funny game I saw Toki Pona speakers play: make an appointment to meet somewhere in Toki Pona, and then the next time you meet for other reasons, compare notes when you were where, and for what reason, as a result of that appointment.
Presumably nobody managed to successfully communicate the time and place of the appointment so it becomes funny to figure out how everyone misunderstood it.
Numbers and locations are abstract in Toki Pona so it's difficult to communicate exact times and meeting locations. Only the numbers 1, 2, 5, and 100 are officially in the language. Though you can always chain "1" and "2" together and render a place name into Toki Pona-ish phonology and add "ma" ("place/location" in Toki Pona) to it to make it work.
The “only 120 words” is an intriguing idea (and illustrates just how far one can come with periphrasis alone), but if this were an actually actively spoken language the lexicon would likely quickly expand, first to include conventionalized composita, which will probably soon grammaticalize as new roots. I’m also not sure whether most of the existing grammatical markers would survive long, in particular “li” seems redundant and would likely be optimized away in just a few generations. Similarly, the grammar would likely become more complex and gain complex clauses and some grammatical markers for event structure (e.g. time specification) - these are all
important things which right now are not ergonomic to use.
At any rate, very smart exercise in linguistic minimalism. I wonder whether even smaller languages are viable using data driven design (e.g. via mining large language corpora for particularly well connected lexical meanings to identify most useful concepts for periphrasis)
It’s a bit more complicated than this. Of course you can represent any information with any kind of coding schema but here the truck to find a schema that’s also reasonably efficient. As so often, this is about compression. In the context of human language, the problem is finding the smallest set of lexical concepts such that you can express a wide range of meanings with minimal combinations. A lexical principal component analysis if you want.
But yes, quite a lot of things about languages have been subject to unhealthy sensationalism, like the story with “over 50 words for snow in Eskimo” or “Piranhas unable to count”. Linguistics is not that easy.
You also could talk with only one word (unary system), but this misses the mark completely.
I experimented to design a variant of Toki Pona with only two-syllable words, but then I realized that's why it's a linguist who designed Toki Pona. One of the points: the phonemes in a language are not evenly distributed, and even the combinations thereof are not. For example, /ne/ might be twice as frequent than /pa/. I as a programmer started with something like an uniform distribution, and immediately realized that it felt awkward.
It's like the secret sauce: choose the right 120 words and their pronunciation wisely!
If this is too minimalist for you, another interesting conlang is 'Mini' which has 1000 words. I haven't tried it, but I've heard it's pretty easy to learn.
This was recently on a textbin, glad I backed it up:
A More Refined Minimal Language Inspired by Toki Pona & SpamTec & Ytcracker
Basic Vocabulary:
suno - sun, light, day moku - food, eat tomo - shelter, place, structure telo - drink, water, fluid jan - individual, person soweli - creature, animal kama - future, approach, event tawa - move, direction, towards ni - this, present mi - self, I sina - other, you ona - they, he, she, it Grammar:
Basic Structure: Subject + Verb (or Predicate)
mi moku. - I eat. sina suno. - You shine or You are sunny. Direct Objects are introduced with "e":
mi moku e telo. - I consume water. (I drink water.) sina tawa e tomo. - You move towards the shelter. (You go to the house.) Questions:
For yes/no questions, use "seme" after the verb. mi moku seme? - Do I eat? sina tawa seme? - Are you moving? For open-ended questions, place "seme" where the unknown is. seme li moku? - Who eats? sina tawa e seme? - Where are you going? Tense Indication:
Past: Use pini before the verb. Future: Use kama before the verb. mi pini moku. - I previously ate. (I ate.) mi kama moku. - I will eat soon. (I will eat.) Negation: Use ala after the verb.
mi moku ala. - I don't eat. Possession: Use pi between the possessor and the thing possessed.
mi pi tomo. - My house. sina pi moku. - Your food.
------------
If anyone is interested in practicing with me, let me know!
Very cool language! As something auxlang like it seems to have a lot of similar issues encountered when creating the vocabulary as other auxlangs but that's not usually a problem for me since I know a bunch of romance and South Asian languages. Thanks for sharing!
I am curious how this differs conceptually from "Basic English" which always struck me as quite linguistically naive. English is full of prepositional verbs like "shut up" that have a specific meaning that you have to just memorize. So when you're listing out the "vocabulary" of some reduced version of English, you have to count "shut" "up" and "shut up" as different words. Or "break up with" to mean to end a romantic relationship. It's a set phrase with a defined meaning, not just a phrase whose meaning is apparent from the words used. It needs its own entry in a dictionary.
edit: Not to mention compound words in English, which any linguist would say are just one word ("hot dog") but because of spelling conventions, people tend to parse as two words. (In English, you can tell when something has become a compound noun by the stress pattern; John McWhorter calls this "backshift" and it's also the difference between the noun and verb forms of words like "record" and "rebel.")
The Wikipedia pages says explicitly that "However, it was not created as an international auxiliary language. [...] The language is designed to help users concentrate on basic things and to promote positive thinking".
So it seems like the HN title is wrong, and it explains the limitations pointed out in a few comments.
A game idea: you wake up and find yourself in a mysterious land where people speak only Toki Pona. You don’t know why you are there and how to get out of there. In order to go home you have to learn the language and discover the secret of the land through interactions with NPCs.
A long time ago, a tribe of linguist PhDs settled on the island, because funding for the humanities dried up, and tenure track jobs vanished into myth. They were forced to settle on a remote island because it was the closest place with affordable housing.
I've had nearly the exact same idea, except the NPCs are explicitly aliens, to explain why they have such different psychology that their language hasn't grown further. Or perhaps Toki Pona is their trade language, or it's a language only children speak, or something of that nature.
I am sure one can have fun with ChatGPT with this prompt. But making a good game out of it to help others learn Toki Pona would be a great challenge. What game engine? What type of game? How to make the story compelling? The game should progressively replace the main language with Toki Pona as the gamer acquires the language skills, at the end there will only Toki Pona, maybe even the game interface will be completely Toki Pona. It will be a fun project to work on :-)
My first assumption was that this wouldn't need to be more than a text adventure, so my first thought was Twine. But LLMs are literally language-based and (near)-infinitely flexible, and many already are capable of Toki Pona. There are some incredibly elaborate LLM games driven by a single prompt.
You can think of most common languages as "lossless" (at least in comparison), whereas Toki Pona is "lossy". Once you translate English to Toki, it loses a lot of granularity. For example...
Translating English to Toki Pona:
Talk -> Toki
Translating Toki Pona to English:
Toki -> To speak, Communicate, Use language, Think, Say "hello"
Which means, it's now impossible to translate it back to English and know which one is referred to, without context.
The language forces you to be less specific in your meaning, but also to focus on core human emotions instead of unnecessary complexity. Having a translator perfectly convert English to Toki Tona is gonna be a lot easier than the reverse, anyway, unless you have a lot of context to piece it together.
Note: I tried learning this language many years ago and this is my personal opinion. It may not be shared by everyone.
> You can think of most common languages as "lossless" (at least in comparison),
If only! For most words in most sentences, you can find direct translations, but between most language pairs, I'm pretty sure any given text (just one page or so, let alone books) will have at least one point where you have to interpret the text to translate it: choose what meaning the author intended or what saying/expression conveys the same meaning the closest without going into descriptive mode.
I suppose that might be what you meant by "in comparison", but it seems misleading, at least to anyone not fluent/proficient enough in another language to know that lossless translations isn't how languages work
But, yes, toki pona lifts this to a whole 'nother level. Does anyone happen to know how Esperanto compares? As far as I know, it should be quite similar to a "normal" (non-con)language in this regard (just simpler to learn, with bonus points if you speak a romance language afaik)
Esperanto was intended as a sort of diplomatic language. It's got flaws, definitely. The sounds and spelling are very much from the creator's native Polish, a lot of important terms are rather obscure («Usono,» from "Usonia" is the word for the United States). That said, it is in the end relatively easy to learn, and it is easy to express the ideas of diplomacy, science, and civil society.
China and Japan used to have a lot of Esperantists before WWII, for that reason.
> After World War I, the League of Nations considered adopting Esperanto as a working language and recommending that it be taught in schools, but proposals along these lines were vetoed by France.
It may be Eurocentric, but it's hell of a lot easier for diplomats to learn than English or French!
An actual Esperanto speaker here. I need to correct this. It was never intended to be a "diplomatic" language, as such a language only spoken by diplomats between their kind. So the language of a small elite, which does not want to deal with the average man on the street. That sounds like a story which was said about the predecessor of Esperanto: Volapük.
Esperanto was at some point in time the "workers latin", because the less educated worker could learn it as a means to talk with people from other nations. That ended with pushing English or other "more practical languages" in schools to this day.
Esperanto still is a working living language with a working worldwide community.
Zamenhof stated multiple times that he wanted to create an universal second language, as opposed to an universal first language. I don't think this distinction makes much sense, had any effect on any design decision, but probably it was important for the marketing of the language. In this sense it was indeed intended to be a "diplomatic" language, so that diplomats can use a single language. (As well as international organizations, merchants, tourists etc.)
That doesn't sound logical to me. If Zamenhof didn't intend for it to be a primary language, one you learn from birth, then why couldn't it be used by random people still? There has been trading between countries for much longer than Esperanto exists for, especially in border regions or small countries but also across oceans and continents.
Esperanto is from 1887. I was curious what holidays were like at the time:
> According to Stowe (1994), “many nineteenth-century Americans traveled, and many more participated vicariously in the experience of travel by reading travel letters, sketches, and narratives in newspapers, magazines, and published volumes” (p. 3). Similarly, the appetite for travel in the U.K. was also voracious --https://regrom.com/2020/08/26/regency-travel-traveling-abroa...
So also a goal Zamenhof could intend. I don't know how you get to the conclusion that, because it wasn't intended for my mom to use while I was a baby, it wasn't intended to be used by my mom or me on holiday if we're not "diplomats", unless you call any tourist an international diplomat
Western Europe is very different from the Europe that Zamenhof grew up in.
You get so many internationalist movements out of Russia because it already was in many ways international internally. Lots of languages and land, but both travel and speech were restricted by authorities, secret police seemed to hover invisibly everywhere. The language of everything important, the language of rulers, was Russian. Vacations were in-country, if they happened at all.
Looking to the UK, France, and the US is in this case misleading.
I think that view of you is wrong. The distinction is an important one.
By saying you create a diplomatic language, you are marketing towards the elite, as I wrote in my post.
By saying that everybody speaks it as a second language, which is indeed what Zamenhof, the creator of Esperanto, wanted is a different focus. The first is focusing on an elite, the latter focuses on the people.
It's like the distinction between "computers for knowledge workers" and "personal computers", the first is only for a small elite, the latter is for everybody. Or the distinction between "politics for a couple of few" and "politics elected by everybody", the first is called a form of dictatorship, the latter democracy.
Right, Esperanto wasn't created for the elite, or only for diplomats. That wouldn't make much sense. Also I don't think GP intended to suggest it, but you clarified it anyway, so all's good.
Frankly, this is why despite my admiration for Esperanto, I do not engage in it.
Posts like these are the 'no fun allowed' of constructed languages, and it pops up most often with Esperantists. Like a diplomat, you refuse to let people use words carelessly, or loosely.
Toki Pona is in itself a reaction to that. It's an exploration in wordplay, puns, and local culture.
EDIT: You also left like... a wall of text explaining why Esperanto is far superior to Toki Pona? That isn't fun to read or talk about. If the idea is to replace English as a language of the world, we don't have to bring the stern attitude of an English teacher along with it.
> Posts like these are the 'no fun allowed' of constructed languages, and
> it pops up most often with Esperantists. Like a diplomat, you refuse to
> let people use words carelessly, or loosely.
Wtf? What 'no fun allowed'?! In the community is fun allowed how and why are
you making that stuff up based on what actually? What interpretation are you constructing, which is not based on any reality? We have wordplays, puns
and local culture. People do these all the time and annoy the more grammatically
inclined people with it all the time. These conflicts inside the community
are normal any community will develop people who need to care about the language
more and people who care less about any language. That's how new concepts
are generated.
> EDIT: You also left like... a wall of text explaining why Esperanto is far
> superior to Toki Pona? That isn't fun to read or talk about. If the idea is
> to replace English as a language of the world, we don't have to bring
> the stern attitude of an English teacher along with it.
The wall of text tried to answer the question sincerely of how they compare.
Also it included my personal bitterness of about people who constantly piss on Esperanto for the wrong reasons. Like such as exactly this post of yours.
And that's also why I stopped writing it. I wrote that it's lacking "functionality", that makes Esperanto more complex. Toki Pona is minimalist,
it can't be the best language in the world for everything. But that does not
make it bad. People enjoy learning it and despite what you try to make
people in the Esperanto-community look like, there are a bunch of them speaking
that language too for its value of minimalism, its value in playing
around with the sapir-whorf-hypothesis regarding depression (it's after all the language of good), its value in finding a community, etc.
You see something, interpret it wrongly and then piss on it, for the wrong reasons.
> Toki Pona is in itself a reaction to that. It's an exploration in
> wordplay, puns, and local culture.
That a niche of people who are inclined to perfectionism, down-beating and snobbishness are also inclined to favor Toki Pona is shown by your comment.
Slight correction: Zamenhof's native languages (so far as we can tell), in a sense of what he spoke at home, were Yiddish and Russian, although he certainly learned Polish at a very young age due to place of residence. Not that it makes much difference in this case - the quirks of Esperanto phonology, such all those affricates and consonant clusters are familiar to speakers of pretty much any Slavic language. Esperanto orthography, on the other hand, appears to be inspired more by Czech than Polish - "v" rather than "w", diacritics over digraphs etc.
Thanks for the balanced reply! Yes you're right, I'm only using lossless and lossy as a metaphor here, because of course, any translation will lose information. Even speaking your native tongue with another fluent speaker will lose information because the meaning is interpreted.
However, I'd say that well-developed languages (which for me means having many hyper-specific words and a diverse vocab available) allows interpretation to be more precise and functionally allow communication to happen. With Toki Pona I have the sense that a lot more is lost, and we end up interpreting broad strokes of the original meaning. This isn't bad, it's just different. Poetic if you like.
I just started learning Esperanto on Duolingo and it's been a breeze. I've studied other European languages (Spanish, French, German, Latin) as a hobby before and Esperanto is kind of a streamlined hybrid of all those.
Interligua may be "more useful" as an auxlang but the size and history of the Esperanto community appeals to me.
I'm also studying Bahasa Indonesia right now; it also has a very regular and simple grammar. It's said to be the easiest non-European language for European language speakers to learn.
> But, yes, toki pona lifts this to a whole 'nother level. Does anyone
> happen to know how Esperanto compares? As far as I know, it should be
> quite similar to a "normal" (non-con)language in this regard (just simpler
> to learn, with bonus points if you speak a romance language afaik)
As an Esperanto-speaker with a shallow understanding of Toki Pona I can say the
following comparison:
1. Esperanto has way more vocabulary, which is useful for translating nuances from one source language to another language and back. This was added over the years and is still expanding, as any other living language does.
2. Esperanto words are way more specific than the one of Toki Pona. A sentence which I took from a Toki Pona cheatsheet "soweli li moku" can be translated as "a land animal is eating", "a cat is drinking", "a dog is ingesting", etc. so highly context depending. Esperanto would for the specific meaning of "a cat is eating" allow "kato estas manĝanta" or, if you don't care for the present progressive tense, which is normally how people speak it, "kato manĝas" (present tense).
3. Toki Pona glues words together by putting them each after the other, without glueing the roots together. "telo pimeja" was one example of trying to say "coffee". The Esperanto principle would be to say nigroakvo (a kind of water, which is black). Besides that Esperanto allows more nuanced distinctions like adjectives, so "nigra akvo" would mean "black water", the water, which is black.
Also as I wrote earlier, the roots in Esperanto are way more specific. "akvo" is in Epseranto just water, not liquid, not fluid, not beverage as is the meaning of "telo" is in Toki Pona. Also "nigra" is in Esperanto "black", not dark, not unlit, as is the meaning of "pimeja" is in Toki Pona. Besides that was just an example based on the word coffee, which someone else in this topic used. The actual word of "coffee" in Esperanto is "kafo" and for tea is "teo", because words which are sufficiently internationally understood are put into the language by the speakers of it.
4. Esperanto is simpler to learn, it provides the same high rewarding learning curve as does Toki Pona, but it's steeper, because it's a bigger language. Depending on intelligence, learning style, being able to focus on things regularly, motivation, connections to other speakers, time I would say people can use the language after 3 months to one year. How good that is, that's another question. I learnt the language more intensively in the first year and then just maintained it, used it and improved upon it.
5. Esperanto uses an agglutination based way of building words, as I wrote in the example of point 3, based on my little understanding of linguistic terms, this could be called "synthesizing" words, but it also supports building phrases as does Toki Pona in an analytical way. "Ĉu mi povas uzi la lazertranĉilon?" (Can I use the lasercutter?) or "Ĉu mi povas uzi la ilo kiu tranĉas per lazero?" (Can I use the machine, which cuts with a lazer?). Toki Pona would be something like "mi li pali ala pali e suno ilo" (Can I use the sun-tool?). "ilo" by the way is a word taken from Esperanto, as the creator of Toki Pona also understood Esperanto, which means "tool".
6. Esperanto has 136 years of literature behind it, Toki Pona doesn't. Let's wait another 136 years and see which of all the current constructed languages are then still around and see how much is produced in them.
7. When you learn Esperanto you have acquired some shared vocabulary in romance languages, but also a little bit of other languages in the mix, like German. Which can be a little bit helpful when learning after Esperanto then those languages. Or trying to navigate on a trip through Italy, you won't understand 80% of what people are saying, but here and there you will encounter words, which sound familiar. Which shows the so called "eurocentricity" of Esperanto, which is then used by some to campaign against the language, as if that property makes it the ultimate evil in the world, if they would apply the same standard to speaking English, then they would stop writing in the internet at all.
8. Toki Pona is overhyped. Esperanto is not hyped anymore. Any long existing constructed language loses it's appeal at some point, because the times have changed, people don't understand the history of it, believe anything which at first sounds enough plausible, but is in fact bullshit. Worldwide auxiliary languages like Esperanto always have the problem to be not "perfect" enough, they all replace each other in an eternal asymptomatic drive towards "more perfection", without ever reaching that goal. Thus in the end the most working language in that category is still Esperanto and hopefully will be for the next 136 years. But that's only my personal pet opinion.
Anyway, as I'm digressing into some general rant about planned languages, I end this comment, I can surely answer concrete questions about comparing these two languages later on, if wanted.
This is precisely the case where ChatGPT (or any LLM) will have trouble that a layperson would have trouble validating: Toki Pona has a limited document corpus and requires contextual interpretation obvious to experts.
In this case, I validated against the hand-translated example in the main link. It did fine.
This is probably easy mode for it, given the tiny vocabulary at hand, even though it has surely seen very little Toki Pona in the wild. Large language models are extremely good at translation. I’ve personally tested ChatGPT-4 against Google Translate and other translators on some hard translations (languages that are not similar to English, such as Chinese), and it has consistently provided better translations than Google Translate in my opinion. As an example, for validation purposes, news stories that have been reported worldwide make things easier, since you can establish the facts of the translation by hand, and then compare what comes out.
What does your comment actually add to the discussion? Yes, ChatGPT can be wrong. So can Google Translate!
If you’d like to provide a clear example of ChatGPT-4 (not ChatGPT-3.5) making up a Toki Pona translation that is completely disconnected from reality, please do. That would be legitimately interesting conversational material. General cynicism isn’t.
ChatGPT absolutely has limitations, and translation is a hard problem, even for the industry-standard tools. Crossvalidation with multiple translation tools and ideally humans is important for any translations that actually matter.
Another challenge, is that toki pona requires a lot of context, not only of previous sentences, but visual and communal. For example, I can say 'soweli lili' and point to a cat, then in all further conversations 'soweli lili' will mean exactly a 'cat' until specified.
I could use sentence that could mean literally hundred thousand of different things, but if I explain it properly once, you are expected to keep it as context.
ChatGPT4 will struggle to keep all context as it will surely accumulate.
I've done a fair amount of translating Toki Pona based on my own sentences, and based on recent online posts. None of that would have been in the training set. It does a good job. Since Toki Pona translation is "lossy", you can even prompt it to give you multiple interpretations and discussion.
Since LLMs' only valid uses involve entertainment and creativity, and Toki Pona is more or less a recreational language without serious application, I think they go together like two peas in a pod.
-edit- After posting, I decided this was too snarky/low value on it's own. So to add:
I personally get quite a bit of value out of it as an ad-hoc German tutor and as one-among-many tools for research and learning on new topics for which I am not already an expert.
I've also used it to help me make and troubleshoot very simple code in languages that I am not an expert in.
I personally have trivially found lots of uses that probably all cluster around "educational" to one degree or another.
Is it perfect? Absolutely not. Does it hallucinate/get things wrong? Absolutely. Is it useful despite those limitations? Emphatically yes.
This idea that its "only valid use" is entertainment doesn't even make sense to me. "Entertainment" is perhaps where I find it the most deficient relative to other current options.
"In mathematics, a real number is a number that can be used to measure a continuous one-dimensional quantity such as a distance, duration or temperature. Here, continuous means that pairs of values can have arbitrarily small differences. Every real number can be almost uniquely represented by an infinite decimal expansion."
->
"toki pi nanpa, nanpa lon li nanpa li ken lukin e suli tawa wan. lon ni la, nanpa tu li ken ala li poka li ken lili. nanpa lon ali li ken lukin kepeken nimi nanpa mute tawa pini ala"
->
"Talking about numbers, when a number is present, it can be seen to be important or significant to an individual."
"In this context, the number two might not be significant or might be seen as small or lesser."
"All numbers can be seen using many number names endlessly or without an end."
ChatGPT 4, obviously in separate chats: as you can see a complete failure to convey and roundtrip the original message.
I would definitely not attempt to translate something like that in one shot - I'd read it and digest it for a bit and come up with a way to say the same thing without being unnecessarily obtuse, Thing Explainer style. mi nanpa e mute ijo la kipisi ijo li ken ala lon. ijo wan sin li kama la mi suli wan e nanpa. taso mi ken ala suli kipisi e nanpa. nanpa pi mute ijo la ni li ken ala: ijo tu en kipisi pi ijo wan li lon. taso mi nanpa e suli ijo la kipisi nanpa ale li ken. ijo li kama suli sin pi kipisi wan la mi ken suli pi kipisi wan e nanpa. kipisi li ken lili ale. nasin nanpa ni li ken nanpa e suli ijo e suli tenpo e suli seli. nanpa pi suli ijo la ni li ken: mi sitelen e nanpa la sitelen ni li ken nanpa ni taso li sama ala sitelen pi nanpa ante. taso nanpa mute la sitelen ona li suli ale li ken ala pini. GPT3 doesn't translate this very well, it seems to miss the fine points of grammar, but maybe GPT4 would do better
This is what GPT4 came up with for the translation:
> The text you provided is in Toki Pona, a constructed language with a minimalistic vocabulary. The translation to English is challenging due to the highly context-dependent and simplified nature of the language. Without further context, here's a rough translation:
> "When I count many things, cutting things is not possible. When one new thing comes, I increase the count by one. But I can't increase the count by cutting it. In the case of counting many things, this is not possible: two things and a cut of one thing exist. But when I count large things, cutting all numbers is possible. When something becomes large by cutting one, I can increase the count by cutting one. Cutting can make everything small. This counting method can count large things, long times, and hot things. In the case of counting large things, it's possible: when I write a number, this writing can count it but is not the same as writing a different number. But with many numbers, its writing is always large and cannot end."
> Please note that the translation might not capture all the nuances, and the interpretation may vary based on the context in which the text is used.
I’ve only heard of Toki Pona for about 20 minutes, but everything I’m reading says that’s just a normal property of the language, not a limitation of ChatGPT.
As a human, which of the 120 Toki Pona words would you confidently and correctly translate as “continuous” or “real” or any of these other linguistically uncommon concepts?
Based on my own chat with ChatGPT, I feel almost certain that it warned you about this exact limitation of the language, but you ignored that?
"continuous" (in this actually incorrect usage, since the correct term is "dense")
->
said of an algebraic structure such that for every x != y there exists z such that x < z and z < y
->
characteristic ("nimi"?) of a set ("poki") where for ("tawa?") all ("ale") thing ("ijo") different ("ante" + "lon"?) than other ("ante" + "nimi"?) thing ("ito") there exists ("lon") a new ("sin") thing ("ijo") such that the first ("wan" + "nimi" + "ito"?) thing is less ("lili") than the new thing ("sin" + "ito"?) and the new thing ("sin" + "ito"?) is less ("lili") that the second thing ("tu" + "nimi" + "ito"?)
Would need some way to join into a grammatical structure and probably a preamble that establishes that the context is mathematics.
Sorry, I had just deleted it from my history without realizing that it would break the share link. Updated the link with a new session demonstrating the same thing now.
What about going the other way? I'm on chat gpt3.5 FWIW, but when I ask it to translate simple phrases from English to Toki Pona (electric car, barking dog, etc) it hallucinates words that aren't in the ~120 word dictionary I found.
Regardless, there is a very limited dictionary to work with in Toki Pona, so communicating extremely specific ideas seems like it is always going to be problematic. The language doesn’t even have a specific word for “dog”.
I always wanted to train an LLM from scratch. There is an interesting project that does just that with Esperanto and my immediate thought was I'll do it with Toki Pona. Unfortunately, I had vastly underestimated the size of the existing Esperanto corpus and overestimated size of the Toki Pona corpus.
So, I gave up on that idea because I believe that there is not enough training data, but I'd love to be proven wrong.
GPT3(?) is not that good, making quite basic mistakes, as in, I can pick them out and I'm an absolute noob at toki pona. GPT4 link that someone posted in a sibling comment gives 404 for me so I can't judge that, please include versions when saying that some LLM can do it :)
Here's an excerpt from the linked ChatGPT conversation:
> [User]: Please translate this Toki Pona into English:
> [User]: o! mi sona e toki pona. mi lukin sona e toki mute... taso mi toki ike e toki ale. toki mute li jo e nimi mute. nimi mute li pona ala tawa mi. toki pona li pona e mi tan ni. toki pona li jo nimi lili. nimi lili li pona e me
> [GPT-4]: Here's a translation of the provided Toki Pona text into English:
> [GPT-4]: "Oh! I know Toki Pona. I have studied many languages... but I speak all languages poorly. Many languages have many words. Many words are not good for me. Toki Pona is good for me because of this. Toki Pona has few words. Few words are good for me."
> [GPT-4]: Note: Toki Pona is a minimalistic constructed language with a small vocabulary. The translation may not capture all the nuances, but this is a close representation of the message in English.
You can see what happens when you ask an LLM (in this case I believe it's GPT-3.5) to use Toki Pona as the primary language while also providing English translations:
My initial impressions of Toki Pona were that you’d struggle to communicate anything more than basic thoughts but people in the community have told me they’ve had robust philosophical conversations. Seems like a wonderful group of people in my interactions.
That's part of the idea. It's that you slow down, try to figure out what exactly the other person means by what they are saying. In a language with a fixed vocabulary, context becomes even more important than normal.
Sounds like the opposite problem from lojban, where having highly precise conversation is possible (and like most of the point) but effortful to achieve.
Conlangs(Constructed languages) are fascinating! Apart from Toki Pona, I can recommend checking out Esperanto and Solresol, two other attempts at international languages.
I wonder how fast a community could learn Toki Pona if that was the only way of communicating and nothing else was allowed? It'd be interesting to see who has less and more difficulty sectioned by age, culture, education level, etc...
Also worth mentioning that Toki Pona was created by Sonja Lang(a fellow Canadian <3) during a period of depression. Here's her main website where you can see more of her work: https://www.lang.sg/
All of these are glorified Swadesh lists. The concept dates back to the 1950s.
Semdom.org (and/or rapidwords.net) has some useful checklists if you're looking to create a comprehensive vocabulary or map concepts across existing ones.
Christian Richard created Toki Pona after some years of activity in the Esperanto movement, so the two languages have a shared history. This is also why people miss the point when they complain on these HN threads about Toki Pona apparently not being a full, usable language: there was already Esperanto for that, Toki Pona was intentionally created as an experiment, a sort of conlang Oulipo.
Apparently the creator now goes by that name. I was writing out of my own memories of knowing the person under a previous name in the Esperanto and general conlang community over two decades ago.
I am quite interested, but a bit sceptic. On the one side, it would be great to have a language which you can learn in a couple of weeks due to its simplicity and small vocabulary. On the other side, as a programmer I immediately get the notion, that a large vocabulary might just get replaced by more complex expressions. This is basically a mathematical necessity. And it is something which we can observe in real life anyway. Whenever you have a situation, where people try to say something in a language where they lack the vocabulary, but also when the language lacks the specific term for a new thing, one will use compund expressions. Some will stay, others will gradually be replaced by a specific term.
Also, I find it highly interesting how groups of people will evolve branches of languages, in the large usually called dialects. But also in the small, like with any customs and styles, local variants are developed. Seems to be a very fundamental human trait. That can be an age-dependent variation - most age groups have their version of a language, and also culturally. Like certain groups of immigrants picking up their style of the local language. And on the big scale, look at English, which seems to be the closest to an universally spoken language so far. There are quite some flavors depending on which continent it is spoken on.
The reason we have more than 120 words is because you need more detail than that.
I could say nuclear reactor, or I could say “hot danger room cook mix build make sun” or some other inane lengthy sentence mushed from smaller words. It’s inefficient and inprecise.
This is like an assembly language. Simple but unusable because of the simplicity and ambiguity.
i mean that essentially already happens. all languages share an underlying grammar and im pretty sure that is what allows NLP algorithms to work, but i could be wrong
I agree and I really enjoy the simplicity of the concept. Though I don’t believe anyone is suggesting using it as a full-time replacement for any other language.
Well I guess toki pona is different things to different people, but for me it's about simplification of life using this language as a tool. Kind of like a monk language.
I've read the coffee used in these comments, so I'll stick with that. If you think life is complicated when things like a double espresso pumpkin spiced ice caramel latte is a normal order in some coffee shop, you can just reject the notion of its existence.
Is coffee even too specific? Perhaps. How about just needing a beverage? toki pona will have your back with that one. Even a "hot beverage". You might get a surprise, perhaps tea, Coco, or coffee, but this is all you really require.
When monks need to eat, they take a food vessel (bowl) and ask people to fill it with food.
If you can get through life with the simplest of communication, maybe you have simplified your life adequately.
Would it be possible to train an LLM from scratch that would speak Toki Pona? 120 word dictionary over a reduced alphabet would mean a tiny number of possible tokens and theoretically a model could be smaller than the ones used in "tiny stories" experiment (where a simplified almost childish English has been used). Maybe even a local machine would be enough to train it. I wonder if there is a large enough dataset for Toki Pona or if there is a sensible way to synthesize one? I'm no expert in LLMs or Toki Pona, though.
Heard about the language about 2 months ago and yes, it's fascinating.
Also it's understood by ChatGPT (at least version 4).
I asked it for "my hovercraft is full of eels" and got
"tomo tawa mi li jo e kala mute" which a speaker of Toki Pona
confirmed is a good translation. (Roughly: "My moving house has many fish")
Asking ChatGPT to translate it back a month later got "My vehicle contains many fish." which is not bad for a language that relies on context a lot.
interesting project, for fun yes, but it seems it's based on an outdated assumption that 'thoughts are determined by language' [1] a sort of buy-one-get-one free.
However, human languages haven't been widely adopted with much success when designed/created top-down. Even more advanced and well developed languages such as Esperanto, haven't take off, though more widely adopted.
Personally I agree with Prof Steven Pinker, that Language is an Instinct, and that it evolves in communities, in stages, such as 'pidgin -> creole -> first/native language', most importantly developed & fortified by children.
The human brain has capacity for storing, processing and remembering tens of thousands of words, example [2]:
- 13K words for an average 3-year-old
- 60K words on average by the time they graduate from high-school
- a 3-year-old is likely to remember a new word casually mentioned in a conversation for up to 5 weeks
you may save in storage at the expense of processing, orchestrating, learning, and rather crucially accessibility. Except for fun, it's kinda pointless otherwise.
[1] Sapir-Whorf hypothesis of linguistic determinism "that people's thoughts are determined by the categories made avail-
able by their language, and its weaker version, linguistic relativity,
stating that differences among languages cause differences in the
thoughts of their speakers".
I found it curious that you can ask chat gpt to talk to you in toki pona.
Explain in toki pona what is toki pona.
ChatGPT: toki pona li toki pi toki lili. ona li pali e nimi mute lili tan toki mute. ona li jo e nimi luka wan e nimi tu. nimi luka li jo e nimi tu lon toki mute. nimi mute li ken toki e sitelen sin sama nimi luka. nimi lili mute li pona tawa toki pi toki pona. sina ken toki e ni: "toki pona li toki tawa pi toki lili."
Can someone who speaks toki pona tell us if he's doing a good job?
You can ask ChatGPT to talk to you in just about anything - Lojban, say, or Old Norse. And it will usually do so, but the quality will vary considerably depending on the language (e.g. it doesn't know Lojban nearly as well as it thinks it does).
The thing missing from this analogy is compilation. RISC machine code isn't supposed to be manually written most of the time. I wonder if there's something similar that could be applied to human communication? As in, an intermediate simplified language that's rarely written or read on its own, but instead provides disambiguation between human languages?
Toki Pona is very ambiguous so it's actually harder to parse. All words can be used as any part of speech, so the parsing itself is often derived from context.
E.g. "mi wile tawa wawa" could be parsed as:
- "mi (wile) (tawa wawa)" ("I want, towards strength")
- "mi (wile tawa (wawa))" ("I want to go powerfully/quick")
toki pona is not an attempted universal language! it is a pidgin between a number of languages but neither its creator nor its community wants it to be a universal language.
But you could say "ko lete", "walo lete", "ijo lete", or "lete", I guess. Or "telo lete pi kiwen mute ala" which is kind of a nuisance. ("Not-very-hard cold water.")
I think "telo lete walo" or "ko lete walo" would be widely understood.
This is like writibg a new computer language which doesn't contain functions or any equivalent, and that in that language instead of using a function you can just inline the contents of the function instead.
Nothing prevents you from defining a new word and then using it.
The problem, rather, is that any such definition will be confined to that conversation, or at least that group of speakers - there are no "reusable libraries".
There once was a tribe that had no idea that the sky is blue. People from afar who were more technologically and scientifically advanced tried to explain that the sky is blue, but the tribe members had no conception of this or capacity to really understand what it means for the sky to be blue. They had never considered what color the sky is.
That is because in their language, they had no words to describe the concept of the sky having a color. So they didn't even comprehend or notice that the sky was blue.
Be careful with any trend to publicize or normalize a smaller language that can be shared with everyone. It's a trap. It's a trap so that people lose sight and memory of things that exist. Who wants you to forget what exists, and why? Who wants to take away words and what words are those?
To me, toki pona and this Newspeak-like idea feel almost like opposites. They just happened to share the characteristics of a smaller or "simpler" vocabulary than English. To speak toki pona to me is to remove jargon and surface level speech, and to think more about the real meaning of what is coming out of your mouth, while Newspeak is more like jargon taking over the whole language until you only repeat Goodthink because that's the only words you have. toki pona makes you think about what you want to say, while Newspeak makes you think about what the party wants you to think.
I personally think of toki pona as a tool of self reflection and understanding because of how you have to break things down to them speak the meaning out, but maybe others use the language differently. It is in no way a trap to lose touch with reality though. It's not removing words for reality, just expressing it in a different way.