In high school I was the founder of our (small) Esperanto Club. I gave Esperanto lessons each week at our club meeting designed as a one year course, usually by the end of the year if someone attended each week and did moderate self study outside of class they could communicate in Esperanto really well. We even had one international student who really struggled with english that after a few months of attending our club could communicate with us in Esperanto more fluently than English!
We had one a freshman join during my senior year who was a huge advocate of Toki Pona as a conlang. We decided to devote a month of the club to Toki Pona instead of Esperanto and it was mind boggling how quickly everyone was able to get a grasp of it. Granted Toki Pona is much more "wordy" than Esperanto, you often have to use many words to convey an idea that you could express quickly in a language with a larger vocabulary. Regardless it was an absolute blast to learn and I'm surprised by how much I remember.
Once I got to college there seemed to be a severe lack of interest in Esperanto, or any conlang for that matter, amongst the student body. I could never really keep enough people interested the same way I could in high school so I gave up after my sophomore year. I really miss teaching people Esperanto. I believe the club at my high school still runs to this day, I'd love to be able to go back and visit one day.
I bloody wish there were good modern Esperanto courses. I've found a Michel Thomas method Esperanto course once but it was incomplete. I would really love to buy a complete one (even for a price higher than what other languages cost). Intuition suggests I'm not the only one interested although there probably are not too many. This means the demand, albeit humble, certainly is higher than the offer. You might consider running a croudsourcing campaign and making some good modern Esperanto resources if you really feel interested in teaching people Esperanto.
I personally used Edukado[1] for teaching resources for my course. They have a ton of lesson plans from beginner all the way up to advanced all supplied for free by the community. This is by far the most popular resource for instructors, however it may not be as suitable for a beginner as most of the resources are written in Esperanto. Maybe still worth checking out though
It's nice but fairly boring (e.g. too much repetitions of the same stupid questions) and not even nearly as good as Michel Thomas or Pimsleur.
I wish somebody would just take something like one of those courses for e.g. Spanish and replace Spanish words with Esperanto. Obviously that would mean pirating the actual method but it would be blatantly hypocritical of me to pretend I care. With a croudsourcing campaign, however, there are chances to do that (or something even more interesting) officially, pay the royalties etc and I would gladly contribute a reasonable amount of money.
I've said it before, but the problem with having only a hundred words is that they are the hardest one hundred words in any languages. To a foreign learner, easy words are hard, and hard words are easy.
To someone learning English as a foreign language, "banish" is easy to understand: it has just one meaning, so once you memorize it, you can recognize it whenever it's used.
"Turn out", on the other hand, has half a dozen meanings and you have to rely on context.
In the extreme case, imagine explaining the word "the" to someone whose native language doesn't have an article.
> imagine explaining the word "the" to someone whose native language doesn't have an article
Much like "turn out", there are many different ways in which "the" might be used. But as contrasted with "a" ("explain it to someone who doesn't have articles"), it's not that bad -- "the" is used to mark noun phrases that are already present in the conversational context, and "a" is used to introduce new noun phrases into the conversational context.
More generally, "the" and "a" are markers for what is known in linguistics as "definiteness", with "the" being an unspecialized definiteness marker and "a" being a more specialized indefiniteness marker. But there are many other determiners that require or mark definiteness -- possessives like my and their are definite; demonstratives like this and that are definite; some is a fully general indefiniteness marker...
(Compare "there's some guy outside scaring customers away" with "there are some guys outside scaring customers away", then consider that a would only be permissible in the first one.)
> To a foreign learner, easy words are hard, and hard words are easy.
Yep. I've pointed out before that most people have the instinct that when a foreigner doesn't know the language well, you should talk to them the same way you'd talk to a small child. But that's completely backwards. The typical small child only knows common words and can handle any native grammar at all. A foreigner will have trouble with common words, effortlessly comprehend rare words (after looking them up), and have extreme trouble with grammar beyond the basics.
I had a Chinese tutor once who was embarrassed when a rare word came up in some reading, and assured me that this was a "really fancy word" and it didn't matter if I didn't know it. The assurance was not needed -- at the time, my vocabulary was negligible; to me there was no difference between a really fancy word and its dirt-common equivalent. It's taken a lot of work to get to the point where I can be annoyed to encounter a fancy substitute for a word I feel I should have known.
> "the" is used to mark noun phrases that are already present in the conversational context, and "a" is used to introduce new noun phrases into the conversational context.
> (Compare "there's some guy outside scaring customers away" with "there are some guys outside scaring customers away", then consider that a would only be permissible in the first one.)
This rule is OK but it fails for the following case: A man you've never seen before runs into your office and says "The President is outside!" Now, the President wasn't part of your conversational context, in fact there was no conversational context, nor any context with the speaker. But "a President is outside!" is clearly wrong.
More generally, any attempt to give simple rules for this part of English grammar will fail, and will not generalize to other languages with articles. The rules of natural languages are extraordinarily complex, although we have the illusion that we can understand them explicitly. That's why rule-based NLP systems have always seemed attractive, and have always failed.
> any attempt to give simple rules for this part of English grammar will fail, and will not generalize to other languages with articles
Can you elaborate? (Genuinely interested: I like languages, don't know much about NLP).
GP's explanation was just wrong, but articles are pretty well defined, easy to explain, and generalize well to other languages.
- "The" (definite article) marks a noun that refers to a particular instance of the class designated by that noun, distinctly recognizable from other instances.
- "A" (indefinite article) marks a noun that refers to a generic instance of the class that isn't explicitly recognizable among other instances.
> The Philippines, The United States, The Great Lakes
I see nothing wrong? The Philippine (Islands). The United (States). The Great (Lakes).
> the poor
This is a challenge for NLP but only due to the ellipsis. "the poor (people)", where "people" is elided. I don't think most native speakers realize "poor" is an adjective here.
> a means of production
> a number of hamsters, the number of hamsters
> a shirt, but a piece of garment, and a pair of pants
> The Philippines, The United States, The Great Lakes
The point is that they are all names for a unique entity. There are no multiple Philippines - there's only one. Yet we should say "The Philippines", while we cannot say "The Switzerland."
Similarly, we aren't talking about "a particular group of united states among multiple such groups" - for example, nobody calls Germany "The United States", even if it's made of many states.
> "the poor (people)", where "people" is elided.
The problem with this analysis is that "the poor people" can be used to mean "these particular group of poor people we are talking about", but "the poor" can't - it's not a definite group of people.
E.g., "I met those refugees on the road, so I asked the poor a few questions." (??)
> a shirt, but a piece of garment, and a pair of pants
I'm saying: why can't we say "a garment" or "a pants"? (Hmm maybe we can say "a garment"? I'm not sure.) Seriously, a pants should be a thing: it's clearly one item. Has anyone ever seen someone wearing "a pant" (i.e., half a pair of pants)?
> There are no multiple Philippines - there's only one. Yet we should say "The Philippines", while we cannot say "The Switzerland."
I think kaoD's point is that islands, states and lakes are not unique entities, so "the" does designate a particular instance of a class in these examples.
I think he might have a point: your three examples are proper nouns that happen to include "the", and they evolved from "definite article (+ attributive) + noun", where the definite article was used to pick out a particular instance of a class.
> Yes it is. It's a specific subset of people (the poor) that can be picked apart from the rest (the non-poor).
I think the issue is that it doesn't necessarily refer to the people mentioned in the first clause. To me it sounds like a non-sequitur, e.g. 'I met those refugees on the road, so I [went elsewhere and] asked the poor a few questions [to see if they would have some insight on the refugees I met earlier].'
I took the trouble to classify all of these according to Huddleston & Pullum ( https://www.amazon.com/dp/0521431468 ). For most of these, they're just fixed (or nearly-fixed) expressions.
1. The Philippines / The United States / The Great Lakes / the poor
5.8.4(b), Fixed expressions containing the definite article.
"In such cases, it is largely arbitrary that the definite article is required rather than a bare noun."
(kaoD says 'I don't think most native speakers realize "poor" is an adjective here', but there's good reason for that -- it is not ordinarily possible to construct a the+adjective phrase.)
Examples: [17iii] I have (the) measles.; [17v] We caught the bus.
2. A means of production
This is not at all idiomatic (Marxist rhetoric invariably refers to the means of production), so this is a simple example of 5.6.2, The indefinite article a.
"The indefinite article a is the most basic indicator of indefiniteness for singular count nouns." (here, means)
"Both [number and couple] occur in singular form with an obligatory determiner (usually a, but others are possible []), and in addition number can occur in the plural, and take a limited range of adjectival modifiers"
"The definite article the does not occur with number in the sense we are concerned with here."
Examples: [58i] We found huge numbers of ants swarming all over the place.; [58v] An unusually large number of people have applied this year.
4. The number of hamsters.
5.6.1, The definite article the
"The definite article the is the most basic indicator of definiteness."
"Use of the definite article [] indicates that I expect you to be able to identify the referent -- the individual ladder, the set of ladders, the quantity of cement [or quantity of hamsters] that I am referring to."
In order for this phrase to be grammatical, you need to use number in its ordinary, non-quantificational sense, as in "That depends on the number of hamsters you've put in the cage".
Examples: [1] Bring me the ladder!; [6vii] They are interviewing the man who mows her lawn.
5. A shirt / a piece of furniture / a pair of pants
You appear to be talking about the difference between mass nouns (pants / bread / furniture) and count nouns (shirt). This is discussed in 5.2 ("Overview of noun classes and NP structure"), but has absolutely nothing to do with definiteness or choice of determiner. However, a cannot be used with mass nouns because a requires a count of one, and mass nouns cannot be counted.
6. A lion chased him. (a particular lion)
5.6.2, The indefinite article a
Exactly the same as in "a means of production".
7. A lion is a ferocious beast. (lions in general)
5.8.3(i), Generic interpretations
"The interpretation of singular indefinites in the same context, like a lion in [iia], is correspondingly "any lion that exists."
"The generic use of a singular definite like the lion is also possible in the context given in [iiia], but is an example of the restricted 'class' use of the definite article (see 5.8.4 below). If instead of lions we were talking about doctors, the definite singular would not generally be possible"
"In The lions are ferocious beasts, the plural definite the lions would also obligatorily be interpreted non-generically."
"With predicates that can only be applied to a set, a singular indefinite generic such as a lion is inadmissible"
Examples: [14i] Lions are ferocious beasts.
; [14ii] A lion is a ferocious beast.; [14iii]
The lion is a ferocious beast.; [16c] This chapter describes the English noun phrase.
----
I have no real point; this was mostly for fun.
However, I'm a little confused as to why "a means of production" / "the number of hamsters" / "a shirt" / "a lion chased him" are in your list, since they're typical examples of the most common use of the / a as described above.
There is some overlap between the poor and the class use of the definite article, but it's not a perfect match, because it isn't possible to use poor in the same sense without accompanying it with the.
It's a mass noun just because the adjective turned into a noun (sorry, don't know the technical terms in English) when the real noun was elided.
In "the poor people", "poor" is an adjective. If you elide "people" and turn the phrase into "the poor", it has to be a noun (otherwise we'd have a noun phrase without a noun).
Though IMHO it's better analyzed as an adjective with an elided noun, but I'm afraid I lost that war ages ago :shrug:
But I personally prefer to analyze what makes sense, not what's written (considering language is language). I'd rather have a regular case with an ellipsis than "oh, it's just a fixed phrase" and a zillion other corner cases.
I'm not agreeing with him. But "it is a mass noun" has its own problems. It isn't possible to use "poor" in that sense without using the full phrase the poor. I think it's best analyzed as a fixed phrase.
> But "it is a mass noun" has its own problems. It isn't possible to use "poor" in that sense without using the full phrase the poor.
Sure, there's a very broad class of syntactically plural mass nouns without singular countable forms for which that is the case — interestingly, countable nouns with a similar relation to adjectives can become mass noun when their plural form is used with “the”.
> Sure, there's a very broad class of syntactically plural mass nouns without singular countable forms for which that is the case
Can you give an example of what you're talking about? If you mean something like "scissors", that's not true -- it's perfectly possible to say "I like scissors".
> Can you give an example of what you're talking about?
Pretty much all (there's probably some exceptions, because English) nouns that are syntactically plural and written identically to an adjective are mass nouns used exclusively with the definite article and with the sense of “those (usually specifically people, but this can vary by context) to whom the adjective applies as a mass”. In addition to poor, you have strong, weak, educated, etc.
Nouns where there is a singular form written identically to the adjective tend to have a plural with two senses, both a countable sense that is a normal plural of the singular form and an uncountable sense identical to that of spelled-like-adjective plurals. Examples include Italian/Italians, etc.
As a technical note, I wouldn't call those words "syntactically plural". They are syntactically plural in that they can be the subject of verbs inflected for a plural subject, but that's not the best criterion. There are a few relevant questions:
1. Does it have the form of a plural noun?
2. Does it take plural verb agreement?
3. Does it take singular verb agreement?
Phrases such as "the poor" are hard to call "syntactically plural" because they fail criterion 1. They pass criteria 2 ("yes") and 3 ("no"), but nouns that are syntactically singular can also pass criterion 2.
1. Yes. Scissors appears to be a plural noun in terms of raw morphology.
2. Yes. Scissors has obligatory plural verb agreement.
3. No. Despite the semantics, scissors has obligatory plural verb agreement.
Family (syntactically singular, semantically plural):
1. No.
2. Yes. My family are all painters.
3. Yes. My family is coming to visit me this summer.
With only the verb agreements supporting the classification of "the poor" as plural, you get into an unresolvable argument over whether singular verb agreement doesn't happen because it's impossible (syntactically plural) or because "the poor" are always being considered plurally (syntactically singular).
If you want to consider "the poor" as a mass noun, you also run into the problem that there are no syntactically plural mass nouns. Mass nouns are always syntactically singular. (Well, this depends on how you want to classify scissors and pants, I guess.)
English has no “form of a plural noun”; it has a regular pattern of how, for countable nouns, singular and plural forms relate to each other (and even for such pairs, there are a number of irregular patterns with multiple examples, and a number of pairs with sui generis relations.
“Family” is a singular collective noun; singular collective nouns are a special case and can take plural or singular verbs. “Poor" is a plural mass (not collective) noun, it takes exclusively plural verbs
> If you want to consider "the poor" as a mass noun, you also run into the problem that there are no syntactically plural mass nouns
There's a whole broad class of them that are structurally just like "poor" as I mentioned and gave examples of in the grandparent post. Sure, if you ignore that broad and very frequently used class of mass nouns, there aren't any other plural mass nouns. But...
> Mass nouns are always syntactically singular. (Well, this depends on how you want to classify scissors and pants, I guess.)
Scissors and pants are normal countable (and so, not mass) plural nouns that just happen to not have a singular form because the quantum unit is conventionally taken to be a pair. This isn't really in question anywhere.
> Scissors and pants are normal countable (and so, not mass) plural nouns that just happen to not have a singular form because the quantum unit is conventionally taken to be a pair. This isn't really in question anywhere.
I don't have a strong position on this, but they aren't normal countable plural nouns. "Two pants" is ungrammatical; it must be "two pairs of pants".
(Expanding a bit: it's easy to say that "one pants" is impossible because pants is plural. But pants require counting in pairs regardless of how many you're talking about; it's not just a restriction on grammatical number. Whether you think "four pants" should correspond to "four pairs of pants" or "two pairs of pants", it isn't possible to say "four pants", and this is not normal behavior for a count noun.)
Another example: "the Chinese people" (we'd just say "the Chinese"), 10M results.
> It seems more likely to have originated as a direct calque of French les pauvres.
Definitely. Spanish "los pobres" too. Phrases with adjectives as nouns (again, I'd say originating from ellipsis) are very common in romance languages.
E.g. "los rojos" = "the red ones"
Notice how in English my proposed adjective ("poor") lacks the plural "s" (like adjectives do in English) unlike Spanish/French, which do agree adjectives plural forms with their nouns.
That's not a response. Those results are overwhelmingly not synonymous with the phrase "the poor", which would be clear immediately if you looked at them.
If I search for "the poor people", I get told there are 22M results. The entire first page is taken up by references to a proper noun, The Poor People's Campaign; searching for '"the poor people" -campaign' cuts reported results to 12M, and the new results prominently feature The Poor People's Crusade. This is not a compelling argument that the phrase "the poor people" is permissible in the same contexts that allow "the poor" (which would be necessary if this were a case of ellipsis).
> Another example: "the Chinese people" (we'd just say "the Chinese"), 10M results.
That's a much better example; I'm willing to concede that "the Chinese people" and "the Chinese" mean the same thing and can be used in the same ways. But you haven't made the argument that the equivalence between "the Chinese people" and "the Chinese" transfers over to an equivalence between "the poor people" and "the poor".[1] I claim you cannot do this, because "the poor people" and "the poor" are not equivalent.
Note also that the parallel construction to "the Chinese" is "the Italians", not "the Italian". But we obviously can't claim that "the Italians" is ellipsed from "the Italian people". We also shouldn't claim that "the Chinese" is ellipsed from "the Chinese people".
[1] For one thing, people in "the Chinese people" is not even the same word as people in "the poor people"; "the poor people" uses Merriam-Webster's sense 2 ("plural form of person"), while "the Chinese people" uses sense 5 ("singular noun of which the plural is peoples; a body of persons that are united by a common culture, tradition, or sense of kinship"). https://www.merriam-webster.com/dictionary/people
As I noted, there are many ways to use "the". You're correct that the rule I gave doesn't cover most of them. You're mostly wrong in the details:
> Now, the President wasn't part of your conversational context, in fact there was no conversational context, nor any context with the speaker.
This is very wrong; the President, like all proper nouns, is part of the conversational context. You may validly assume that your listener knows the identity of "the President" in the same way you assume that they know the meaning of the word "outside". For this reason, all proper nouns, arthrous [= marked with "the"] or not, are always definite.
Some English proper nouns are arthrous and some aren't; there is no rule governing this.
> But "a President is outside!" is clearly wrong.
It's not ill-formed in any way[1], though it raises some questions about how you know there's a president out there without also knowing which president it is. But more importantly, I restricted my comment to cases where "the" contrasts with "a", and this is not such a case. "The President" and "a president" are not parallel formations.
[1] If you insist on capitalizing President [referring to a specific known individual], then it is indeed ill-formed. Proper nouns are definite and "a" is indefinite.
> Now, the President wasn't part of your conversational context, in fact there was no conversational context, nor any context with the speaker.
That's an awesome example! My proposed patch would be: singular instantiated concrete nouns in English always need a determiner, and we should use the most specific determiner that pragmatics will give a plausible meaning to. (So also "our mother lives in Boston" is preferable to ?"the mother lives in Boston" but the second option is possible in some contexts.)
> More generally, any attempt to give simple rules for this part of English grammar will fail, and will not generalize to other languages with articles.
Yeah, that's a great point. For example, Portuguese commonly requires articles with abstract nouns, where English would forbid them.
"O amor é um dos maiores prazeres da vida."
'(the) love is one of the greatest pleasures of (the) life'
Someone learning one language from the other might first think "oh, great, this language has definite and indefinite articles, just like my native language -- it's no problem, I already understand this". But that understanding will fall down when confronted with situations like abstract nouns.
Ancient Greek has definite articles but no indefinite articles, and I was taught that definite articles are mandatory with people's proper names (ὁ Πλάτων, '(the) Plato'). But it's apparently more complicated than that, because for example in
sometimes he's called ὁ Πλάτων and sometimes Πλάτων (for reasons that I don't grasp). And similarly in Portuguese, it's common to refer to people in the third person by a proper name plus definite article (o João, a Maria) but neither obligatory nor universal. My intuition is that it's a very mild honorific and that it's only done when we think the listener is already familiar with the person referred to, but I'm not a native speaker and I bet that intuition isn't quite right either.
So yeah! These rules feel super-simple, and they're really not.
It's probably true that in real natural languages, the 100 most frequent words are also the most ambiguous. Languages have evolved this way because (native) speakers can easily disambiguate, so there's no need to be precise all the time. But Toki Pona is a constructed language, so it might not have this property.
But maybe not the most common hundred concepts. "Turn out" is ambiguous but the notions of "expulsion" or "away" or "count" and "people" may not be.
It would be extremely useful to have a list of the most common core concepts shared by all cultures. Some maybe tricky like colors, but there must be a great many in common because it is possible to communicate after all.
Totally agree, and I have data to support this buried in some drive somewhere.
Interestingly, I had a famous SLA researcher/professor tell me that this was a non-issue. I assume it was because she dealt mostly with Dutch learners of English, but I found it to be fantastically short-sighted.
Reading the article, I realized there must be some optimal sizes for vocabulary. I say sizes, plural, because there must be some layers there in terms of usage frequency. Toki Pona clearly goes to an extreme of simplicity in vocabulary space.
But what is "optimal"? You would expect languages, in their natural evolution, tend towards it all the time.
Any system is fairly trivial for native speakers — they just learn it.
For non-native learners, the optimum level is probably the closest to what their native language is. Simple examples, it’s easier for a native speaker of Japanese to learn Mandarin since their is substantial overlap in vocabulary via the significant Chinese influence in the Japanese language.
That said, all other things being equal, the number is probably on the low side. As a simple example, Indonesian seems to be a relatively easy language for folks coming from any language to learn, even if their native language is linguistically distant. I believe this is due to the relatively small amount of functional vocabulary.
In English "microscope" is "micro-scope", namely small-view.
I find it funny that many don't understand that this is how languages work, until they learn other languages. Especially with English, which is a bastardization of a lot of other languages. For example, if you learn French basically anything that is fancy in English is just the normal term in French (e.g. "house" -> "mansion" or "famous person" -> "celebrity")
Actually microscope (like most similar words eg telescope, periscope, endoscope etc) is borrowed from greek: μικροσκόπιο, which is generated from μικρό + σκοπέω meaning small + look at.
Actually.... we are saying the same thing. Scope comes from Greek. See here [0][1].
As we're speaking about language "actually" would conventionally be used in a manner to correct someone. This is why there's jokes like "Where does the mansplainer get their water? The well, actually". But in this case, you did not contradict me in any way. So instead you should say: "Specifically, microscope (like...) is borrowed from greek..." This would create a less contentious comment and still give the added value that you are trying to contribute.
When I studied German I was initially amused that the German word for television was "Fernseher" (far seer). How quaint! Until I stopped and thought for a moment what "television" literally means -- tele from Greek meaning "far", plus Latin "vision".
That's Greek. (Well, the -um ending isn't, but the rest is.)
Latin for small is parvus, and Latin for watcher is visor[1], but Latin doesn't tend to form compounds the way the Greek does.
[1] Visor is literally "watcher", an agent noun formed from the verb videre "look at". Microscopium just uses an ordinary noun ending, not an agentive construction. I don't know what happens in Icelandic.
What about an English version of Toki Pona? (maybe Eng-Toki Pona?)
Why? Because English is very common already, and it seems an easy way to not only have usable words in two languages (English and Eng-Toki Pona) but also it would be useful to actually learn. (foreigners could _actually_ use this for real)
This is the biggest hurdle I have with made up languages, they have very little actual utility value.
As an afterthought, a simple charades like game where you have the limited vocabulary on a board, card or print out where people have to describe an event,book/story, object, movie, etc... using Eng-Toki Pona. Everyone could do it right now (even kids), and learn a new language at the same time. (one of the easiest ways to learn something is by making it fun and a game)
Edit: A reverse of this - Japanese-Toki Pona, all the same words, but in Japanese. Same with every other language (Russian, German, French, etc...). Then, after you learn one Toki derivative, you can easily pick up others. I figure it's possible you could learn the choppy/odd sentence building technique in your native language, and then add other languages later. Could also be fun as a group game once the English version gets easy. And all of these would be useful everywhere... not just speaking with other Toki Pona speakers.
That's interesting. This seems like actual work to learn though.. (850 words AND grammar rules vs Toki Pona's 100 words and grammar free-for-all)
The reason I made my suggestion is because it would be based on Toki Pona itself, a subset of all languages as it's core philosophy, which seems like a universally good idea.
But to increase adoption of this idea, I feel it needs to have broader appeal and real world value (ie, actually be able to use it somewhere). Otherwise it feels a bit like trivia or a toy.
I have long wondered whether it would be possible to identify the most basic, core words of English, and construct a dictionary such that all definitions eventually reduce to those words. That way, a speaker of a foreign language could learn the meanings of the core words by translation into their native language, and then the process of learning a new word would be: look it up in the dictionary, and if there are any words in the definition that you don't already know, look those up, and so on until everything is reduced to the basic words you already know. The question then is, how many basic words must there be, and which ones are they? I realize nobody actually learns a language like this, but it's still conceptually interesting, analogous to the idealized process of reducing a mathematical proof all the way to the axioms (which, of course, mathematicians don't actually do, but in principle they could).
This exists : https://learnthesewordsfirst.com/ is a "multi layer dictionary". There are 360 base words, the very first ones are explained with images, then each word is defined using the previous ones.
Then there's a list of 2000 more words defined using only base words.
The last layer is a full dictionary whose definitions use only these 360+2000 words.
I was thinking the same thing! I think it in itself should be useful as a starting point for learning many languages, by just translating them literally into the language of choice.
Works well as a learning tool, but fails miserably for translation. The underlying assumption "the mapping between words in a pair of languages is simple" starts falling apart beyond the most basic introductory course.
Indeed. In the book detailing the language, the author says that the language was inspired by the various pidgins that have formed around the world where English words are modified to meet native phonology and combined with a simplified grammar.
The only one of these that I think doesn't work is lupa, which is a hole or opening, which doesn't necessarily have that much in common with a loop. I guess they could both be circular, but lupa applies to any kind of opening, circular or not.
linja pona - a font for the Sitelen Pona hieroglyphic system with fancy character combination logic (goes way beyond the limits of my rudimentary font knowledge) - the text is all entered in latin but the font automatically converts toe Sitelen Pona (if you try copy/paste some text from the second site linked somewhere else you'll see it).
"In 2008 an application for an ISO 639-3 code was rejected, with a statement that the language was too young.[22] Another request was rejected in 2018 as the language "does not appear to be used in a variety of domains nor for communication within a community which includes all ages".[23]"
> goes way beyond the limits of my rudimentary font knowledge - the text is all entered in latin but the font automatically converts toe Sitelen Pona
It most likely utilizes ligatures—like when you write ‘ff’ or ‘fi’, the letters are joined into a custom glyph (in proper software and context), so this font does the same for custom longer sequences. This is popular lately with programming fonts and such.
Yip, it uses ligatures, but it also has a system where you can say put a box around text by using the underscore character w/ square braces ("[_mi_pilin]" has the "mi" and "pilin" characters with a cartouche around them, like on the second line of the linked page below), and also I think does dynamic compounding of characters (see the current complete character list here - https://davidar.github.io/linja-pona/nimi - I haven’t asked the author how they did it, but my guess is that there’s something fun/dynamic going on).
I learned it last year as a fun excercise, and found it pretty interesting that there are so many things you can express with so few words.
However, in the end you'll end up wanting to express much more and the language is just not able to do that, you'll end up lost in the compounds, and things become unclear pretty quickly.
Yip, there are definitely topics I butt into where I hit a wall too -_- . But with time I got much better at navigating them and have had several-hour-long conversations IRL as well as online using just Toki Pona (without resorting to compounds really - I don't know if it's what you're referring to, but compound phrases with fixed meaning aren't really in the spirit of the language). There are people online who I talk with regularly exclusively using Toki Pona.
It may not be a concisely expressive language, but I like that the small number of word units means it would be easier to use as an intermediary language than Esperanto.
Another practical step could be to identify a specific minimal set of language units required for any given popular nonconstructed language. Is this already a thing?
>Another practical step could be to identify a specific minimal set of language units required for any given popular nonconstructed language. Is this already a thing?
Presumably, that would depend on your conversatonal goal. We could refer to the Apollo lander as a spacecar[0] in a casual conversation but that could be tedious or misleading if a NASA representative were trying to pigdin with an ESA representative about the metal casing of a booster rocket ballooning past the expansion tolerance of the o-rings, using only the 3000 most common english words.
Talking about daily life works pretty well (general chit chat - asking people how they’re feeling, what they’ve been doing, assuming you know something about their lives already there won’t be many ambiguities). online, I often share pictures as I go (they give points of reference). So I can say “mi pali e pan suwi” (i made sweet bread) and show a picture of a cake; my friend could say “ni li pona ala pona?” (is it good or not?) and I could reply appropriately. I could say if I plan to make it again, maybe talk about the ingredient composition slightly.
Commonly also, sharing toki pona memes/making jokes, finding deliberately silly ways to refer to things. (A running game I have with a friend is referring to lots of places as “tomo awen” (waiting house). Hospital? “tomo awen”, because most of your time there is waiting. Waiting in line at a supermarket? “tomo awen”. And right now, our homes are also “tomo awen”, because we can’t go out...).
One thing I’m looking forward to doing when it gets warm/we’re not quarantined anymore, is going to the zoo, walking around, and deciding how to describe all the different animals “soweli pi lawa sewi” (animal with a high head) for giraffe for instance. Going to a gallery or the like would also probably work pretty well - you can point at things, talk about how you feel about them. (It would be more about immediate experience than prior knowledge I guess). Any activity/situation with a lot of ready-to-hand context is likely to work.
Doing activities where you have a fixed frame of reference a language with limited vocabulary can work quite well. I played 2-player local coop of Wilmot’s Warehouse ( https://www.youtube.com/watch?v=TAcyPIJYOx4 ), where the sould of the game is categorising packages with various random/abstract designs and deciding where to put them in a warehouse. That would probably be a good exercise for any language, but it worked in toki pona.
I met up with someone I didn’t otherwise know and we talked about our professions, where we grew up, etc. (One cheat is that in Toki Pona’s 120 word count, borrowed words aren’t included - but there are conventions for transliterating words from other languages if you want to talk about English (Inli) or America (Mewika)).
You can go shopping and cook with someone - trying to describe what ingredients you’re looking for is I guess a game, and once you have them at home you can probably issue instructions with enough accuracy using Toki Pona’s vocabulary.
What’s difficult to talk about? Time and scheduling, especially under time pressure. Trying to clear up a scheduling confusion via txt messages with a friend was absolutely zero fun, especially when misunderstandings start creeping in that can’t be quickly rectified. Scheduling discussions have been the ones where I’ve hated Toki Pona the most.
Talking about technical matters doesn’t work - I can’t talk meaningful about maths in toki pona beyond the absolute vaguest of euphemisms - and while people have tried to state theorems/etc., it basically doesn’t work in my experience.
Talking without a concrete frame of reference can be hard. If I can’t see you, can’t share pictures with you, can’t use emoji, etc., the range of what I can communicate is severely limited. I can say I worked, I can say I’m feeling good, I can say what country I live in. I can talk about eating flat round bread with red paste on top (pizza), and that my girldfriend didn’t like it or whatever...which is still manageable chitchit I guess, but much more limited. (There are toki pona songs that work well at this level of speech though - e.g. https://www.youtube.com/watch?v=Bjlwov4JiD0 ).
It's interesting to me that you focus on the 'high' meaning of sewi, because I've always focused on the 'religious' or 'divine' meaning. So I would hear "lawa sewi" as something like 'religious head' or 'religious leader'. Probably my first parse of "soweli pi lawa sewi" would be 'the rabbi's cat' or something, rather than a giraffe. Or I would imagine an animal whose head is similar to some concept or representation of divinity. :-)
mi wile e ni: tenpo lili la mi mute ale li ken tawa tan tomo awen pi mi mute!
Thank you very much for this write-up! This sounds like a lot of fun, maybe I'll spend some of my quarantine time looking into it. Any recommended resources for learning?
There are a bunch of different communities/resources - I learned a lot from the "ma pona pi toki pona" discord, which is probably the best community for learning Toki Pona (people are generally friendly and generous with sharing knowledge). There's also a lot of music written in Toki Pona on youtube https://www.youtube.com/playlist?list=PL2eg_FknCSeoWL6tFBKOn... (some of them are decent!), which is nice for passive consumption.
Honestly, a language like this could be quite beneficial if it was widely used (maybe even taught in schools) as a fallback method for cases where you need to communicate with someone without having a language in common. Instead of trying to puzzle out what the other person is saying, you just switch to Toki Pona and use these very basic concepts to slowly figure things out.
This concept is used heavily in ASL (American sign language).
Simple signs can be grouped to make other concepts.
There are conjugations or tenses -- explicit number and time are their own signs -- and articles and prepositions are usually dispensed with, much like headlines.
Example "teacher" is TEACH PERSON. Plural would be TEACH PEOPLE. Student is LEARN PERSON, etc. There are the equivalent of modifiers indicated by motion, facial expression, etc.
This sounds wordy but in practice the bit rate is about the same as spoken because some signs stand in for multiple things, plus the omissions and modifiers as above are meaning multipliers.
Thanks! I'm working on a blogpost analyzing the speed gains I got by porting my webassembly environment to Rust. So far it's about 50x as fast on average, but most of the gain is from using an ahead-of-time compiler instead of an interpreter.
It feels like taking a picture and converting it to the lowest image format. Even if you know what was there previously, it's impossible to convert it back to a meaningful piece of information.
Trying to interpret sentences out of context in Toki Pona is indeed tricky!
However, "sina wile ala wile moku e telo pimeja?" (Do you want to eat black water?) can be quite accurately reconstructed as "do you want a coffee?" if you know it was said outside of a cafe.
With a given context that you can refer to (in a cafe, walking in the park, cooking food), the expressive power/range of reference of 120 words is vastly greater than, say, in a letter to a stranger. This is probably one reason why the meme community in Toki Pona is reasonably lively ( https://www.reddit.com/r/mi_lon/ ).
I don't know how you extend that metaphor to image-decompression though! (I guess if you know you're decompressing a picture of a face you can train your decompresser on other picture of faces first :) ).
having played with it i will say that it was an interesting and valuable experience to try and distill something that's emotional to you down into the basic basic concepts required to try and express it, which is exactly the point of TP. It's not for trying to have real conversations in.
Everyone should spend at least some time learning about ancient languages, such as Latin and Greek. If you do, you'll understand that all languages were originally like that, with very few words. The profusion of vocabulary we have nowadays were formed from a small kernel that was enriched with numerous suffixes, prefixes, and the combination of two or more words. For example, as we are closer to Latin, it is useful to learn prefixes such as "ex", "ab", "co", "pre", etc., and how they are used to form thousands of new words from latin origin.
I've spent some time doing that, as well as learning Toki Pona, and it doesn't seem to me that they're very similar. Latin and Greek have attested vocabularies of thousands of words, many of which have a super-specific meaning and aren't analyzable as compounds. Just one example is that they both have words for specific plant and animal species, which Toki Pona doesn't.
I agree that learning ancient languages and prefixes produces amazing results for our understanding of modern languages that are related to them and is a great idea. :-)
I especially don't think that ancient language speakers felt that, just because they had productive prefixes like our re- or un-, they could always describe concepts with combinations of a few existing morphemes instead of inventing new ones or borrowing them from other languages. Also, even the ancient language speakers didn't understand the etymology or structure of their own languages very well; some borrowings and compounds were already opaque to them!
It's weird to see so many words that I already "know" from the Tokipona language just because I'm Finnish and lived a few years in Poland. The words in Tokipona have of course a "wider scope" in their meaning, but here's a few I spotted scrolling through a Tokipona dictionary [1]:
"Kala" is a fish both in Finnish and Tokipona, "nena" is a nose in Tokipona which is "nenä" in Finnish, "sina" (you) is "sinä" in Finnish, "nimi" (name) is the same in both languages, "noka" (leg) is "noga" in Polish, "ona" (she) is the same in Polish. There's more that I'm easily able to remember like "linja" (line in Finnish) which has a similar meaning in Tokipona, not to mention numbers like "wan" and "tu" and words like "mama" (mom) and "mani" (money) etc.
I'm pretty sure the best way is just to jump straight into Finnish, no point of getting confused by other languages or even toy languages.
If you don't speak any other language than English (your website says you're Scottish) there's some positives in that also as you will have a strong place in your brain for Finnish as that "other language" and you won't freeze so easily when you must speak it. What I'm trying to say is that I at least find myself very often "frozen" when having a conversation in a language in which my level is similar to another language. Polish and Spanish are both languages that I'm able to survive with but if I'm looking for the word e.g. for "Saturday" in Polish I might suddenly find myself stuck in the Spanish word for it. Add Swedish to the confusion and I might as well give up and hope to be understood in English, which is also not my first language, but at least I have a much more strong grasp of it than other languages (except Finnish) so that it isn't subject to the confusion most of the time. YMMV but I hope you get the point and maybe even find some encouragement in it to just start learning Finnish straight away.
Also, since you are Scottish you might already be able to roll your R's and pronounce the letters "ä" and "ö" the "Finnish way" since at least some Scottish accents have those "sounds", which is not the case for e.g. Australian, North American or Southern English accents.
Let me know if you want a Finn to talk to and I'll shoot you an email. ;)
Yeah my biggest problem is making the effort and having the practice of learning vocabulary.
(Well the grammar is my biggest problem too, but everybody struggles with that when learning Finnish. It is very regular, and drilling in a classroom is easy "minä olen", "sinä olet", "han on" is very easy. But doing that in real-time when trying to form a sentence is way harder.)
I can manage the rolling Rs, and I can almost say Töölö properly, but I need to concentrate and remember! Until recently I'd been coasting because I was working in English-speaking offices. I suspect now I'm looking for another devops/sysadmin/developer job I should not limit myself to English-only offices, but I'm not sure how realistic that is. It doesn't help when I go to shops and people greet me in English!
I appreciate your kind offer, thank-you. For the moment I'll pass. I need to talk more to my wife and neighbours! (We've been told that I MUST not speak Finnish near our child, and keep the separation near total until he's older. Right now I speak English to our three-year old, and my wife speaks Finnish. He'll understand and reply in the appropriate language to each parent. That's pretty awesome and I'm lucky that I can understand at least 85% of what he says. Though sometimes I have to ask him to translate when he uses words I don't know talking to his mother!)
I'm not sure how well this works in practice. Unless you already know a horse is a wonder dog and a hippopotamus is a water horse, when someone says "river wonder dog" will you know what he means?
Let's say Pat is an English speaker who has a strong understanding of its history and root words, but has never seen or heard of a hippopotamus until reading your comment. Without looking the word up, Pat can break down the word into the Greek roots híppos (horse) and potamós (river). From these roots, Pat can assume that a hippopotamus is a "river horse" -- a four-legged animal, probably a large mammal, that lives in or around rivers.
Let's say Jordan is a practiced speaker of Toki Pona who has never seen a hippopotamus, and has just now heard "river wonder dog" for the first time ever. Just like Pat, Jordan can infer that this refers to a four-legged animal, probably a mammal, definitely big or impressive, that spends lots of time in water. (The context of rivers is lost, so Jordan might assume that this creature lives on ocean shorelines, or even in open water.)
In either language, the creature's name uniquely identifies it to people who are already familiar with the animal, and gives a useful description to people who are unfamiliar with it.
This is actually really common in languages. Like the other commenter noted in Chinese. I'll give another example. Bear(熊) + cat(猫) = panda(熊猫). There's tons of these that don't make sense unless you already know. There are also plenty that you might be able to put together with context (electric + brain = computer). But we can go into any language and find things like this. The article mentions "microscope" which is already a compound word in English[0]. German has "sick wagon" for "ambulance" and "finger shoes" for "gloves". French has "animal companion" as "pet".
I'll also mention that German and Mandarin have a significant amount of compound words. There's even a joke about German, that if you want to make a new word you just smash two words together. Animal that lives in my house? House-animal. Haustier (pet).
[0] English is a "bastardization" of a bunch of languages. There's plenty of words in it that are compound words from other languages.
Others address that this is not uncommon in languages, nor is it unhelpful even if you don't know.
But for a different counterpoint, lets say we gave a specific name to everything. For example, in my new, totally not-fake language, a hippo is a klantomolarnal.
Unless you already know a hippopotamus is a klantomolarnal, when someone says klantomolarnal, will you know what he means?
So, if our options are:
1) colloquial compound words that are informative but inexact and may require context (most languages)
2) unique words for everything with no intrinsic meaning (no language)
3) compound words that uniquely describe a thing (ithkuil, lojban, basically nothing useable)
It seems clear that (1) is our only real option. Toki Pona just has a harder time because of the smaller shared vocabulary which makes things more ambiguous.
In my limited experience, context clears things up tremendously.
(incoming rusty toki pona)
o sina lukin e ni! // Look at that!
o sina lukin e telo suli soweli! // Look at that big, water mammal [hippopotamus]!
Like many languages, you'd have to learn more than the base vocabulary. If you were learning Chinese, could you guess what an electric brain (diànnǎo, 电脑), a vertical rise machine (zhíshēngjī, 直升机) or a cat head eagle (māotóuyīng, 猫头鹰) was? Maybe not, which is why you'd learn the Chinese for computer, helicopter, and owl.
This is how many languages works, but the spirit of Toki Pona is genuinely different (though many learners come in with the impression). Learning compound phrases won't really help you much.
One could say that the idea is to, if you have to refer to something in a given context, you try to cobble together a description, just enough to differentiate it from other things. "sike tu" (two circles) will be useful to differentiate a bicycle from a car, but if you want to differentiate them from a pair glasses you'll probably need to use a different term - "sike tawa" (moving circles) maybe! What you have to learn, instead of set compound phrases, is the ability to improvise names/descriptions for things on the fly :)
For a humorous taste of a limited language, without venturing outside of English, Randall Munroe (the XKCD guy) has a book called, "Thing Explainer", that's a sort of encyclopedia written using only the 1,000 most common words in English.
AFAIK the first edition of the Longman Dictionary of Contemporary English (LDOCE), which "uses 2000 common words in the definitions to make understanding easy", was published in 1978.
I like the idea of a very small language and I think it may be very useful for science and technology, but Toki Pona does not fit the requirements. Eg: it lacks words to express numbers.
Lojban looks interesting for this purpose, but it's way too complicated.
As a student of Japanese, I often get confused when listening to Japanese because there are so few sounds and therefore many homonyms.
This gave me the idea of a language with only two sounds, "ku" and "ka", being used to express everything. Sort of an analogue to the idea of encoding everything using the binary digits of zero and one.
kakakukukakakukukukuka, as the great philosopher once said.
Exactly. Is "kute pona" somebody who is good at listening, or somebody who is obedient? Like (Ingsoc) Newspeak, Toki Pona's poor vocabulary and overloaded semantics lead to a very simple and incurious language perfect for shrinking peoples' minds.
Sed esperanto estas la lingvo kiu inspiris Ingsoc de Newspeak. Esperanto ne havas vorton por "bad"; esperanto havas malbona, aux "ungood". Simile maljuna por "old" ("not-young") kaj malgranda por "small" ("not-big").
Sed facila. Nenio malhelpas vin uzi vortoj sxati "vieila" (de la ina formo de la franca por maljuna) por "maljuna", sed multoj parolantoj de esperanto eble komprenos.
Lojbano probable havas sistemo por vortoj tiel. Vortoj havas multojn lokojn en la "loko strukturo" de la vorto. La vorto por iri estas {klama}, difinita kiel:
{x1 venas/iras al celita loko x2 el deirpunkto x3 tra vojo x4 per rimedo/veturilo x5}
{lo klama} estas la viro de iri
{lo se klama} estas la celita loko
{lo te klama} estas la deirpunkto
{lo ve klama} estas la vojo de vojagxi
{lo xe klama} estas la ilo de iri aux la veturilo
Multoj lokojn por substantivoj signifas tion {klama} estas kvin vortojn!
Sed Lojbano havas la senvoĉan veluran frikativon (/x/ en la IPA aux "kh" en araba). Estas malfacila por anglalingvanoj.
We had one a freshman join during my senior year who was a huge advocate of Toki Pona as a conlang. We decided to devote a month of the club to Toki Pona instead of Esperanto and it was mind boggling how quickly everyone was able to get a grasp of it. Granted Toki Pona is much more "wordy" than Esperanto, you often have to use many words to convey an idea that you could express quickly in a language with a larger vocabulary. Regardless it was an absolute blast to learn and I'm surprised by how much I remember.
Once I got to college there seemed to be a severe lack of interest in Esperanto, or any conlang for that matter, amongst the student body. I could never really keep enough people interested the same way I could in high school so I gave up after my sophomore year. I really miss teaching people Esperanto. I believe the club at my high school still runs to this day, I'd love to be able to go back and visit one day.