
What's the minimum number of words you'd need to define all other words? (2012) - devilcius
https://www.reddit.com/r/AskReddit/comments/sxqt5/what_is_the_minimum_number_of_words_that_you/
======
Someone
The Oxford Advanced Learner’s Dictionary has a “Defining vocabulary” that they
claim is used to write almost all definitions (I used the Fifth edition, where
it is appendix 10). It’s about 8½ pages, with 5 columns of about 63 lines, so
about 2,700 words.

It doesn’t list inflections, proper names, adjectives for colors such as
yellowish and words used in an entry that derive from that entry (the
dictionary mentions _blearily_ and _bleary-eyed_ being used in the definition
of _bleary_ )

They also say they _occassionally_ had to use a word not in the list, but
don’t say how often they had to. Those words _are_ defined in the dictionary,
so it is possible that the reference graph does not have any cycles.

So, I guess 3,000 is a good first guess.

~~~
aasasd
Notably, 3000 is a good chunk of what's (afaik) considered to be the average
everyday-use dictionary: somewhere from 10000 to 20000 words.

(Though again I'm unsure if the endless English phrasal verbs are counted as
distinct in these estimates, not doing which would probably be cheating.)

~~~
mikekchar
Generally speaking, in language acquisition papers anyway, vocabulary size is
done a "word families" rather than words. So "police" and "police station" are
counted as a single "word" as long as you also have "station" in the list.
Phrasal verbs ("look up to" vs "look in to" for example) are counted
separately if I'm not mistake because while the root of the word is the same,
it's not the same "word family".

------
mjgeddes
The work of Ana Wierzbicka and Cliff Goddard studied 'Semantic Primes', 'the
set of semantic concepts that are innately understood but cannot be expressed
in simpler terms'.

[https://en.wikipedia.org/wiki/Semantic_primes](https://en.wikipedia.org/wiki/Semantic_primes)

The combination of a set of semantic primes and the rules of combining them
forms a 'Natural Semantic Metalanguage' , which is the core from which all the
words in a given language would be built up.

[https://en.wikipedia.org/wiki/Natural_semantic_metalanguage](https://en.wikipedia.org/wiki/Natural_semantic_metalanguage)

The current agreed-upon number of semantic primes is 65 (see list at wikipedia
links above).

That means that any English word can be defined using a lexicon of about 65
concepts in the English natural semantic metalanguage.

~~~
PinkMilkshake
I've been following this stuff for years, it's fascinating. I'm particularly
interested in the recent practical applications like Minimal English and it's
equivalent in other languages. For those that don't know, unlike other
minimalist English subsets which usually focus on learnability or clarity,
Minimal English focuses on maximum translatability.

I'm going to get silly now, but I can't help but think the semantic primes -
if you can avoid thinking of them as words or even conscious experience -
represent some core set of cognitive axioms, like the primitive elements for
constructing mental models. As you go to simpler life forms the "word list"
would get smaller. If there is any truth to that, I wonder what potential
primitives we are missing that would allow us to think more complex thoughts
and whether you could measure species intelligence by their "vocabulary" and
working out what concepts can't be expressed when one of the primitives is
missing. What would happen if you lost the concept of above'ness?

The other thing I find interesting and it might be no more than a coincidence,
is how there is only the numbers _one_ and _two_ and then you have to use
_many_ or _more_. This in some way matches up with the ideas of the _Parallel
individuation system[1]_ whereby young children can only precisely recognize
quantities up to 3, or 1 + 2 and an adult can only precisely recognize
quantities up to 4, or 2 + 2. After that, the brain uses the _Approximate
number system[2]_. So it's like there are only 2 slots to place a quantity.

[1]
[https://en.wikipedia.org/wiki/Parallel_individuation_system](https://en.wikipedia.org/wiki/Parallel_individuation_system)
[2]
[https://en.wikipedia.org/wiki/Approximate_number_system](https://en.wikipedia.org/wiki/Approximate_number_system)

~~~
aasasd
> _some core set of cognitive axioms_

This and the rest of the comment remind me of the Pirahã language, in which
there are purportedly two numerals but researchers can't figure out what they
are:
[https://en.wikipedia.org/wiki/Pirah%C3%A3_language#Numerals_...](https://en.wikipedia.org/wiki/Pirah%C3%A3_language#Numerals_and_grammatical_number)

> _Frank et al. (2008) describes two experiments on four Pirahã speakers that
> were designed to test these two hypotheses. In one, ten spools of thread
> were placed on a table one at a time and the Pirahã were asked how many were
> there. All four speakers answered in accordance with the hypothesis that the
> language has words for 'one' and 'two' in this experiment, uniformly using
> hói for one spool, hoí for two spools, and a mixture of the second word and
> 'many' for more than two spools. The second experiment, however, started
> with ten spools of thread on the table, and spools were subtracted one at a
> time. In this experiment, one speaker used hói (the word previously supposed
> to mean 'one') when there were six spools left, and all four speakers used
> that word consistently when there were as many as three spools left._

~~~
daveloyall
Having read only your comment, I'll jump in and solve the puzzle.

    
    
        enough
    
        not enough

------
superice
I am a little suprised that toki pona ("language of good",
[https://en.m.wikipedia.org/wiki/Toki_Pona](https://en.m.wikipedia.org/wiki/Toki_Pona))
is not mentioned. It is a language that consists of about 125 words, which
aims to make you think about describing complicated subjects. To give an
example: The concept "friend" could both be described as "good man" or "man
good to me" depending on whether you think your friend is intrinsically good.

Admittedly, the original question is specifically about the English language,
but toki pona is a nice experiment related to this.

~~~
kpozin
> "[...] Who are you?"

> "A friend!" Shouted back the man. He ran toward Zaphod.

> "Oh yeah?" said Zaphod. "Anyone's friend in particular, or just generally
> well-disposed to people?"

Adams, Douglas. _The Restaurant at the End of the Universe_.

~~~
phouchg
"sina jan seme?"

"jan pona" mije li toki wawa. ona li tawa tawa jan Zaphod.

"jan pona?" jan Zaphod li toki. "jan pona tawa jan wan anu ale?"

jan Douglas Adams. ma moku lon pini pi ma suli.

~~~
schoen
pona. taso mi pilin e ni: "restaurant" li "tomo moku" li "ma moku" ala.

~~~
phouchg
sina pona. mi pakala. tenpo ni la mi ken ala ante e lipu mi :-(

------
gojomo
An interesting related talk, touching on the minimality and expressiveness of
both natural and computer languages, is Guy Steele's 1998 talk "Growing a
Language":

Video:
[https://www.youtube.com/watch?v=_ahvzDzKdB0](https://www.youtube.com/watch?v=_ahvzDzKdB0)

PDF:
[https://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf](https://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf)

Prior HN discussion:
[https://news.ycombinator.com/item?id=16847691](https://news.ycombinator.com/item?id=16847691),
[https://news.ycombinator.com/item?id=2359174](https://news.ycombinator.com/item?id=2359174),
& others

~~~
peterkelly
That's the first thing I searched for when opening this thread to see if
anyone else had posted the link yet. Just brilliant.

------
fginionio
I think the approach I would use is as follows:

0\. Get a dictionary.

1\. Form a directed graph, with an edge from each word to every word that uses
that word in its definition.

2\. Remove all words that have no outgoing edges.

3\. If you removed some words, go to step 1. Otherwise, all words left in the
dictionary are minimal.

EDIT: If anyone knows of a machine-readable dictionary, I'd love to actually
do this.

~~~
hairtuq
This will not yield a minimal set; in a cycle, it is only necessary to remove
at least one word. The problem is thus to delete the minimum number of
vertices to remove all cycles. This is the NP-hard Feedback Vertex Set
problem. Here's a paper that solves it for a dictionary (there is some more):
[https://arxiv.org/abs/0911.5703](https://arxiv.org/abs/0911.5703)

~~~
fginionio
Looks like you found our answer! Someone's already done the hard work.

~~~
zuminator
This is not necessarily the answer. It's an upper-bound for the answer.

------
visarga
Definitions are not enough to fully capture the meaning of a word. In order to
do that you need full language modelling and to ground words into other
sensory modalities, plus the word in relation to actions taken in various
situations when the word was used.

GPT-2 (of recent OpenAI fame) uses 1.5 billion parameters and, though capable
of interesting results, is far from human level. It also uses just text so
it's incomplete.

[https://blog.openai.com/better-language-
models/](https://blog.openai.com/better-language-models/)

Another interesting metric is Bits Per Character - BPC. The state of the art
is around 1.06 on English Wikipedia. This measures the average achievable
compression on character sequences and doesn't include the size of the model,
just the size of the compressed sequence.

[https://arxiv.org/pdf/1808.04444.pdf](https://arxiv.org/pdf/1808.04444.pdf)

~~~
Emma_Goldman
That's true but it's almost inherent in what a dictionary is, i.e. to
catalogue the canonical semantic meaning of words, not to provide a complete
model of language and its contextual variables.

------
arooaroo
I used to work for Pearson Longman, and one of their USPs was that their
defining vocabulary was significantly smaller than the main competitors,
namely OUP and CUP. Longman's was just over 2000 (about 2100 IIRC), whereas
OUP's was approx 3000.

Even then, one is rather constrained and definitions frequently cross-
referenced other words to bootstrap the definition.

------
chasing
Words in the English language are not the same as computer code. I'm not sure
you can fully define most words in terms of other words -- hence the variety.
Dictionaries generally only provide rough sketches of the meaning of a word.
Even synonyms can have slightly different subtexts, connotations, and
histories. Hell, individual words have wildly different meanings depending on
context.

~~~
akozak
You could call this the Wittgensteinian critique of the question.

------
abecedarius
Besides Basic English, I've run into a neat French dictionary for children,
[https://www.amazon.com/Mon-premier-dictionnaire-Roger-
Pillet...](https://www.amazon.com/Mon-premier-dictionnaire-Roger-
Pillet/dp/B0007DU07S)

It sticks to a basic vocabulary, has an entry for every word it uses, and goes
heavy on examples and pictures in preference to formal definitions. (And it's
monolingual even though written mainly for learners in North America.)

I don't have it to check, but estimating from memory: around 2000 to 4000
words. I found it useful while bootstrapping up from Duolingo.

~~~
degenerate
If it goes heavy on examples and pictures, then it can probably give a more
relaxed definition for words, knowing the context will be picked up from the
pics and examples. Do you find that true?

~~~
abecedarius
Yes, it was like that. The philosophy was to support learning that tries to
come closer to real-life immersion than typical school foreign-language
classes did. (From my memory of the preface, the only part in English. Of
course nothing back in the 1960s could really approach moving to France --
maybe nowadays you could using the internet.)

------
YeGoblynQueenne
It depends on what is meant by "define". If we are allowed to use existing
words in a language, L, to create a new language, L', then use expressions in
L' to define each word in L, a single word w, originally in L, suffices.

The idea is to first index each word v in the lexicon of L (including w),
starting at 1 and ending at n, whatever is the number of distinct words in the
language. Alternatively, you can index _meanings_. Then (should be obvious
where I'm going with this by this point) you map a sequence S_k of repetitions
of w of length k in [1,n] to each k'th word, v_k, in L. So now L' is the
language of n sequences S_1,...,S_n of w each of which maps to a word (or
meaning) in L. And you have "defined" L in terms of a single word, the word w.

But that's probably not at all what the reddit poster had in mind.

However, it should be noted that natural language is such that there's really
no reason that we have many words- it's just convenient and helps us create
new utterances without having to create long sequences of one word, as above.
The important ability in human language is that we can combine words to create
new utterances, forever- which we can do with one word just as well as with a
few thousand.

Finally, I suspect that if there _was_ a minimal set of (more than one!) words
sufficient to define all other words (meanings) in a language, all natural
languages would converge to about that number of words- which I really don't
think is the case.

~~~
adrianmonk
> _probably not at all what the reddit poster had in mind_

I'm pretty confident the goal is to choose a smallest _subset of English_ so
that, if you know this subset of English and are given a dictionary written in
it, you can learn the entire vocabulary of full English.

That means you're not allowed to create any new words, so you can't create the
magic uber-word w.

> _if there was a minimal set of (more than one!) words sufficient to define
> all other words (meanings) in a language, all natural languages would
> converge to about that number of words- which I really don 't think is the
> case._

This amounts to saying there is little to no redundancy in language. I'm not
convinced. For example, once you've got "one" and "plus", the words "two",
"three", "four", etc. are just convenience. Another example might be
opposites: if you have "down", you don't absolutely have to have "up". But the
thing is, people really like convenient ways of saying things. In fact, the
economics probably drive you toward doing this. It makes for shorter sentences
think of it like data compression: if a concept occurs often, you want a
dedicated word for it so you can just say that word instead of saying the
definition.

~~~
YeGoblynQueenne
>> That means you're not allowed to create any new words, so you can't create
the magic uber-word w.

Oh- w can be an English word. And the reddit post didn't say anything about
not inventing a new language, with only English words (it would be a new
language since it would have completely different grammar and semantics).

But I think you're right that what I propose above is totally cheating :)

~~~
Dylan16807
I'd argue that even if you take the same letters as an existing word, adding
completely unrelated definitions makes a new word.

------
kybernetikos
I looked at this question a while back, and wrote this:
[https://kybernetikos.com/2007/12/03/atoms-of-
english/](https://kybernetikos.com/2007/12/03/atoms-of-english/) (blog is only
up some of the time sadly, I'll fix it eventually).

I took Websters dictionary from the project Gutenberg site. I started with
95712 words. After the initial throwing away of words that weren’t in any
definitions, I was down to 4489 words. After expanding them, and throwing away
words that weren’t in the expanded definitions, I was down to 3601 words.
Setting recursive definitions as atoms and continuing got me down to 2565
words.

------
Veedrac
I once found (plausibly from another HN commenter) a text based adventure
where (almost?) all the words used were replaced with alternative English-
sounding nonsense words, but have never rediscovered the link.

I feel this would be of interest to the thread, if anyone knows what I'm
talking about or knows how to successfully Google for such a thing.

~~~
AnIdiotOnTheNet
The Gostak

 _Finally, here you are. At the delcot of tondam, where doshes deave. But the
doshery lutt is crenned with glauds.

Glauds! How rorm it would be to pell back to the bewl and distunk them,
distunk the whole delcot, let the drokes uncren them.

But you are the gostak. The gostak distims the doshes. And no glaud will vorl
them from you._

It has been on my to-play list for some time but I haven't got around to it
yet.

[https://ifdb.tads.org/viewgame?id=w5s3sv43s3p98v45](https://ifdb.tads.org/viewgame?id=w5s3sv43s3p98v45)

~~~
nathell
And let us not forget about Lighan Ses Lion, a transcript of a fictitious game
in a made-up language that just happens to overlap with English.

[https://www.eblong.com/zarf/zplet/lighan.html](https://www.eblong.com/zarf/zplet/lighan.html)

------
feyman_r
Reminds me of Randall Munroe's Thing Explainer:

"In Thing Explainer: Complicated Stuff in Simple Words, things are explained
in the style of Up Goer Five, using only drawings and a vocabulary of the
1,000 (or "ten hundred") most common words."

[https://xkcd.com/thing-explainer/](https://xkcd.com/thing-explainer/)

~~~
doh
Love the book. Super fun to read, even for an adult.

~~~
gotocake
It is a great book, and one of the best tools to teach kids in a fun way I’ve
come across. He has a new book coming out later this year too which describes
absurdly overengineered ways to solve simple problems. I’m preordered. :)

~~~
doh
That's fantastic. Didn't know that he has a new book, for the lazy ones,
called "How To: Absurd Scientific Advice for Common Real-World Problems" [0].
Also pre-ordered!

[0] [https://www.amazon.com/How-Absurd-Scientific-Real-World-
Prob...](https://www.amazon.com/How-Absurd-Scientific-Real-World-
Problems/dp/0525537090/ref=tmm_hrd_swatch_0?_encoding=UTF8&qid=&sr=)

~~~
I_complete_me
I find it reassuring that customers that bought this book also bought
Tramontina 80114/535DS Professional Aluminum Nonstick Restaurant Fry Pan, 10"

------
Criper1Tookus
I've actually been wondering about this a lot myself recently, though I have
been thinking of it in terms of "axiomatic English" i.e. the set of words and
grammar/syntax rules from which all other meanings expressible in English can
be represented, and cannot be explained themselves except through tautology?
It's a really, really interesting question, and answering it would explain a
lot about how we actually think.

------
singularity2001
Just one: "nor"

[https://en.wikipedia.org/wiki/Functional_completeness](https://en.wikipedia.org/wiki/Functional_completeness)

Hope you are one of the 10000 lucky ones whose mind is blown for the first
time.

Or another one: "1"

[https://en.wikipedia.org/wiki/Unary_coding](https://en.wikipedia.org/wiki/Unary_coding)

~~~
adrianN
Logic won't help you without a metatheory that links it back to the real
world.

------
MrOxiMoron
this reminds me of
[https://youtu.be/_ahvzDzKdB0](https://youtu.be/_ahvzDzKdB0) awesome talk!

------
hyperpallium
The words needed to define a universal turing machine (and a program to
simulate a human brain, but that doesn't require additional words).

We could extend it to cover words not conceivable by humans, and any universe,
by using a program to simulate those, but (1) I assume the question implicitly
assumes _human_ words, though (2) it wouldn't require more words anyway.

------
taternuts
Wow, I had no idea there was such a thing as simple.wikipedia.com! It
apparently tries to follow 'Basic English'[0] that's comprised of only 850
words. The difference between the simple version[1] of artificial neural
networks is a lot more approachable than the normal version[2]!

0:
[https://simple.wikipedia.org/wiki/Basic_English](https://simple.wikipedia.org/wiki/Basic_English)

1:
[https://simple.wikipedia.org/wiki/Artificial_neural_network](https://simple.wikipedia.org/wiki/Artificial_neural_network)

2:
[https://en.wikipedia.org/wiki/Artificial_neural_network](https://en.wikipedia.org/wiki/Artificial_neural_network)

------
ggggtez
0 obviously. Babies start with no definitions of words, but here we all are.

The baby learns the words via example, not by definitions.

~~~
catach
I think implicit to the question is that you're _just_ using words for
defining.

------
WhitneyLand
How does this make any sense?

You could have 100 synonyms with the same "definition" but 100 different
shades of meaning, implied degree of strength, or connotations.

You don't necessarily simplify anything by making people add additional words
get across those subtleties.

Of course of some are useless equivalents, but many aren't.

~~~
magneticnorth
Oh, you absolutely wouldn't simplify anything by doing this - ideas that used
to be encompassed by a single word would have paragraph-long descriptions.

It's just a thought experiment about how much you could optimize one dimension
(number of words) if you didn't care at all about optimization anywhere else
in language.

------
lostmsu
The answer is 2: zero and one. What you need is to describe second-order
logic. Just define "every" quantifier to be 0 0 0, NAND [1] - 0 0 1, and all
other words as other sequences of 0s and 1s, for clarity, that look like 1 *.
There might need to be some trick to ensure unambiguity of splitting a
"sentence" into "words", but that should be trivial.

1:
[https://en.wikipedia.org/wiki/Sheffer_stroke](https://en.wikipedia.org/wiki/Sheffer_stroke)

------
emilfihlman
Taken to the logical extreme, the question is: "how many intrinsic symbols do
we need to convey any meaning when presented to a fully logical being", to
which, in my opininon, the answer is 1 (or 2, really, since 1 is only
"possible").

You might not have words for it, but a fully logical being can decipher any
bitstream given enough interactivity.

So start from 1 and 0, form basis of mathematics and symbols, then start with
physics from all the way bottom.

------
randartie
0) Initialize set X to contain every word.

1) Y = set of words in every definition of the words in set X

2) X = Y - X (all words in Y that are not in X)

3) Repeat from 1 if the set of words in X has changed

Does that reduce all words down to the actual minimal set of words required to
define other ones? Since you can build upwards from the resulting set X to get
the original set of words.

Also, this reminds me of the knapsack problem a little bit (for example what
is the minimum set of coins required to be able to make $X).

------
bloak
I've seen a dictionary that defined 120 words using those same 120 words
(morphemes really), though some of the definitions were a bit ... weak. Toki
Pona also has about 120 words but it's a very different set of words: Toki
Pona's vocabulary is concrete and everyday, while the dictionary's was very
abstract. So probably it's just a cute coincidence that both numbers were
about 120.

~~~
rijoja
Not according to these guys:

[https://en.wikipedia.org/wiki/Semantic_primes](https://en.wikipedia.org/wiki/Semantic_primes)

mentioned by mjgeddes in this very thread

------
_cs2017_
It depends on the context assumed about the audience that is supposed to
understand the definitions.

Do they have the experiences relevant to the word being defined? If not, what
experiences do they have in common with the person providing the definition?

How intelligent are they? Can they understand complex concepts through logic,
through examples or both?

How much do they know about English (besides the few words assumed known)?

------
rsync
Isn't the answer "two" ? "One" and "None" (or on and off) ?

Of course I see the obvious bootstrapping problem where you relate the
encoding starting with just those two words but ... somehow I think that's
easier to overcome than it seems ... as in, I think it must be possible.

If Helen Keller can write a book, surely I can relate digital encoding to a
toddler over the course of a year or three, right ?

~~~
rfeather
I think it depends on if there is a distinction between digits/letters and
words, otherwise "26" would be a good starting answer too (since each letter
is it's own word).

------
Gunstig2Snath
I can actually answer this question. Back in the day I was going through
Oxford Dictionary and it mentioned that all the meaning use words from a like
of about 3,000 words. The list, IIRC, was also at the back of the dictionary.
And it also mentioned that on rare occasions they have to use words outside of
those 3,000.

Source: My memory of something I read at British Council Library 17 years ago.

~~~
pbhjpbhj
Others agree with your memory, eg
[https://news.ycombinator.com/item?id=19332648](https://news.ycombinator.com/item?id=19332648).

------
lkrubner
So, once the human race had discovered roughly this number of words (give or
take a few for whatever language existed at the time, and minus the useless
words demanded by grammar) then humans had a Turing Complete language? That
must have been a crucial point for the evolution of human culture.

~~~
atoav
Much more crucial than the Turing completeness of the grammer was certainly
the ability to write language down, which conserved knowledge and language for
multiple generations as long as the decoding skills were passed along.

What effect this really had can be observed with the introduction of
mechanical printing presses which reduced spatial and temporal distances of
information flow significantly.

The internet might yet be another of those things..

------
hitekker
I am reminded of
[https://en.wikipedia.org/wiki/Natural_semantic_metalanguage](https://en.wikipedia.org/wiki/Natural_semantic_metalanguage)

------
DmitryOlshansky
I bet this heavily depends on what you consider an accurate definition.

------
twotwotwo
There are lots of ways it's not at all the same, but it's at least sort of
interesting to compare this question to the number of dimensions needed for
effective word embeddings.

------
novalis78
Reminds me of Toki Pona — with about 120 words it seems to work.

~~~
schoen
As an avid toki pona user, I've often contrasted it with NSM and noticed
things that are really tough to express in toki pona (very likely
intentionally).

One thing is that toki pona has no built-in comparatives at all. A usual thing
is to say something like

mi sona e ijo mute ala. jan pi pali sama li sona e ijo mute.

'I know not many things. My colleague knows many things.'

ona li suli taso mi suli mute.

'She is big, but I am very big.'

jan ni li jo e mani mute. taso jan ante li jo e mani mute mute.

'This person has a lot of money. But the other person has lots and lots of
money.'

Another thing is that there's no built-in way to make a relative clause at
all.

mi sona e toki. mama meli mi li sona e toki sama.

'I know a language. My mother knows the same language.' (As opposed to 'My
mother knows a/the language that I know'!)

mi sona e toki. mama meli mi li sona ala e toki ni.

'I know a language. My mother does not know this language.' (As opposed to 'I
know a language that my mother doesn't (know)'!)

ona li pali e ijo. mi sona e jan ante. jan ni li pali kin e ijo ni.

'She does something. I know another person. This person also does this thing.'
(As opposed to 'I know another person who does what she does'.)

moku mute li kama tan soweli. mi moku ala e moku ni.

'Many foods come from animals. I don't eat these foods.' (As opposed to 'I
don't eat foods that come from animals'.)

It's also extremely tricky to construct specific tenses and specific logical
conditions. The particle "la" can mean "when", "because", "also", or "if", and
is only supposed to be used once per sentence. This is especially challenging
when trying to contrast things that have happened with hypothetical
conditions. For example

jan olin ona mije li moli la mi mute li pilin ike.

I intend this to mean 'we feel bad because his romantic partner died' but we
can't really disambiguate, for example, 'we will feel bad when his romantic
partner dies' or 'if his romantic partner dies, we will feel bad'.

You can qualify things with "tenpo pini/ni/kama la" ('in past/this/future
time'), but you're not supposed to use more than one "la" in the same
sentence, so it's discouraged to write things like

?tenpo pini la mi moku e ni la insa mi li pilin ike.

'Because, in the past, I ate this, my belly feels bad.'

You can try to break these up into multiple sentences.

tenpo pini la mi moku e moku jaki. mi pali e ni la mi kama pilin ike.

'In the past, I ate gross food. Since I did this, I started feeling bad.'

This gets really challenging if you have to refer to several different things
of the same sort, which perhaps have conditional relations to one another that
apply at different times or in different circumstances. For example, if you
wanted to say "when my mother arrived, the plane that she was on was very warm
because it had a broken air conditioning unit which the crew didn't know how
to fix", you might end up making a long series of sentences that tell a story.

tenpo pini la mama mi li kama kepeken ilo tawa kon. ona li kama la kon lon ilo
li seli mute. ni li kama tan ni: ilo lete li pakala. jan pali li sona ala pona
e ilo lete.

In the past, my mother came using an air travel tool. When she/it arrived, the
air in the tool was very hot. This happened because of this: the cooling tool
broke. Workers did not know how to improve the cooling tool.

But some kinds of conditions don't necessarily lend themselves well to this
form, like if I wanted to say "if she had known that this would happen, she
wouldn't have taken this airplane", or quantifiers like "every Singaporean who
goes to school in Singapore learns English and whatever the government defines
as his or her family's language" or "everyone who was inside the building when
the earthquake happened got injured by some object"...

I don't feel confident about my ability to describe the truth conditions of
the latter two examples in toki pona in a way that's faithful to the English
original.

It's also unclear to what extent we're allowed to stack "e ni:" and "tan ni:"
in order to embed indirect discourse and chained reasons.

?ona li pilin pona tan ni: toki pona li pona tawa ona tan ni: ona li toki lili
li jo ala e nimi mute.

'She was happy because of this: she liked toki pona because of this: it's a
small language and doesn't have many words.'

Edit: also, NSM explications assume that you're deliberately defining new
vocabulary in order to expand your language, which isn't really customary in
toki pona. Even if we figure out how to express a concept or situation in toki
pona, we don't then acquire a single word that we can use for that concept or
situation in the future.

------
raldi
What's the minimum number of words you'd need to define the word "left", as in
"left hand"?

~~~
tchaffee
opposite right?

~~~
pbhjpbhj
Left is 'not right', right is 'not left'. So left is simply 'not not left' ...
ez!

------
gpm
a, b, c, d, e, f, g, j, k, l, m, n, p, r, s, v, x, y, (, ), and, concatenate.

Some hints:

\- "backwards j"

\- "a circle"

\- "a cross"

\- "n, but rotated ninety degrees"

\- "mirror of p"

\- "vv, except no gap"

\- "pixel-wise union n and l"

\- "mirror of s, and make the lines straight"

Semantics are impossible anyways, I challenge you to define the word "dog".

Challenge: Do better, make sure you don't have circular dependencies.

------
SeanLuke
This feels intuitively like it's closely associated with some measure of the
Komolgorov complexity of a passage.

------
aboutruby
Can go from 1 word: Entity, to every word.

The tradeoff being density of information, understandability to the readers,
and conciseness.

~~~
ianleeclark
There are things which "are" which are not entities: objects.

------
doxos
Two words. "1" and "0"

~~~
TheLoneAdmin
10 words. "1" and "0"

fixed it for you.

~~~
pbhjpbhj
01000100 01101001 01100100 00100000 01111001 01101111 01110101 00100000
01101101 01100101 01100001 01101110 00111010 00001010 00001010

00110001 00110000 00100000 01110111 01101111 01110010 01100100 01110011
00101110 00100000 00100010 00110001 00100010 00100000 01100001 01101110
01100100 00100000 00100010 00110000 00100010 00001010 00001010 01000110
01101001 01111000 01100101 01100100 00100000 01101001 01110100 00100000
01100110 01101111 01110010 00100000 01111001 01101111 01110101 00101110

00111111

Also spaces??

------
ChlorophZek
Finally, something thought-provoking! Everybody, ready your Internets, this
gentleman deserves an answer!

------
sebringj
"a" and "i" since its binary you could define all others.

------
kazinator
Good to see this silly question off r/lisp for once. :)

------
WhuzzupDomal
Finally, something thought-provoking! Everybody, ready your Internets, this
gentleman deserves an answer!

------
novalis78
Reminds me of Toki Pona

------
keyle
I'm guessing but I can't really explain why, my gut feel is 42.

------
vonnik
Randall Munroe of XKCD experimented with this in his book Thing Explainer:

[https://xkcd.com/thing-explainer/](https://xkcd.com/thing-explainer/)

------
stretchwithme
One.

~~~
ithkuil
unary codes still need two symbols because you need a terminator/separator.

binary codes can be prefix-free thus self terminating.

------
agumonkey
and kernel is one of them

------
aaron695
I'd say most nouns need to be seen.

To understand duck you must see a duck (Eat a duck, pet a duck, smell a duck,
hear a duck)

Perhaps you could cheat and uses pixels and coordinates to use English to draw
photos and videos to explain ducks.

~~~
drewrv
It depends on how well you want to "define" something. Wikipedia describing a
duck:

 _Duck is the common name for a large number of species in the waterfowl
family Anatidae which also includes swans and geese. Ducks are divided among
several subfamilies in the family Anatidae; they do not represent a
monophyletic group (the group of all descendants of a single common ancestral
species) but a form taxon, since swans and geese are not considered ducks.
Ducks are mostly aquatic birds, mostly smaller than the swans and geese, and
may be found in both fresh water and sea water. Ducks are sometimes confused
with several types of unrelated water birds with similar forms, such as loons
or divers, grebes, gallinules, and coots._

But you could also describe a duck in two simple words: "water bird".
Apparently that's a real term:
[https://en.wikipedia.org/wiki/Water_bird](https://en.wikipedia.org/wiki/Water_bird)

~~~
oh_sigh
That's not really a good definition, because it is too expansive. Penguins are
water birds but not ducks.

------
lutorm
Doesn't Goedel's incompleteness theorems imply that it is impossible to define
all words using words, unless you have some axiomatic words that are not
defined within the system?

~~~
ars
I don't think it applies here. With words you just end up with circular
definitions.

With math a circular definition is unacceptable, and that when the theorem
comes into play.

~~~
lutorm
I would think circular definitions would be just as unacceptable with words.

~~~
pbhjpbhj
They're not because we don't acquire language by definition of words (alone;
sometimes at all; reading dictionaries comes along way down the road of
language acquisition).

