
The Controlled Natural Language of Randall Munroe's Thing Explainer [pdf] - tkuhn
http://arxiv.org/abs/1605.02457
======
GolDDranks
I thought Thing Explainer is a fun experiment and a delightful book, but as an
attempt (and it isn't a serious attempt) to use only simple and super-
commonly-used language, it doesn't hit the home base. As I speak English as my
second language I'm acutely aware of this.

Thing explainer uses the 1000 most commonly used lemmas, but words have
multiple senses, and some of them are commonly used and some are not. From a
viewpoint of a language learner, an unfamiliar use of a word might be another
word for what it's worth. (Of course they might have a clear semantical
connection, which helps guessing.)

Another thing is that phrasal verbs and set phrases are essentially vocabulary
items too – you can't decode them using only extralinguistic knowledge (that
is, knowledge about the world).

Randall Munroe developed a text editor that highlights any words outside his
word list to help with writing the book, but I think an editor that could
handle word senses and multi-word phrases would be a formidable thing. Of
course it needs much more high-level NLP, word sense disambiguation and such.
(Possibly impossible to pull that off cleanly with the current level of tech?)
I'd love to see one.

~~~
drb311
The text editor is here:

[http://xkcd.com/simplewriter/](http://xkcd.com/simplewriter/)

Try it!

1,000 words is an unrealistic constraint. But I'm amazed how often I use a
complex word when the simple alternative is better.

Or...

Sticking to 1,000 words is hard. But doing it often makes my writing better.

~~~
eddieroger
I wasn't familiar with that tool, so I put a few of my last HipChat messages
in there to see how I faired. Once I removed specific words or proper nouns, I
still didn't do all that well. Even the preceding sentence had four problem
words, and I'm pretty sure "preceding" would join them.

I agree, sticking to 1,000 words is hard. But I don't think I agree that it
makes the writing better. I look at language as a form of expression, and some
words are just more colorful than others, but it takes all of them to paint a
picture. I honestly don't know if I could limit myself to the 1,000 most
common words, but I wouldn't want to do that if I could. That said, I'm
curious why you think it makes your writing better? Is it that you think more
about what you're saying? Or that you think it will be more easily digested by
others?

~~~
ethbro
To me, language is wielded with dual intention: (1) to convey meaning ("A
thorough lecturer") & (2) to convey meaning in a pleasing manner to the
listener ("A good lecturer").

Expanding vocabulary to accentuate (2) by necessity compromises (1) _for
classes of listeners / readers who are not familiar with that vocabulary
superset_. Which decomposes the optimization problem to "Who is my audience
and what is their comfort vocabulary set?" I would expect it's >1,000 words
even for ESOL listeners. However, it's certainly < "the full set of florid
English words".

And furthermore, I think English writers writing for English consumers (I
count myself among these, sadly) often undervalue writing in the most
effective style for the widest audience when applicable (and nowadays it
almost always is: research papers, comments on a public forum, blog posts, how
to's, etc etc).

Does anyone have any links to courses to help develop a working minimally
spanning English vocabulary for international technical communication?

This is one problem I've had with academic literature in historically liberal
arts fields. "Just learn the obscure English vocabulary _(before you can
understand, work, or research in a field)_ " is a ridiculous bar to set in
front of contributions.

~~~
drb311
Rare words can make writing more efficient and enjoyable. But it doesn't
always.

To my ears:

"I use language for two reasons:"

Is better than

"To me, language is wielded with dual intention"

But I always enjoy the word "florid", even though it's not in the top 1000
words.

~~~
cyphar
> "To me, language is wielded with dual intention"

I've got to be honest. I actually prefer that one. Not for conveying meaning
but it definitely sounds nicer -- as a though it were a part of a
Shakespearean soliloquy.

~~~
ethbro
I believe that's the nicest thing anyone has ever said about words I put on
the internet.

------
adrianratnapala
Thing explainer is fun -- but I don't think the descriptions in it are
particularly clear. It's just fun to see how he solves the problem of
describing them given his self-imposed limitation.

Even more fun is Poul Andersons "uncleftish beholding":
[http://www2.warwick.ac.uk/fac/cross_fac/complexity/people/st...](http://www2.warwick.ac.uk/fac/cross_fac/complexity/people/students/dtc/students2011/maitland/fun/)

~~~
jgrahamc
Agreed. Often you have to already know what he's talking about to get meaning
from his words.

For example, his description of the Saturn V Rocket (he replaced rocket with
'up goer') has this description of hydrogen: "The kind of air that once burned
a big sky bag and people died (And someone said "Oh the [humans]!")

I guess that's amusing if you already know about the Hindenburg, or something.

Also, speaking of the propellant used for the Saturn V's F-1 engine he states:
"This is full of that stuff they burned in lights before houses had power".
The propellant in the F-1 engine was RP-1 (kerosene). I assume he was talking
about 'town gas' for lighting.

~~~
paol
Kerosene lamps were ubiquitous before electric lighting. Look at the pictures
in the wikipedia page[0], you'll recognize them for sure.

[0]
[https://en.wikipedia.org/wiki/Kerosene_lamp](https://en.wikipedia.org/wiki/Kerosene_lamp)

~~~
Pitarou
It seems that, in trying to make a second point, jgrahamc accidentally proved
his first.

~~~
jgrahamc
It wasn't an accident. I was showing that the ambiguity introduced by his
language means that it's hard to understand what's really going on.

~~~
Pitarou
Well that was clever. :-)

------
mendelk
Off topic:

I attended a school with little to none "secular" education, but I was by
nature really curious about things.

At some point, I got my hands on "How Stuff Works", the book[0], and devoured
it. It was a super-enlightening book. I'd even venture to say it set me on my
autodidactic path to programming.

If this book serves the same purpose for someone as that book did for me, it's
benefits cannot be overstated IMO.

As an extra bonus, I was considered by my peers to be far more knowledgable
than I actually was, due to having a layman's understanding (or at least the
appearance thereof) of _so many_ esoteric concepts. 10/10 would read again!

[0] [http://amzn.com/0785824324](http://amzn.com/0785824324)

------
jsingleton
While this is tagged as PDF the link is actually to the website where you can
read the abstract.

The PDF link is:
[http://arxiv.org/pdf/1605.02457v1.pdf](http://arxiv.org/pdf/1605.02457v1.pdf)

------
stepanhruda
I have Thing Explainer right here, my main issue is I'd like a separate
version with the actual terms so I can have a conversation about the covered
topics without sounding like a moron.

~~~
vanderZwan
What I think would be interesting is if he tried another version _without_ the
1000 most common words.

~~~
andreasvc
I don't think it's possible to form non-trivial sentences without function
words. But leaving out common content words could be interesting.

~~~
weinzierl
Someone wrote a 260 pages long novel without using a single word that contains
the letter 'e'. That seems impossible to me.

[https://en.wikipedia.org/wiki/Gadsby_%28novel%29](https://en.wikipedia.org/wiki/Gadsby_%28novel%29)

~~~
theoh
There are multiple examples. It is difficult but apparently not so difficult
that nobody attempts it. The accepted term for texts like that is "lipogram";
it's a popular form of the broader category of constrained writing.

Wikipedia has a page for "logology" which is apparently a term for the general
activity of playing with language in that kind of way on a per-letter basis
(so, including anagrams and palindromes, etc.)

[https://en.m.wikipedia.org/wiki/Logology](https://en.m.wikipedia.org/wiki/Logology)

------
teh_klev
Looking at the diagram of "Bags of stuff inside you" in "Thing Explainer" I
was surprised to find that when annotating blood vessels, intestines etc he
uses "hallway" and not "pipe", which I thought would be in the 1000 most often
used english words. How curious.

------
0x0
Sounds very similar to Simple English Wikipedia:
[https://simple.wikipedia.org/wiki/Wikipedia:Simple_English_W...](https://simple.wikipedia.org/wiki/Wikipedia:Simple_English_Wikipedia#Simple_English)

~~~
andreareina
In fact, Randall makes an explicit mention of it in a comic:

[https://xkcd.com/547/](https://xkcd.com/547/)

------
tomc1985
Thing Explainer was a nice experiment, but, really... what's the value in
trying to parse CNL explanations that are harder to understand than using the
actual word?

------
impostervt
self-plug: let's you explore which English words are the most common

The Long Tail of the English Language [http://blog.wordsapi.com/2015/01/the-
long-tail-of-english-la...](http://blog.wordsapi.com/2015/01/the-long-tail-of-
english-language.html)

~~~
rspeer
Is that your site? You have a spam problem in your comments.

