
Cleartext: A text editor that only allows the 1,000 most common words in English - henrik_w
https://github.com/mortenjust/cleartext-mac/blob/master/README.md
======
gregschlom
Am I the only one to find that using only the 1,000 most common words actually
makes thing _harder_ to understand? The writer ends up having to use
convoluted paraphrases to refer to things where a precise, well defined and
well understood word exists but it's not part of the top 1,000.

~~~
AndrewUnmuted
The technique is working quite well for Donald Trump:
[https://www.washingtonpost.com/news/the-
fix/wp/2015/09/15/ho...](https://www.washingtonpost.com/news/the-
fix/wp/2015/09/15/how-trump-speak-has-pushed-the-donald-into-first-place/)

> Some of his answers last only a few seconds, some are slightly longer, but
> almost all consist of simple sentences, grammatically and conceptually, and
> most of them withhold their most important word or phrase until the very
> end. Trump’s sentences end with a pop, and he seems to know instinctively
> where to put the emphasis in each one.

~~~
gnaritas
He speaks like a child, it works because his base is stupid and they relate.
It's not a skill, he just isn't that bright.

~~~
jshevek
If Trump honestly believed everything that he says, I would agree that his
thinking would necessarily be shockingly _shallow_ on many issues.

But to compare him to a four year old then make a sweeping statement like
"isn't that bright" ?

How could anyone think this? Isn't it obvious that there are domains in which
he is, for better or worse, exceptionally intelligent within that domain?

~~~
kbenson
I think it's interesting to try to identify his domain. I think it's
marketing. He is exceptional at marketing, and at generating personal wealth
from that. That is different from being a good businessman, or knowing how to
run a business (and for his method of wealth generation, may by in opposition
to it). I think he's very good a perpetuating his own extreme boom and bust
cycles, and coasts along on that. While at the top, he's able to capitalize on
the success to extend his brand,

------
kazinator
It should support a few template sentences which _define_ a word, which is
then allowed to occur in the remainder of the text.

More useful than a document which uses only 1000 common words is a document
which uses only 1000 words, plus words which it clearly defines.

I feel as if I could write almost anything if I have that, and it will be
self-contained and accessible to anyone with the thousand word vocabulary,
plus the ability to internalize definitions, which is a very basic faculty of
the intellect.

~~~
skystrife
This idea reminds me of an amazing talk by Guy Steele:
[https://youtu.be/_ahvzDzKdB0](https://youtu.be/_ahvzDzKdB0)

He starts the talk by assuming monosyllabic words as his primitives and builds
up the words he needs to use to give the talk by providing definitions for
them first.

~~~
kogus
Thanks for that, I really enjoyed it. Back when Java was new and cool...

------
kf5jak
A similar editor was made for the book Thing Explainer[1] by Randal Munroe
from XKCD. A book that explains all kinds of different things, from space
shuttles to microwaves using the top 1000 common English words.

[1] [https://xkcd.com/thing-explainer/](https://xkcd.com/thing-explainer/)

~~~
Tharkun
Yeah it's called "thing explainer", but it's more like a "thing convoluter".
It's a work humour, not a book of science.

~~~
tormeh
Some of the explanations are actually going into a textbook now, to accompany
more traditional text. So it seems professionals disagree with you. I will
concede that it might have been clearer with the top 2000 words or something,
but that's my perspective and plenty of kids might disagree, especially those
that have grown up with weird dialects or even sociolects like AAVE.

~~~
readams
No. He's doing some drawings for text books. There is no chance that the thing
explainer 1000 words affectation will be adopted for the text book for the
purpose of actual teaching.

------
eracer001
Would love a slider that would allow you to adjust the words allowed from 500
most common words in English to 10k most common words. Also, it would be great
if you could compile a windows version.

~~~
igravious
Or how about the more uncommon the word the more like the background tone it
is so that you get instant visual feedback?

~~~
boie0025
I like this idea. Maybe a context menu for each word with scored
alternatives..

------
visarga
I like the idea. This could be enhanced with a thesaurus that would offer
alternatives to difficult words. It could be a useful tool not only for
writing clearer explanations, but also to compose easy readers for English
learners.

~~~
kej
You could build that feature dynamically. When a user tries to use a
disallowed word and then uses an allowed word, you could log that word pair.
Combine and anonymize those logs and you'd be able to show the most likely
replacement words for any word that people often try to use.

------
macintux
Attempting to type the Gettysburg Address, which is how I usually experiment
with word processors and keyboards, is an exercise in futility.

> Eight times ten and seven years ago our fathers brought into the world a new
> country, born of free thoughts and doings, and completely sold on the idea
> that all men are created the same.

~~~
Nutmog
Futility? It looks like an improvement on the original:

"Four score and seven years ago our fathers brought forth on this continent a
new nation, conceived in liberty, and dedicated to the proposition that all
men are created equal."

"free thoughts and doings" sounds clearer that "liberty", which could mean
almost anything - did it have no prisons? I guess they really meant freedom
from England's control, which isn't quite free thoughts and doings - that goes
to show how poorly worded the original was.

~~~
macintux
Them's fightin' words. If the Gettysburg Address is poor writing, life has no
meaning.

------
blhack
You guys are missing that this is a joke. This is a reference to this comic:
[https://xkcd.com/1133/](https://xkcd.com/1133/).

Which is poking fun at how _funny_ things sound when you restrict yourself to
only that many words.

~~~
duaneb
A joke that would improve the writing of most people.

~~~
tacos
Sorry, "improve" is not in the list.

~~~
duaneb
See? I need to better my own use of words. We all speak and write in ways
others can not understand unless we make the effort—especially in a place
where not everyone speaks our language.

------
jotux
It's neat but not very useful. I'd actually be interested in an editor that
enforces E-Prime[1] to see if it made writing more clear.

[1]
[https://en.wikipedia.org/wiki/E-Prime](https://en.wikipedia.org/wiki/E-Prime)

~~~
zzkt
[https://github.com/AndrewHynes/eprime-
mode](https://github.com/AndrewHynes/eprime-mode)

~~~
kkylin
Thanks! My first question upon seeing this post was: isn't there already an
emacs mode for this?

~~~
throwanem
There is now: [https://github.com/aaron-em/ten-hundred-
mode.el](https://github.com/aaron-em/ten-hundred-mode.el)

(There was before, under the name "1000-words.el", but it isn't very good, and
it also isn't, and can't easily be made, available via MELPA. So I wrote this
instead. The PR to add it to MELPA is open and awaiting review; pretty soon
you should be able to M-x package-install RET ten-hundred-mode RET and get
it.)

------
gnoway
Perfect for editing articles in Simple English Wikipedia:

[https://simple.wikipedia.org/wiki/Main_Page](https://simple.wikipedia.org/wiki/Main_Page)

------
jjar
The author of the book already made this.

[https://www.xkcd.com/simplewriter/](https://www.xkcd.com/simplewriter/)

------
crispyambulance
Joke aside, there are ways to compute a "readability score" for any block of
text. This is useful for document writers that need to target particular
grade-levels for their docs (eg driving manual is "8th grade").
[https://readability-score.com/](https://readability-score.com/)

~~~
rspeer
The readability score is based only on the lengths of words, which I guess is
something, but it's not a very good proxy for when you learn the word. The
frequency of the word is a much better thing to measure.

------
jshevek
I think this is a cool project. I appreciate that the creator has shared this
with the world. I think its sad that people are (appear to be?) criticizing
him simply for making it and sharing it. However....

"If you find yourself on the receiving end of a message that is too hard to
figure out, do everyone a favor and insist on a simpler version.

Do _everyone_ a favor? Presumptive.

If I am on the receiving end of a message that is too hard for me to
understand due to my ignorance of certain terms, or due to difficulties I have
parsing grammatically correct writing, then the _best_ way for me to do
_everyone_ a favor is to work on improving my own vocabulary and/or thinking
ability.

Then I will be better equipped to communicate well with a larger set of the
population, and better equipped to reason well. Improving your own thinking
skills is good citizenship. Improving your ability to communicate well with a
wider swath of the population can help you to build bridges between
communities.

If I were instead to insist that the message achieves 'clarity' by
accommodating my ignorance, then I may be helping some but certainly not
everyone.

"Maybe one that only uses the 1,000 most common words."

Asking others to accommodate limitations to my vocabulary is likely to
increase their cognitive load; most often I'd rather them apply themselves
more fully to other tasks. I can just use a dictionary! Also, this runs the
risk of resulting in text that is _harder_ to understand, if they trade
precise terms for needlessly convoluted grammar.

------
cowardlydragon
How about a standard translator that takes generally accepted conversions of
more complex language constructs into simpler language?

How about a thing that changes less well known words and puts in easier words?

~~~
DanBC
If you come up with a way to automagically translate things into easier
English you could probably sell it to various UK government / NHS / etc
organisations, who all have a duty to make much of their communication
available in "Easy Read".

Here's the Equality Act Easy read version:
[https://www.gov.uk/government/uploads/system/uploads/attachm...](https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/85012/easy-
read.pdf)

And the Equality Act non-easy read:
[http://www.legislation.gov.uk/ukpga/2010/15/contents](http://www.legislation.gov.uk/ukpga/2010/15/contents)

Here's a Google search for some easy read documents:
[https://www.google.co.uk/search?site=&source=hp&q=easy+read+...](https://www.google.co.uk/search?site=&source=hp&q=easy+read+inurl%3Agov.uk&oq=easy+read+inurl%3Agov.uk&gs_l=hp.3...666.8052.0.8378.23.20.0.0.0.0.410.2468.0j1j5j2j1.9.0....0...1c.1.64.hp..14.3.874.0.4Hy0Gvi2kHQ)

Here's Mencap's (a charity that works with people with learning disability)
guide to Easy Read:
[https://www.mencap.org.uk/make_it_clear](https://www.mencap.org.uk/make_it_clear)

~~~
suchow
Similarly, in the U.S. there is the Plain Writing Act of 2010
([http://www.plainlanguage.gov/plLaw/](http://www.plainlanguage.gov/plLaw/)).

------
rexf
It would be a huge usability improvement to have autocompletion of valid
words. Currently, you have to type each word, and wait to see if it is
rejected/removed.

Instead of waiting for each word to be typed, why not show acceptable words as
you are typing, so you can tell before you finish typing if it is valid.

------
dallamaneni
There is a web version of this which I made a few days ago:
[https://news.ycombinator.com/item?id=11424197](https://news.ycombinator.com/item?id=11424197)

~~~
devy
And this was submitted before as well.
[https://news.ycombinator.com/item?id=11367972](https://news.ycombinator.com/item?id=11367972)

------
baby
The most annoying application on earth :)

A better application would be to underline words not included in the whitelist
and to provide a synonym included in the whitelist upon right clicking.

But as I learned in school: a language's diversity is beautiful, why restrain
ourselves in the vocabulary we use?

In particular, a boring text with many repetition is just hard to read, see
the discussion here:
[https://news.ycombinator.com/item?id=11131391](https://news.ycombinator.com/item?id=11131391)

------
cpeterso
I wrote a simple editor that used Google Translate's web service to round-trip
translate the text you enter in real time. I had been thinking about
intermediate representations that might assist in the automatic translation of
human languages. I wanted to see how one's writing might change to ensure that
the round-trip translation was identical or still made sense, thinking this
would improve the odds that the translated version actually meant what you
intended it to.

------
threatofrain
While this is probably a joke, I would note that a person is rarely writing
for consumption by all demographics. Vampire novels, comic books, chemistry
textbooks, all have their audiences, and it is naive to say "but one day the
whole world may wish to read my book, I must open the gates as wide as
possible", because accessibility is not a free.

Don't simplify or complicate your language without reason.

------
drcode
Great app, but this still doesn't address the issue of homographs, where a
word can have wildly different meanings (some of them uncommon meanings) that
just happen to be spelled the same.

For instance, in the live example, the word "Application" is used to refer to
a computer program, when the more common meaning likely refers to the verb, as
in "The application of a band aid".

------
sogen
Check also Rewordify [1], provides suggestions alongside the original text,
has "difficulty" settings, and more, very complete website, highly
recommended. I spent ages looking for something like this.

1.- [https://rewordify.com/index.php](https://rewordify.com/index.php)

------
WalterBright
I read years ago that the vocabulary used on TV was about 2,000 words. A high
school graduate has a vocabulary of 10,000 words, a college graduate 30,000,
and there are a million words in the English language.

What was interesting to me was that I could learn a foreign language by only
learning 2,000 words!

~~~
rewrew
That is interesting. What about languages other than English? Any idea about
the avg. word count needed to learn them? (Used on TV does seem like a good
benchmark for "learned). Or is it roughly the same for every language, do you
think?

~~~
bane
"being able to read a newspaper" is a typical goal for pre-fluency language
acquisition. In most languages I'm aware of the number of words required to do
that is something around 1500-3000.

~~~
WalterBright
Yes. But I read somewhere that a newspaper vocabulary is targeted at somewhere
around a 5th grade reading level.

------
makecheck
A great idea. See, the point is to be _understood_ by _everyone_. Nothing is
gained by making crazy sentences out of words that most people have never
heard of. Is that supposed to impress people?

I had a friend who seemed to start replacing dead-simple phrases like
“seriously!?” with “are you being facetious!?”. Ugh, this doesn’t help at all.
It will either scream “hey, everyone, I learned a new word!!!” to the people
who know what you said, or “say what!?” to anyone else; and they will be too
busy parsing your words to hear whatever you say next.

And this advice isn’t because everyone you meet will be 6 years old or a high
school drop-out. Simple words are: (1) WAY easier to pick up for _everyone_ ,
and (2) many people are not using their native language, including very smart
people in fields where people like big words.

------
tomc1985
Thing Explainer was a nice experiment.... but that's exactly what it was. An
experiment... not to be repeated.

English has such a wonderfully rich vocabulary... and no cow is too sacred for
the 'innovators' of tech. Everyone: prepare to talk like 6-year-olds!

------
gertef
Guy Steele's _Growing a Language+ is a much better model than "1000 most
common words"
[https://www.google.com/?#q=growing+a+language](https://www.google.com/?#q=growing+a+language)

------
giardini
How about modifying it so one can choose to use the 1,000, or 2000, 5000, or
10,000 most common words, for instance? Should be pretty easy and done
dynamically (although tuning the number _down_ would require more work).

------
galago
Has no one mentioned Basic English yet?
[https://en.wikipedia.org/wiki/Basic_English](https://en.wikipedia.org/wiki/Basic_English)

------
dallamaneni
Check EasyWrite: Cleartext alternative for the web:
[https://github.com/adeekshith/easy-write](https://github.com/adeekshith/easy-
write)

------
cdevs
The original machine had a base-plate of prefabulated amulite, surmounted by a
malleable logarithmic casing in such a way that the two spurving bearings were
in a direct line with the pentametric fan. The main winding was of the normal
lotus-o-delta type placed in panendermic semi-boloid slots in the stator,
every seventh conductor being connected by a nonreversible trem'e pipe to the
differential girdlespring on the 'up' end of the grammeters.

------
Erik816
Oh good, now I can make sure I'm following proper Newspeak!

------
dmlhllnd
Presumably the list includes the word "gimmick".

------
frenchie4111
It would be nice if it there was a setting that would make it not delete a
word, but mark it in red. That way I can copy/paste things into it.

------
dintech
"At last!" he said. "My good sir! This is remarkable!"

Although in Trob the last word in fact became "a thing which may happen but
once in the usable lifetime of a canoe hollowed diligently by axe and fire
from the tallest diamondwood tree that grows in the noted diamondwood forests
on the lower slopes of Mount Awayawa, home of the firegods or so it is said."

------
grondilu
As anyone ever written an artistically worthy novel with only those 1,000
words? Sounds like an interesting lipogram.

Also, is there a Vim plugin to do that?

~~~
cpeterso
Lucy Aikin was an 19th century author who wrote novels using single-syllable
words, including versions of Robinson Crusoe and The Swiss Family Robinson.

[https://en.wikipedia.org/wiki/Lucy_Aikin](https://en.wikipedia.org/wiki/Lucy_Aikin)

------
SixSigma
The irony that four words in the title are not in that 1,000 (in quotes here)

A "text" "editor" that only allows the 1,000 most "common" words in "English.

As determined by a similar online version that merely underlines words outside
the vocabulary :

[http://splasho.com/upgoer5/](http://splasho.com/upgoer5/)

------
bootload
_" Use it to tell your family members why their computers act up, or tell
people at work why they should pay you more."_

Very clever application. You could use this idea to only describe problems
using a dictionary of _known nouns /verbs_ to describe technical problems
using a known vocabulary.

------
austinstorm
Delightfully Orwellian.

------
bikamonki
So we can all write like potheads think!

Before downvoting me, and you will, do consider that I have zero scientific
evidence for the above sarcastic comment; however, anecdotally speaking, all
long-term pot smokers that I happen to know seem to speak with 1000 common
words, or less.

------
Brajeshwar
Hemingway[1] is a good editor too. It does not have word limitations. It does
a good job to make sure your sentences are not complex, long, and boring.

1\. [http://www.hemingwayapp.com/](http://www.hemingwayapp.com/)

------
aaroninsf
Oooo I would like to apply this as a filter, for various thresholds, to
various favorite works of e.g. fiction. Not with and replace; just a hard
filter... very Oulipo...

The resultant word count plots would I am guessing cluster by author, maybe
other things (genre?)...

------
skybrian
A while back a friend wrote an Android app for this:

[https://play.google.com/store/apps/details?id=com.fallinghaw...](https://play.google.com/store/apps/details?id=com.fallinghawks.upgoer)

------
EGreg
I am reminded of this talk:
[https://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf](https://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf)

------
dm03514

      The aim of Newspeak is to remove all shades of meaning from   
      language, leaving simple concepts (pleasure and pain, 
      happiness and sadness, goodthink and crimethink) that 
      reinforce the total dominance of the State

~~~
logfromblammo
Whenever I hear the word "amazing", the association that leaps most readily to
the fore of my brain is the word "doubleplusgood".

------
vtlynch
I dont understand the appeal of this at all. The first demonstration gif shows
how this mangles your text. Maybe this would be good for writing english for
audiences who learned it as a foreign language.

------
parallel
This might actually be really useful when communicating online with developers
overseas who don;t speak English natively. Recently started using upwork and
found myself rewriting instruction to use common words.

------
rasz_pl
This could actually be useful if it had buildin word2vec dictionary/suggestion
function. Dont just delete less common words, suggest simpler one of equal
meaning instead.

------
traviswingo
I think this is really great. I have just tasked the people in my company to
use this to explain our company in 100 words or less, without losing context
and making sense. :p

------
lsiebert
It's interesting to see people criticizing a software tool for it's potential
uses and effect on society when that tool isn't cryptography.

------
rocky1138
Man, this thing totally should have been called Newspeak.

------
personjerry
I feel like this would be much better as, say, a dictionary file for Word; the
restriction seems too aggressive (i.e. what if I want a name?).

------
lallysingh
Actually, a version of this as an aspell dictionary would be lovely. Have your
favorite editor highlight all the words that don't fit.

------
stared
To make it simple, you would need to use <=1000 words _and_ avoid phrasal
verbs. Otherwise it is a kind of cheating (at least, in English).

------
libeclipse
I don't believe this is a good idea. Limiting yourself to the 1000 most common
words does not necessarily lead to clarity or simplicity.

------
brianstorms
Unclear on why this is a good thing. Dumbing down language helps whom,
exactly? I guess if you want to hasten the Idiocracy this is your app.

~~~
SwellJoe
Writing so that children, people for whom English is a second language, and
people otherwise unable to comprehend advanced writing, can understand you is
not a bad thing.

I'm considering applying this to my product documentation, even though our
users are generally well-educated...we can't afford to have docs translated
into a dozen languages, but simple English may be readable by enough foreign
language speakers to make it worth the effort. English is a tough language;
making it easier for new users of the language is great, IMHO.

~~~
yuncun
Agreed. Some people think that big words are good. Me, I think good writing
should be like good code; short and easy to understand.

~~~
filmgirlcw
Well clarity != short/simple. Which is sort of the joke of the whole book/this
idea. It can often take many more words (which just makes an idea more
complicated) if you rely on this kind of idea (which yes, I'm aware is a joke)
than just using the correct word.

It's also very possible to get the gist of a word, based on where it is placed
and the words around it -- even if you don't know exactly what it means.
(Obviously I'm not suggesting that for something like documentation writing --
but by the same token I would never think that limiting yourself to a 1,000
word dictionary for something as important as product docs would be beneficial
in any way).

~~~
SwellJoe
I strongly suspect there is a happy medium to be found. Maybe more than 1000
most popular words...and maybe just augment a limited dictionary with an
addendum for technical terms not in the short dictionary, but that are
extremely common for the subject. E.g. in my case, maybe "domain" isn't in the
dictionary, but I obviously need it to talk about DNS. I would reasonably
expect even foreign language speaking administrators to know this word.

So, it's a joke, but maybe a useful thought exercise, too.

------
tempodox
There are already enough agents out there trying to dumb us down. And I don't
want to be one of them, so thanks, but no thanks.

------
snurk
This sounds awful. We need clear grammar, not limited vocabulary.

There's evidence of limited vocabulary and lazy expression in too many venues.

------
caf
What would be interesting is a greasemonkey script that scales the size of
words to be proportional to the log of their frequency.

------
partycoder
They can also include a Toki Pona translator.

~~~
pspeter3
Have you written anything in Toki Pona before?

~~~
partycoder
I was learning it and I could manage to understand some phrases. There are
many "neologisms" for everyday items though, requires some thinking.

------
jokoon
Isn't this a great opportunity to compress text. I wonder if there are
algorithms that use dictionaries.

------
michaelhoney
I can see a text compression scheme coming up. Convert a string to a list of
numbers, use a lookup table, done.

------
nxzero
To provide some context, Wikipedia has a version that's written using "Plain
English" that's based on limited set of common English words:

[https://simple.m.wikipedia.org/wiki/Main_Page](https://simple.m.wikipedia.org/wiki/Main_Page)

Haven't had the chance to use the tool, but sounds like a good first step to
making English text easier to read.

------
hyperpallium
I guess it must ignore proper names, like Trump and Randall, detected by
capitalization.

------
cloudjacker
This allows more than the top 1000 words. The dictionaries clearly have 3000
lines

------
ape4
Suggestion for the next word would be cool. Since there aren't so many.

~~~
kbenson
Oh, with the likely classifications of the word, such as verb, adverb, noun,
etc, this could be really useful.

~~~
patrickmay
A little bit of coding could eliminate the need for the user altogether!

------
Myrmornis
Ouch, that first example. "Hard to figure out for people"?

------
aquarin
Could this application suggest substitutions from 1000 word list?

------
vph
Don't mean to sound like an insult, but this is the first tool (that I know
of) intentionally designed to make people stupider.

------
dghughes
Joke or not if you haven't check out Vsauce on Youtube the Zepf Mystery it's
somewhat related.

------
artursapek
Trump writes his victory speeches in this.

