
Hapax legomenon - mutor
http://en.wikipedia.org/wiki/Hapax_legomenon
======
struppi
A german example (from
[http://de.wikipedia.org/wiki/Hapax_legomenon](http://de.wikipedia.org/wiki/Hapax_legomenon))
is "Knabenmorgenblütenträume" from Goethe's "Prometheus". It is a wonderful
example for how you can create new words in German by concatenating existing
words:

    
    
        Knabe = Boy
        Morgen = Morning
        Blüte = Blossom
        Träume = Dreams
    

The literal translation of this word is somthing like "A boy's morning dreams
about blossoms", but the true meaning (according to some internet discussions:
[http://www.gutefrage.net/frage/was-sind-
knabenmorgenbluetent...](http://www.gutefrage.net/frage/was-sind-
knabenmorgenbluetentraeume-) ) seems to be something unfinished: Every part of
the word depicts something unfinished / young.

Sometimes I really love my mother tongue :)

~~~
ygra
Technically English allows the same, only that compound words are separated by
spaces instead of either concatenated or joined with hyphens.

~~~
sanoli
It seems to work much better in german, when they are joined and feel much
more like a new word. As for English allowing the same thing though with
spaces, isn't concatenating words to join their meaning a feature of pretty
much every language?

~~~
losvedir
> _isn 't concatenating words to join their meaning a feature of pretty much
> every language?_

No, and that's the point.

The fact that the grammar of the language allows strings of nouns to form
compound nouns (e.g. a "River Steamboat Captain" feels natural to me) is
unique and interesting, and shared by English. In Spanish, for example, you'd
need prepositions/conjunctions to express that concept.

The fact that if you were to write this down, you'd leave out the spaces, is
just a weird quirk of the written system, and independent of the language
itself.

~~~
schoen
My favorite example of English's willingness to do noun-noun compounding and
Spanish's corresponding unwillingness is on the Boston subway:

Passenger emergency intercom unit at end of car

Sistema de intercomunicación para pasajeros en caso de emergencia situado al
extremo del tren

There are several things going on there that make the Spanish longer than the
English, but one is the obligatory use of explicit prepositions relating the
nouns to one another in Spanish. In English terms, the Spanish says

System of intercommunication for passengers in case of emergency situated at
the end of the train

See also this list written up by John Cowan of some languages that do noun-
noun compounding and what the implicit meanings of such compounds can be:

[http://recycledknowledge.blogspot.com/2009/12/noun-noun-
comp...](http://recycledknowledge.blogspot.com/2009/12/noun-noun-
compounds.html)

Notice that not all types of compounds are understood in every language, even
for languages that sometimes allow this!

~~~
sanoli
Thanks, that was a great example.

------
hacknat
As a person who got a degree in ancient languages (Hebrew, Greek and Aramaic),
I'm excited to see this article up. Hapaxes are particularly interesting in
ancient texts, because they bring up issues of understandability and
translatability.

In particular the Bible is an interesting case. There are 1500 Hapaxes in the
Hebrew Bible, but what's truly amazing is how many dis legomenon, tris
legomenon, tetra, etc there are. The average English translation of the Hebrew
Bible uses a diminished vocabulary, compared to the Hebrew, sometimes by a
factor of 5. I think religious folk of all creeds would be interested to know
how much subjective judgement goes into the translation of their sacred
documents.

------
JacobAldridge
So Googlewhacks are hapax legomena of the open internet corpus. I cannot wait
to utter that phrase at a party.

[Edit: Seems the plural is hapax legomena, not hapaxes legomenon.]

~~~
sirodoht
The reason is that hapax means "one time" in Greek, while legomenon means
"that which is said". So, the noun is legomenon, which will be the word in
plural.

A (not very good) analogy is "visible phenomenon". You would not say "visibles
phenomenon", but "visible phenomena".

~~~
JacobAldridge
Yes - for some reason I instinctively grouped it with culs-de-sac and mothers-
in-law.

------
twowo
It's an interesting problem to try to determine what are the limits of a
language and what is a word and what is not. Corpus studies are not sufficient
for that purpose as you will always end up with a large number of hapaxes.
Because language is based on social consensus, the most common sense approach
to the problem would be to determine 'wordiness' of a string by checking how
many people consider it to be a word.

We are trying to do something like this with large-scale studies for English
and Dutch. As it is very related to the problem I will allow myself to share
the links: [http://vocabulary.ugent.be](http://vocabulary.ugent.be)
[http://woordentest.ugent.be](http://woordentest.ugent.be)

------
leephillips
As a youth I was fascinated to discover a dethroned hapax legomenon in a
detective novel:

[http://lee-phillips.org/literallyEgregious/](http://lee-
phillips.org/literallyEgregious/)

~~~
madaxe_again
I'm glad to know I'm not the only one whose favoured childhood reading was the
"compact" OED in tinyprint. That and the '57 and '67 encyclopaedia britannicas
- I spent a year (in the 80's) charting the progress of mankind's knowledge in
that decade - effectively a manual diff.

Thinking about it, I now zyxt that this was probably strange.

------
jdmitch
I had thought Alice in Wonderland would have quite a few hapax legomena, even
if mostly nonsense words I think (ie 'twas brillig and the slithy toves...),
but I found from this academic article [0] that Twain's Tom Sawyer actually
has 5% more hapax legomena. The article has some pretty surprising findings
about ratios of hapax/vocabulary - though hapaxes fairly consistently make up
around 50% of the words in any text, they steadily increase in corpora over
3,000,000 words.

[0]
[http://aclweb.org/anthology/J/J10/J10-4003.pdf](http://aclweb.org/anthology/J/J10/J10-4003.pdf)

------
jimmytidey
Love the part on "Sassigassity": A word that appears Dickens' short story "A
Christmas Tree", and it seems that no one knows what it means.

~~~
ppod
>"The devoted dog of Montargis avenges the death of his master, foully
murdered in the Forest of Bondy; and a humorous Peasant with a red nose and a
very little hat, whom I take from this hour forth to my bosom as a friend (I
think he was a Waiter or an Hostler at a village Inn, but many years have
passed since he and I have met), remarks that the sassigassity of that dog is
indeed surprising; and evermore this jocular conceit will live in my
remembrance fresh and unfading, overtopping all possible jokes, unto the end
of time."

Surely the joke is the peasant's mispronunciation of "sagacity"? Also an odd
Baader-Meinhoff, that story about the dog was on the front of reddit last
week.

------
doe88
But I'm wondering, if a word only appears one time, in a single book, how do
we know it is a real word and not a new word invented by his author or a
mistake?

~~~
JacobAldridge
What, pray tell, is the difference between a real word and a new word
invented?

~~~
sp332
Reminds me of an author who was given a brand-new Oxford English Dictionary (a
20-volume English dictionary with etymologies) by one of his fans. His wife
was proof-reading a new manuscript when she came across a word that she was
sure wasn't real English. He said "Oh, I'm sure it's in the dictionary!" so
she went to look. A few minutes later she comes in, throws the volume at him,
and storms out. Confused, he flips to the word in the volume and finds the
etymology is... his earlier book!

------
matmann2001
I love the irony that listing hapaxes for the English language, in effect,
nullifies their hapax status.

~~~
yebyen
If I am to interpret the wikipedia page, I think it's more common to talk
about Hapax legomena within a single text -- or within a single author's work
-- you could assert that a word was Hapax legomenon in the corpus of an whole
language, but then you would have to have read the rest of the language's
original recorded works, other people reusing the word from the original
reference, and people overheard your own reference (assuming you broke the
spell) without realizing they were breaking the spell.

How could you ever know if you were the one who ruined it, if it was or if it
wasn't hapax when you first found it?

------
pepijndevos
This was the answer to a sub-sub-puzzle of the MIT mystery hunt 2012
[http://web.mit.edu/puzzle/www/2012/puzzles/into_the_woodstoc...](http://web.mit.edu/puzzle/www/2012/puzzles/into_the_woodstock/sounds_good_to_me/)

------
sfrench
So interesting that this comes up today. We were discussing this yesterday in
relation to an upcoming project where we want to maintain some amount of data
for visitors to a website, but don't necessarily need to retain data for the
visitors that we see very few times.

------
yebyen
I think this discussion is incomplete without any reference to the word
"Gundible."

[]: cloud.github.com/downloads/shoes/shoes/nks.pdf‎ (Read the introduction)

------
PeterisP
Quite interesting - the concept is commonly used in natural language
processing, but I've never seen it called by that greek term.

------
bemmu
Is there a book where each word is used only once?

~~~
czr80
I think that might be impossible to sustain for any substantial length.

It's true, admittedly, such constraints are often embraced by authors seeking
novelty; consider someone's novel written entirely without using the letter
'e'. This rule, though, seems excessive - constructions would grow
increasingly baroque, English's famously large vocabulary stretched thin,
meaning squirreled into obscure words, awkward transitions.

And yet, perhaps too hastily dismissing an idea is equally foolhardy.
Exploring Borge's library one may, indeed, see everything in sufficient
time...

~~~
userbinator
I see what you did there.

------
madaxe_again
Heller: polymesmeric. Does it count if it's used as a byline on the cover, I
wonder.

------
PaulAJ
I can just hear Hermione Granger saying "Its 'legoMEnon'"

