
Wolfram Alpha Blog : Yes, That’s Dumb—A Lexicographic Footnote - ujal
http://blog.wolframalpha.com/2009/05/24/yes-thats-dumb-a-lexicographic-footnote/
======
lunchbox
It would be cool if Wolfram Alpha incorporated functionality from Princeton's
Wordnet Database:

<http://wordnet.princeton.edu/>

Screenshot:
[http://minimalism.linguistics.arizona.edu/~sandiway/wnconnec...](http://minimalism.linguistics.arizona.edu/~sandiway/wnconnect/snapshotl.png)

As you can see from the screenshot, it allows you to map all kinds of
relationships between words (e.g. taxonomies).

I always found WordNet to be an amazing tool, albeit one that could use some
UX work.

~~~
programnature
That is actually what is being used here.

Wordnet is one of the sources for the WordData[] function.
<http://reference.wolfram.com/mathematica/ref/WordData.html>

------
cool-RR
Why doesn't Alpha let you actually browse the synonym network? Why does it
show the sixth-level relative of a word but won't let me see all of its
immediate synonyms?

------
dxjones
Interesting, ... the synonym network containing "black" contains 26,330 words.

The entire English language must be partitioned into "K" non-overlapping
synonym networks. What is the value of "K" ?

I wonder if K = 2, ... and the other synonym network contains "white".

If K > 2, then how many words are in the "white" synonym network, and what
other words are not connected with "black" or "white" ?

Whatever K is, it must be relatively small, since there are only so many words
in the language, and the synonym networks seem rather large.

~~~
mblakele
What about "ash"? I'd expect that to be in both networks: in "black" as a
synonym for "coal" or "cinder", and in "white" as a form of "ashen", "pale",
etc.

More widely, why must there be K non-overlapping networks that cover the
entire language? Naively, I'd expect an unbounded number of non-overlapping
small networks. But the two ideas of "Basic English" and of specialized
vocabularies lead me to posit that every "significant" network (a network with
a N number of members, where N > 5,000) overlaps with at least 1 other
"significant" network. As a result, I would suspect that it would be difficult
or impossible to find K non-overlapping sets that partition the entire
language (where K is less than... oh, let's say 100).

~~~
jerf
An unbounded number of networks would be made of an unbounded number of nodes
in the minimal case, and I have a hard time describing the number of English
words as "unbounded". If you apply the full power of Internet debate (now with
extra axioms!) to the twisting and spinning of the term and streeeeetch the
meanings as hard as you can, maybe you can get there, but not by any sane
method.

~~~
mblakele
Oh, is English supposed to be sane? I hadn't noticed.

You're right, of course: I was using "unbounded" in a loose fashion. There are
a finite number of phonemes used in English, and I suppose there must be some
upper limit to the number of phonemes in a word.

------
Mongoose
Wolfram|Alpha: statistical, linguistic analysis... now with computational
racism!

------
rabidsnail
nltk has been around for some time now and provides much richer functionality
for this sort of thing.

------
wglb
For a moment there I thought this was reddit-like Wolfram bashing.

