
Show HN: WordsAPI - impostervt
https://www.wordsapi.com/
======
wing328hk
Nice project.

I've created SDKs (REST API wrappers) in Python, PHP, Ruby, Java, C#, Android
for the WordsAPI:

[http://restunited.com/releases/424223873313015558/wrappers](http://restunited.com/releases/424223873313015558/wrappers)

(Objective C, Scala, ActionScript SDKs still in beta)

Hope these SDKs make it easier for developers to consume the API.

~~~
atmosx
In how many hours did you do that?!

~~~
wing328hk
Definitely less than an hour (around 30 minutes if I recall correctly) as the
SDKs, documentation, sample codes are automatically generated after entering
the endpoint definition. Please give it a try at
[http://restunited.com](http://restunited.com).

~~~
atmosx
Oh good to know!

------
miket
Looking forward to seeing further development!

For a more comprehensive word API, check out the excellent Wordnik API:
[http://developer.wordnik.com/docs.html#!/word](http://developer.wordnik.com/docs.html#!/word)

~~~
esperluette
You can get Wordnet through the Wordnik API too ... as well as from the
American Heritage Dictionary, The Century Dictionary, Wiktionary, and the GNU
version of the Collaborative International Dictionary of English

And Wordnik is becoming a not-for-profit, if that makes a difference to you.
:-)

Disclaimer: I am the founder of Wordnik. (You might like my TED talk:
[http://www.ted.com/talks/erin_mckean_redefines_the_dictionar...](http://www.ted.com/talks/erin_mckean_redefines_the_dictionary.html))

That said, I know the guy behind WordsAPI and he's good people. :-)

~~~
deadlysyntax
I absolutely love your TED talk.

~~~
esperluette
aw, thank you!

------
RKoutnik
Slightly disappointed that it doesn't have syllables. Should be easy to add, I
have a list here: [https://github.com/SomeKittens/Haiku-
Generator](https://github.com/SomeKittens/Haiku-Generator)

~~~
sumgy
Out of curiosity, how did you generate that list? I looked into doing
something similar a couple months back, but i couldn't find nearly as complete
of a source for syllable count.

~~~
RKoutnik
I started with the list from this question [0] and then wrote a one-off script
to convert it to the form linked above.

[http://stackoverflow.com/q/10414957/1216976](http://stackoverflow.com/q/10414957/1216976)

------
abhididdigi
Thank you. I was looking for an API only yesterday, and didn't find any that
is as good as this one. Even the Princeton University one didn't fit into my
workflow, because it is overly complex.

~~~
impostervt
That's pretty much why I built this. It's a great resource, but geared towards
lexicographers vs developers.

~~~
ghchinoy
This is awesome! I'd love to hear more about how this was created
(node/express, etc.)!

~~~
impostervt
It's a Node/express app. The word data is in Postgres (hosted by heroku),
user/metric data is in mongo (via mongolab.com). Hosting is from heroku
fronted by cloudflare.

~~~
ghchinoy
Thank you! Impressive. How do you limit anonymous web user's queries?

~~~
impostervt
Requests made for the demo don't have an access token. On the back end I look
for this case, and then see if the request has "when" and "encrypted"
parameters. "when" is just a date/time stamp, and "encrypted" is the same
thing..encrypted. If I see both those params, I decrypt the "encrypted" and
make sure it matches the "when" to validate the server created it, and make
sure the "when" is less than one hour.

Otherwise, all requests require an access token.

~~~
bovermyer
What do you use to cache requests?

~~~
impostervt
Redis & cloudflare.

------
mtmail
Is this a thin wrapper over
[http://wordnet.princeton.edu/wordnet/](http://wordnet.princeton.edu/wordnet/)
or does it/will it go beyond that?

~~~
impostervt
For now, that's the source of most data. I'll be adding more as time goes on.
First up is pronunciation.

~~~
zo1
Then I'd suggest you properly attribute them for the data, especially if
you're going to charge people for access (in my opinion, IANAL). See their
site on the matter:

[http://wordnet.princeton.edu/wordnet/citing-
wordnet/](http://wordnet.princeton.edu/wordnet/citing-wordnet/)

~~~
impostervt
You're right, of course. I turned the "about" page on. Was holding off until I
could spruce it up a bit, but I guess it's ok for now. Wasn't expecting
hackernews to really jump on this.

------
zmillman
Hmm, what about word stemming? I looked up "windmills" and got an empty result
set.

~~~
byoung2
Same with "walked". I've used the Porter Stemming Algorithm in the past, and
it works well.

[http://tartarus.org/martin/PorterStemmer/](http://tartarus.org/martin/PorterStemmer/)

~~~
chrisfarms
The data is stored in postgres, so it should be simple enough to use the
Snowball dictionary/stemmer and the tsvector/tsquery functions to sort this
out.

------
las_cases
It is blazing fast for me. I see from another response that this is a Node.js
app but perhaps caching might also explain how fast this is. Also, today I
have learned that jazz also means "have sexual intercourse with".

~~~
impostervt
Got a great speedup when I added Redis. When a word is first requested, the
JSON is put together from the Postgres database tables, then just stuff it in
Redis for subsequent requests.

Since there have been so many requests, most common words are in Redis at this
point.

~~~
las_cases
I have stumbled upon Redis a lot in really cool projects so I definitely need
to take a deeper look into it.

------
WhitneyLand
Nice work. Is the order of definitions supposed to be correct? The word "fast"
has a first definition of "unrestrained by convention or morality", yet I
would expect that to be lower down the list.

~~~
impostervt
The order doesn't convey any meaning.

------
SwellJoe
I would love to see this made available in an editor. When writing, if I could
pull synonyms and definitions up instantly with a single keypress, I suspect
my prose would be better. I like to think I have a good vocabulary, but this
great blog post (which I found through HN) has had me thinking I could always,
and _should_ try to, do better:
[http://jsomers.net/blog/dictionary](http://jsomers.net/blog/dictionary)

Imagine something akin to tab completion for writing prose.

~~~
Swizec
Please don't. Thesaurus prose is some of the worst writing out there. You can
always tell when a writer tries to go [too far] out of their natural active
vocabulary.

Basic rule of thumb for writing prose: synonyms are a myth. No two words have
exactly the same meaning.

~~~
SwellJoe
Did you read the article I mentioned? I believe that context is important.

------
PeterWhittaker
The results for _thesaurus_ , _action_ (second word to pop into my head), and
_everything_ are interesting: Short set, long, long set, empty set.

------
iguana
This is really cool!

One issue I found is that it provides alternative spellings as distinct items.
Is there a workaround for this?

{ "typeOf": [ "chromatic color", "chromatic colour", "spectral colour",
"spectral color", "citrus", "citrus tree", "pigment", "citrous fruit", "citrus
fruit" ] }

------
saganus
Very nice work. Interesting what can be done with this.

One question though, does anyone have any idea what the copyright for this
would be? If you happen to use X dictionary (one for each implemented language
for example, or even maybe as a selectable source), would say, the Oxford
Dictionary could sue because you are using their data? AFAIK, facts are not
copyrightable, are they? where would this stand?

------
senorgusto
I wish synonyms were sorted according to usage frequency instead of
alphabetically... maybe its not possible with the data source though?

~~~
impostervt
Not possible currently. I'd like to find a source of "how common is this word"
and add some kind of quantifiable number to each word. Perhaps I can scan
existing open source text and just figure it out. On the backlog.

------
jobposter1234
Would love to see something like this but that would expand abbreviations.
E.g., corp -> corporation (and the reverse).

------
kevinweaver
"An API for for the English language."

For For!

~~~
impostervt
Omg that's embarrassing. Fixing it now.

------
hellbanner
What is the minimum set of atoms needed to construct the rest of the English
language?

~~~
nacnud
The alphabet?

~~~
hellbanner
I meant words.

------
bdoerrfeld
Is [https://www.wordsapi.com](https://www.wordsapi.com) down for maintenance?
I'm getting error responses. Thanks!

------
_bitliner
I was wondering which is your market. I mean, who is going to use a service
like this? Or which are typical use cases of it?

~~~
jfoster
Might be useful for search. (not necessarily just web search, but ecommerce as
well)

For example, instances of "aqua" should probably match the search query
"blue". Google seems like it may already be that advanced, but other search
engines perhaps not. Large-scale search engines probably would keep this in
their own DB, though.

------
bilalel
Hi,

Nice project! What about adding example of sentences?

~~~
impostervt
Good idea - I'll add that to my todo list.

------
dotwebdull
How did you build the is_a relationships (eg, Person is_a Animal)?

Ontology engine? If so, what's your source?

------
gosukiwi
Nice! It seems to be quite useful.

------
nijiko
Not receiving verification emails.

~~~
impostervt
Sorry about that. Using sendgrid, which seems to delay emails from new
accounts for a bit.

------
darkhorn
How does it know that finger is part of hand?

------
deepGem
This is wordnet++ Good going.

------
dspoka
Can anyone compare the pros and cons of this to princeton's wordnet?

------
jbpadgett
Could this be used in some way for robot AI to learn to speak?

------
pseudometa
Looks great!

------
dested
Great service

------
okonomiyaki3000
What? It has no mode for finding anagrams? Useless.

