Definitely less than an hour (around 30 minutes if I recall correctly) as the SDKs, documentation, sample codes are automatically generated after entering the endpoint definition.
Please give it a try at http://restunited.com.
You can get Wordnet through the Wordnik API too ... as well as from the American Heritage Dictionary, The Century Dictionary, Wiktionary, and the GNU version of the Collaborative International Dictionary of English
And Wordnik is becoming a not-for-profit, if that makes a difference to you. :-)
Out of curiosity, how did you generate that list? I looked into doing something similar a couple months back, but i couldn't find nearly as complete of a source for syllable count.
Thank you. I was looking for an API only yesterday, and didn't find any that is as good as this one. Even the Princeton University one didn't fit into my workflow, because it is overly complex.
It's a Node/express app. The word data is in Postgres (hosted by heroku), user/metric data is in mongo (via mongolab.com). Hosting is from heroku fronted by cloudflare.
What lead you to use Mongo for userdata/metrics and Postgres for words? Are there specific features of each that you're using? I'm new to both, so just trying to learn which use cases prefer one over the other...
Requests made for the demo don't have an access token. On the back end I look for this case, and then see if the request has "when" and "encrypted" parameters. "when" is just a date/time stamp, and "encrypted" is the same thing..encrypted. If I see both those params, I decrypt the "encrypted" and make sure it matches the "when" to validate the server created it, and make sure the "when" is less than one hour.
What lead you to use Mongo for userdata/metrics and Postgres for words? Are there specific features of each that you're using? I'm new to both, so just trying to learn which use cases prefer one over the other...
What lead you to use Mongo for userdata/metrics and Postgres for words? Are there specific features of each that you're using? I'm new to both, so just trying to learn which use cases prefer one over the other...
What lead you to use Mongo for userdata/metrics and Postgres for words? Are there specific features of each that you're using? I'm new to both, so just trying to learn which use cases prefer one over the other...
> Princeton University makes WordNet available to research and commercial users free of charge provided the terms of our license are followed, and proper reference is made to the project using an appropriate citation. [1]
Then I'd suggest you properly attribute them for the data, especially if you're going to charge people for access (in my opinion, IANAL). See their site on the matter:
You're right, of course. I turned the "about" page on. Was holding off until I could spruce it up a bit, but I guess it's ok for now. Wasn't expecting hackernews to really jump on this.
The data is stored in postgres, so it should be simple enough to use the Snowball dictionary/stemmer and the tsvector/tsquery functions to sort this out.
What you really want is a lemmatizer (stemming approximates lemmatization). I believe that NLTK has a WordNet lemmatizer, but I don't know much about it.
It is blazing fast for me. I see from another response that this is a Node.js app but perhaps caching might also explain how fast this is. Also, today I have learned that jazz also means "have sexual intercourse with".
Got a great speedup when I added Redis. When a word is first requested, the JSON is put together from the Postgres database tables, then just stuff it in Redis for subsequent requests.
Since there have been so many requests, most common words are in Redis at this point.
Nice work. Is the order of definitions supposed to be correct? The word "fast" has a first definition of "unrestrained by convention or morality", yet I would expect that to be lower down the list.
I would love to see this made available in an editor. When writing, if I could pull synonyms and definitions up instantly with a single keypress, I suspect my prose would be better. I like to think I have a good vocabulary, but this great blog post (which I found through HN) has had me thinking I could always, and should try to, do better: http://jsomers.net/blog/dictionary
Imagine something akin to tab completion for writing prose.
Please don't. Thesaurus prose is some of the worst writing out there. You can always tell when a writer tries to go [too far] out of their natural active vocabulary.
Basic rule of thumb for writing prose: synonyms are a myth. No two words have exactly the same meaning.
Very nice work. Interesting what can be done with this.
One question though, does anyone have any idea what the copyright for this would be? If you happen to use X dictionary (one for each implemented language for example, or even maybe as a selectable source), would say, the Oxford Dictionary could sue because you are using their data? AFAIK, facts are not copyrightable, are they? where would this stand?
Not possible currently. I'd like to find a source of "how common is this word" and add some kind of quantifiable number to each word. Perhaps I can scan existing open source text and just figure it out. On the backlog.
Might be useful for search. (not necessarily just web search, but ecommerce as well)
For example, instances of "aqua" should probably match the search query "blue". Google seems like it may already be that advanced, but other search engines perhaps not. Large-scale search engines probably would keep this in their own DB, though.
I've created SDKs (REST API wrappers) in Python, PHP, Ruby, Java, C#, Android for the WordsAPI:
http://restunited.com/releases/424223873313015558/wrappers
(Objective C, Scala, ActionScript SDKs still in beta)
Hope these SDKs make it easier for developers to consume the API.