Link to original HN submission: https://news.ycombinator.com/item?id=14337275 It...

snissn · on May 22, 2017

Can you suggest any unsupervised learning? I want to take a body of text associated with users and come up with keywords/topics with each user. Thanks! :)

capybara · on May 22, 2017

Some fairly widely-used techniques include LSI, LDA, and word2vec or doc2vec. There a lot of different techniques out there! I'm one of the creators of Tagger News, and we used LDA with python's Gensim package. Here's a good tutorial: https://radimrehurek.com/gensim/tut2.html

minimaxir · on May 22, 2017

Note that fasttext is the next generation of word2vec/doc2vec, and shares many of the same creators.

projectorlochsa · on May 22, 2017

How does fasttext compare to vowpal wabbit?

skystrife · on May 22, 2017

Basically the same [1,2].

[1]: https://twitter.com/yoavgo/status/751178795323908096

[2]: https://nlpers.blogspot.com/2016/08/fast-easy-baseline-text-...

gerenuk · on May 23, 2017

Vowpal wabbit is approx. 4-8x faster than gensim but the accuracy will be less compared to gensim.