
Automatic Topic-Modelling with Latent Dirichlet Allocation - cschmidt
http://engineering.intenthq.com/2015/02/automatic-topic-modelling-with-lda/
======
jaydub
If you have the time to watch, I've also found David Blei's talk on the LDA to
be very interesting and clear as an introduction

[http://videolectures.net/mlss09uk_blei_tm/](http://videolectures.net/mlss09uk_blei_tm/)

------
charlescearl
David Mimno, has interesting post on the status of Mallet
[http://www.mimno.org/articles/malletplans/](http://www.mimno.org/articles/malletplans/).
Of note also is the Vowpal Wabbit variational Bayes
[https://github.com/JohnLangford/vowpal_wabbit/wiki/Latent-
Di...](https://github.com/JohnLangford/vowpal_wabbit/wiki/Latent-Dirichlet-
Allocation) implementation. Spark includes a parallel LDA in the 1.3 release
of mllib.

------
MoOmer
I used Mahout's LDA to topic model looots of web pages. The main issue I came
across was that it just fell flat when I used LDA on different sites in the
same run. I'd tried a few different types of normalization and chrome
filtering, but it just didn't fly. Performed best for me when it was run on a
per-corpus basis.

------
atto
At Kifi ([https://www.kifi.com](https://www.kifi.com)), we use LDA quite a bit
for document classification, and have open sourced our Scala library:
[http://eng.kifi.com/reactive-lda-library/](http://eng.kifi.com/reactive-lda-
library/)

------
dchichkov
As with any other machine learning algorithm, don't waste your time trying it
on stock market, kids. Exponentially distributed randomness in, exponentially
distributed randomness out.

And obligatory
[http://arxiv.org/abs/1208.4429](http://arxiv.org/abs/1208.4429)

~~~
upquark
What exactly do you mean by exponentially distributed randomness? LDA is
typically applied to textual data, so the application would be to e.g.
automatically tag/classify financial news. I don't think the prospects of ML
are as grim in this space.

------
primaryobjects
For those interested, I created a node.js LDA library that's pretty easy to
use.

[https://github.com/primaryobjects/lda](https://github.com/primaryobjects/lda)

