
Machined Learnings: LDA on a Social Graph - DanielRibeiro
http://www.machinedlearnings.com/2011/03/lda-on-social-graph.html
======
aksbhat
here is my research on detecting communities in Twitter's Social Graph (36
Million users) using Label Propagation algorithm.
<http://www.akshaybhat.com/LPMR/> You can find both code as well as results.

~~~
sidupadhyay
Akshay, great use of LP. Did you consider other functions for adsorption
besides a max vote? It would be interesting to see a distribution of possible
communities based on seeded community labels from your current results.

Also, thanks for releasing your code. I'm currently working with the Junto
package <http://code.google.com/p/junto/> but I'll give your map reduce
implementation a try for the added benefit of running LP on larger datasets.

~~~
aksbhat
The LP algorithm which I have used [
<http://pre.aps.org/abstract/PRE/v76/i3/e036106> ] is different from Label
Propagation algorithms used for Semi Supervised learning. The algorithms
implemented in Junto are Semi Supervised algorithms (generally used with Bi-
Partite graphs) while the label propagation algorithm which I have used is
used for unsupervised community detection in networks.

If you dont have access to a Hadoop cluster, I have some multiprocessor code,
which is slightly faster, if you have enough RAM (32 Gb for twitter dataset).
drop me mail if you need it.

------
dvse
This kind of latent variable model is not only highly unstable but also tends
to tell you more about the person interpreting the results than the data
itself, e.g.: Chang et al., Reading Tea Leaves: How Humans Interpret Topic
Models, <http://umiacs.umd.edu/~jbg/docs/nips2009-rtl.pdf>

~~~
catechu
While I agree that it can be unstable (inference can get stuck in local
maxima), latent variable models like LDA _can_ be used to rigorously evaluate
textual categories (e.g. journal articles). We take for granted that the
categories we set are "useful", in some sense, so it's interesting to see that
quantitatively questioned.

e.g.:
[http://www.pnas.org/content/early/2010/11/12/1013452107.abst...](http://www.pnas.org/content/early/2010/11/12/1013452107.abstract)

Also, for the curious, original paper on LDA:
[http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pd...](http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf).

