reply
here is an example of taking tweets that have 'diabetes' and correlating topics with counties.
https://www.linkedin.com/in/karl-dailey-02557b65/treasury/po...
The original topics were actually better, but I was asked to adjust the language to wipe out common language across all tweets. You can still see interesting things going on though. North East: Charity, Hospitals, Research. South: Koolaid, sweat tea. etc.
I had also done topic modeling on customer surveys for Comcast (I cant show them), but the topics identified 3 key features that lead to low customer satisfaction.
I have also used LDA on grocery purchases (for mere fun), and it worked out really great as well.
A more interesting topic would be one that is understandable but was not obvious, even to the people intimately involved in data's subject matter. These are harder to discover, but they do exist - and I have never seen LDA surface them.
Then when presented with a new article on any subject create the vocabulary and frequencies for that article, strip out the words stripped previously and then do a match against all subjects.
Very easy to parallelize, very good results (surprisingly good, for such a simple algorithm in fact).
The thing is, reviews, messages, comments, etc, in general, do tend to revolve around some central topic(s). For example, this comment right now is about LDA. It's not totally random.
this is easy enough to do, I'm wondering what exactly is your definition of "interesting" ?
negative reviews:
http://pastebin.com/vkGn7ZLH
Amazon Echo, roughly ~34k reviews.
https://en.wikipedia.org/wiki/Non-negative_matrix_factorizat...
It's surprisingly useful for finding interesting trends in reviews.
http://ieeexplore.ieee.org/document/7050791/
reply