
Topic Analysis of Marc Andreessen’s Tweets - dev1n
http://dhurley14.github.io/blog/2015/01/11/marc-andreessen-lda/
======
languagehacker
Get back to me when the implementation can group the topic words into a decent
summary.

The rest of it is pretty bush league.

~~~
dev1n
So I did some research and found a way to access article body text without
having to use diffbot so that'll be fun. I found a python wrapper for
boilerpipe [1] so I plan on redoing this analysis. The amount of data I had to
work with was pathetic. This time around I'll utilize jv22222 's suggestion on
how to get the hyperlinks out of the tweets too. Thanks all! This was fun :)

[1]: [https://github.com/misja/python-
boilerpipe](https://github.com/misja/python-boilerpipe)

------
mahouse
So the 70% of the article explains the process of getting a list of URLs?
Switch to Tweepy, please.

------
jv22222
Full links are a available as entity references in each tweet in the api.

