
Ask HN: How would you go about doing a cluster analysis on Tweets? - nate
I&#x27;m trying to do some analysis on which Tweets with which features.<p>For example, I&#x27;d like to know if a Tweet gets more engagement if it has hashtags or gifs or movies or mentions, etc. There&#x27;s been things like this before: https:&#x2F;&#x2F;www.quicksprout.com&#x2F;2014&#x2F;03&#x2F;05&#x2F;what-type-of-content-gets-shared-the-most-on-twitter&#x2F; But their methodology isn&#x27;t shared and might not be an accurate analysis. I&#x27;m also not sure if its results are valid 4-5 years later.<p>I&#x27;m curious how others would go about this. Would you just take the top 10% of a bunch of tweets and just do some basic counts on a single variable: has media, does not have media. And just visualize which variable makes the impact?<p>Or would you explore how maybe multiple variables interact? Would you do some kind of k-means cluster analysis? All pointers to education, methodologies, helpful tools&#x2F;software and previous attempts at this are very welcome.
======
neduma
Can this help? - [https://medium.com/@mroth/how-i-built-
emojitracker-179cfd823...](https://medium.com/@mroth/how-i-built-
emojitracker-179cfd8238ac)

