Ask HN: How to graph total tweets about "bitcoin" in given timeframe?

AznHisoka · on Dec 9, 2013

Use the Twitter Search API. Yes, yes I know people say it's not as complete as the hose, but it's actually quite close to complete if you query it constantly without sleeping between requests. Just keep track of the MAX Tweet ID, and get tweets > this ID for every request.

Especially for queries where there aren't more than XXX results per minute (yes, even for terms like Bitcoin as u can see here: https://twitter.com/search?q=bitcoin&src=typd&f=realtime)

Then map those tweets to the specific hour (in epoch time for something), or even specific minute, or second if you want more granular detail. Index them to a search index like ElasticSearch or SOLR, and then do a faceted search, which will return the # of results for each time period. You can then graph this fairly easily.

The flaw I see in your experiment is that there's a TON of noise in Twitter. Maybe filter out tweets by obvious spam bots (ie looking at follower/following ratio, ignoring users who only tweet hashtags, or URLs, etc).

dylz · on Dec 6, 2013

You'll want to use the Twitter public timeline firehose - https://dev.twitter.com/docs/api/1.1/get/statuses/firehose

Insert any matching bitcoin into your database with the associated time. Then you can sum/avg/aggregate it by time (within [epoch seconds] - [epoch seconds + 3600])

makerops · on Dec 6, 2013

I think its hard/expensive to access the firehose no?

dylz · on Dec 6, 2013

Wasn't much of an issue for me but since he is somewhat new I imagine reading a stream and basically doing an infinite foreach would be a tad easier than adding queuing, API ratelimits (per hour? forget which), timed polling, etc.

makerops · on Dec 6, 2013

There is no mechanism for search "completeness" without access to the fire hose.

http://blog.cloudera.com/blog/2012/09/analyzing-twitter-data...

^ that is a good resource if you ever get access to the firehose.

Shoot me an email anthony@makerops.com, and when I get home I can shoot you an email with some example code that Ive written in the past.