I'm not a coder (undergrad economics) but can learn fast.
What I want to do is find a way of tracking tweets about bitcoin on an hourly basis.
The idea is to compare this with a graph of the price of bitcoin over time and see if there is any relationship between the two.
Why am I doing this? Mainly curiousity and improving my technical skills.
Any points appreciated. So far I have started learning about APIs and JSON and am trying to figure out how to use JSON feeds in a program that is good at manipulating data (i.e not my browser).
Especially for queries where there aren't more than XXX results per minute (yes, even for terms like Bitcoin as u can see here: https://twitter.com/search?q=bitcoin&src=typd&f=realtime)
Then map those tweets to the specific hour (in epoch time for something), or even specific minute, or second if you want more granular detail. Index them to a search index like ElasticSearch or SOLR, and then do a faceted search, which will return the # of results for each time period. You can then graph this fairly easily.
The flaw I see in your experiment is that there's a TON of noise in Twitter. Maybe filter out tweets by obvious spam bots (ie looking at follower/following ratio, ignoring users who only tweet hashtags, or URLs, etc).