Twitter Mood Predicts The Stock Market 59 points by duck on Oct 18, 2010 | hide | past | web | favorite | 36 comments

 Hey, why stop at passive predicting - if we set up a few boxes with a hundred thousand twitter bots pushing "calmness" updates, we can probably make the Dow keep on going up forever (or at least 87.5% of the time) . Think about all the happy 401(k)s.
 Cargo Cult Science :D
 If the markets were actually science rather than the decision making of a few huge players - if the HFT algo writers take this research seriously, I bet it'd work for at least a day or two.
 Careful, man. People are getting sued for writing software that fiddles with the stock market. At least, if it fiddles with it without appropriate kickbacks.
 From the top right of page 6:`````` To assess the statistical significance of the SOFNN achieving the above mentioned accuracy of 87.6% in predicting the up and down movement of the DJIA we calculate the odds of this result occurring by chance. The binomial distribution in- dicates that the probability of achieving exactly 87.6% correct guesses over 15 trials (20 days minus weekends) with a 50% chance of success on each single trial equals 0.32%. Taken over the entire length of our data set (February 28 to December 20, excluding weekends) we find approximately 10.9 of such 20 day periods. The odds that the mentioned probability would hold by chance for a random period of 20 days within that period is then estimated to be 1−(1−0.0032)10.9 = 0.0343 or 3.4%. The SOFNN direction accuracy is thus most likely not the result of chance nor our selecting a specifically favorable test period. `````` I'm not sure about the result not being chance. In particular, aren't they privileging the hypothesis by using the probability of 87.6% guesses being correct? They say the result only had a 3.4% chance of happening, yet we could assign a similar probability to a wide variety of other seemingly unlikely outcomes. I'd much rather have seen the predictor tuned on a subset of the data and then tested on the rest of the data - perhaps the authors will release their predictor and we can try it on 2009.Given that twitter and the stock market are both correlated to what is happening in the world (in fact, caused by what is happening in the world,) it is not surprising that they are correlated with each other. Somehow I don't think correlation in the other direction would get the upvotes though - stock market crash predicts sadness on twitter? You don't say.In short, I don't follow all of their statistics, but I am highly dubious of their model having any predictive power.
 They managed to test and train on different periods of time.`````` February 28, 2008 to November 28, 2008 is chosen as the longest possible training period while Dec 1 to Dec 19, 2008 was chosen as the test period `````` That said, their split exhibits enormous bias:`````` Dec 1 to Dec 19, 2008 was chosen as the test period because it was characterized by stabilization of DJIA values after considerable volatility in previous months and the absence of any unusual or signiﬁcant socio cultural events `````` I don't understand why they would think this is okay.
 I can see exactly why they would think it was okay. What they were trying to do was to figure out whether twitter moods could predict the unpredictable random movements of the market, rather than the outside edges of the bell curve when anyone could have predicted what would happen, and where the market moves might even be seen as a cause of the twitter moods rather than a result, or a co-result of some other process. Basically, it's the hardest place to test the correlation.Since they got great results, it'll be interesting to see them extend the analysis and see if any interesting dynamics show up.
 Whoops, missed that. Well then, in the absence of unpredictable events, twitter could predict the stock market, but it can't anymore since the market will now factor this new knowledge in (assuming perfect markets of course.)
 Pendantic response:Perfectly efficient markets. Which, as a theory, has been under rather serious siege for the better part of the last decade.You can certainly make a behavioral argument that would reach more or less the same conclusion, however.
 >perhaps the authors will release their predictor and we can try it on 2009Can't we somehow look and see if the author's lifestyle suddenly became more lavish. That should be a good predictor of stock market success, but then ...
 I am very angry about this. I was damned close to going live. My code was written. My backtesting showing amazing results.Now every douchenozzle on earth will be onboard the bandwagon, and render it useless.EVERY TIME I HAVE A GOOD IDEA SOMEONE BEATS ME TO IT! WHARBLEGLARBLE!
 BTW - My 'Calmness Indicator' was simply the ratio of capital letters and exclamation marks to other text. It wasn't 87 percent, but close enough.HULK SMASH! BAD SCIENTISTS! BAD!
 Do you have any idea what you would have done after your project went live?Because you can still beat them to that.
 Yes. I do. I was going to bet on the supershort, and superlong funds. (2-4 x average market returns) for short term periods, and double my money every few days. I was only waiting to build capital.Now I have zero time to build capital before it's leveraged away. FUDGE!I am righteously pissed off now. I had investors on the hook. now it will all be arbitraged away from me.It's just like quicksilverscreen when Rupert Murdoch litigated it away (one of the senators who voted for the DMCA was one of the lawyers defending Fox).I am ANGRY.
 what's to keep you from going live right now? credit's cheap. max out a credit card? trade on margin? you really have no access to capital? --or you do but you'd rather use someone elses? pivot?i GUARANTEE you that people are already trading based on twitter data. i doubt you're really as bad off as you think...or maybe you are. yes. actually, your position is hopeless. might as well go open source with it now. sourceforge link pls?
 I have so damned many phonecalls to make tomorrow. A lot of powerful, and wealthy people will never talk to me again because of this. Fuck. Fuck. Fuckity. Fuck.
 You may be mad, but unfortunately this is not the place for it. Let us know if it pans out.
 I just lost billions of potential dollars, because some college kid couldn't see the practical applications and published. What would your response be?What if you were sleeping in someone else's garage, it was less than freezing temperature outside, someone had stolen your car a few weeks ago, and you had put your own kneecaps on the line as collateral? What would it be then?Frankly, I think I'm holding together rather well given my circumstances.
 Relax, dude, I've heard exactly this idea pitched by 2 other people, you'd have been up against "douchenozzles" even if this "douchenozzle" hadn't published.
 I think this further illustrates the theory put forth that ideas are worthless, and it's only implementations that matter.
 Your calmness index is pretty low right now. I wonder how this will affect the market.
 Interesting hypothesis but I will wait before trusting my savings to this approach. If twitter mood can be correlated with stock market trends, I don't see how it could fare better than traditional Socioeconomics indices which measure mood in a much more controlled and accurate way.Interpreting mood trends from Twitter data assumes some very advanced semantic and linguistic analysis of messages, which even my human brain fails to understand about 25% of the time.
 I still prefer the hem line indicator. ;)
 My hypothesis: This is just another indication that stock market speculation prices spend most of the time almost completely decoupled from the value of the underlying assets. Eventually there is some triggering event, lets just call it a Lehmanbrovent, when some massively leveraged party can't pay off a creditor, and the effect cascades through the market bringing it back to a real valuation. We call this state a "catastrophe." The decoupled state we refer to as a "healthy market."A good way to test it would be to see whether the accuracy of the twitter indicator varies based on types of market activities, such as: does it completely fail on crashes, but always predict run-ups?
 Does anybody know how they got this data?Does twitter's API offer the ability to see all incoming tweets? Glancing at their API documentation seems to suggest that they turned this ability off a while ago.
 You can get a pretty big, random chunk of the public timeline with statuses/firehose or statuses/sample
 Here is the original paper - http://arxiv.org/pdf/1010.3003v1. The paper claims their code and data is open, but the link is broken and I was unable to find it by browsing their site...which seems to consist of mostly blank pages.
 I interviewed who did semantic tweets research at [some university] and said they were given an insane sample from the past x years just by making a written request to Twitter.
 From the paper-Third, these results are strongly indicative of a predictive correlation between measurements of the public mood states from Twitter feeds, but offer no information on the causative mechanisms that may connect public mood states with DJIA values in this manner.
 I rather like Eric Gilbert's "Widespread Worry and the Stock Market" on this.
 Anyone want to open a hedge fund?