
Twitter Mood Predicts The Stock Market - duck
http://www.technologyreview.com/blog/arxiv/25900/
======
ivancho
Hey, why stop at passive predicting - if we set up a few boxes with a hundred
thousand twitter bots pushing "calmness" updates, we can probably make the Dow
keep on going up forever (or at least 87.5% of the time) . Think about all the
happy 401(k)s.

~~~
chegra
Cargo Cult Science :D

~~~
pessimizer
If the markets were actually science rather than the decision making of a few
huge players - if the HFT algo writers take this research seriously, I bet
it'd work for at least a day or two.

------
Robin_Message
From the top right of page 6:

    
    
        To assess the statistical significance of the SOFNN achieving
        the above mentioned accuracy of 87.6% in predicting the up and down
        movement of the DJIA we calculate the odds of this result occurring
        by chance. The binomial distribution in- dicates that the probability
        of achieving exactly 87.6% correct guesses over 15 trials (20 days
        minus weekends) with a 50% chance of success on each single trial
        equals 0.32%. Taken over the entire length of our data set (February
        28 to December 20, excluding weekends) we find approximately 10.9
        of such 20 day periods. The odds that the mentioned probability
        would hold by chance for a random period of 20 days within that
        period is then estimated to be 1−(1−0.0032)10.9 = 0.0343 or
        3.4%. The SOFNN direction accuracy is thus most likely not the result
        of chance nor our selecting a specifically favorable test period.
    

I'm not sure about the result not being chance. In particular, aren't they
privileging the hypothesis by using the probability of 87.6% guesses being
correct? They say the result only had a 3.4% chance of happening, yet we could
assign a similar probability to a wide variety of other seemingly unlikely
outcomes. I'd much rather have seen the predictor tuned on a subset of the
data and then tested on the rest of the data - perhaps the authors will
release their predictor and we can try it on 2009.

Given that twitter and the stock market are both correlated to what is
happening in the world (in fact, caused by what is happening in the world,) it
is not surprising that they are correlated with each other. Somehow I don't
think correlation in the other direction would get the upvotes though - stock
market crash predicts sadness on twitter? You don't say.

In short, I don't follow all of their statistics, but I am highly dubious of
their model having any predictive power.

~~~
b0b0b0b
They managed to test and train on different periods of time.

    
    
        February 28, 2008 to November 28, 2008 is
        chosen as the longest possible training period while
        Dec 1 to Dec 19, 2008 was chosen as the test period
    

That said, their split exhibits enormous bias:

    
    
        Dec 1 to Dec 19, 2008 was chosen as the test period
        because it was characterized by stabilization of DJIA
        values after considerable volatility in previous months
        and the absence of any unusual or signiﬁcant socio
        cultural events
    

I don't understand why they would think this is okay.

~~~
Robin_Message
Whoops, missed that. Well then, in the absence of unpredictable events,
twitter could predict the stock market, but it can't anymore since the market
will now factor this new knowledge in (assuming perfect markets of course.)

~~~
enjo
Pendantic response:

Perfectly _efficient_ markets. Which, as a theory, has been under rather
serious siege for the better part of the last decade.

You can certainly make a behavioral argument that would reach more or less the
same conclusion, however.

------
steveitis
I am very angry about this. I was damned close to going live. My code was
written. My backtesting showing amazing results.

Now every douchenozzle on earth will be onboard the bandwagon, and render it
useless.

EVERY TIME I HAVE A GOOD IDEA SOMEONE BEATS ME TO IT! WHARBLEGLARBLE!

~~~
pavel_lishin
Do you have any idea what you would have done after your project went live?

Because you can still beat them to that.

~~~
steveitis
Yes. I do. I was going to bet on the supershort, and superlong funds. (2-4 x
average market returns) for short term periods, and double my money every few
days. I was only waiting to build capital.

Now I have zero time to build capital before it's leveraged away. FUDGE!

I am righteously pissed off now. I had investors on the hook. now it will all
be arbitraged away from me.

It's just like quicksilverscreen when Rupert Murdoch litigated it away (one of
the senators who voted for the DMCA was one of the lawyers defending Fox).

I am ANGRY.

~~~
steveitis
I have so damned many phonecalls to make tomorrow. A lot of powerful, and
wealthy people will never talk to me again because of this. Fuck. Fuck.
Fuckity. Fuck.

~~~
ghotli
You may be mad, but unfortunately this is not the place for it. Let us know if
it pans out.

~~~
steveitis
I just lost billions of potential dollars, because some college kid couldn't
see the practical applications and published. What would your response be?

What if you were sleeping in someone else's garage, it was less than freezing
temperature outside, someone had stolen your car a few weeks ago, and you had
put your own kneecaps on the line as collateral? What would it be then?

Frankly, I think I'm holding together rather well given my circumstances.

~~~
qq66
Relax, dude, I've heard exactly this idea pitched by 2 other people, you'd
have been up against "douchenozzles" even if this "douchenozzle" hadn't
published.

~~~
pavel_lishin
I think this further illustrates the theory put forth that ideas are
worthless, and it's only implementations that matter.

------
danielnicollet
Interesting hypothesis but I will wait before trusting my savings to this
approach. If twitter mood can be correlated with stock market trends, I don't
see how it could fare better than traditional Socioeconomics indices which
measure mood in a much more controlled and accurate way.

Interpreting mood trends from Twitter data assumes some very advanced semantic
and linguistic analysis of messages, which even my human brain fails to
understand about 25% of the time.

------
jcroberts
I still prefer the hem line indicator. ;)

~~~
RickHull
<http://en.wikipedia.org/wiki/Hemline_index>

------
pessimizer
My hypothesis: This is just another indication that stock market speculation
prices spend most of the time almost completely decoupled from the value of
the underlying assets. Eventually there is some triggering event, lets just
call it a Lehmanbrovent, when some massively leveraged party can't pay off a
creditor, and the effect cascades through the market bringing it back to a
real valuation. We call this state a "catastrophe." The decoupled state we
refer to as a "healthy market."

A good way to test it would be to see whether the accuracy of the twitter
indicator varies based on types of market activities, such as: does it
completely fail on crashes, but always predict run-ups?

------
blhack
Does anybody know how they got this data?

Does twitter's API offer the ability to see _all_ incoming tweets? Glancing at
their API documentation seems to suggest that they turned this ability off a
while ago.

~~~
japherwocky
You can get a pretty big, random chunk of the public timeline with
statuses/firehose or statuses/sample

<http://dev.twitter.com/pages/streaming_api_methods>

------
robryan
From the paper-

 _Third, these results are strongly indicative of a predictive correlation
between measurements of the public mood states from Twitter feeds, but offer
no information on the causative mechanisms that may connect public mood states
with DJIA values in this manner._

------
brendano
I rather like Eric Gilbert's "Widespread Worry and the Stock Market" on this.

<http://social.cs.uiuc.edu/people/gilbert/38>

------
fleitz
Anyone want to open a hedge fund?

------
known
Trading != Investing

------
meatsock
i for one will be investing heavily in tila tequila

~~~
code_duck
Invest in synthesizing the next Bieber.

------
binspace
I wonder how long it takes for the observation to influence the results,
thereby invalidating the observation.

