
Does Anne Hathaway News Drive Berkshire Hathaway's Stock? - nir
http://www.theatlantic.com/technology/archive/2011/03/does-anne-hathaway-news-drive-berkshire-hathaways-stock/72661/
======
dewarrn1
Well, correlation certainly isn't causation, but being a data-nerd, I had to
take a look. I grabbed Anne Hathaway's web presence from Google Trends, and
Berkshire-Hathaway's performance from Google Finance. After a little
massaging, I dumped the numbers in to R, and there is indeed a reliable
correlation between Anne's (per-week) web presence and B-H's weekly average
close (or closest preceding close).

data: hath$AnneTrend and hath$BerkClose t = 4.6739, df = 373, p-value =
4.135e-06 alternative hypothesis: true correlation is not equal to 0 95
percent confidence interval: 0.1372140 0.3286587 sample estimates: cor
0.2352165

It's an R^2 of 0.23, which isn't much, but it's not completely spurious;
Scarlett Johansson doesn't show the same pattern (there's a weak correlation,
but not reliable at p < 0.05).

Pearson's product-moment correlation

data: hath$ScarlettTrend and hath$BerkClose t = 1.7687, df = 373, p-value =
0.07775 alternative hypothesis: true correlation is not equal to 0 95 percent
confidence interval: -0.01016440 0.19071018 sample estimates: cor 0.09120053

However, Scarlett's career hasn't mirrored Anne's all that well. What about
someone a little closer to Anne, say, her Oscar co-host, James Franco? Note
that apart from that, the two haven't really done anything together
(<http://imdb.to/g6vwbp>), although I'd say they've both come to fame
recently.

data: hath$JamesTrend and hath$BerkClose t = 4.5991, df = 372, p-value =
5.826e-06 alternative hypothesis: true correlation is not equal to 0 95
percent confidence interval: 0.1336854 0.3256934 sample estimates: cor
0.2319475

And that might be the clincher. James's little ups and downs shouldn't match
Anne's, and yet his recent rise to prominence seems to share the same, slight
relationship to B-K's stock value. Thanks for reading!

(edit: Ugh, awful formatting! Apologies.)

~~~
joshu
So cotemporanrous correlation isn't enough to trade on. You need to look for a
leading signal. You might also want to compare the actors to the SP500.

You can make money even with a fairly low rsq if you have a good portfolio
construction and cheap execution.

------
andylei
Is there life on Mars? I talked to an employee at NASA, and he said, "Probably
not, but I guess its not out of the question". Then I wrote a story about it.

~~~
barista
Dang! by just reading the headline, I was about to change my name to Goldman
Sachs. Heck I'll do it anyways. With name like Goldman you can't go wrong.

------
grimoire
He used 6 dates:

Oct. 3, 2008 - Rachel Getting Married opens: BRK.A up .44% Jan. 5, 2009 -
Bride Wars opens: BRK.A up 2.61% Feb. 8, 2010 - Valentine's Day opens: BRK.A
up 1.01% March 5, 2010 - Alice in Wonderland opens: BRK.A up .74% Nov. 24,
2010 - Love and Other Drugs opens: BRK.A up 1.62% Nov. 29, 2010 - Anne
announced as co-host of the Oscars: BRK.A up .25%

I'm not a statistician, but 6 dates could hardly be considered a correlation.
I would also think that no Anne Hathaway news should result in no positive
price changes for a correlation to exist.

Seems like the article confuses correlation with "funny coincidence".

And it certainly doesn't "drive" the stock, as the headline states. That
implies a causation, which is even stronger than a correlation.

~~~
akronim
_I would also think that no Anne Hathaway news should result in no positive
price changes for a correlation to exist_

That's not correct. Imagine the stock price can be derived from two factors A
and B such that price = A + B. A is correlated with the price, but a positive
price change could also happen caused by B. So no change in A does not
guarantee no change in price. And in the case of BRK.A there would be many
such factors.

------
mayank
No it does not. The linked article relies on a naive assumption about NLP and
is devoid of any content.

To substantiate: entity resolution [1] is a widely studied research area in
NLP, and it's fair to say that researchers aren't as naive in their methods as
the linked article would suggest.

But even if naive "NLP" methods were being used, it would be silly not to mine
the link structure of news articles (or sources) to disambiguate Anne Hathaway
and Berkshire Hathaway [2]: there aren't too many news sources that mention
both in the same article, and are outweighed by the ones that mention only one
or the other (e.g., Us weekly vs. WSJ).

But even if there were some habitual false positives, no researcher or
programmer worth their salt would unleash their mining algorithm on the whole
web. More likely, you would target low-latency news sources like Bloomberg's
financial feed.

So I would say, no -- anne hathaway news almost certainly does not drive
berkshire's stock if text mining and NLP are the methods, and shame on the
Atlantic.

[1]
[http://en.wikipedia.org/wiki/Name_resolution#Name_resolution...](http://en.wikipedia.org/wiki/Name_resolution#Name_resolution_in_semantics_and_text_extraction)
[2] (PDF)
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.66....](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.66.3235&rep=rep1&type=pdf)

~~~
jcl
It doesn't have to be NLP gone wrong. It could be a bad algorithm based on web
traffic, picking up accidental clicks from Anne Hathaway fans (or some similar
shared metric).

It's also possible that this is a natural effect of the market -- that people
seeing Anne Hathaway headlines are subconciously reminded to do something
about their stock holdings. In which case, bad NLP would actually work better
than good NLP. :)

~~~
kanwisher
Most algo trading is going to use Bloomberg or Reuters data, since they are
very "clean" datasources that tend to be more focused on finance sectors, and
not a ton of hollywood gossip. Generally they are tagged with the actual
stocks in the articles, so you would have to be doing it on purpose.

~~~
jcl
Sorry, I mean an algorithm based, for example, on the number of clicks on
Berkshire Hathaway articles on a financial site being influenced by Anne H.
fans clicking on the wrong search result.

~~~
mayank
I get your point, but doesn't that seem a tad contrived?

------
mynameishere
Anne Hathaway in the news == no news of importance.

News of importance == bad news == shares in general go down.

...but really, if you have a good theory, don't jawbone on it; open a
brokerage account and make your fortune.

------
anigbrowl
The headline case seems highly unlikely, but there is a fund in London that
trades on Twitter sentiment, an idea which sounds like something out of a
Bruce Sterling satire. You can always sell shovels to traders and the media,
though - hire a few PhDs and a graphic designer and crank out some
infographics and popular attention is assured. Emphasize risk and caution
regularly and sooner or later events will prove you to be a genius, at which
point you can do an IPO.

------
MikeCapone
BRK has a market cap of over 200 billion. To make price move significantly,
there would need to be pretty big trades, no? If huge trades are made on the
basis of random names mentioned online, that could explain a lot of our
troubles with capital allocation...

~~~
gaius
One word: leverage. In other words, tiny movements in that much stock can big
effects.

------
aidenn0
It seems to me that if one algorithm did this, other algorithms would start to
notice a correlation between Anne Hathaway news and BRK stock, so they would
also want to buy when there is Anne Hathaway news. Something like this could
easily "infect" large portions of NLP that way if there is insufficient
damping in the system.

------
nostrademons
These are pretty dumb traders if this article is true. Bigrams are not hard,
and who the hell refers to Berkshire Hathaway as just "Hathaway".

More likely explanation: there're spurious correlations wherever you look, and
The Atlantic just happened to run across one.

~~~
droz
My favorite example of this <http://pubs.acs.org/doi/abs/10.1021/ci700332k>

------
kijiki
<http://www.theregister.co.uk/2008/09/10/ua_bankruptcy_farce/> Suggests that
this sort of automated text mining happens. Of course, it is from The
Register...

------
Gaussian
File this one under "stupid data mining tricks."

------
clu3
Oops, I always thought Anne Hathaway's father's name was Berkshire Hathaway

------
Confusion
Coincidence strikes again. Someone should convince journalists that
correlation is not causation; not even when some other random fact happens to
suggest a connection to our overly enthusiastic pattern-seeking minds.

