
Google Search Terms Can Predict the Stock Market - tonyrice
http://blogs.smithsonianmag.com/science/2013/04/google-search-terms-can-predict-the-stock-market/
======
rm999
Cool paper, but the fact that a silly keyword like 'color' is so significant
without an author explanation makes me question their results. I really wish
they dug deeper and explained unintuitive results like that. There may be a
bug in their experiment.

Also, why would they use the Dow Jones Industrial Average? It is a
ridiculously bad average of stock market performance for many reasons. This
planet money podcast goes into why:

[http://www.npr.org/blogs/money/2013/03/12/174139347/episode-...](http://www.npr.org/blogs/money/2013/03/12/174139347/episode-443-dont-
believe-the-hype)

~~~
PaulHoule
Lots of people use the Dow precisely because it is a bad indicator. If they
don't like the conclusion they get with something better, like the W5000 or
the S&P500, they can try it again with the Dow for a "second opinion."

The whole methodology is a joke anyway. If you evaluate a huge number of
search terms, some of them are going do better than others. So terms that
might mean something ("debt") get mixed up with terms like ("color") that got
lucky.

What you have to do is come up with one strategy that trains a classifier on
the combined words, tunes it up with logistic regression and derives optimal
trading actions from that. And you've got to factor in what you're paying to
the broker and the market makers.

~~~
ISL
Relevant technical terms: 'look-elsewhere effect' and 'false alarm rate'

<http://en.wikipedia.org/wiki/Look-elsewhere_effect>
<http://en.wikipedia.org/wiki/False_positive_rate>

There are more similar considerations. Statisticians have been very busy for
the last century or so.

~~~
rubinelli
As my statistics professor liked to say, "Numbers are meek creatures. If you
torture them long enough, they will say whatever you want to hear."

~~~
asperous
This is the key to the accounting profession

------
DoubleMalt
Google search terms could also predict a completely random time series. You
just have to pick the right search terms.

If you look at enough words the chances, that some of them have predictive
value approaches 100%.

~~~
Iterated
I used to backtest trading indicators a lot and the majority of time the
indicator would work on a stock like Apple but wouldn't work on a stock like
Tyson Foods. Or any other stock. I was just fitting the curve.

------
Symmetry
Ignoring methodology concerns for a moment, there have been times in the past
where people have found statistical regularities like this in the stock
market. Sometimes people have made money on these discoveries. If not someone
else will as soon as the researcher publishes. But making money on them gets
rid of them, and given that this is now publicly available information those
regularities certainly no longer exist, if they ever existed in the first
place.

EDIT: Ok, I couldn't resist the xkcd reference: <http://xkcd.com/882/>

~~~
tgrass
An economist and his friend are walking down the street when the friend sees a
ten dollar bill on the sidewalk.

“Look,” he says, “it’s a ten dollar bill”. “Nonsense,” says the economist. “If
that was a ten dollar bill, someone would have picked it up by now.”

~~~
Symmetry
I'm (and economists) aren't saying that there are never $10 bills on the
sidewalk, just that they don't stay there for long after someone notices them.
Stock prices went down on the weekends for decades before someone noticed. But
once it was noticed it stopped pretty quickly.

~~~
mikeyouse
I'm more interested in the persistent market inefficiencies; growth vs. value,
January effect, low P/E, all interesting examples that haven't been arbitraged
away.

~~~
someben
Check out the behavioral finance literature, like the Shleifer book:
[http://www.amazon.com/Inefficient-Markets-Introduction-
Behav...](http://www.amazon.com/Inefficient-Markets-Introduction-Behavioral-
Clarendon/dp/0198292279)

------
someben
Out-of-sample backtest or it didn't happen. Out of a bunch of random search
queries, chances are several of them will be "predictive" of futures moves in
an index. Big whoop.

------
zazerr
Related: The time Eric Schmidt said

"One day we had a conversation where we figured we could just try to predict
the stock market, and then we decided it was illegal. So we stopped doing
that."

~~~
fennecfoxen
Why would it be illegal? :(

I mean, subject to a variety of risk and regulatory issues, perhaps, and not a
core competency, and a variety of other things, but... the fundamental reason
for insider-trading laws is to make sure the insiders don't abuse their
positions and act against the interests of the company's owners (shareholders)
by trading in stock tips instead of building shareholder value. If you get
knowledge independently -- like if you sent your analyst out to count the
number of cars in a firm's parking lots to estimate hiring/firing -- that's
all well and good (and is something that hedge funds actually _do_.)

Heck, if it works, this sort of thing would be great. Moving the market
earlier means fewer people buy and sell companies at the wrong price. That
sort of knowledge is worth billions. (Think about it from the perspective of
startups: if you could see the future and know whether a company would work
out before you even founded it, then you would build only successful
companies, and you'd be certain they'd be funded well. This is but a small
fraction of that power, but it's still quite meaningful.)

~~~
mynegation
The legality of information you obtain is not about whether you obtained it
independently, but based on whether it is public knowledge (or can be derived
from it) or not.

Let's say you are not connected to company X, but you have a friend that works
there. One day your friend tells you material fact. Now, your friend probably
broke a couple of rules, but _you_ cannot control what people tell you and you
cannot be held liable for what you hear. However, acting on this information
is illegal. Even if you presumably do not use this information, but trade
shares of company X before the information you know becomes public, you are in
murky waters.

Same with Google. Google cannot control what people search for. However,
acting on this information would be illegal, unless they are absolutely sure
that information in the search terms is consistent with public knowledge about
the company.

~~~
bjterry
I believe this is not a fully accurate picture, although the example you give
in your second paragraph is definitely accurate.

At a high level you aren't allowed to use "material, non-public information"
for investment purposes, but information isn't material just because you can
make money off of it in some way, otherwise "channel checks" would be illegal.
Material non-public information has to come from insiders of the company, so
the only argument that could be made that it was illegal for Google to make
investments is based on material information that was being provided to them,
via search terms, by corporate insiders. If they are merely using the
sentiment exposed by the public to them through search terms that is probably
legal. Similarly it's legal for hedge funds to fly planes over department
stores and count the cars in their parking lots to gauge the level of business
they are seeing at Christmas time, even though this isn't public information.

------
mattparcher
Somewhat related, on a more micro level: some have suggested a trend in which
news about Anne Hathaway drives up the price Berkeshire Hathaway’s stock.

Six examples from 2008-2010: [http://www.huffingtonpost.com/dan-mirvish/the-
hathaway-effec...](http://www.huffingtonpost.com/dan-mirvish/the-hathaway-
effect-how-a_b_830041.html)

A computer scientist who works with hedge funds: "We come across all sorts of
strange things in our line of business, strange correlations"
[http://www.theatlantic.com/technology/print/2011/03/does-
ann...](http://www.theatlantic.com/technology/print/2011/03/does-anne-
hathaway-news-drive-berkshire-hathaways-stock/72661/)

As the intelligence of our technology grows, I find it amusing to consider
these less-logical patterns in the algorithms (indirectly) responsible for so
much of our economy.

~~~
OGC
So it's just "Correlation does not imply causation"?

------
Guvante
You are always going to be able to find trends between data points if you have
enough information. In order to tell whether you actually have predictable
information you need to do a double test, which they don't seem to mention
anywhere.

Typically you split your data pool in half and use data analysis to determine
some kind of relation, such as the change in "color" seems to correlate to a
reverse change in DOW Jones. You then generate a prediction algorithm based on
that.

Finally you use your prediction system on the other half of your data, to see
if you are actually predicting or just correlating. Feel free to adjust any of
your methodology but make sure you don't include the second half in your
generation step or else you have only shown correlation not prediction.

------
bdg
This concept is very similar to the "Twitter can predict the stock market"
from a long time back: [http://www.wired.com/wiredscience/2010/10/twitter-
crystal-ba...](http://www.wired.com/wiredscience/2010/10/twitter-crystal-
ball/)

See also, a collection of other related links (helpful if you want to try your
hand at making something like this):
<https://news.ycombinator.com/item?id=2055857>

~~~
someben
That Twitter paper has been torn to shreds over the last couple of years. Here
is me snarking: [http://blog.someben.com/2011/05/sour-grapes-seven-reasons-
wh...](http://blog.someben.com/2011/05/sour-grapes-seven-reasons-why-that-
twitter-prediction-model-is-cooked/)

------
dawilster
I was thinking of something very similar the other day but in regards to
bitcoins. They are very volatile at this stage and whenever they're in the
public eye they either seem to increase or decrease in value. Perhaps somebody
could perform a study as to a relation between the volume of bitcoin related
search terms and it's market value.

~~~
hammock
As soon as someone figures out bitcoin derivatives (e.g. options), you'll be
able to buy or sell the volatility. More broadly, such a development may also
stabilize those price fluctuations.

~~~
someben
What is your source for the ole' "derivatives reduce volatility in the spot
market" assumption?

~~~
fennecfoxen
Theory. If you expect that the price of bitcoins is going to be somewhere in
the next month, but it's wildly deviated from that on the spot market, there
is a monetary incentive to buy/sell bitcoin until the prices are in line with
long-term expectations -- discounted for risk, of course. Derivatives like
futures options reduce that risk, because you can lock in that future price
right now. If anyone's offering, that is.

Derivatives could also reduce risk for users. For instance: you're a business
that accepts bitcoins for payment. But the price of bitcoins can fall by 50%
in a day! So you'd better transfer those bitcoins to cash the moment you get
them, or you're subject to losing half your cash any day. Something that's too
hot to handle like that kind of damages the ability to use it as a medium of
exchange, doesn't it? But you could alternatively buy bitcoin currency
futures, essentially locking in your current price. So maybe you could hang
onto those bitcoins a while and use them to pay people for your referral
program or something. (Oh, sure, there's a cost associated with it, but there
are costs associated with translating cash back and forth as well. Depending
on the price of the derivatives, it might make sense.)

~~~
someben
When people use derivatives to hedge their exposure to "bad things happening"
and the market is not very liquid (i.e. BTC), arbitrage opportunities can
persist. For example, if there is no buyer at an out-of-the-money option
strike price I know is overpriced, then the market cannot correct itself even
though I am "right." When these mis-pricings stick around for a while, the
eventual corrections are often much more extreme. Derivatives reduce overall
volatility when the spot market is mature and liquid, which is a far cry from
MtGox right now.

------
alakin
Been playing around with google trends and trading for a little while now, and
there could be something to this.

If anyone in SF is also into this stuff, we should grab coffee. Email is
antonlakin at gmail.

------
xedarius
Many many moons ago I had a similar idea, although my idea was to spot
patterns of eBay purchases. I had a feeling that people in various parts of
the world would buy certain types of items depending on economic trends. Once
I had identified purchasing trends, it seemed to me that I might be able to
predict trends.

------
AznHisoka
Don't think it's that simple. Example: There's a sudden burst in searches for
"Macys". They could have started a huge advertising campaign. Turns out that
campaign costed them billions of dollar, and didnt have high ROI.

~~~
fennecfoxen
This seems more like a macroeconomic-leading-indicator strategy. The algorithm
worked against the Dow Jones Industrials, not against individual stocks.

------
ianstallings
Someone somewhere at a hedge fund just chuckled and said "ya think?"

------
ericb
How do you get access to google search terms for something like this? Also,
anyone have any other hypothesis about search terms that would predict
movement?

~~~
teraflop
<http://www.google.com/trends/>

Hal Varian, Google's chief economist, has been talking about using Google
Trends this way for a while:
[http://www.frbsf.org/economics/conferences/1103/Varian-
part_...](http://www.frbsf.org/economics/conferences/1103/Varian-part_1.pdf)

------
st8ic
1) This has been around for many, many years. 2) No free lunch.

------
brador
How easy is it to scrape Google trends? Is there an API?

------
andylei
good luck getting 'color' to work out of sample

