
Google Search Terms Can Predict the Stock Market (2013) - 3chelon
http://www.smithsonianmag.com/science-nature/google-search-terms-can-predict-the-stock-market-41584532/?no-ist
======
qf303rjr3
No they can't - or at least, this paper doesn't provide any compelling
evidence that they can.

I read this paper when it first came out a few years ago, and produced an
implementation of the signal. They have heavily overfitted to historical data
- many plausible alternative assumptions for which keywords are predictive are
not profitable in backtest at all, let alone useful as a basis for future
trading.

This is an unfortunate example of non-finance domain experts, who I'm sure are
more than capable in their respective fields, making egregious errors when
they try to apply their knowledge in finance.

[https://xkcd.com/1570/](https://xkcd.com/1570/)

~~~
Kurtz79
Isn't overfitting historical data a basic mistake in any machine learning
exercise, regardless of the domain ?

I thought the common practice was using part of the historical data for
creating the model, and another sizable, non overlapping chunk to validate it.

~~~
KMag
> I thought the common practice was using part of the historical data for
> creating the model, and another sizable, non overlapping chunk to validate
> it.

One problem is that too often, people break the data into a training set and a
testing set. Then they train N algos on the training data, test them on the
testing data, and then trade on the algo that tested best.

Once you use the testing set for more than one algo, it's really a meta-
training set.

Really, you need a training set, a testing set, and a validation set. If you
use the validation data set with more than one algo, it's no longer a
validation set.

So, you train N algos, test N algos. Pick the best, and validate it. If
validation fails, do you have enough discipline to wait for more data to come
in and try again? Most people do not and will make hand-wavy arguments about
why it's okay to re-shuffle the same data into 3 data sets and try again.

~~~
JoeAltmaier
Its an infinite regression. You keep needing more data to be completely
'fair'. If the data set is finite, eventually you use all of it. Then where do
you go?

Another route is to model the data source, and train on the model (which you
can run forever to get endless data). Then test on the real-world data. But
that's only as good as the model.

------
jbb555
I don't think they got this right.

It's easy to find terms with hindsight that correlated well with what happened
(even if you test that prediction with some of your data and reject ones that
didn't work). It's not so obvious that they will be any guide to the future.

If aliens attack next week and the market falls for 6 months week by week
we're likely in a years to time to find the stock market decline correlated
strongly with the use of the word "alien" the week before compared with weeks
before the attack when the market was rising and it was hardly mentioned.

It's easy to find correlations with hindsight, the skill is in predicting what
they will be in advance

------
bko
>Instead of looking at the frequency that the names of stocks or companies
were searched, they analyzed a broad range of 98 commonly used
words—everything from “unemployment” to “marriage” to “car” to “water”—and
simulated investing strategies based on week-by-week changes in the
frequencies of each of these words as search terms by American internet users.

> The strategy was relatively straightforward: The system tracked whether a
> word such as “debt” increased in search frequency or decreased in search
> frequency from one week to the next. If the term was suddenly searched much
> less frequently, the investment simulation bought all the stocks of the Dow
> on the first Monday afterward, then sold all the stocks one week later,
> essentially betting that the overall market would rise in value.

> If a term such as “debt” was suddenly searched much more frequently, the
> simulation did the opposite: It bought a “short” position in the Dow,
> selling all its stocks on the first Monday and then buying them all a week
> later. The concept of a “short” position like this might seem a bit
> confusing to some, but the basic thing to remember is that it’s the exact
> opposite of conventionally buying a stock—if you have a “short” position,
> you make money when the stock goes down in price, and lose money when it
> goes up. So for any given term, the system predicted that more frequent
> searches meant the market as a whole would decline, and less frequent
> searched meant it would rise.

So there were [98] terms followed with no insight as to how they were chosen.
Then everything was bought/sold on the [following Monday] and sold/bought [one
week] later. That seems like a lot of choices made seemingly arbitrary by the
researchers. Reminds me of the xkcd comic on jelly beans [0]

[0] [https://xkcd.com/882/](https://xkcd.com/882/)

------
qwrusz
This is the last sentence; might help you decide if reading this thing is
worth your time:

> "But why do searches for the words “color” and “restaurant” predict declines
> nearly as accurately as “debt”? Why do “labor” and “train” also predict
> stock market rises?"

~~~
bbctol
I think there's a corollary to the old "Any headline that asks a question can
be answered with 'no'": any question in the discussion section of an academic
paper that they don't even try to answer can be answered with "because you
screwed up somewhere."

------
empath75
This includes the stock market collapse of 2008 in its data set which was
driven by a once in a lifetime debt crisis. I'm not sure that it would work
going forward.

~~~
throwawayReply
Nothing guarantees it, but what makes you think that a tool that is hooked up
real-time to what people everywhere are thinking and querying _wouldn 't_ be a
good predictor for the stock market?

The precise terms to use perform randomly as can be seen by the spread, so the
fact that 'debt' came out on top is less interesting than the fact that the
spread itself is significantly higher than what would be expected if the terms
were distributed randomly with a mean impact of zero.

Or as they put it in the paper, "The distribution of final portfolio values
resulting from the random investment strategies is close to log-normal" ...
"We find that returns from the Google Trends strategies we tested are
significantly higher overall than returns from the random strategies (<R>US =
0.60; t = 8.65, df = 97, p < 0.001, one sample t-test)."

What the paper is saying, is as a whole their basket of terms performed better
than random strategies.

You are free of course to try to reproduce this study to see if such a
strategy can be used going forward. It would be interesting to investigate the
effect of introducing a Bayesian aspect, such as investing more weight (money)
into words that have been performing better, much like multi-armed bandit A/B
testing.

Edit: The selection of the basket of terms is of course important, whether it
comes from knowing the recent history vs part of the algorithm is important as
mentioned above about overfitting.

~~~
nonbel
>"What the paper is saying, is as a whole their basket of terms performed
better than random strategies."

This is a totally meaningless metric unless they took measures to blind
themselves to the validation data. Did they do that, or did they try out a
bunch of different things, do interim analysis on the performance, etc (as is
almost always done in academia)?

If the latter, all this "test" amounts to saying you checked if A=3, but
consciously set A=5. Disproving such a hypothesis has zero value...

[https://www.kaggle.com/wiki/Leakage](https://www.kaggle.com/wiki/Leakage)

------
edward
If it were a real $20 bill, someone would have picked it off the sidewalk
already.

------
cheriot
If it does now, it won't for long.

Edit: Won't any more. The title needs a [2013].

------
mikkom
Note: This is from 2013 and probably arbed to unprofitability by now.

------
dschiptsov
Partially observable, multi-variable (multi-dimension) mostly stochastic
multiple-causation phenomena cannot be predicted from a set of prior
observations, by definition. Correlations does not imply causation. It does
not even imply that observed events describe phenomena adequately.

