
LSTM Neural Networks for Time Series Prediction - shivinski
http://www.jakob-aungiers.com/articles/a/LSTM-Neural-Network-for-Time-Series-Prediction
======
achompas
The key with autocorrelated models is to benchmark them against a naive
alternative. I really appreciate Jakob's point that the LSTM might simply be
modeling one step ahead using the current data point and Gaussian noise. Such
a candid assessment is important in applied work.

I suspect that, similar to other time series applications, you'll find some
interesting signals from exogenous effects. I wonder how LSTMs can incorporate
this exogenous information for time series analysis.

------
rubyfan
I feel like machine learning needs more content like this. Practical
application of all ML techniques with code samples but _more importantly_ real
source data, data manipulation and a firm understanding of the desired output.

------
highd
I'm unclear - aren't you showing results for training data anyway? The network
might just be compressing the trends in the training data into its function.
The question should be if the same network works on another time series,
right?

I.e. I can build a look up table looking at the last k data points predicting
the next and you're going to need a massive training dataset to make that not
work well on the train set (something like O(exp(k)*k/N) time series).

~~~
rubyfan
No time series analysis doesn't work like that. You can't train on one series
then predict another series outcome. When training time series, you are
basically looking for a signal which is not portable to some other subject.

It's more than just compressing the training set, a human can use a training
set to learn seasonality, volatility, streak durations, moving averages, etc.
which are learnings that can be used to infer future movement. The LTSM is
learning it's own observations to predict the next tick.

~~~
highd
Timeseries can 100% work like that. If you expect your timeseries data to be
coming from a similar distribution that is what you do to train your LSTM.
It's not just a magic box - you have to train it to encode useful features in
the gate.

Sure, you can prefer to use subsets of a single time series instead of
multiple time series. The issue remains that it doesn't matter what your
performance on training data is. You still need to partition your dataset into
training and test data - otherwise you could just be storing a lookup table
for all you know. It looks like the author has trained on the entirety of the
dataset, and then is just considering that performance...

Let me put this another way. You do this with a random walk. Train on your
entire timeseries - every length 50 window. Say that there's only 8 unique
values at each timestep. That means that there's 8^50 possible input sequences
into the neural network. A sufficiently complex neural network can fit an
arbitrary function, so if you just have a couple thousand windows there's
~(8^50 / 1000) possible functions that can predict the correct output exactly
- and this is on noise! And in all likelihood the neural net will learn that
noise: [https://arxiv.org/abs/1611.03530](https://arxiv.org/abs/1611.03530)
Without comparing training and test results there's no way to know that neural
network learned anything of value - it can get perfect accuracy on training
data that's pure noise!

This stuff is really critical to get right if you're doing machine learning.

~~~
rubyfan
What I was interpreting the parent comment to mean two different subjects. For
example, I can't train on weather data from Paris France and then expect it to
be able to predict tomorrow's weather in Portland Oregon. Am I wrong on that?

~~~
highd
You may be able to do that. That's sort of a matter of preference which you'd
like to do. If the two datasets share more structure than it's more
advantageous to share the network. There's also a bunch of hybrid approaches,
i.e. pretraining on every city and then fine-tune each independently.

------
huac
There's a good comment on the article about using this kind of network to
predict direction rather than return. I think that's where this would show the
most promise: if you know the direction that a (liquid) asset or the market
will go, then you can make money via a long/short strategy.

------
ge96
So... can you or can you not, predict the stocks with ANN's... haha, guess I
won't be quitting my job any time soon.

>A stock time series is unfortunately not a function that can be mapped.

~~~
1024core
> So... can you or can you not, predict the stocks with ANN's... haha, guess I
> won't be quitting my job any time soon.

Not sure about "predicting" the stocks with ANNs, but how do you explain the
Medallion Fund averaging about 30%/y (ballpark) without _a single loss year
since 1990_ ?

[https://www.bloomberg.com/news/articles/2016-11-21/how-
renai...](https://www.bloomberg.com/news/articles/2016-11-21/how-renaissance-
s-medallion-fund-became-finance-s-blackest-box)

~~~
brobinson
Or Virtu having a single losing day in a six year period?

[https://www.bloomberg.com/news/articles/2015-02-20/high-
freq...](https://www.bloomberg.com/news/articles/2015-02-20/high-frequency-
trader-virtu-extends-nearly-unblemished-streak)

I wonder what the Sharpe Ratios of the trading systems these funds run are.

~~~
nomnombunty
Virtu does high frequency trading so it makes sense that they don't have many
down days. Also HFT strategies can have ridiculous sharpe ratios of like 100

~~~
makeset
True, although conventional metrics like Sharpe ratio or ROI are not very
meaningful for HFT models, because they can't scale with any additional
capital (you can safely assume they are scaled to the max). Their returns are
extremely consistent, but also ultimately limited in magnitude. Rather than
magical money-making machines who have cracked the "secret code" of financial
markets, HFTs are essentially a fixed-cost utility service for reducing market
inefficiency through improved price discovery.

------
techbio
The LSTM shown will or will not predict the direction of a random walk. But it
does OK with a sin function.

