Hacker News new | comments | ask | show | jobs | submit login
Using a Keras Long Short-Term Memory Model to Predict Stock Prices (fritz.ai)
296 points by austin_kodra 3 months ago | hide | past | web | favorite | 166 comments

This is pure nonsense. This isn’t even the right way to begin thinking about this as a forecasting task — the target series should be log-normal returns, not raw asset price. The performance of this model is laughably bad, which is probably why he spends zero time evaluating its effectiveness. You could trivially get better forecasts than this by naively repeating the last-observed price.

This isn’t ML. It’s cargo-cult performance of words and ideas that ML people use.

Even more so since test data can't be from the same time range: i.e. for time series you need to split train/test by date, not randomly, otherwise your model just memorizes the series.

It's standard practice to validate forecasts on non-randomized test/validation splits of the same time series, since this simulates the conditions where the model will be deployed in reality: It will know everything there is to know about the past, and it will know nothing about the future.

See Hyndman's fpp2 — https://otexts.org/fpp2/accuracy.html

Also, his description of rolling window validation: https://robjhyndman.com/hyndsight/rolling-forecasts/

The article splits in time, not randomly.

As someone outside the tech sphere on either coast, that's all ML seems to be.

What I've seen from companies marketing to Higher Education is - we have a lot of data, you set arbitrary flags to the data that you believe indicate 'x' (or even better, they have pre-built data expectations) and you will get 'y' outcome.

And none of it is actually based on anything real. It's all anecdotal applied to extreme amounts of actual data.

And when I read about ML on here, it seems to confirm my experiences.

> that's all ML seems to be.

You have to be very selective about what you consider "ML" to come to that conclusion. There has been a constant parade of incredible, mind-blowing results out of ML over the past decade, advancing the state of the art by leaps and bounds both in research and in real applications.

Do you not remember how terrible speech recognition and speech synthesis were just a few short years ago? Did you not see DeepMind finally crack Go? Check out BigGAN [1]. Try search in Google Photos. See translation getting better every year.

Yes, there are quacks and charlatans and people who are just plain wrong. But ML is real, and it solves real problems that people failed to solve any other way despite decades of concerted effort.

[1] https://medium.com/syncedreview/biggan-a-new-state-of-the-ar...

I concur totally, however think we should add in a caveat that is relevant for this audience: ...only the major consumer tech companies are actually reaping the benefit of applied ML in a way that is profitable.

That's because they have the platforms and applications that people are using at scale. So ML is a force multiplier if you already have a consistent and strong user base for a good product.

If you're trying to get a product or company started, unless you're a pure ML research company like Clarifai (and arguably even then), ML is probably going to cost you more than you gain.

These days, startups founded by people who have worked in big tech companies and have witnessed the power of ML/AI will definitely apply it from day-1.

You don't have to invent or implement the algorithm yourself. You can use AI/ML intelligent/cognitive services provided by the big-3 cloud companies to reduce your starting cost significantly.

To have positive ROI, your problem complexity should have crossed the threshold where common-sense traditional solutions don't work any more.

If you do embark on ML research yourself, then be sure to walk the path from simpler models to complex ones while carefully establishing performance metrics.

The problem isn’t that ML has gotten worse. It’s just as rigorous and far more powerful than it ever was. The problem is ML is hard, it hasn’t gotten orders-of-magnitude easier to understand, and there’s enormous incentive now to pass off amateur understanding as complete. The real ML still happens — it’s just drowned out.

I think people are saying that that's something they're not happy with. The big methods in AI, like backprop,

1) work a LOT better for specific problems than statistics or statistical learning ever has (and at this point, I think we can safely say: ever will)

2) a lot of methods either can't be explained, or outright shouldn't work, according to statistical theory.

The use of statistics in machine learning is limited to evaluating performance and individual element performance (and even that is tenuous at best in many cases). If you ask, say, why would an autoencoder, with an LSTM on it's compressed representation and Q-learning evaluation have somewhat decent performance on half the computer games humans ever designed ? Statistics will not be useful in formulating an answer.

If you ask extremely valid questions, like "why would an LSTM predict anything ?". Statistics draws a blank. There is no good reason to assume an LSTM will ever converge (and on a truly random dataset, it won't, whereas statistical methods will still allow you to say something).

I think there's 2 reasons for this

1) the "upper limit" of complexity a human can understand in a statistical model is lower than the upper limit a neural network can "understand". In statistics the human understanding is critical to getting to a valid model, in machine learning ... it is not. Meaning machine learning can learn relationships a human mind cannot.

2) There must be some fundamental property of the world we live in that matches neural network architecture. In order for backprop to work on real-world problems, it has to be the case that almost all real world phenomena are continuous, both "raw" and in the frequency domain. If this wasn't the case, machine learning would never be able to learn anything.

can't be explained, or outright shouldn't work, according to statistical theory

When people invented a steam powered engine, some other people had probably said: "Modern physics can't explain how it works. It shouldn't work. It's too complicated". Then a few decades later physicists discover laws of thermodynamics.

It's so much worse to see ML applied to problems that could surely be studied and understood mechanistically, but people are sold on using ML instead in deference to buzzwords alone. The result is that they may get some model of a phenomenon, but they'll never learn a goddamn thing about why it works. What good is that?

It depends on what your goal is; sometimes it’s the destination, sometimes it’s the journey.

Can you clarify why "the target series should be log-normal returns" is important or provide a pointer for more information?

Investment markets operate on relative gain, not absolute gain. E.g.: if you invest in a stock and it gains $5, this would be a great return for a $1 stock but a poor one for a $1000 stock, so the absolute gain doesn't mean anything on its own. A 5% return always means that you've gained 5% on your investment.

Would it be the same (valid) with percentage returns?

Less so. Log-normal returns are better because they have the property that a summation of log-normal returns over contiguous intervals is equal to the log-normal returns of the combined interval. In other words: Losing 5% and then gaining 5% doesn’t put you back at exactly 100%, and log-normal fixes that.

In the extreme, two successive trades, where the first gains 110% and the second loses 100%, “average” out to a 5% return. However, you don’t want to make that pair of trades.

Ah, that makes sense. Many thanks.

I think their point is that predicting that "the stock will go up by 1$" or "the stock will go down by 1$" is worse than "the stock will go up by 0.05%" and "the stock will go down by 0.05%" because of this little paradox:

50$ increase from 50$ is 100% increase 50$ decrease from 100$ is 50% decrease

e.g., if the model finds 50$ increase/decreases, that actually corresponds to very different wealth changes

Thank you.

Well, just at a minimum... There's the fact that the majority of large gains and losses exist over single-day frames. That is, guessing "right" or "wrong" on movement means little when one slip on the wrong day will decimate your returns.

Welcome to ML in 2018.

http://scikit-learn.org/stable/modules/generated/sklearn.pre... Should normalize the data and raw prices won’t be used and instead the normalized coefficients are effectively a percentage

It's the AI winter we're all fearing!

Clearly one LSTM didn't work. Lets try FOUR!

- edit: miscounted number of LSTMs

I'm not a purist believer in the efficient market hypothesis. BUT, I doubt there's much alpha to be gained simply from looking at price data which is widely and publicly available. Also keep in mind that markets are dynamic feedback loops so even if this model had an edge, the act of publishing this article would work against you to neutralize that edge in the future.

There's a good reason the most successful 'quantitative' trading fund, Rennaisance Technologies is so secretive and subjects its employees to a lifetime NDA/non-compete.

If there is edge in the market (and I think there are anomalies that can be traded on), I would assume it to come from correlating proprietary datasets with price. Or datasets that have a high barrier to access/analyze. One example that I'd bet still works in some industries would be counting trucks from a supplier to estimate product demand before the companies announces earnings ;)

I 100% agree with this as a Daily Fantasy Sports player who has published some blog posts and code related to strategy. Any success I've had has been in sports and formats that are not popular and I purposely do not write or open-source code on. Edge can go away almost immediately; I saw this firsthand in DFS when a number of websites came in with free tools that pros had been using for years.

I play in the stock market a little and my best wins have been on small cap stuff where I've had some edge with unique knowledge of the industry and have taken the time to read SEC filings / keep tabs on earnings reports / closely watch competitors, acquisitions, etc.

To be clear, I still think this article is great from a learning about Keras perspective. That said, to anyone who thinks building some ML models and outperforming the market is easy, remember that edge is only as good as the number of participants in the market who don't have it.

I'm very interested to read some of your blog posts related to strategy/code around DFS, but I don't know what your blog is. Would you mind posting a link to one of these posts?

Definitely, sorry for late reply. Here's a few:

- Pick Em strategies on DraftKings - https://medium.com/draftfast/evaluating-possible-strategies-...

- Thinking in multiples - https://medium.com/draftfast/thinking-in-multiples-7e7c76ee2...

Rentech and the other more secretive top quant firms such as TGS and PDT use nearly exclusively public data.

Funds like Two Sigma that haven't had as good (or scalable) returns are actually the ones that focus on novel data sources.

In reality, everyone in this industry is drowning in data, and the real edge comes from learning how to more efficiently parse and analyze data rather than acquiring more of it. The top performing firms work at a level where the data is merely an abstraction; above-market returns can be achieved from nothing but public data sources, if you've automated the process of extracting signals.

These are bold claims with nothing to support them. Given the claimed secrecy, I don't see how you could possibly know all that, unless you are (or have been) working for Rentech, or one of the others. Is that the case?

Funny things I've heard about Two Sigma. Everyone on the outside: Two Sigma is the best. People at the top of the financial industry: Two Sigma is cheesy.

I'm pretty sure that lifetime NDA is a myth. It gets repeated very often (like a lot of the Renaissance mythos) but I haven't seen evidence for it, and on its face it sounds flagrantly unenforceable. It could be that the compensation alone substantially reduces turnover. Likewise once they've hit "their number", employees may decide to go into tech or philanthropy instead of a competing firm.

For what it's worth, Renaissance (remarkably, in my opinion) advertised a role in the most recent Who's Hiring thread.

The NDA/non-compete may not enforceable, but if Renaissance believes employees with valuable IP are being poached, they will sue and make it very expensive to hire, then defend, ex-rentec employees.


They’re totally up front about this requirement, too, when making offers. (Source: I turned one down.) You are to move to Long Island, buy a house, buy into the employees-only fund and work there for the rest of your financial career.

That's interesting. Did you apply, or did they find you? I could see that offer being attractive.

LOL, No.

How do you even enforce that? If I get a job somewhere else, there's no reason for me to disclose where I'm going.

> How do you even enforce that?

By suing your new employer for $20 million [1].

(On the other side of the coin, they offer like a month of paid vacation a year, plus longer sabbaticals every few years, plus tons of benefits...if Long Island and math are your thing, it's a very nice place to work.)

[1] https://www.marketwatch.com/story/renaissance-millennium-set...

They will prob run regular checks and you are prob. signing off on them having a right to do so.

These predictions can also be self-reenforcing. If enough money believes the model is accurate, it will create it's own market conditions. In a distopian AI stock prediction world the best model will be the one with the widest publicity and adoption, not the best data points.

The problem I see with that hypothesis is that the asset pricing must eventually be tied to actual performance--while it may be self-reinforcing to some extend, if it _is_ fundamentally wrong, there will be a reckoning and an adjustment after a high value stock goes bankrupt, for example.

But the stock market isn’t just a bunch of people placing speculative bets about company performance; it’s a bunch of people placing speculative bets about company performance by buying shares of the companies. If the market is irrational, it can actually prop up companies that would otherwise go bankrupt.

I suppose they company could continuously issue stock in this case.

But in this world, it actually makes sense for those companies to stop doing their normal business and just go into the business of selling their shares.

This reality sounds absurd, but you could argue that BTC market is there. Enough of the market thinks that "always buy" is a good investment, regardless of the real-life value of the asset. Or maybe, less controversially, gold is that market. Any asset that is always increasing in value and not related to the real value of that asset is just a store of value.

Both of those things have valid services that I pay for.

BTC provides a number of services. From money changing and international money transfers to actual investment brokerage. Granted, the number of securities available in BTC is less than spectacular, but it's not zero. I pay for both those services. Now you may argue that those are unregulated services and therefore have trust issues, but one might argue that all markets have trust issues, and the only difference is the level. BTC, so far, seems to be more trustworthy than, for instance, the ECB (e.g. the Greek payment limits and the Cypriot bail in, one of which affected me, and both of them used MY money to achieve political aims, without my approval).

Gold provides a store of value, with a good story behind it. I pay my bank, I believe, around $40 per year for that same service. With frankly, not as good a story behind it (as I trust my bank less than I'd trust a bar of gold under my pillow when it comes to still having value tomorrow. Not that I have the kind of spare change to make that a pressing issue, but ...)

So given that both BTC and Gold provide services that clearly people are willing to pay for, who's to say they shouldn't have a valuation based on that income like every other financial service provider in the world ?

That effect doesn't go very far. It can slightly reduce a company's cost of capital but won't enable them to be profitable over the long term.

Generally when talking about price movements the opposite is true. That is a prediction that the market will move at some point in the future will cause that move to happen right now instead, but only if that prediction is believed.

And if you have a model that people believe which predicts the movement of stocks but doesn't have any good reason for those predictions then it may very well move the price of those stocks in the ways that it predicts. But in doing so the money of the people who believe its predictions will be transferred to the people who don't until it stops having an effect.

If enough people believe a given model and use it, it would actually tend to make the model over and under-predict. If a lot of people think the market will go up, it's race to buy before the others and eventually a race to sell before the others when either expectation shift or all the buyers run out.

Which is say, belief in a direction can make the direction happen but belief in a particularly shaped curve won't make the market resemble that curve.

"A research scientist and senior level employee who worked out of his Pennsylvania home […] His noncompetition agreement prevented Magerman from working for one year after leaving Renaissance for any firm engaged in the business of mathematically-based trading of futures and securities." [1]

Not quite lifetime, unless they expected him to drop dead within a year.

[1] https://www.forbes.com/sites/nathanvardi/2017/05/08/inside-t...

Magerman was an early employee. Not sure his agreement would be typical.

A lifetime noncompete really I assume they pay you for life for this noncompete.

My understanding is that legally, the contract says something reasonable, like a year. But the firm will use it's considerable resources to impede your career if you attempt to work in mathematical trading after that year's up.

On the flip side, employees are compensated very well, and the turnover is very low.

I'd assume turn over is low if by leaving they now have to change entire career paths. I'd put up with a lot of shit to not have to learn a brand new trade.

They make SO much money, though, and get access to the special employees-only fund, so I assume that employees that join are interested in staying.

They don't. They're all qualified for machine learning or data scientist roles in any area of tech. That's a huge field.

Most of them are also qualified for general software engineering roles; some of the people with more theoretical backgrounds might not be.

If that's true, it's only a matter of time until they get sued and lose a large class action suit.

Well, being an employee gives you exclusive access to the best fund in existence by far (35% return every year for decades). Nobody is going to want to kill their golden goose.

Extreme secrecy plus consistent very high returns sounds like the hallmarks of a ponzi scheme

The fund contains only employee money. That's a very clever ponzi scheme if it is one.

Rennaisance is one large quantitative trading firm, there are other bigger and more successful ones however. I happen to work for another very large and successful quant trading firm.

Bigger sure, but can you point to one that has had better performance than the medallion fund?

Most of them aren't hedge funds, they're private (HFT, Electronic Trading, Botique Firms, Proprietary Trading are all terms you'll hear) so they don't have public data. It is literally impossible to prove without insider information.

However the Virtu Financials, the Citadels, etc, will always exist, and will be doing exceptionally well whether people realize it or not.

Yeah, a better name for these kinds of funds is "independent research groups." Edgestream is an example of another. Renaissance is the most (in)famous one, but many others exist which profitably manage tens to low hundreds of billions while employing only 10 - 100 people.

I realize this doesn't actually matter and I feel weird defending Renaissance... but Virtu is a public company and hasn't been doing that well, and returns from Citadel's various funds are not hard to find. Also, I don't know how to compare a fund's returns to that of a private business. But Medallion has been around since before HFT was really a thing, and I'm not aware of any HFT places that have been growing at 70% a year for 20 years.

These are wildly different businesses. Virtu is a global electronic market maker which in reality is more like a mature technology company than a risk taking hedge fund. Renaissance (Medallion specifically) is a weird fund that manages employee money. Citadel is a large, diversified financial services company that happens to have overlap with both of the previously mentioned firms. They have a large market making operation and offer a variety of different hedge funds that they market

It's not unheard of for market makers to have returns well above 100%, and one you get into certain latency arbitrage strategies the returns can grow significantly.

The issue is that these business are capital limited, so you can't reinvest your massive returns to compound them.

Which company would that be?



> Rennaisance Technologies is so secretive and subjects its employees to a lifetime NDA/non-compete.

Medallion fund is profit maker but their strategy is obviously very size limited. Their strategy does not scale.

Betting syndicates can make similarly massive ROI, but they are very growth limited.

What is a betting syndicate?

There is tons of opportunities in smaller markets that RT or others have no interest in, however.

How does a lifetime non-compete work?

They kill you.

Well, if it's in the contract..

I'm shocked that would be even close to enforcable.

I don't know anything about the finance industry, but it seems to me that there are a number of industries where the non-compete term is irrelevant, as you can still become persona non grata. For example Companies B, C, and D know that if they hire someone from Company A, they will get sued. The merits are irrelevant if Company A can outspend them relative to the value of the prospective employee.

Me too I suppose if you where payed for life to not work in the field and would probably not be enforceable.

don't worry, it's not. Even in New York, has to be limited in time and geographic region. Good luck getting more than a year.

All these LSTM Stock Market tutorials seem to be a variation of this guys tutorial which he did a while ago:


Except Jakob explains the folly of the method.

Whereas the author of this piece says the following:

From the plot we can see that the real stock price went up while our model also predicted that the price of the stock will go up. This clearly shows how powerful LSTMs are for analyzing time series and sequential data.

I made this code https://github.com/mouradmourafiq/tensorflow-lstm-regression at least 6 months earlier than the first push in the github repo you mentioned, and I am sure many people did some version before that. A lot of people sometimes just need a reason to play with some technology. I assume that OP used stock market to learn about LSTMs.

The guy I posted uses Keras which is what this guys code looks like.

I am sure plenty of people have done lstm on the stock market but the form of the code looked similar between the two.

>From the plot we can see that the real stock price went up while our model also predicted that the price of the stock will go up. This clearly shows how powerful LSTMs are for analyzing time series and sequential data.

Yes I've noticed on HN recently ML and data science have become popular topics but I'm surprised a post like this has so many votes.

The problem with these types of systems are it's difficult to do backtesting on a single continuous data stream... I write about that here:


I recommend doing something similar to the original post. Neural networks tend to produce valid looking output for a stock price easily, a random walk does too. You have to find correlations, causations, and then review the results carefully.

For reference, I wrote my own financial advisor (not directly utilizing deep learning, that functions relatively well):


It works relatively well when checking the "causality" which also has it's limits: https://blog.projectpiglet.com/2018/01/causality-in-cryptoma...

EDIT: Added some evidence:

* 2016 stocks: https://imgur.com/a/j8YWR

* Early 2018 crypto: https://blog.projectpiglet.com/2018/01/30-weekly-returns-usi...

* Early 2018 stocks: https://twitter.com/AustinGWalters/status/976347632439209985

* All 2018 on Robinhood only: https://imgur.com/a/2CxEFqI

* Also (I am lettergram on HN), created: https://hnprofile.com and https://redditprofile.com

Interesting but i am a non believer can you provide figures that show gains through years?

Added in an edit update - I have more, but just tossed some together for you from 2016 - 2018.

> I wrote my own financial advisor (not directly utilizing deep learning, that functions relatively well)

I do not see enough data to backup the claims of how well it works. And if you can't provide that - what good is it as a tool?

This type of scam has been done many times for years...

I preliminarily did the same thing as in this article a while back (with crypto data, since it's easily accessible), just as a way to learn keras.

Somehow my network always learned to output a delay of the input no matter how hard I tried to shape it. I've searched through literature briefly and some examples. Some blog posts even claimed they had good time series prediction when I clearly saw they were having the same problem.

What I mean to say is, be critical of what you find online, and be critical of your results. This is definitely not an easy problem, and some argue that it's unsolvable all together :)

You are not alone. With time series that is the easiest solution for any model. That is why you need a lot of work creating your datasets and the experiment setup with time series. To avoid the easy solution and obtain something useful.

Nobody's gonna do much better than lagging the market price using a model based purely on historical price data, markets aren't nearly that predictable.

Do not minmax scale the data. New data outside ranges seen before will be chopped off. Do make relative (high/low), so patterns you find generalize to other domains. Don't fit complex LSTM models on tiny datasets. Somehow this is the shiny thing all starters jump on, while the most simple models are the most proven and robust. Don't just share such a model without giving a disclaimer ("I never tried this out on real data, and would not invest my own money in this solution"). This is real money, real economies, and you could horribly crash these. That is also why you don't write with authority about how to build a bridge (while you never crossed a bridge you've build before). The only reason for the "clearly showing how powerful LSTM models are" is it responding to test set samples it got wrong, and correcting with a lag. How about some evaluation, preferably on another stock than trained on?

>This is real money, real economies, and you could horribly crash these. That is also why you don't write with authority about how to build a bridge ...

Setting aside that first bit of hyperbole, I don't see the danger in the bridge example. I mean, if engineers are building bridges based on information they've gleaned from internet articles, there are larger issues at play.

As far as markets, retail traders basically exist and have existed within a massive bubble of misinformation since forever. There's a reason order flow is so valuable.

ML is dangerously overhyped (that is an issue at play). This article uses min max scaling, so it will treat a 30$ stock as if it were 20$. You want someone like that managing your pension fund?

I refer to engineering code, because it is clear to me we don't want our bridges build by those not skilled enough to make safe bridges. Yet, it is not so clear with ML/AI, while the potential for damage may be even greater.


These types of experiments pop up from time to time. You have to compare the performance relative to the single lag error. So the baseline is to use the previous time step to predict the current time step. Ultimately, it's not worthwhile to attempt to predict stock timeseries the SNR is far too low. It's more valuable to attempt to predict a trend, e.g., +1 is there will be upward movement in time window x, -1 if downward, 0 is within some epsilon of the current price.

11:15, restate my assumptions: 1. Mathematics is the language of nature. 2. Everything around us can be represented and understood through numbers. 3. If you graph these numbers, patterns emerge. Therefore: There are patterns everywhere in nature.

> Everything around us can be represented and understood through numbers.

One of the biggest problems facing someone trying to assemble a numerical model of some thing or process is whether or not their model is adequate to describe the phenomena they are observing.

This is why psychology remains a statistically modeled science, largely relegated to anecdotal research.

(Note that my intent is not to speak ill of psychology and its cohort, only illustrate that just because you can slap numbers on a thing doesn't mean you've described it well.

Look at the transition of understanding from Newtonian mechanics to relativity and quantum mechanics. Newtonian mechanics was enough until someone saw Mercury was doing weird things it shouldn't.

And before Newtonian mechanics we had a broken description of the solar system with more exceptions than rules. We were throwing numbers at a system and failing, in some part, to describe it with consistency.)

Well said. Having to add increasingly many edge cases to your model is a sign your model is insufficient.

12:50 Press Return

(For anyone who didn't get the reference, you have an excellent movie & coding soundtrack to catch up on: https://www.youtube.com/watch?v=ShdmErv5jvs)

I haven't thought about this in years. Love that sound track. Thanks

Now, where did I put that drill?

This doesn't take away from the exercise, but I notice the model seems to "predict" the movement just after it occurs.

It just lags the current price. If there is no alpha in the price signal, then the rational thing to do is to just stay at the current price, which is what the algo seems to have learned to do.

Pretty underwhelming. From the very limited experiment demonstrated, the LSTM seems to be playing catch up rather than actually predicting trends.

I noticed this as well.

Many data science thought pieces don't quantify the results, and it's suspicious that they're not included in detail (typically, the R^2 of a simple stock market prediction model is super low, making it impractical putting actual money on the line as it's barely better than guessing randomly: https://twitter.com/minimaxir/status/1021885939361042432 )

> barely better than guessing randomly

In the stock market something that is consistently better than guessing randomly is very valuable! You don't need much of an edge to make a lot of money.

(I'm very skeptical about approaches like the OP though)

You say that because the prediction line lags the price line?

That’s not playing catch up, if the prediction was made at time 0.

It doesn't specifically say as far as I can see, but I'm highly doubtful that the prediction was made at time 0 (i.e. with no seed data). That would literally be a lucky guess.

You can use whatever prediction technique you like, but if your model is wrong, then so will the prediction. Predicting stock prices requires considering as many factors as you can gather that goes into setting the stock price, and how the factors correlate with each other.

> previous price of a stock is crucial in predicting its future price

This is a poor and incorrect model. The stock price has plenty other variables and many of which are unknown. What's used here is just a single variable.

Signal/noise on the market is extremely low, and determining whether you’re predicting something or just data snooping your way to a function that looks like it predicts something is a serious competitive advantage in itself (... because all you need then is novel functions to try).

Working out a function that passes a split test is inevitable and easy tbh. That function then making money is highly unlikely. This is especially true if your data sources are public.

My favourite part of this is that Francois Chollet (the author of Keras) specifically warns against this exact use of LSTMs in his book (Deep Learning with Python)

This is complete rubbish. Zero evaluation compared to any sane baseline, zero thinking about representation (predicting raw price?!). I'll just say it, ML was frankly better when fewer people knew it and there was less money in claiming to be an expert.

Again, as always, and every time -- if a stock market price-predicting scheme worked, it would not be published on the internet for free. It would not be for sale. It would be in constant use to make money for the authors.

The author uses stock market as a toy example. He is not actually attempting to do serious forecasting.

Only comparing the prediction accuracy against simpler models like ARIMA or VARMAX (as the author suggest) can tell you if the model has any use.

It seems like if you do anything with financial data like this, people often aren’t capable of accepting that it’s just a toy example — they judge it as if it must be intended to really make money. I don’t know why.

I saw some guys present who did this with currency trading. They had to retrain the model every night and it didn't trade for the first four hours after the market opened.

Total noob question in this space (algotrading with DL algorithms).

What DL is good at, is to automate tasks that are easy for humans (e.g. telling cat from dog, understand a sentence from sound wave, translation, etc) but hard for machines without DL.

Now, if a task is even hard/borderline impossible for humans to achieve (e.g. predicting stock market with higher accuracy and consistency), why would we believe DL could do a better job than humans?

DL isn’t just good at automating human tasks (see AlphaZero and AlphaGo).

It can be used to recognize patterns and train to solve problems better than humans. Though in this case since the market is people making predictions about a prediction I’m not sure how much it’d help.

There’s probably something about humans using a rough estimate of what the stock has been historically to inform where it could go though so it’s value is probably non-zero.

By DL, i meant generic DL techniques like CNN, LSTM (used in this article), which relies on large amount of labeled data to train, and predict on similar data, thus my comment about automation. Alpha Go (Zero) is very specialized for the game of go, not sure how much of its specialized algorithm could transfer to other generic use cases.

>Alpha Go (Zero) is very specialized for the game of go, not sure how much of its specialized algorithm could transfer to other generic use cases.

AlphaZero is a generalized successor, and it does just that:


Good to know it generalized, but seems it's only generalized on board games like problems where you have a problem space to search through, is that understanding correct? If so, it probably won't help in the use cases we are talking about here, right?

Yeah it's not an AGI, but I think the response was more that DL isn't necessarily limited to automating tasks that are easy for humans but hard for machines without DL (which I think was your original statement).

DL could potentially do a better job by recognizing patterns in the dataset that lead towards winning (making more money) that humans might miss. Like how AlphaZero can recognize moves in Go or Chess that humans don't understand are the best moves to make.

It's not obvious to me how this would be done, but I think it's plausible that some clever implementation could help.

I think relating DL with human skills is confusing you. DL already beats humans in recognizing dog breeds. DL could always do a better job, because it looks at more patterns than humanly possible.

The reason I relate human skill with DL, is because for tasks where there are real patterns that humans can consistently recognize, DL is applicable, because it relies on human labeled inputs. For patterns, even humans can't consistently recognize (or if there are patterns at all), there's no way for DL to be better than humans, simply because the inputs from human can't be trusted. The example of DL does better job at e.g. recognizing dog breeds, recognizing cancers, whatever, is because those jobs are possible for humans to begin with. The only difference there would be DL could work on more human labeled data than any real human, so that they can recognize more patterns. But if those human labeled data/patterns dont exist in the first place, why we would believe DL could do a better job?

I think you are still confused. Replace DL with software.

Because there's no requirement that deep learning must only excel at tasks humans are good at.

That said, the pitfalls of using ML in trading systems are numerous.

I didn't mean it's a requirement, but a fact that what ever DL could do is based on its input (labeled data), and that input has to come from human, or the task underneath is possible for human.

These predictions and models work until they don't. And when they don't, they blow up in spectacular fashion.

Look at Long Term Capital Management. It had big names behind it and the model/strategy made a lot of money, until I believe the Russian bond crisis caused the markets to act irrationally. Then everything they had invested in quickly went down the toilet, and it was big enough to almost cause a financial crisis.

I wonder if someone could comment on how AI is used at big investment banks? I assume its much more than technical analysis (like this model). I imagine that NLP might be helpful for quickly ingesting a news feed and then making buy/sell decisions based on the news in less time than humans can read and react?

As you mentioned, one way is to quickly ingest earnings announcements from press wires/company websites. As soon as the earnings report hits the wire, they want to immediately known how big a miss/beat was it, and whether they should immediately dump it or buy more.

Other examples include analyzing the words/sentiments in earning calls, to see if certain words indicate a bearish or bullish signal. For example if a CEO mentions the word "headwinds", does that historically lead to a much better quarter next time?

Some other companies do analysis with public/private data. For example, you can use a weather dataset to calculate the average temperature for all cities where there are restaurants owned by Cheesecake Factory in a quarter, and see if it will negatively or positively impact sales.

I don't think you need AI for that, though. There are news feeds that are specifically designed to be machine readable. All you need to do is parse it and compare the actual numbers to the expected ones, then calculate the delta for your pricing model. All of this is pretty simple. The real challenge is being faster with this than the competition.

You're absolutely right. None of this is AI. Wall Street is notorious for making stuff sound more impressive then it really is.

I don’t know what’s more embarrassing. This article, or the fact that it’s #1 on HN right now.

I cannot fault your enthusiam but I can fault your method :)

Read Tsay's Financial Time Series Analysis for a better idea how to forecast financial data.

The biggest problem you will face is stationarity: the statistical properties of the data is not constant over time. For example, the mean and std dev is not constant over time. Using returns instead of raw prices helps to make better financial forecasts.

Two methods to explore:

1. You are better off predicting stock prices by predicting future returns and then forecasting is the current price plus predicted future return.

2. You could use your neural model to predict absolute size of returns using realized volatility.

Is this graph at the end a one-step-ahead prediction? Because if it is, then it looks very much like a 50/50 success/failure ratio. Which is what I would expect, because otherwise you could extract a signal from it.

I'm skeptical of any system that proves it can predict stock prices vaguely.

I would be less skeptical if it was more humble, and specific in its constraints. For example, could it predict the % probability a stock would up X% if it had N consecutive negative days? Or, how many minutes/hours after earnings are announced is the after-hours price usually indicative of what the price will be the next day?

You can't predict everything, but you might be able to have a probabilistic prediction of specific scenarios. though even then I'll still be skeptical.. daily stock movements are totally random.

No, this is a bad example of prediction; if you bet money on the direction of prediction, you'll lose much. Drawing prices and making it looking as similar is not correct prediction. At least you should show >50% correct direction prediction if it is considered to be "well processed" machine learning.

From my experience, LSTM or other recurrent neural network models only work "well" at forecasting bounded and periodic or oscillating time series. Might work for something like seasonal sale data, but would fail spectacularly with unbounded and chaotic time series like stock prices.

I used to run a technical analysis site for exchange traded funds (ETFs).

Skipping to the "results" of this experiment, I am seeing nothing. It looks like a slow, short period moving average (SMA)... which does nothing in terms of predicting anything. It just averages out the chaos over time.

> From the plot we can see that the real stock price went up while our model also predicted that the price of the stock will go up. This clearly shows how powerful LSTMs are for analyzing time series and sequential data.

Is laughter or tears the right reaction here?

Isn't the concensus these days that every bit of information is squeezed out so that the stock price always reflects all that is predictable?

In addition, major factors are external and trying to predict further price only based on past price is bound to disappoint.

Oblig XKCD https://xkcd.com/1570/

Seriously tho', articles like this need to come with a big banner at the top saying warning: do not try this at home.

I know nothing about ML and not very much about finance, but I find it incredibly unlikely that some random guy on Medium has an algorithm that can be used to accurately predict the market. Why isn't he hilariously rich?

I believe a simple moving average would achieve the same result and that mixing training and test data for validation invalidates your model results...

On a positive note, nice showcase of how simple implementation with Keras is.

Seriously, how on earth is this the news #1 of the frontpage of HN? Isn't this community supposed to know better?

Before clicking I thought "I bet they do stacked LSTM to predict the _next_ time step" :|

Let's take it easy on the author... LSTM's are cool especially when you first grok them, and stock data is the most abundant time series data out there.

It's the webdev equivalent of creating a log-in form tutorial and putting the password in a JavaScript variable.

Think the important thing is model performance and tuning hyper parameters which is missing from this tutorial.

Without that you are just regurgitating some model that does something.

Simple linear regression would work just fine for something rudimentary like this. This using neural nets stuff for everything lately is getting ridiculous.

What about trainnig a nn for linear regression?

Regardless of whether or not this would make anyone money, it's a really nice introduction to forecasting time series using LSTMs. Thanks for the post!

Shame on me for looking. But this is a account that post only from fritz.ai -- might be nonsense spam.

Well, I'm only on mobile. But the problem with stock prices is often that you don't have good data, unless you are working for the big four or whatever. And even then, they are apparently still working on integrating external data in real time.

Imagine having the whole Twitter stream, maybe categorized by some network in an automatic way.

Also prequential evaluation.

Feed it 3 cycles of a perfect sine wave. What does it predict?

It's tough to make predictions, especially about the future.

-- Yogi Berra

Have heard this quote attributed to einstein.

wonder how effective those models will become when everyone starts believing in those models to do trades.

The vast, vast majority of trading is fully automated algorithmic already.

Of which almost all is HFT. And by definition HFT has to be fully automated. I'm not sure if there's any data on the kind of algorithmic trading that the article talks about.

Why on earth there is supposed to be even a correlation between so-called past and future (bullshitting as a business aside)?

It is like predicting Trump by Nickson or, even better - predicting a next Trump.

Predicting stock prices with deep learning is the "Killing Baby Hitler" of Time Travel. Even though they know it won't work out, everyone tries it their first time.

We'll know if it works when he gets to be a trillionaire.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact