Hacker News new | past | comments | ask | show | jobs | submit login
Building AI Trading Systems (dennybritz.com)
215 points by dennybritz 6 days ago | hide | past | web | favorite | 103 comments

I tried doing some forecasting with various neural network models after assembling what I thought was a good amount of forex data. The neural net (I tried various architectures) couldn't do any better than chance. After playing around with it and trying to double-check everything, that was as far as I could get. This puts me ahead of most traders, since most of them lose money, then quit.

This makes me wonder what kind of trading systems can actually have any kind of edge, since some kind of autoregressive time series forecasting system seems pretty unreliable.

On a more general note, how do you move beyond it being gambling? Just because a system backtests well doesn't mean a phenomenon will continue to happen, especially if your system will significantly impact the market you're in. If you make a trend-following system, every time you trade, you're gambling that the trend is more likely to continue than not. If you're right, you'll come out ahead over many trades. If you don't have enough capital to withstand drawdown the way most beginners don't, you won't be able to last long enough for whatever phenomenon you've found to average out.

It takes a lot of time, effort and risk to do all this, so, this is a long-winded way of saying I don't think it's for me. If you build a SaaS product and it fails, at least you can talk about what you learned from building it and use that in future endeavors. If you lose money trading because your algorithm doesn't work, what do you learn from that besides that your algorithm doesn't work?

The most popular kind of quant trading is using a factor model. The first step is developing some alpha factor, a number that is predictive of how much money you'll make from each stock. So let's say my alpha factor is "companies with good earnings per share will go up." So I first take the EPS for all the stocks in my universe and maybe rank, then zscore. Now I have some positive numbers and some negative numbers. These represent the weights of my portfolio. The positive weighted companies I go long, the negative ones I go short. The bigger the number, the larger my allocations.

Now that I have my alpha factor I backtest it and whatever. Since the mean of a zscore is zero, I know I'm market neutral, so (ignoring some stuff) my factor should have little exposure to the market.

If I think it's good, I add it to my other alpha factors and combine them somehow. Could be as simple as adding them all up, or maybe something like using random forests to figure out the best way to combine them, or whatever. Now that I have a bunch of alpha factors all combined, I can run them through the optimization engine.

The optimization engine will adjust the weights of my "ideal" portfolio in order to reduce exposure to various risk factors (thus lowering volatility). My optimizer will also figure out how often I need to rebalance. There's generally a bunch of terms in there that try to reduce trading costs and zero out exposure while not diluting the "ideal" portfolio too much (or else the alpha could be wiped out).

Now, after all of this, I'm ready to trade.

In short, what we're trying to do is reduce our exposure to as many factors as possible and just get exposure to our alpha factor. We don't want the market, price of oil, sex scandal of a CEO, or anything else affecting our portfolio. We are trying to dig up this latent, unearthed, alpha that exists in the market, but doesn't belong to one company or asset.

Taking EPS as proxy for alpha is like trying to recognise banana pictures by looking at average yellow content: it is plausible, but roughly 100 years behind the other market participants. Don't do this kind of stuff with your hard earned money...

He was using that as a simple example.

Yes, I just wanted to make a point: things that have names, that have a paper written about them, that have Wikipedia pages or even Nobel prices attached to them are in the same category. The market has priced them in decades ago.

Thinking you can read Fama papers to take on quant funds, like smabie is claiming at several places in this thread, is like reading Commodore manuals to take on AlphaGo.

I feel like I've read something recently about how there's evidence people don't actually read SEC filings.

It's kind of like the theory that open source doesn't have serious bugs because so many people read it.

You know the saying that the market can remain irrational longer than you can remain solvent? If some people don't care about doing simple analysis, and others assume that someone else is doing it, and still others accept that it's useless to do it if other people aren't, it seems like a none-too-efficient market can be a stable equilibrium.

Similarly, the book The Big Short (Michael Lewis) notes that hedge fund manager Michael Burry read hundreds of prospectuses for mortgage bonds in the years leading up to 2008 and was "certain even then [in 2005] (and dead certain later) that he was the only human being on earth who read them, apart from the lawyers who drafted them." He ended up shorting these and profiting hundreds of millions of dollars.

From my analysis, there's around a 3.9% abnormal return associated with a L/S beta neutral low beta strategy (long low beta, short high beta). It's Sharpe ratio is ~1, though. 3.9% is pretty significant, especially since the beta correlation is less than 2%.

There's a reason why these factors are called "persistent." For systemic reasons, it is hard to arbitrage them away, mostly due to laws, and sometimes tax implications.

That's the ideal described in quantopian tutorials, but I doubt it often works out that way.

From personal experience, it really does actually work that way. Not all quant firms are running traditional market neutral factor portfolios though.

How much money have you personally made with this approach?

Market neutral strategies really only work with significant access to leverage and favorable financing. Retail investors such as myself are unable to get the kind of juice necessary to run a L/S market neutral strat.

Of course, I suppose it would be possible if you discovered some amazing alpha factor. But if you did, you probably would be better just trying to get investors.

So in short, the answer is $0. For my personal portfolio, I run (only started recently) a variable leveraged beta strategy that can be described in my three part series:




Thanks, looks interesting.

Really? So you're saying, not all $10tn of quant funds in the US market are managed in the way you just described?

Most people seem to think indexing is a boring cop-out, but imo it’s the place of humility you get to after you’ve dashed yourself against the rocks of trying to outperform for a few years - or decades - and then realising the whole endeavour is insanity.

It’s accepting that you’ll receive what the market gives you and not a dollar more, and that that’s the best you’re ever going to get.

+1 for indexes. This is the route most successful traders seem to take - alternating between indexes and bonds. You won't see Warren Buffet buying stocks on Robinhood.

You don't see Warren Buffet buying indices either..

If you aren't willing to put in the work, sure just hold an all weather portfolio or whatever. But I don't think it's super hard to beat the market on a risk-adjusted basis, especially as a retail investor. Most of all, it takes a passion for it (trading is a hobby) and also a reasonable amount of time dedication.

For example, there are a number of persistent factors that don't go away. Things like value, low-beta, size factors, etc. It's not super hard to leverage the tons and tons of papers that have been published to construct a portfolio that does really well. And, due to structural reasons, will do well until the laws change (almost all persistent factors exist due to some law).

Especially in the current market environment, it's really easy to make a killing. The market is moving purely on sentiment and day traders have had some really great opportunities to make stacks in the past couple of months.

Friend, if you are up on Hackernews making comments like this, I can guarantee with near 100% certainty that you are not capable of sustained outperformance in the markets. Whether you realise it now, later or never is no skin off my nose, but sooner would be cheaper.

Comments like this are why these "beat-the-market" threads are evergreen on HN. They do a disservice to the community.

I'm really glad these sort of comments were made around 2013 on this community and I started trading cryptocurrency. His comments make sense to me, and I can guarantee you with near 100% certainty there is another millionaire trader reading this thread.

You didn't need to trade to make millions in cryptocurrency if you started in 2013. Just buy and hold.

You are as dumb as you are lucky and this comment is just the definition of survivorship bias. There are always some people who make money off of pyramid schemes (not necessarily saying that bitcoin is one) but that doesn't mean it was at any time at all a good idea to invest in one.

Real estate is another area that is very typical for bubbles and when the bubble bursts the large majority of people who are overleveraged will be eaten alive by the big investers (much bigger than you) who make money off the poor in times of crisis like they always do.

Survivorship bias applies when the survivors are visible and the losers are not, but this thread is full of losers, projecting their loss on everyone that survived and deeming these dumb luck anomalies.

Since it was very hard to lose money with cryptocurrency even if you tried, you had years where it was a good idea to invest. Who should you listen to? The one that got rich with a better scheme than buy and hold? or the one who sat on the side lines for years, made zero profit, and now holds a grudge at missing such a good investment?

Poor people should not go into real estate. There is enough for everyone. If you'd put some of your wealth into Euro, in the past months alone, you'd have made enough to buy a small apartment when the bubble bursts.

The low beta anomaly has persisted for since we have data. If you consider the S&P 500 the market, then yes, it's not difficult to beat it by investing in low beta stocks and leveraging up so your beta is 1. This is a classic and time-tested strategy that will probably always outperform, on a risk-adjusted basis.

You're arguing on HN that you have a long-term strategy that can reliably beat the market. Such a strategy would be worth billions of dollars. You probably don't have such a strategy.

Good luck with that though.

There are a number of persistent factors. They are well known in academia and to financial practitioners. There are structural reasons for their existence. We are talking about value, fama french size, etc. These are classic factors that everyone knows about.



I think your attitude of dismissing decades of academic research by Nobel prize winners is a little myopic.

Also I don't think you understand what "risk-adjusted" means.

> Also I don't think you understand what "risk-adjusted" means.

You might be right; why don't you try explaining it for me? I suspect it is going to look a lot like a justification for why your strategy is fantastic even though it makes sub-par returns.

Hey, maybe if I was a little rude. I'm sorry. The fundamentals of quant finance are little too long to go into here. But if you're legitimately interested in having a good conversation. I can explain it to you. Email me at sturm@cryptm.org.

I can give you complete run down of persistent factors, leverage, risk-adjusted returns, etc etc.

You should take a beat and search the keywords, "trading capacity constraint."

That will lead you to the reasons why it is very possible to both beat the market and be incapable of scaling it up to billions of dollars.

tl;dr: Someone who has found a way to reliably arbitrage to 25% gains year over year on an inefficiency that will only fill about $100,000 is not going to be able to cash their system in for billions of dollars. But they will be reliably beating the market (on a risk-adjusted basis: assuming the system has risk equal to or less than holding e.g. the S&P).

You will find that firms which can actually reliably beat the market do tend to make their founders wealthy. But they can't scale it beyond the capacity allowed for by the inefficiency. All successful hedge funds and prop shops eventually reach a point where they can't reinvest the returns for the same gain.

That kind of sounds to good to be true... This [0] seems to suggest that the low-beta anomaly is essentially an artifact of how risk is measured. Toughts?

[0] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2503174

Pics or it didn't happen! We should not fall below wallstreetbets standards on HN.

Warren buys presidents.

Warren Buffett doesn't buy stocks on Robinhood because they couldn't handle his volume. I saw him quoted somewhere not long ago saying that if he was working with a "small portfolio", a few hundred thousand dollars or maybe it was even a few million, that he was quite certain he could return in excess of 100% annually.

But yes, picking stocks is probably not as effective for most people as buying an index fund.

He said the bit about returning 100% in a very old Berkshire meeting (90s). He's recently commented that the market has become either too volatile or crowded to allow the same type of value investing that got him to where he is

He could probably use BH to manipulate the market so his mini-portfolio would have crazy returns. This is one of the theories behind the infamous Medallion fund.

I doubt he could without using or BH insider info.

You need more data to input besides just the price time-series. Successful human traders balance and synthesize a myriad of data sources to make decisions.

I depend on an in-depth understanding of human psychology as one of my data sources. You can't turn something like that into data and input to a model. It is something learned through life experience and study.

+1 Any trading strategy based only on price is fool’s errand. The information that impacts price need to be included in the trading strategy. A lot of short term price movement is news driven, thus unstructured text processing of news, social media, relevant documents will be a key component of such trading strategies.

I am not that familiar with forex market compared to equity market. But I expect forex to be impacted by changes in political and economic situational news of host countries of relevant currency pairs. All these need to be coded into forex trading strategies.

If trading based on just price was so simple , everybody would be doing it successfully.

Price based arbitrage was very successful for Edward Thorpe http://www.edwardothorp.com/books/a-man-for-all-markets/

No successful strategy ever has been based on price. Price isn't stationary so you can't do anything with it. You need to be looking at the log returns. Price is completely irrelevant, at least for equities.

Do you know how return is calculated?

Sorry, but that is wrong. All I use is price and time. See Elliot Wave Theory. Most indicators are perfectly correlated with price meaning unless you are a HFT you can't trade fast enough to act on them.

Elliot wave theory is quackery. Price is not suitable for any statistical analysis.

It has changed and it is just observations of competing waves of pessimism and optimism and the patterns they demonstrate. I'll put a model trained on TQQQ and SQQQ price and time data only. I will bet you 5k mine will beat yours using whatever inputs you choose over any reasonable time you specify.

trading forex swaps as an individual is a disservice. its not centralized at all, as a retail you get spreaded quotes from some banks that want to make markets and that is it really. many of the biggest fx brokers have been banned in the us over the years too for all sorts of awful things [1]. there are cross bank quoting and such but the whole thing is extremely unsuitable for retails, making it prime for low barrier to entry decimation of retails which is exactly what you'll see time and again.

if you really care to trade currency rates in a sane way, there are CME futures

1 http://www.forexscamalerts.com/fxcm-permanently-banned-usa-f...

This is another unhelpful thing about trading: when you seek out information, everyone pipes in telling you that what you're doing is wrong, what they do is right, what you want to do can't be done without giving any explanation or elaboration, etc.

If your algorithm stops working or doesn't work... do you have the experience to know why? wavepruner is saying that models can't capture everything like intuition and experience which comes with time.

Personally over the past few decades investing in US equities, I have found arbitraging information found in Asian-language tech sites and real-world locations such as Shenzhen surprisingly profitable. My best example was a tip I garnered in 2011 from a Beijing KO employee brandishing her new iPhone 4. When I idly asked her if other employees in the huge KO (hundreds of staff) had iPhones, she offhandedly exclaimed 我们都有! ("we all have them"). That moment set me up for a very comfortable retirement.

> since some kind of autoregressive time series forecasting system seems pretty unreliable.

A few months ago I tried to evaluate autoregressive behavior in stock returns. To my surprise it seemed strong on some periods, but then weak on others [1], and as you said not reliable enough to rely on.

My impression is that a lot more information aggregation and processing is required to obtain a sustainable edge worth tranding on than what a single developer can achieve in his/her spare time.

Top investment shops have dedicated teams of sw engineers just to deal with the infrastructure that support their data pipelines, financial model backtesting and deployment.

[1] https://thomasvilhena.com/2020/01/likelihood-of-autoregressi...

There must be some information content left in stock price time series data, as evidenced e.g. by the price Momentum factor [1], which was been replicated in a number of studies (e.g. [2]), observable over the last couple of decades.

[1] https://en.wikipedia.org/wiki/Momentum_investing

[2] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2961979

You don't have to outperform to have a viable strategy. Harnessing small returns from multiple un correlated sources is a perfectly viable plan.

In a literal market, you can find alpha in tomato trading by:

(1) monitoring other people’s tomato transactions in as much detail as possible;

(2) spying on people who are about to buy tomatoes and rapidly make changes behind the scenes just before they make a purchase; or

(3) pointing voice analysis at The Food Network looking out for recipes that call for fresh tomatoes, tracking tomato tankers in major sea lanes, monitoring storm tracks in the top tomato growing zones, etc, and adjusting your position appropriately.

It sounds like you are trying (1) when (3) might be better, or even (2) if you are not jail-averse (or your local jurisdiction has institutionalized high speed market fiddling to the point of being legal.)

Mind providing a source for what a 'literal market' means? Thanks.

As in a town square market, as seen in human culture for the last few thousand years.

For non-native speakers, market has a specific meaning in US English. In the US a “market” typically means a small shop selling the most common fruits, vegetables, snacks, and household items.

In Europe a market is a larger and more general area for autonomous small traders, usually also trading food and household goods.

The autonomy and low barrier to entry make them a good analogy to a stock market.

(They also suffer from availability and quality issues. A large business can arrange to purchase and retail all of these goods on behalf of consumers under one roof — a supermarket.)

Let's say I want to predict the crop yield of a field. Sure, looking at the yield in previous years would help. But the yield is just a nonlinear projection of a point in high-dimensional space that has dimensions like weather, water availability, pest infestations, farmer skill, etc. All of these dimensions are incredibly relevant to forecasting, but once we've projected our points onto the yield axis, most of this information is gone. So if you want to take advantage of this information, you need to do your fitting in the original high-dimensional space.

>>> This makes me wonder what kind of trading systems can actually have any kind of edge.

The secret is simply to have an edge.

If you're trading on behalf of clients. You don't care what happens to the market because you don't depend on the high or low to make money.

If you're buying or selling for yourself, same thing. Guess who's buying coal and oil, power plants and refineries and assimilated. They sell what they have and buy what they need.

If you're making money on arbitrage, making sure the New York and the London stock exchange have the same USD to GBP to EUR price and vice versa. You could make money but you better be faster than other corporations and more careful at the same time because you're not the only one doing that. Anytime you buy one side, the other side might have changed because you can balance out.

There are clear factors that drive many markets. When the weather is cold people consume more energy for heating. When it's hot, they go out to make barbecues and buy more sausages. When there is a drought or crop sickness, wiping agriculture exploitation, prices of food and meat go up. That's some examples that are easy to understand.

The stock market is not about speculation. It's about buying real items in the real world and providing services.

You dont learn. There are two rules, you dont talk about what works and you dont talk about what doesnt work. I worked professionally in the quant space up to 2008, and i still get calls for interviews, with people wanting to dig out what i am doing nowadays. What is popular or known loses alpha pretty quickly due to overcrowding.

There are lots of ways to produce an edge. Forex is slightly different because you are trading a currency (this actually makes things easier in some ways) but, a few years ago, a lot of the cutting edge was news releases.

So inflation comes out at X% and then you try to jump ahead of other people reacting to the news.

Speaking very generally, you are looking for data that has information about future returns. So this may include past values of the time series (this is kind of complex though because a stock price does trend, that company is investing capital to earn a return which compounds in the price so stationarity is...complex) but may include other time series/their past values i.e. price of other stocks, economic data, etc.

So this could be responding to changes in liquidity, it could be seeing some repeatable behaviour by investors and jumping ahead of it, etc.

Quant is not about adding to the efficiency of markets though. They aren't using these models to determine the value of something, they are more about looking at the value of other things to determine the value of a given asset. So these strategies end up being correlated to liquidity in a lot of instances (but not all). This is a generalisation but...it is a very odd thing to have occurring in society...would this exist if investors didn't have an irrational demand for microsecond liquidity? Probably not.

Also, determining whether something is a real signal is just part of statistics, isn't it? This has definitely been an area where there has been quite a lot of innovation as increases in computational power has made non-parametric stuff more feasible (I am not an expert on this, it is just my understanding).

Btw, I should add I used to work in finance and I have some experience with this kind of thing as I do quite a bit of "quant investing" but in gambling (it is far easier to just copy what people do in finance and apply it to gambling then come up with it yourself). And just based on my experience, it makes most sense to employ a mixed approach. So learn about the business valuation, and then build a five-factor model...watch what it does, and then filter its picks with your knowledge. A lot of quant strategies are vaguely ludicrous if you have an understanding of the fundamentals of investing, like you are trying to use a computer to replicate a human...and people wonder why it doesn't work? It is an overcomplicated shortcut (to give you a concrete example, the blowup of value and funds like AQR was very obvious...you just had to look at the utter garbage stocks they owned). So I think a combination of human and computer beats either separately (one fund that does is Marshall Wace).

Just a reminder: nobody ever wrote about their super successful trading strategy. Its just never happened. If you have the wherewithal to research and build a trading system that works, then you're smart enough to know that the moment you reveal your edge to the world - it disappears. Even if you dont discuss the innards of your strategy, but you talk about your process or the system youre strategy is built on, you've revealed too much.

Yup. The only reason I am starting to write about it now is that I am no longer running the system. You could argue that it's not useful to write about systems that worked in the past, but I would disagree. New systems can work 99% the same way, but get an additional edge from somewhere else, like new data or better models. Most of the engineering will always be the same.

With the notable exception of Ed Thorp, who managed to write Beat the Market first, and then start a hedge fund to exploit the strategy 7 years later, and only when a reader proposed they go into business together.

Though it helped that the period was 1967 to 1974. The piranhas were a little slower back then

This is a simplified version of the truth. There is a lot of information that you can safely share because the number of people that will know where to look for it, know how to implement it, what to even do with it, how not to make any one of 100possible stupid mistakes while implementing it - is very low.

Example in point: Warren Buffet. All of his process is public knowledge, he constantly writes and talks about it. And yet somehow it didn't make him lose his edge.

Not exactly. We know the broad strokes. He figures a price for a company by estimating their lifetime earnings and then factoring in the time value of money. He's talked about a mental checklist he goes through when looking at quarterly reports, but he never talks about the exact details of this process. Whether he's lost his edge, that's debatable too. The time since he's beaten the S&P is now entering into the long-term.

I agree except if your strategy is something everyone uses then it becomes a self-fulfilling prophecy of winning.

If thats the case, more often than not you wont be in the high-frequency realm. In other words, the "self-fulfilling" trades are usually short ideas that hedge funds highly publicize or maybe a penny stock someone wants to pump (illegally) - and everything in between.

Where higher frequency trades do become "self fulfilling" can be in intraday technical analysis. For example, there are many people that follow RSI signals in options. Youll see retail people trade this, and then market makers will step in and bring things back in line - because options dont trade on technicals...

First thing that jumped my mind was moneytron, they were predicting the market, turned out to be a fraud.

I find most of this article to be “successful people can’t explain why they are successful so they say a bunch of arbitrary things they’ve noticed”.

He found success pursuing relative advantages, infrastructure advantages, and building custom tools from scratch.

But absolute vs relative advantages, plumbing together canned solutions vs building your own from scratch, infrastructure-level advantages vs decision making advantages...all of those contrasts exist in other businesses everywhere. None of those are specific to trading.

> “in my experience, nothing beats learning by doing or finding a mentor”

This hits the nail on the head.

The best way to become a profitable trader is with a mentor, but it’s nearly entirely luck. You drive an Uber or tend bar and happen to make friends with someone successful who is willing to guide you. Trying to seek out a mentor online is nearly impossible, as everyone who is findable and willing is almost certainly a better marketer than trader.

The other way to become a profitable trader is to start trading with real money. It’s amazing how quickly one can learn how to mend a boat, when the boat starts sinking.

readers may also be interested in Benter's paper "Computer Based Horse Race Handicapping and Wagering Systems: A Report" -- https://www.gwern.net/docs/statistics/decision/1994-benter.p...

> This paper examines the elements necessary for a practical and successful computerized horse race handicapping and wagering system. Data requirements, handicapping model development, wagering strategy, and feasibility are addressed. A logit-based technique and a corresponding heuristic measure of improvement are described for combining a fundamental handicapping model with the public's implied probability estimates. The author reports significant positive results in five years of actual implementation of such a system. This result can be interpreted as evidence of inefficiency in pari-mutuel racetrack wagering. This paper aims to emphasize those aspects of computer handicapping which the author has found most important in practical application of such a system

Arguably the paper describes the state of the art from three decades ago, applied to betting on Hong Kong horse races, not market price movements.

The parent comment is the most useful one in the thread so far for anyone who seriously wants to learn about quantitative trading.

Sports betting is essentially the same thing as proprietary trading in financial markets. The paper gives a good summary of a technique that was very successful in its day.

There is very little publicly available material on quantitative techniques that are useful for proprietary trading. Lo and Mackinlay's "non-random walk down wall st" was good, but that's 20 years old.

The mathematical literature on gambling is a lot more accessible. It's also probably easier to consistently make at least small money gambling, because the barriers to entry are lower.

Yeah this is a great paper on the subject. Although horse betting is different than financial markets due to the parimutuel system.

Does that mean you have to find a greater edge to cover the house edge?

Writing ai trading systems is the coding I do for fun since 2012. I'm a little under break even so far but I keep at it because find it so interesting. Since I started every single week I have learned a new way of thinking about a problem I encountered or a new approach to problems that still stand in my way.

Questions like, how do you choose a stoploss? Well you can pick it statistically based on history or you can use a supervised label. You can even use stock A calculated stoploss to pick the stoploss you use on stock B because you found a condition under which those two stocks became almost identicall correlated. How do you want to pick the supervised label? You can do spectral analysis to pick the stoploss too. You can use sentiment as a stoploss, source from google news or twitter or stocktwits.

It doesn't have to be, 'well I measured the average profitable stoploss to use over the last 10 years across all stocks and that isn't working so I quit'

Things like that, you get to fit the ideas together and then test them in the real world.

There are some things I would like to share.

1. Just because you have a good forecast doesn't translate into cash. It has to be paired with a trading strategy. This is probably why the author thinks the answer is RL, because coincidentally if you approach this problem with RL, it does the forecasting + strategy.

2. I have measured a correlation between heavier processing(using a higher big O) and better out of sample performance.

The criticisms with the NN approach like non stationary data have obvious solutions that a 'by the book' trading approach + ml approach don't really teach beginners so they dismiss it.

It is my belief right now that there are people who are prepping data from sources like iextrading then using things like sagemaker to develop good enough forecasting and combining it with a statistics+rules based trading strategy to make living wages.

That said, I have 5k account size for my NN obsessions, and my 401k is 'by the book'.

person_of_color is totally right when he says it is a Moby Dick of programming.

> Just because you have a good forecast doesn't translate into cash. It has to be paired with a trading strategy. This is probably why the author thinks the answer is RL, because coincidentally if you approach this problem with RL, it does the forecasting + strategy.

Exactly, this is one of the nice things about RL. You don't to do a bunch of handwaving to turn your predictions into a strategy.

It sounds like a lot of fun! I love the idea that there’s one metric ($) to measure the effectiveness of your strategy/code.

Any recommendations or hints on where to get started (assuming I’m decent with python/pandas etc)?


1. I would start in the numerai tournament, I did this for 3 years after the first two years of me by myself on the market. It's useful because they provide ml ready data, and you can iterate very quickly. If you do not have ML experience numerai will teach you about many different types of overfitting and the many correct and incorrect ways to deal with them. An example would be some ML people always apply dropout, but when you have a small signal to begin with, dropout can dropout the signal, and then there is only noise left for the model to fit and of course it will then perform poorly. The other thing it will help with is the hopelessness that you will encounter from hitting a wall(hitting a wall is common in ML, and should be expected), the scoreboard shows individuals who have broken through that wall so you can know it is possible. I stopped participating after stabilizing in the top 20 because they change the format of the tournament every so often and I wanted my Saturdays back. You don't need to reach top 20, I hit a wall around rank 100 back when they used actual bitcoin to pay people. You just need to do well in one of the rounds where everyone else fails so you can go through the process of 'what did I do that I'm not aware of that made me succeed where everyone else failed'

2. Read Advances in Financial Machine Learning by Marcos Lopez de Prado. This goes over the false assumptions that outsiders make, and then outlines rookie mistakes(I made many of the mistakes described in the book, then read this book when it came out). It also will break you out of the thinking that leads to typical approaches and why it is unrealistic to expect them to work.

3. Become familiar with retail trader mistakes like overtrading, improper sizing, and emotions as well as the fact that you cannot rely upon regulating bodies to prevent fraud from occurring, they only act after it has occurred.(This is for scenarios where your models says short this stock, then you see that the stock is fraudulent but it continues to exist.) Learning blackjack probabilities + sizing helps with developing a strategy. Things like, do you want a trading system that has 60% accuracy and 10% profit each time, or one that has 45% accuracy but 200% profit each time. It's interesting because even if you have a 50% accurate 200% profit/50% loss strategy, you still need to calculate the probability of what number of losses you will see in a row that will still bankrupt you if you have the wrong size. In college for me this was covered under the Discrete Math Class.

After steps 1+2+3 I think people who have some level of control over their emotions have the right foundation to code a system. There are people that should not trade because they don't have the right personality profile.

4. Find a way to fit the data you encounter into a DB. Early on I had to pay 100 a month to get daily csvs for stock data. I wrote code that answered questions for me from the csvs. This was wasted time, because you can write SQL to answer so many questions. Keep this DB on a separate computer from the one you do ML dev on. Because the computer that ML dev happens on inevitably gets wiped(it will happen to you).

Then for you its a matter of just leveraging python+pandas etc to code a solution that meets your criteria. There are three categories that you have to operate across, infrastructure+forecasting+trading strategy. When you see one of your models predictions become true it really is a different feeling. But to ease my conscience I should warn you, if you are the curious type and you try this once, you will always be curious about it.

For timeseries data im currently using iextrading even though it has downsides(they only have data for trades that route through their exchange). I used to use kibot, alphavantage,scrape yahoo, download stock data csvs from ebay,and save etrade realtime quotes. For placing trades I am currently using alpaca.(I've used IB,etrade,and robinhoods private api before they blocked it).

I see you posted you are at break even since 2012. Most people wipe out in 3 years so good job.

I recommend trading manually to feel and listen to the market so that you can adapt your code to it. Also, maybe you are overthinking your strategy so don't hesitate to remove parameters.

My discord is open you can join it and see my trades and look to find any insights from what I enter.

Don't do this. It's the programmers Moby Dick. You are better off self learning stats/ML skills in your free time and joining a quant fund than to try and do it yourself.

Agreed. It's a goose chase, the house always wins and even a winning system works one week and not the next.

"Then, profits started decreasing and I decided to move on to other things and I lacked the motivation to go back into it."

Is this post about the one with decreasing profits, or a new one that is profitable?

I would love to try trading as a hobby with a little side money, but I would abhor a hobby that reduces to effectively buying the trader-feel-good experience, where you're essentially sponsoring incumbents as a fanboy chipping in his pocket money.

What I would require from a trading platform:

1) decentralized and permissionless 2) provably fair trading

With 'provably fair trading' I mean the protocol should be such that I can prove you are not simply held captive by an intermediary, regardless in what shape or form. It should also be fair with respect to latency.

For example consider a trading market where token X can be exchanged for token Y and vice versa. Each holder of X demands her minimum of Y per X, and each holder of Y demands his minimum of X per Y. What if everyone salty hashed their demands, and pays the market contract (proportional to how much they will actually be allowed to trade) to register their salted hash. When the round has closed, people reveal their salt and plaintext, and the incompatible trading offers get their money back (minus a usage fee perhaps). The compatible ones can have their trades go through at the rate of 'total compatible X offered' to 'total compatible Y offered' (or some variation thereof, say rewarding those that helped close the gap). In this way there is no high frequency trading, and you could have a family of such markets operating at different timescales...

I've never tried the AI trading path but I imagine that you can't get huge gains with public data, unless you find a way to extract "hidden" information by processing real time news.

I wonder nevertheless if there's a sweet spot where you can build a simple AI trading algorithm and get modest earnings from it.

I think the answer is yes & no. If you come up with a sufficiently clever strategy using public data that other people haven't thought to use it's definitely doable. For example, someone with a good understanding of meteorology would've had a significant advantage a few decades ago (though trading firms have since caught on). You wouldn't need a perfect data set if the strategy isn't being used.

In terms of strategies based purely on market data, you are definitely correct. Any publicly (freely/cheaply) available market data is low resolution, lacking the full data from any point in time, and generally based on poor approximations of the actual data (elsewhere in this thread someone mentions that IEX's data is based on trades that get routed through the IEX exchange, which obviously misses any data you could get from the markets that make up 99% of the volume, dark pools, etc.).

I think the "sweet spot" is simply coming up with a strategy that nobody else has thought about, or else executing a better-known strategy more effectively than other market participants. Both are hard, but somewhat in the realm of possibility. The problem is that many people think there's free money to be made without either of these.

For the last year or so I have been working on a ML-based trading system in the domain of crypto with two friends. I made more in 2 months than I used to in a year. This is after thousands of "full positions swings" and millions of trades (short and long). We are now experimenting with different classes of trading strategies to reduce risk.

We would like to find 1 or 2 more people to work on this project, we need people who can tolerate risk and skilled at data engineering: data pipelines, psql, pandas, numpy, data visualisation, setting up servers. Ideally also skilled at machine learning / deep learning and who has tried his hand at trading systems. If interested, my email is in my about info.

"Actually, many months my PnL graph looked something like this: (this is generated to get a point across, but my real data looked extremely similar):"

I'd love to see the actual data

What's the real performance of the system so far?

Probably not very good. Voleon does all ML-based trading and what I've seen of their returns does not give me any confidence in ML-based trading having alpha. I would estimate that at best in a good year returns would be like 5% y/y in the long term, much less than the sustained ~7% that index funds offer especially when adjusting for risk. Just speculation but there's a lot of firms with much more capital, better tools, and teams of extremely intelligent people who have pretty poor returns because of how good the competition already is.

I think you're comparing apples to oranges here. These funds manage billions of dollars of client money, which forces them into highly liquid markets with scalable strategies. That's quite different from how individuals or smaller prop funds can operate, trading off capacity for higher returns by trading in less liquid markets or with strategies that are "not worth it" for large hedge funds. If you must manage billions of client money then you are right in terms of competition, but as someone who only trades his own capital, you can see a lot higher returns.

>These funds manage billions of dollars of client money, which forces them into highly liquid markets with scalable strategies.

Obviously this is true, but I think you're missing the point. Trading with ML on price data is a strategy that literally anyone can reproduce and, as is evident by reading the comments in this thread, is something that many people have tried to replicate. In that context, everyone using that strategy is effectively acting as a large fund. Further, a large fund or prop shop can deploy small-scale strategies, I think the limiting factor really tends to be leverage. But if they are just trying to make 5% returns for example, they can deploy a lot of small strategies that make ~5% returns. And that's not mentioning the countless tiny shops operating under the radar trading <10-50 million AUM (really, I think the average fund is much smaller than what you would imagine). What I'm getting at is that there are a lot more market players than the "big guys" and they will either have an equivalent strategy to you or will be better equipped to take advantage of that same alpha because of more capital/better data sources/smarter stats. With that in mind, it seems insane to suggest that you can find significant alpha in such a low-hanging fruit.

Remember that you are trading during one of the longest bull markets in history. It's not hard to make good returns, but it is hard to analyze risk. There are a million and one ways to make 100% y/y, but a fraction of a percent of those will continue to work in the long-term. With a black-box model you cannot properly assess risk. Even with well-understood models, this is something that real industry players struggle with: backtrading alpha != simulation alpha != profitable alpha != long-term alpha.

Also considering you're the OP & are trying to argue in favor of this type of trading, it would be very informative to disclose what kinds of returns you actually made. It's hard to expect people to listen to your opinion in a game where everyone successful is motivated towards secrecy.

Agreed. Many smaller, yet successful Hedge Funds limit the capital they manage for this very reason. Some strategies just don't work at certain scale.

If the above is true, why would a fund not just allocate a small amount of resources to trade on OP's strategies. Either:

(1) OP's strategy performs worse than the alternative (2) They already do this, and have resources that allow them to outperform OP at their own strategy

If the returns are really meaningful, i.e. better Sharpe ratio than just holding $SPY or some dead simple strategy like that, then (2) must be true at least _somewhere_.

Returns don't matter. People always compare strategies to the S&P 500's return, but it doesn't make any sense. In finance, strategies are compared on their Sharpe ratio, not on their returns. This is because leverage can be applied to make the returns better. For example, if you had a strategy that, unlevered, returned 5% with 2% volatility, that would be pretty amazing. If you wanted, you could lever up 4x and get around 20% return and 8% volatility (though, it's not that simple, since you are going to pay a volatility tax, but we'll ignore that).

Another thing to look at is correlation to the market. The less correlated to the market your strategy is, the more valuable it is. This is because investors like uncorrelated strategies. For example, lets say you have n strategies, each with a volatility of sigma and mean return of mu. Allocating all of your money to one strategy or two or all of them won't change your return, it will still be mu. But if the strategies are uncorrelated, and you equal weight each one, your vol will be sigma/sqrt(n) and your return will be mu. This is the essence of Modern Portfolio Theory (MPT): add as many uncorrelated assets that you can.

In no particular order, here's a list of things that matter when evaluating an (equity) strategy: turnover, size of alpha being exploited, Sharpe ratio, correlation to market and sector, correlation to style factors (value, momentum, oil, etc), and net exposure (long or short).

I mentioned the idea of adjusting for risk in the GP post, though you are correct that I didn't call out any specific measure like the Sharpe ratio by name. If your risk-adjusted returns are worse than S&P 500 then obviously leverage isn't going to fix that problem. Simply put, I don't think OP's strategy really has any of these desirable features like a good sharpe, market neutrality, low exposure, etc. I think OP is just naively building a portfolio based on price predictions extrapolated from price data, and it happens to work in a bull market. Since OP probably doesn't have the resources to really evaluate risk (good simulation tools, good historical data sets, & even the industry know-how of how to look at risk) it seems rather meaningless to hear "I used to have very good returns."

Apologies if I misinterpreted your comment, just my thoughts when reading it.

I guess all I'm saying is that a strategy that has a worse Sharpe ratio than the market but zero correlation could still be valuable to a lot of investors for the same reason that sometimes it makes sense to add a asset to your portfolio that lowers your expected return. It's possible that that one strategy with the worse Sharpe ratio when combined with your pure beta investments would yield a portfolio with a better Sharpe ratio than either allocation alone.

That's an interesting point that I didn't really consider. Thanks!

They don't do "all ML based" trading. By definitions of the smurf who wrote OP about "AI trading systems," they don't do ANY ML based trading.

Their returns kind of suck, but it's more to do with their trading frequency and correlations than anything else.

Just curious to know do financial firms have implemented something similar to this

Modern finance is built on top of this type of technology. There are hundreds if not thousands of firms participating in "quantitative finance," attempting to use computers/statistics to predict markets. The vast majority of trades go through major practitioners of this exact idea.

To be honest. I don’t believe a word about the performance using AI. Especially if the article doesn’t present the features and the NN architecture. Its always the question: would a super simple model perform the same way? And very ofter the answer is yes.

Lol this.. I once build a model (btc) that assumes this "we can't know the direction of market so we might as well guess", after spending months reading papers and trying to be "clever"

It starts off picking a random market direction (up/down) places bid (sorry I mean makes a trade). Then based on lots of tuning/backtest decided how long to be in position and what is the stoploss.. Think in the end the most "profitable settings" where something like :

$proft_size = 0.38% $stop_loss_size = 0.35%

Win-Continue-Direction = 3 rounds (after winning/losing do we change direction) So it probably in the end was Markov-model with random-start - if we had to label it :)

Oh and for fun it would also "martingale for x rounds" :P

Worked quite well for 3-4 days and was fun implementing it while watching "Billions" on TV in the background :D

Actually I'm not saying that you can't be profitable. Just saying that the fancy NN, won't work like magic in a noisy environment like financial time series. For other time series with periodic and other anomalies you can apply NN and you may have an edge - BUT for that kind of environments also classical methods are working quite good.

trading != investing

or trading == investing (and changing your mind very often)

In some markets it is necessary to put the same length of fiber-optic cable between the colocated servers, so that being closer to the exchange's cabinet doesn't translate into an advantage. So obviously we're talking about extremely low latency, high-frequency trading. This carries a huge amount of prerequisites to even get started.

Not only are that, but there are many different order types besides "buy at market price, sell at market price". Then there's options, short sales, and more.

It goes deep. People devote 30 years of their career to this. Read the authors experience as a kind of warning, if you will.

Agree to an extent, but not all money in quant finance is generated through HFT. Notably, I don't think funds like Rentec are really doing much to get low-latency [1]. Latency obviously does matter for any kind of quant trading, but to my understanding a good enough strategy + slippage models & the likes can overcome this.

[1] You can find a list of NYSE broker/dealers here: https://www.nyse.com/publicdocs/nyse/markets/nyse/members/NY... -- any firm where latency matters will need to be on this list to colo on the exchange.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact