Hacker News new | past | comments | ask | show | jobs | submit login
Using Reinforcement Learning in the Algorithmic Trading Problem (arxiv.org)
288 points by godelmachine on April 29, 2020 | hide | past | favorite | 135 comments

As someone who has written about this previously [0], worked briefly in HFT before, and read dozens of papers on the subject, I can say with very high confidence that the results are not to be trusted. This paper, just like pretty much any academic paper on the subject, ends with a backtest on historical data, not a real system.

Not only is it (very!) easy to overfit backtests (especially with so little data they are using here), but backtests are nothing like the real world. In the real world there are HFT traders front-running you, latency, jitter, fees, hidden order types, slippage, and a lot of other complexities that don't fit into a short HN post. Whenever you see a paper ending with a backtest you can already assume it's BS.

It's similar to training a robot in an extremely simplified 2D simulation environment without physics or other interactions, and then claiming one has built a real robot. A mistake many people make is believing that trading is all about AI. But in reality, the model often matters less than infrastructure/latency/system/data issues.

In addition to that, people who are actually "good" at trading don't publish papers, they silently make money. Papers are typically published by academics or students who have never built anything profitable but would like to put a paper on their resume. I have yet to see a single good academic paper about trading.

[0] https://www.tradientblog.com/2019/11/lessons-learned-buildin...

If you want to read useful academic papers about trading there is one author in particular who is actually not bad - Zura Kakushadze. Most of his stuff is applicable to mid-frequency trading, not HFT. He worked at WorldQuant (reputable trading firm) and the founder of WQ, Igor Tulchinsky, is a coauthor on one of his papers.

Example of a pretty interesting and accessible one - is "101 Formulaic Alphas" [0].

[0] - https://arxiv.org/pdf/1601.00991.pdf

This paper is a hilarious dump of WQ's randomly generated formulas that (hopefully) happen to pass in-sample test.

  Alpha#33: rank((-1 * ((1 - (open / close))^1)))
This formula trivially reduces to

  rank(open/close - 1)
which is an example of a mean-reversion strategy. But: 1) nobody bothered to simplify this formula, 2) as any mean reversion, it is extremely difficult to trade.

Why is mean reversion difficult to trade?

High turnover, high costs. You flip your position too often

If the market flips against me, I just double my bet on the next play.

the market can stay irrational longer than you can stay solvent

Martingale is a sure-fire way to lose all your money rather quickly.

That's a great way to go broke very quickly.

Only if you don't have infinite money on the side, in which case it's guaranteed to be profitable

You also need the other side of your trades to have infinite money. But in any case, why are you trading at all if you already have infinite money?

Martingala with transaction costs are not profitable.

picking a random one out of the pile:

> Alpha#90: ((rank((close - ts_max(close, 4.66719)))^Ts_Rank(correlation(IndNeutralize(adv40, IndClass.subindustry), low, 5.38375), 3.21856)) * -1)

I wonder how these magic numbers get picked (4.66719, 5.38375, etc) -- I guess there is some optimization solver which attempts to find the most profitable variables for a given alpha formulation, but isn't this approach also very vulnerable to overfit?

Yup, it's probably just the output of an optimizer and then tested on held-out future data. Not overfitting is the key here and what's really hard. You need to be careful about the number of parameters and the amount of validation data you have.

These alphas will likely be only profitable for a short time period as long as the market data distribution (i.e. strategies of other market participants) doesn't change. So you would need to continually optimize and update them.

The way I think about it is that you are essentially finding the right parameters to "exploit" the combination of algorithms of all other participants, where algorithm could also be a human looking at charts and following certain rules, with a lot of random noise from retail traders thrown in.

Seems kind of rudimentary. Namely

> (sign(delta(volume, 1)) * (-1 * delta(close, 1)))

That's crazy. Would be interesting to see WTF a "mega-alpha" actually does using these strategies.

I believe they may have used something on the lines of genetic-programming to create this equation - not sure about the high precision constants. The search space is compute intensive. Many years back, I used that technique to generate a profitable strategy. These things work and are different depending on the timeframe/sampling, stock, trend and money management.

> In addition to that, people who are actually "good" at trading don't publish papers, they silently make money.

Well, that is mostly true. But never discount anything. There are people like me who used to love the data analysis and prediction part in these markets. I got hooked to the markets because of it. I was not interested in making money and naively thought my average pay was good enough. When I first built (or my machine built) a working strategy (in early 2008), live traded/tested it for a couple of months and told few colleagues about the details about the strategy - they did not take me seriously. This was even before I understood NNs or any of scikit-learn tooling. I knew I wanted to get into financial markets - went to a broker to sell the automated strategy and seeking a full time job as an algo-trader - they thought I was trying to scam them even after seeing the contract notes. Plus algotrading had not picked up back then. I found later about such scams. It took me 3 more years and a financial crisis to understand the value of making "much more than enough" money. And retrospectively I know those were just stupid attempts trying to convince others and attempting to give it away.

You make a good point. I've also gotten into trading because I enjoy the algorithmic and mathematical aspects, and I would love to share more of what has been working for me and write extensively about it. And there are probably more people like that out there. However, trading has such a bad reputation and uncertain future that I am not sure that's a good career move. I'm torn.

You're right that there are probably some gems and people writing up good posts and articles. However, 99% of what comes to my inbox, which is certain newsletters and arXiv subscriptions, is clearly BS. I'm particularly disappointed with arXiv/academia, because in other fields like biology and CS/ML/AI, published papers tend to be of higher quality than your average blog post. In trading the opposite seems to be true. Seeing a good trading paper on arXiv is incredibly rare. I would even go as far as saying that reddit is a significantly better source of information than arXiv for this field.

Have you tried quantpedia? https://quantpedia.com/

It's expensive but I find it a really good source for ideas.

So how should we evaluate the quality of a paper on trading AI? I mean the authors might not have access to real data, but their ideas might still be good.

There are some ML problems where it is fundamentally impossible to use historical data to make accurate forward looking predictions as its not IID. These fields require you very carefully capture data on sub-optimal choices. In the case of trading this means making explicitly bad trading decisions some portion of the time, and teams that have done this at any scale are unlikely to share the data.

In the case of trading, any paper not tackling these issues head on is not likely to be useful.

I don't get it - if you have accurate historical data, how is this different from having access to current real-time data? Why can't you pretend you live 20 years in the past and use the data you have as if it were real-time?

Data distribution shift. The market changes over time and your current data does not come from the same distribution as old data. That limits the amount of data you can use for training and testing. You need to be very careful not to overfit. That's especially true for something like daily or hourly data - there isn't much data to begin with and you won't have much left if you look at only a few weeks or months. Market data already has a low signal/noise ratio to begin with, so you need a good chunk of data to learn from.

As you go to shorter time scales you get more usable data, but then you also need to deal with other issues such as latencies/jitter, market impact, complex order types, order book queues, etc. It becomes a different game.

For 1 because your trading existence in that universe would change the future which you can't account for. Your activity influences decisions of other HFTs in real time whereas with a static history you're claiming to be able to trade without perturbing the markets.

Fundamentally, the issue is that in real time you may not be able to make the trade that your algorithm chose. You could get close if you had the actual book prices at any given time, but even then, you might lose to someone who is 1 millisecond faster. So no, backtesting can simulate reality. Interactive Brokers offers a simulated account where you can practice "live" trading, although it's still not the same, since there's no money involved. But if I see a paper tested on an IB simulated account I'll be very interested, and it'll be too late already.

Good point, but I was mainly thinking about making a single good decision to multiply your investment, not HFT. Like identifying that it was a good idea to invest in Tesla stock 7 years ago.

Papers on trading models that show feature importance (rather than backtest results) are more valuable.

>I'm particularly disappointed with arXiv/academia, because in other fields like biology and CS/ML/AI, published papers tend to be of higher quality than your average blog post.

You should really google up something called the Gell-Mann amnesia effect. 99.9% of everything is shit. Including biology, CS, ML and especially "AI."

Of course trading papers are even more universally shit, but once in a while someone publishes a non obvious to me risk factor.

Thanks for the Gell-Mann amnesia effect.

Would you kindly give me a fair idea as to what’s a good amount of money to be made in this field?

As with most industries, it depends. But junior people typically make in the 200-500K range. Then as you gain experience, develop your own ideas/strategies and are able to manage risk appropriately, the sky is the limit. The closer you are to managing money that's being invested the more you make. If you can run a 1.5 - 2 Sharpe strategy and never dip below a ~5% drawdown, i.e. probably have substantial positive skew in returns, you can make in the millions or tens of millions at the right fund. Note that as the OP correctly alluded to, this is much more difficult to do live than in a backtest.

Is 200-500k still true? It used to be, but I think it has decreased significantly over the last decade. I'd say most junior people in this field are making about the same or less than software engineers these days.

But like you said, the range here is incredibly wide and largely depends on how well your strategies do and if you have your own desk/fund.

I don’t think it is. I worked for one of the major HFTs and new graduates earned far less than that. In addition the churn rate was high - I’d say most new graduate hires didn’t last more than 2 years and what you learned in those 2 years was often not much use in terms of experience useful for other career paths.

Bonus distribution was reverse exponential. Like traditional consultancy partnerships, a small few of the old hands made serious money but those at the “bottom” made ok money but after a few years their FAANG based contemporaries were doing better. Advancing up the ranks was not guaranteed even if you survived the frequent blood lettings.

I wonder if this isn't the cause of lack of progress in science. - Why create wealth for all when you can acquire currency for yourself by managing other people's money?

It seems to me like those tragic stories of genuises who died young. What could have been if their ideas had reached the world? But instead of dying the geniuses got sequestered into finance and secrecy, volunteering to make no mark at all on the world of their passing.

++ this.

If they haven't tested this in actual trades and measured results, it's probably worthless. Even backtested strategies at actual firms observe decays (or don't work) when they get put live. And those are places where they invest in (and are incentivized to get right!) backtesting methodology.

Yes. To add to that, leakage of information is very hard to eliminate during back-testing. Even a fractional bit of information is already too much. In academic papers this is usually ignored.

> "good" at trading don't publish papers

I think this is a little unfair. I've seen high-quality papers from phD students who then get hired by financial firms and were apparently very successful. Every good real-world AI system requires both good engineering and good science and it's disingenuous to suggest that all science that isn't actively being applied yet is BS.

I don't claim that all the science that isn't actively being applied yet is BS, but this kind of science typically happens within trading firms, tested on real-world data, and is not being published on arXiv.

As a side note, what this specific paper here did is neither novel not innovative, so it's very fair to criticize it. A3C is 4 years old, and they just take it and run it on some data. It's like downloading a convnet and running it on MNIST. There have been hundreds of papers on RL + Trading. I see them in my arXiv emails every other day and they all do the same thing.

I was referring to these statements:

> This paper, just like pretty much any academic paper on the subject, ends with a backtest on historical data, not a real system

> Whenever you see a paper ending with a backtest you can already assume it's BS.

I guess he was referring to the fact that though the people may be great, their published content in that one paper need not be so.

John von Neumann had a similar observation:


Edit down voters care to elaborate?

> Not only is it (very!) easy to overfit backtests (especially with so little data they are using here), but backtests are nothing like the real world.

I know this, and I ran a company where people should know this, but so many people are so easily swayed by "authority"

like, so and so made trading programs for Investment Bank Co 20 years ago so you know their trading algorithm has to have merit

uh no, they are not retired, they are broke and can't even fund $10k into a trading account to try it

at this point all I would say is just smile and nod.

While it's so easy to dismiss someone's work as flawed (sure, backtest is illusional but do you have anything better?), which I think it may be, I always read it and try to understand what they're up to. Sure, academic folks may have no clue about market microstructure and other complexities, but if they could solve, or make some way toward solving the difficult problems in stochastic processes, they're already worth my effort.

I actually believe that trading is an interesting problem that should be studied more in Academia and Machine Learning. It has many aspects (sparse rewards, long-time horizons, simulation-to-real-world transfer, non-stationary data distributions, etc) that current ML algorithms struggle with.

Unfortunately it seem like most ML people are not really interested in trading, perhaps because it has such a bad reputation (which is IMO unjustified) - so they work on games instead :)

"most ML people are not really interested in trading"

You couldn't be more wrong on this. Stock market trading has the lowest barrier to entry of any endeavor. All you need is $1000 and Robinhood account, which you can open one on your phone in 5 min or less.

I've been following HN for a while, every time someone comes up with a trading algo or posts a link to algo, there's were hundreds of upvotes, lots of comments.

i think it’s just because they want to publish, and as you say it’s not easy to find a really good publicly available dataset or simulator. I think if these were publicly available and there were a Python package, people would rapidly get interested in RL for trading. (if it even works for trading — i don’t know much about it but maybe simpler techniques work best, in which case there would be little chance of producing a publishable paper.)

Ther are plenty of free datasets out there. You can get upward of 10 yrs of daily OHLC stock data on yahoo finance. The amazing thing is yf has S&P 500 index since 1927. Free!

Quandl has many free, or low cost stock market/commodity datasets.

I'm not sure what you mean by a "simulator". One of the greatest challenge applying RL to stock mkt is precisely that the market itself is not a MDP.

I don't think daily OHLCV data is a good data source. First of all, it's too little because of the data distribution shift over time. It's also driven significantly by outliers and events outside of the data (news, etc). There's way too much noise in daily prices that most of the signal is drowned out (longer time horizons = more uncertainty). I don't believe you can find any edge looking at daily data. This kind of data is would be equivalent to what MNIST is in ML. Nice for some playing around, but nobody who is serious would use it for production or benchmarking, at least not by itself.

There is a good reason trading firms pay a lot of money (sometimes millions) for fine-grained historical data from exchanges. It's not only about speed. For interesting experiments you IMO need L2 or L3 order book data, ideally somewhere on second or sub-second scales. That's not HFT (which is nano and micros), but somewhere in the "middle" - it's a different world than what you are talking about.

By simulators he means market simulators for L2/L3 data with a matching engine, latencies, queue positions, jitter, complex order types, etc. You can't simulate other market participants (at least not fully, but there are techniques to even estimate this based on live trading feedback), but there are still many things left that you can simulate in a realistic way during training and backtesting. Trading companies typically have their own high-performance simulators built in house. Some of these are incredibly complex. Good simulators can give you a huge edge and are absolutely necessary.

What you said about daily data is precisely what makes stock mkt so interesting and challenging : nonstationarity.

"outliers and events outside of the data, news" : these are precisely the stuff your models need to learn, and the fact that you consider them noise tells me most folks have no clue how to predict these "noise".

I would actually say the opposite - its much easier to get hold of a financial dataset than for other fields. There are packages for easily downloading Yahoo finance data e.g https://github.com/ranaroussi/yfinance

> people who are actually "good" at trading don't publish papers, they silently make money

I've long understood that this was true. It makes intuitive sense.

But are there any cases where it is not true?

Is it possible to "spread the wealth" when it comes to trading, or any money-making endeavor?

Or does it always reduce down to "I win only because you lose"?

I don't think it necessarily has to be true. I also built a profitable system and wrote about it, but I didn't share all the details. Not even close. There are just too many small details that must be "just right" that they would fill a whole book. It's kind of like building an operating system from scratch. It's not something you can put into a single post or paper. There isn't one "trick" that suddenly makes it all profitable - it's a combination of so many small details.

Then there is the cultural aspect. People who are working in trading are just not used to sharing openly. They don't write online, or anywhere. They are not even allowed to write due to their employers. And people who work in academia are naturally not working on "production" systems - their only job is to write, not to build. So you almost never see people in the intersection of: writes-online & understands-trading & is-not-in-academia

A famous example of a strategy that was published, and shared, whilst still profitable is that of Benjamin Graham, several of his students went on to be incredibly successful traders (we all know about Buffett, right?)

But that's an exception rather than a rule

It is an absurdity to believe there has never been a good trading strategy published in a paper.

The real value though of publishing a trading strategy is in signaling to future employers.

Ultimately, money is made by the ability to come up with new strategies. Any single strategy is only going to live for so long before it dies and it is no longer profitable.

To spread one's wealth, one can donate to charities.

Opening up one's secrets of trading seems to only make sense if one has found deeper, more effective secrets, so that the old crop is not going to be seriously competitive, but a bit of good PR would come in handy.

Trading is inherently zero sum.

Zero-sum in wealth, but not zero-sum in utility. Otherwise, people wouldn't trade at all.

That's not true, it just needs to cost you more to not play than it does to play.

Something like:

    |            | Play | Don't Play |
    | Play       |   -5 |        +10 |
    | Don't Play |  -10 |          0 |

Your payoffs are not zero sum.

> In game theory and economic theory, a zero-sum game is a mathematical representation of a situation in which each participant's gain or loss of utility is exactly balanced by the losses or gains of the utility of the other participants.

Yes? The point is that no one playing has the highest payout for the group[1].

[1] You can change the +10 to +9 if you want to make it the absolute highest total payout.

Sorry I don’t follow. Can you elaborate?

Not always. People have different time horizons on the utility of money and non-linear outcomes on risk.

For example, a gold miner may sell gold futures to guarantee that he won't go out of business once the construction of the new gold mine is complete. There are many other examples.

That's only true in the sense of opportunity cost. I may buy something at $10 and sell it at $15 making a $5 profit. Then it may go to $20. Did I lose $5/share? Sure. But in reality I wasn't a "loser".

I find that in reality opportunity cost rarely matters.

If you buy something at $10 and sell it at $15, where are the $5 profit coming from? From the other market participants, e.g. someone selling to you for $10 and later buying it back for $15, losing $5 in the process. Your profit and their loss sum to zero, which is what "zero-sum" means.

It has absolutely nothing to do with opportunity cost, or whether you, personally, are a "loser". But if you're a "winner", someone else must be the "loser".

In this simple example, yes, but you are assuming that monetary value = utility. That's not always the case. People have all kinds of different incentives for participating in the markets.

Let's say I am a market maker offering to buy Apple shares at $99 and sell them at $100. Let's take an ex-Apple employee who owns some shares. He just had a family emergency and wants to liquidate his shares to get cash, and he needs it quickly. He doesn't care about paying a few dollars extra in exchange for a quick trade because he needs to pay a bill tomorrow. I buy his shares for $99. He is happy because he immediately got his cash.

On the other side, there is a a retail investor doing long-term investment and wants to add Apple to their portfolio. They also don't care about a few cents because they're holding the stock for a decade and love the new CEO. They buy my Apple shares from me for 100.0. They are happy because I can guarantee them a stable price for a decent number of shares.

All participants are happy. I just made $1 from the spread for providing liquidity, the investor got the long-term investment they wanted, and the ex-Apple employee got his cash.

Sure, both sides of the market could have made more optimal trades if they had put in more effort and "optimized" their trades with algos and somehow skipped the middle-man, but they would've sacrificed convenience and time, which may be worth more to them than the little bit of extra $ they paid. Aren't we all winners?

When you go buy bananas in your grocery store you also don't complain about them taking a cut for providing liquidity. You don't say the farmer has "lost" money because the consumer paid more than what the farmer originally sold for to the grocery store. The farmer is happy because otherwise he may not have traded at all or his bananas may have gone bad (= needs to trade quickly). This is no different.

Asset values can just increase. Alice who has $10 and 0 units buys 10 units from Bob who has 10 units and $0 dollars. Alice then sell back to Bob 5 units for $10 dollars. Alice now has 5 units (worth $10) and $10, Bob now has 5 units (worth $10). Total wealth in the system went from $20 to $30.

In addition, companies produce things, some of that wealth gets returned to the investors through dividends, interest (eg on bonds) and buy backs (in my example, let's say each unit generates $1 in dividend, now total wealth is $40(!) while starting at $20, including $20 cash (starting from $10) and $20 worth of units (starting from $10)).

In fact, we see this growth everywhere around us as both the amount of people and the amount of goods and services per person is increasing!

Market makers (market participants who are interested in either buying or selling) are like used car dealers. If you are in the market to buy or sell a car, you could choose to find a buyer or seller yourself, and you may well get a better price going that route. It will also usually require more work from you and take longer than if you just went to a car dealer. So that's the tradeoff: save time via an intermediary (who will likely profit from the transaction), or do more work yourself and possibly get a better price.

You didn't take another very important unit into account: time and risk. Let's say you want to sell your used iPhone. You might get $150, if you wait around for the right buyer, but this might take a while and even after waiting it is not guaranteed you find a buyer for that price. The price could even go down to $80. Alternatively you could go to a pawnshop and get $100 dollars instantly. And if you are happy with that price you can go on with your life and focus on other things.

> Or does it always reduce down to "I win only because you lose"?

It is a zero sum game. Nobody is producing anything, therefore for one to win another must lose.

I think thats an overly simplistic view of things. The market is big and many participants trade at different frequencies. Large pension funds need liquidity to move big blocks of stock for their quarterly and monthly rebalances, and the big medium term statistical arbitrage traders provide liquidity for them to do so. HFT players provide liquidity for the stat arb players. The classes of participants with different frequencies actually help one another, while there is competition for alpha within strategies with similar holding periods. Overall the system creates an extremely efficient and liquid system for valuing and exchanging equity - the very system that empowers YCombinator and other Venture investors to make VC investments knowing that their winners will eventually IPO or be bought by public companies.

I knew someone would come in with "liquidity".

Many HFT jump out when things get volatile, when liquidity is actually required.

Ultimately HFT is doing nothing of societal value, the race down to zero is never-ending and we are wasting huge amounts of resources on a totally pointless march towards zero. Exchanges should introduce random delays to allow market participants who really want to hedge / buy / sell, then we can shift some of the resources to the real world. The costs required to compete at the lowest latencies are large, and forcing small/medium players out the game, as the investment cost is large, which is also bad.

The system is hugely inefficient. The costs as latencies get lower are ever higher, for an extremely similar end result. The law of diminishing returns.

My initial comment was discussing speculative trading in general, but since you mostly brought up some common anti-HFT tropes I might as well address them.

> Many HFT jump out when things get volatile, when liquidity is actually required.

Do you have a citation on that? If you look the preliminary Q1 results of Virtu Financial [0] (only publicly traded HFT) they seem to be doing more trading than ever in these volatile markets.

> Ultimately HFT is doing nothing of societal value, the race down to zero is never-ending and we are wasting huge amounts of resources on a totally pointless march towards zero.

HFT is a mature industry. Latencies have mostly stabilized, and profitability is way down in the last few years. Many firms are merging/consolidating. So in the past few years society is actually spending fewer resources - both financially and from a human capital standpoint on HFT than it did in the past.

> Exchanges should introduce random delays to allow market participants who really want to hedge / buy / sell, then we can shift some of the resources to the real world.

IEX is doing something relatively similar to that for a few years now. They have ~3% of US equities market share. People have the option of trading there but they mostly choose not to.

> The system is hugely inefficient. The costs as latencies get lower are ever higher, for an extremely similar end result. The law of diminishing returns.

Due to consolidation, costs are actually decreasing. Could it be that the market is... working?

[0] - https://ir.virtu.com/press-releases/press-release-details/20...

> If you look the preliminary Q1 results of Virtu Financial [0] (only publicly traded HFT) they seem to be doing more trading than ever in these volatile markets.

Similar story from Flow Traders:


Everyone is, volumes are hugely up. The point about liquidity is during the sudden market shifts, not over a quarter!

Perhaps you don't follow the news? This was a quarter rich in sudden market shifts.

Thanks, I do follow the news. If you read the threads above again you will see that both posters are fully aware of elevated volume, and the distinction was between HFT melting away during short periods of vol and wider "liquidity" from HFT.

So what seemed like a quick drive by wasn't actually correct.

You were replied to, but I'm going to ask some questions of this moralizing.

> Many HFT jump out when things get volatile, when liquidity is actually required.

This feels almost like a "no true Scotsman" situation. Why is liquidity not "actually required" when volatility is low? Is it a moral obligation for any trader to catch a falling knife? I see this condition of "when liquidity is actually required", but I never understood why there was such a strong feeling for it. Why do you believe this?

> Ultimately HFT is doing nothing of societal value, the race down to zero is never-ending and we are wasting huge amounts of resources on a totally pointless march towards zero.

I don't know, I could probably take a similar view of so many jobs in tech. What does society really get from Snapchat, what do they get from HQ Trivia, what do they get from people making powerpoint presentations with arrows that point to synergies. What's the point of any job with some amount of abstraction?

> Exchanges should introduce random delays to allow market participants who really want to hedge / buy / sell, then we can shift some of the resources to the real world.


> The system is hugely inefficient

Do you know how efficient the system was before HFT started up? And, do you know how many people were working in trading before, and how many are, for a similar fraction of stock volume?

> The law of diminishing returns.


> Do you know how efficient the system was before HFT started up? And, do you know how many people were working in trading before, and how many are, for a similar fraction of stock volume?

Again this weirdly mixes HFT with electronic automated trading, which I really don't think anyone in the domain would readily mix.

HFT by arbing over latency is entirely different to the automation of boring trader tasks that see less people employed to do the same thing in the front office.

I can't continue this more, it's just blind allegiance from people who are clearly not in the domain.

HFT != electronic trading

I have to disagree, I know that HFT != electronic trading.

HFT is also not equivalent to arbing over latency.

That's overly simplistic. While the overall system may be zero-system over an infinitely long time horizon, this doesn't typically matter in practice. It can be positive sum for participants over some time horizon they care about.

For example, an HFT trader make pennies from each trade by exploiting tiny price inefficiencies. He essentially takes money from a "stupid" retail investor who does not know how to optimize his trades. However, the retail investor may not actually care about optimizing trades and just wants to liquidate assets or make a long-term (10+ years) bet. He is totally fine with throwing away a few dollars because optimizing his trades through complex algorithms would be too much work. Here, both parties win, the HFT trades gets paid because he provides convenience, or liquidity, to the retail trader. The same would apply to any human market maker, it doesn't have to be HFT.

And yes, HFT liquidity may disappear during HUGE market movements due to risk, but it doesn't disappear as long as both parties get what they want and the risk is manageable, which is "most of the time". Of course, HFT has other issues such as the race to zero and unfair advantages for a few central players, and I don't want to defend HFT. But saying that "it's all zero sum" is not correct.

An analogy is your nearest grocery store. They're a market maker because they buy from the manufacturer and sell to the consumer and profit from the spread. Do you also argue that these are all zero-sum and we should cut them all out and connect all consumers and farmers directly? And their liquidity also disappears when black swans (corona) happens :)

Is that always true?

Melon Usk (say) wants to make cars, but he can’t pay for the factory himself, so he forms a company, sells shares in it, and uses the proceeds to build a factory. Now he and his shareholders can make cars, so the shares are worth more.

Who lost money?

this would only be true if it was a closed system. the central banks essentially magic money into existence and put it into the market through convoluted methods.

My understanding is that while all markets are not zero-sum, that high-frequency trade amongst trading firms approaches zero-sum.

I share your skepticism.

Now, since you appear to know about these things, among all the available papers/article/blogposts/books is there any that you would recommend as being less wrong than the rest? For example, a while ago I read this book [1], and it didn't seem so bad, but I'm not in the industry. Can you recommend anything, even with caveats?

[1] https://www.amazon.com/gp/product/B00BZ9WAVW/ref=dbs_a_def_r...

In general, books are a much better source of information than papers or blog posts when it comes to trading. I haven't read the one you posted, but a few I can recommend:

[0] is okay. I disagree with a lot in there, but it's pretty well written and one of the better books on the subject. [1] Is very old, but it's one of my favorites. It's very mathematical. The ideas still apply today. [2] Is a good introduction overview

[0] https://www.amazon.com/Advances-Financial-Machine-Learning-M...

[1] https://www.amazon.com/Introduction-High-Frequency-Finance-R...

[2] https://www.amazon.com/Trading-Exchanges-Microstructure-Prac...

I have your post saved and have gone through it many times, thanks for writing it - big fan!

As a student who is looking to get started with trading and enjoys the mathematical/analysis part of it, do you have advice of where to begin? I find very few resources in this area and its very hard to get on this career path - my experience is on the ML side if things and I want to transition into trading. Any advice will be really helpful - thanks!

you might find the books I linked in this post helpful:


There are a few different types of roles in the quant world, and number of different types of funds:

alph/signal research: apply quantitative methods to come up with profitable trading ideas and strategies. This is kind of like "Data Science" coming from tech - finding the insight in the data

quant development: build the infrastructure for the data and strategies. This is kind of like "Data Engineering" coming from tech - a lot of ETL and general development work.

portfolio analytics/execution: figure out how to combine different alpha ideals into a portfolio that can be traded. Involves trading and monitoring the live portfolios.

risk management: Thinks of all the possible "risks" the portfolio can be exposed to and ensure they're properly addressed/hedged/accounted for.

This is a broad generalization which can vary greatly from place to place. Typically the smaller funds will have more blurred lines and lots of roles that involve doing multiple of the listed above. At the larger funds, the roles will typically be more well defined and segmented.

Lot's of quant funds are happy to hire people with no finance/trading background if they're strong enough in other key areas. A lot of the "finance" specific stuff can be picked up on the job. Also ML is quite in demand right now.

I also worked in HFT and have no idea what you mean when you say other HFT shops can front-run your orders?

To front run someone’s order you need to have advance information of their order s? Normally this means the front runner is operating as a broker. I can’t imagine any HFTs using other HFTs as brokers to forward their orders to the exchange?

If someone starts running this model couldn’t you just run the model yourself to predict the order?

I’m sure that there is market microstructure stuff/front running practicalities that would make this harder than it sounds but still you wouldn’t completely be in the dark.

But how would doing that constitute front running?

Front running has a specific definition in terms of market regulation (see my other comment) and what you describe is not front-running.

The distinction is between algorithmic traders - and HFT. HFT traders often find ways of making money from algorithmic traders. Especially if the algorthmic traders are doing things like VWAP.

I don't understand your point or how it explains how HFT companies can "front-run" other HFT companies?

Front-running is when someone with a fiduciary duty - typically a broker or dealer - takes an order from a client and then trades on their own book BEFORE executing the client's order knowing the effect of the clients order on the market and knowing that they can exploit this effect for their own benefit.

I know of no HFTs which have such a relationship with rival HFTs and can't even imagine such a relationship existing never mind it being a frequent cause of why strategies perform poorly for HFTs.

Front-running hasn't been a feature of markets for decades at least. Any sniff of front-running would have the SEC or CFTC fine your company into oblivion and possibly result in jail time or at the very least lifetime bans from the financial industry.

Ok, well let's say you're using an algorithm to trade, and an HFT firm identifies what your 'algorithm' is doing, they're going to front run you - whether that be using VWAP or flashing 10 lots every 30 seconds. And both of those absolutlely happen. They're not going to literally 'know' what you're going to do, but some algos are pretty obvious and somewhat exploitable.

That's not front running. You can only front run and order if you're handling the order for a third party.

Being faster than someone else isn't "front running" them nor is spotting patterns in other participant's behaviour and exploiting those patterns. The definition of "front running" is reasonably specific: https://en.wikipedia.org/wiki/Front_running

This is a really valuable opinion, especially because it's easy to be misled about these things for those of us who are novices in HFT.

At the risk of digressing, might I ask if the dough to be minted is good in Quant/ HFT / Algorithmic trading?

Former hedge fund and HFT quant trader here. There's a lot of papers to be found claiming some sort of strategy. I don't want to go to cynicism immediately. But we'll get there:

- Trading isn't just about deciding what to buy and sell, the sexy part that everyone thinks is great. I even had colleagues who thought they were special because they worked closer to the strategies, which meant that certain less glamourous parts were neglected.

- Less glamourous parts like coding the software to read in the market data and send out orders.

- Less glamourous parts like schmoozing with brokers to get them to lower your costs.

- And maintaining infrastructure, which somehow people think should come as part of coding.

Now I'm not saying that RL won't help you. It's just that focusing on the "intelligent" part of the trading system tends to lead to disappointment, as you discover some unknown restrictions on your model that you hadn't thought of. Things like when you find out short selling was prohibited during the period that your model backtest was shorting.

My main red flags when reading papers are:

- Choosing a dataset from a small market. Basically any market that isn't the US or Western Europe large caps. You'll discover both price impact and high fees quite late in the game.

- Choosing a very small subset of the market. Smaller n, more noise and overfitting.

- Short periods. N again.

- Long intervals between decision making. N again again.

That's not to say there's nothing useful to be read though. You might be inspired by something you come across.

Yes, this.

The small subsets and super high Sharpe ratios look suspicious.

Further red flags in this particular case:

- Completely unclear what kind of data they're using. Are they assuming they can buy and sell one individual contract at the bid price each minute? Or did I miss some crucial information about bid-ask spreads?

- Abstract mentions a profit, not an information ratio/Sharpe ratio or anything similar.

- During training they need to tweak the reward function in order to not end up with "buy and hold"? How good is their strategy compared to buy and hold?

- Plots without proper labels.

An additional problem with this is that they use A3C here for trading. A3C is known to not be suitable for adversarial environments (e.g. board games, like Chess).

I wrote a paper that demonstrated that A3C is as exploitable as a uniform random strategy in board games (specifically, some poker variants): https://arxiv.org/abs/2004.09677

(Exploitable is a technical term that is defined in the paper; basically, it's "how much can someone who knows everything about your strategy beat you by?")

So I would be very surprised if this survives contact with other traders.

> A3C is known to not be suitable for adversarial environments

Interesting! What are the main papers in this area?

Any intuition why this is the case? is it because A2C generally results in brittle policies?

It’s mostly an issue that A2C isn’t designed for adversarial environments. It also doesn’t have any notion of hidden information, while other algorithms (eg CFR) explicitly handle this. There’s a well-known phenomena of cycling, where agent A will beat agent B which beats agent C which beats agent A; A2C can exhibit this. Think of rock/paper/scissors- AlwaysRock beats AlwaysScissors which beats AlwaysPaper. To avoid this, you typically need to do some sort of averaging.

The alphastar paper and blog post do a good job discussing these issues as they had similar problems. I’d say that’s a great starting point (and then following their references).

Blog post:


Every time such a paper like this comes out, I have to ask myself: if they knew how to make money like this, why would they tell anyone?

It's a weird subject for academic study in the first place. Trying do do trading strategies that beat the market is one of the few things that really is a zero-sum game. In the absence of independent scientific interest to optimizing these strategies, what's the point? You might as well study how to optimize strategies for ultimate frisbee or something.

Why do you think it's a weird subject of study, but games like Chess, Go, and Starcraft are not considered weird? Aren't both studied for purely their benchmark potential as opposed to the problem itself? Why are games widely accepted, but trading is "weird"?

It's probably best understood as either job audition, fund-raising, or mathematical entertainment.

Sure, I find mathematical finance absolutely fascinating. I just don't think it's worth putting a lot of research energy into. It's usually the boring stuff that matters.

Sometimes you come up with a strategy but you don't have bankroll or infrastructures to execute it. It might help you get a job. Maybe the strategy works for a short time. If someone wants to pay you 400k to write algorithms that might be better option.

Because they're academics who choose to forego money in order to advance their career in academia. They could also generally double their income overnight by choosing to move to industry. Many do, most choose not to.

It looks like their code is assuming a fixed commission of 20 rubles, which is apparently equivalent to US$0.27. Does anyone know if that's realistic? https://github.com/evgps/a3c_trading/blob/master/configs.py#...

Depending on how liquid the market they studied is, code that assumes there is never any slippage may not be very realistic.

There's no comparison to a simple buy-and-hold strategy, which may be less interesting from a computer science perspective but is a good way to avoid spending lots of money on transaction costs.

(I once posted my own algorithmic trading project to HN that had very flawed, naive assumptions about what trades could be executed.)

IIRC moex is perticularly expensive to trade on, and costs can be non linear -- but something like 1bps of commms is a reasonable approximation as an upper bound. In the paper they claim a cost of 2.5RUB per transaction, not the 20 in the config file.

Edit*: It looks like the comms are indeed around 10RUB per side (which is approx 1bps). https://www.moex.com/en/contract.aspx

If the model is trading single lots then this is a reasonable cost assumption, otherwise it isn't. The paper using 2.5RUB as costs is unreasonable.

This is just fitting on noise. The vast majority of movements are random and no more predictable than a coin flip. Before training, your job is to extract that extremely weak signal, then train.

Try generating a time series in Excel with Brownian noise, watch as it is indistinguishable from price charts.

What if someone did find a pattern in Brownian noise...

Anyone interested in delving into reinforcement learning should check out our book https://www.manning.com/books/deep-reinforcement-learning-in... - we cover actor critic (used in this paper) to multi-agent to relational/attention models

Thanks, looks very useful.

Since you seem to be industry practitioners: I moved away from RL 10 yrs back disillusioned with lack of real-world applicability. Has that changed significantly? The only major name I’ve heard of is Vowpal Wabbit. Maybe there are more applications being done in stealth. Any insight? Thanks

Honestly, not much has changed. The primary use cases continue to be things like robotics, playing games, finance etc. I’m interested in RL academically from a computational neuroscience standpoint (using RL to model cognition) but also as applied toward healthcare problems.

However, I don’t think the current limited use of RL is a permanent situation just that the most exciting uses of RL are extremely difficult problems that involve long-time horizons and planning. For example, RL could be used to automatically prove mathematical theorems which would be amazing. But it’s a really hard problem for various reasons. Still a lot of progress to be made.

You might be interested in the recently-launched Covariant (https://covariant.ai/), they apparently actually have systems in production. Pieter Abbeel is one of the founders and they have some pretty "heavy" investors, like Jeff Dean, Geoffrey Hinton, and Yann LeCun.

Thanks! OpenAI has made some progress on this problem too: https://news.ycombinator.com/item?id=21259765

Looks like these two places are on the cusp of a major breakthrough in RL/robotics!

i've been trading crypto in large volumes at high frequencies for quite some time now. my models were plain as yogurt feed-forward neural nets. i would engineer some dumb features, sample the at random data, assign the labels (that translate into trading decisions), and train the model. then push to prod, sit back, and relax while the balance grows like a mushroom cloud. just kidding, before that i would grow gray hair while backtesting, debugging issues, etc.

one of the hard problems was labeling the data. knowing that the price is going up 10 bps one minute from now, should i buy? maybe. but what if it's going to crash 100 bps right after this? probably should sell instead.

reinforcement learning promises to eliminate the need to assign labels in the training data. the agent will try a bunch of different variants at random and eventually will choose the most optimal one knowing the state of the working, i.e. the state of the markets. at training time i only need to feed it the features data. another benefit is that backtesting and model training is sort of fused into a single process. rl model is optimizing pnl, and not the label classification score (as in the nn model). with proper train-test-validation split, the most performant rl model can go straight into production (helping me to keep some of my hair brown)

while all the bits and pieces seem straightforward i never managed to tune rl model to work better in the backtest compared to the good old old nn models. maybe i have never been closer to the gold vein, but for now, i abandoned my efforts to build a performant rl agent if favor of nn models.


Many people are criticising their backtest but I don't understand why. Their test data is sequential to their training data, considers time, and doesn't overlap. They can't overfit to their test data. In any other area of ML this would be an acceptable scheme, why is this unacceptable here?

What people are complaining about is not the overfitting, but the unrealistic assumptions in the backtest. In the real world there is slippage, latencies/jitter, special market open regimes, hidden orders, market impact, front-running, variable fees, and all kinds of other complexities. Their transaction costs are apparently also an unreasonable assumption. Sophisticated simulators used in professional trading firms can account for such things to some extent, but most academic papers conveniently ignore these complexities and just assume they can trade at whatever price the data tells them. It's completely unrealistic.

To answer your original question about overfitting, they can still overfit to test data by running a lot of experiments with different hyperparameters, architectures and parts of the data, and only report what has worked. There are also more complex ways that test data can leak into training data (see the book Advances in financial ML for a good overview). You can already see this is likely the case just from the variance in their results and trades. They also don't compare to baselines. It's not unlikely that the results are just random and they fail to report those experiments that didn't work. Of course, you cannot prove this without having an exact log of all things they ever did to the data. But again, that's not the main issue here.

> They also don't compare to baselines

This is certainly worth criticising

> they can still overfit to test data by running a lot of experiments with different hyperparameters, architectures and parts of the data, and only report what has worked.

but this is a different accusation from accidentally overfitting or leaking, i.e. it would mean that they're dishonest and cherrypicked their data in such a way that it hides overfitting and leakage. This criticism can be levelled at every ML paper, but in this case they detail their architecture, provide the code, and provide a Jupyter notebook to let people try it themselves.

> just assume they can trade at whatever price the data tells them. It's completely unrealistic.

I think that this is a fair assumption for highly liquid markets and relatively small trades, and if it's a fair assumption then all of your criticisms (slippage etc) don't apply to the extent that they'll break the approach. Also, if the approach works then trade size (fees aside) and being frontrun also wont apply because presumably large HFT firms can use it.

Overall I think your criticisms are valid, but imo they don't invalidate a promising approach, they're just the next thing to test.

All profitable automated trading strategies that I'm aware of target a specific inefficiency in the market. What is the inefficiency here? If you can't articulate the inefficiency, it's probably best not to employ the strategy.

Interesting, could you describe one inefficiency that was exploited in the past? I could imagine buying/selling due to spreads between exchanges but is there another not as obvious example?

I can describe one inefficiency from sports gambling. There is a famous NBA Gambler named Haralabos Voulgaris. He realized that the points total prediction for a game, let's say 100 points scored for Team A, was merely sliced in half to represent the half-time score. However, the pace of the first half is markedly different from the pace of the second half, thus points are scored at an uneven clip. He exploited that inefficiency for a while to great success.

Like sports gambling, a lot of the financial products we trade are obviously built by humans using rules, and arbitraging the intrinsic rules and regulations around said products. Think about Forex trading where you convert currency into currency. One of the key strategies is to find and identify brief negative cycles, for example, in the hope that converting US Dollars to Euros to Yen back to US Dollars leaves you with more dollars than you started out with.

Problem is, there is competition, so any "feature" your ML has discovered, will fade away as other MLs discover it too. So it has a tendency to become a completely featureless stochastic system. Temporary features can exist though.

So many papers by "smart" people predicting the past

How do you propose that anyone evaluates their algorithm on the future?

start a fund with your ML approach - evaluate how much money you've made after 1m,6m,1y,2y,10y etc..

Run algorithms live, check results after X time has gone by.

I’d like to know what Rentec does. Or whether Rentec even does HFT?

Human greediness is endless. Using AI for stock trading is just unethical and waste of resources. We are sick.

AI has been used in stock trading for decades now. Also, why does it even matter? Why is it unethical to use AI for trading to begin with?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact