In our case we were doing triangle trading between BTC/ETH/USDT pairs and had our buys/sell delay down to 3-7ms. At one point moving 0.3-0.7% of Binance’s daily volume.
* Finding an objective point of truth for value when all of the currencies are floating is hard but vital to success. This was the hardest problem we encountered. We tried taking the realtime average of BTC and ETH across all exchanges, we tried tying it to the shortest route to USD, and several other routes... but ultimately this is where we ended up “losing” most of our alpha.
* Order books are seemingly simple but the devil is in the details. This especially matters for paper trading.
* Efficiently using API limits at exchanges is an optimization problem in and of itself.
* Our model was relatively simple but we focused on speed and edge cases. For instance Binance would rotate IPs on their load balancers and we’d constantly check the latency between each open SSL connection and use the fastest. Further we wouldn’t decode the buy response to plaintext we’d just read the raw stream.
After several epic months our entire project fell apart after a cryptic phone call about “institutional access” that didn’t follow the 1s websocket update. The access was quiet expensive and we said no to it and shortly after all of our strategies went to crap.
Best we could tell someone was front running us due to an artificial delay for our account (delay between trades went to ~20ms up from our prior steady speed of 3-7ms) and/or a bunch of the trades in the orderbook were bogus.
Frustrated we tried our strategy on another account and the delay dropped again to our normal range and was profitable again (the orderbooks were slightly different between bots!).
It was in that moment we realized playing in unregulated markets is not fun or something we wanted to continue to do. Intermediary risk was something we didn’t account for.
Further we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha.
After months of effort and a ton of fun we decided it was best we went back and focused on a problem where we could build a long term competitive advantage.
Edit: typos and formatting
I didn't believe him at first, since, the more likely problem was elsewhere, but then a few months later sub-penny trading in dark pools was all over the news. This was like 5 or 6 years ago. He's since moved on to other things having come to similar conclusions, that trying to play such a rigged game was futile.
* Did his levels actually get crossed? It could be that he never had the opportunity to get filled at the price he hoped for.
* What was his queue position? Stock exchanges tend to be price time ordered. If others had submitted orders at the same price before he had, they would have priority when getting filled.
* Apropos the prior point, was his broker actually submitting orders when he sent them? Some brokers may avoid showing deep out of the money orders, which could have affected his queue position.
> He finally concluded that some institutional traders must have access to sub-penny ordering, despite it being against regulation.
There's nothing wrong with this. Reg NMS rule 612 permits sub-penny price improvement .
That sounds like a big deal. If this is repeatable, you should document it better. Unregulated doesn't mean a license to do blatantly illegal things. Crypto exchanges certainly get taken to court.
Sure it wasnt your book building algo and a snapshot retrieval race?
The less nefarious explanation is simple. You decide how likely it is: there is no mechanism making sure that IP packets sent to all data feed subscribers arrive at the same time. Exchanges of consequence distribute market data via UDP multicast, over physical links that are as identical as possible (think identical lengths of fibre).
Now if you're receiving JSON via Websocket and parsing it using an allocating parser and your NIC driver is in kernel space and you use a GC'd language and if the exchange loops through a list of TCP connections to send a message to them one at a time and there is jitter in packet delivery time in upstream hosts (and other internet weather) and and and ... you simply cannot expect identical order books at the submillisecond timescale.
A fortiori, throw half of that crap away and suppose they were using userspace NICs and no-alloc single threaded C++ that chills entirely in L1 cache. Still consuming TCP over the public internet.
We wrote support and they told us they would investigate. Never heard back.
This is why the whole $0 trading fee and robinhood concern me. I'm paying for the trades and someone is still messing with me.
Edit: Fun fact, the founders of Robinhood worked at an HFT before starting robinhood, which used to be shown on their LinkedIn until several years ago. Bad optics I suppose.
Some people have this notion that if their trade got matched against a HFT that they have somehow lost out. The exact opposite is true. If the party on the other side of your trade was a HFT, then that implies that all other parties were offering worse prices than the HFT was. If the HFT had not been there, you would have got a worse price (whether on the buy side or the sell side). The presence of high-frequency traders (or any traders, for that matter) reduces spreads and increases liquidity.
The only people who get worse deals as a result of high-frequency trading are people who would sell you the same stuff for more money.
The substance of your post is describing price discovery - price discovery isn't the issue here. The issue is that the price discovery is being done in milliseconds.
The substance of the complaint is the speed of the auction that discovers the price. The faster price discovery is being done, the fewer traders can participate. If a few milliseconds can change the value of a financial instrument then we have to accept that waiting a second or two will allow a lot more traders to reevaluate and offer a fair price.
The HF in HFT looks like a play to reduce the number of traders who can act on information, which means the buyer/seller is probably getting scalped. HFT traders are making money arbitraging the speed of information dissemination, which indicates that other traders would offer different (/better) prices if the market waited a half-second or so to let everyone gather all the relevant data.
The faster the price discovery, the shorter the validity time. This means a more accurate price.
> If a few milliseconds can change the value of a financial instrument then we have to accept that waiting a second or two will allow a lot more traders to reevaluate and offer a fair price.
You're making the mistake of assuming the price is static. If I have to quote you a price on an instrument that's valid for the next 5s, I have to be more conservative than if I'm quoting for the next 5ms. Which means you get a worse price.
> The HF in HFT looks like a play to reduce the number of traders who can act on information, which means the buyer/seller is probably getting scalped. HFT traders are making money arbitraging the speed of information dissemination, which indicates that other traders would offer different (/better) prices if the market waited a half-second or so to let everyone gather all the relevant data.
There's literally nothing stopping or limiting the numbers of players operating at this speed. What the race actually results in is all market participants getting smaller spreads and better prices.
A continuous market price is an illusion; let's not forget that. Every market is made up of trades, which are discrete, and every price has an unstated amount of uncertainty, which may be large by any standards. You can't know from first principles whether a change in quoted price is even a change in the market, because it could be within the +/- range that's implicit. Apple is quoted to the nearest penny, at least, but the idea that the current price is accurate to within 0.004% is ridiculous.
"Fischer Black famously defined an efficient market as “one in which price is within a factor of 2 of value, i.e., the price is more than half of value and less than twice value,”
...I hadn't heard this before Matt Levine mentioned it, but I was like "yeah, obviously, why haven't most people gotten the message?"
"BNP Paribas...said [Saudi] Aramco was worth exactly $1.424394 trillion."
I wonder which particular millisecond that held for.
The market has a whole (mostly) unseen dimension other than uncertainty, which is depth. You can only buy or sell so many shares at the instantaneous market price. Further away, there may be orders, but the price of the whole company is going to be way outside of that. The shorter the timescale, the shallower the "market" so you can't just say we're making progress by doing things faster. It's like when research lasers are said to make unbelievable power, but it's like for a femtosecond or something. Liquidity, in my mind, requires depth, just as with water.
And to address your concerns about depth, I believe that's where market makers (which are related to HFT) come in.
Look at gas stations. If people are spending $1 driving around to save $0.10, then that's not good and at least public policy shouldn't encourage it.
I'm not quite getting this point. If I have a sell order at price x and it's filled by a HFT 1 second before someone with a slower algorithm, how does that result in a more "fair" price for me? If the HFT instead posts a buy order at an unfair price x-1, there's nothing stopping the slower traders from taking my sell order at x one second later.
HFT trading isn't about executing the same trade as someone else but a tiny margin faster. That wouldn't have any special impact on market spreads or liquidity, for example.
There aren't any complaints against the T in HFT; as traders they are helpful. The value questions are about the HF and whether it is a desirable part of the market or an unhelpful arbitrage opportunity created only by implementation details of the exchange.
Just imagine you want to exchange a currency because you go traveling and the exchange tells you "Sorry, nothing available right now, gotta come back in a few weeks". That's what would happen if there is no liquidity.
Not really. The spreads were terrible before HFT market making.
Speaking personally, I'd even be happy to wait several seconds to see if someone else is willing to pay a better price.
Why? If the HFT firm was willing to offer me $X 2 milliseconds ago they are probably still willing to offer $X now. It isn't like there has been time for anything to change; there are going to be short periods of time where there is literally no new information.
And they are just as likely to be offering me more now than less as conditions change.
The average person may be overly paranoid about HFT, but it doesn't make sense to say they benefit from it, because they are not going to be in a position where they benefit from an execution in a fraction of a second.
Price improvement of fractions of a penny has gotten silly too. It's easy to think of it as more significant than it is, until you figure it as a percentage (or the spread for that matter).
It's kind of like how ultra-sensitive people are to gas prices...
Smaller spreads lower costs for everyone: institutional, retail etc.
I used GainsKeeper:
I see they do crypto, but I've never used it for that.
For companies, they have entire departments that consolidate trades and construct massive Schedule Ds
I learned this because I had a six-figure capital loss on my spot crypto trades, and a slightly larger gain on unregulated futures which do not fall under capital gains. Capital losses can only offset up to $3k in other income, so I was terrified that I’d have to pay taxes on the massive “gain” without being able to count the losses against it. Fortunately, the straddle is the correct way to report this, allowing gains and losses to be matched across instruments that may otherwise fall into different income categories.
Indeed. This is the really difficult thing about the crypto space: winnings, if you can keep them. And you can't if the house is just going to front-run all your orders.
"we realized that there will always been a better resourced or more dedicated team willing to fight you for your alpha"
- that's part of what makes financial markets such a fun and interesting challenge but agreed, intermediary risk at the timescales you were operating at in this kind of unregulated market is real and not fun
I'd love to hear what problem you moved to that you believe you can build a long term competitive advantage on if you can talk about it?
Shortly after we realized we either should exit or hire an exec team to run the business. After a few months of executive search we found an offer we liked and took it.
These days I’m interested in using some heuristics based on public data to help consumers make informed decisions about nursing homes and in home health care. Still early on this project but lots of data and not many people looking to do good in that market. Seems like a great place I can add value.
I wouldn't be surprised, given that traditional HFT companies are building cryptotrading desks and they have a lot more capital to play with too.
The only way that is possible is if you gain privileged information about that person's orders before they actually hit the book. The most practical way to do this is to be the exchange.
Regulated financial exchanges basically do this already.
Well you can, as was discovered in the Haim Bodek HFT shit storm.
As somebody who still runs a profitable bot on Binance I find this hard to believe.
Also all order book related endpoints/streams are public, so queries/subscriptions are not tied to a specific account.
I pulled the plug. Tried to run it on Binance but the websocket only updated once a second, so there was way too much risk.
Most of what the author talked about, I learned the hard way.
I'm now at the point where I ran some tests (trading small amounts) live on binance and the results are positive: I do manage to make small profits, but more importantly, the recorded live trades reflect very closely the backtest trades (for a given period). I'm currently scaling up my model and adding better monitoring / reporting / CI.
I'd be happy to chat with anyone having done similar projects or willing to exchange ideas.
I've done a lot of work in the space and would love to chat - just emailed you :)
The kind of data augmentation I do is adding different candles sizes. I validate with 5m candles, but I train with 2,3,4,5,6,7m ones.
I also sample more frequently more recent data. I train jointly with ~22 symbols, but in each X with those symbols, I randomly set some to 0, some I invert their price, some I invert time-wise. This helps generalization for some reason. I tried many kinds of noise, but what I described above is what I found to work best in my case.
I have a more ambitious idea to generate synthetic data using self play: have a bunch of agents trading one against another. This create new price data I can train the agents with, and repeat (this self-play training scheme would be similar to what DeepMind did with AlphaGo/AlphaZero). The issue with it is the need to tune the parameters exactly so that the resulting synthetic data is realistic enough that I can tranfer the agents to real data.
For example, during self-play, should you have only trading agents or should you add "retail traders" that buy during bubbles, "normal buyers" that buy only below, sell above certain prices, institutional buyers that randomly move the price a lot in a given direction. This is a lot of parameters to get right, and it's an optitization problem on it own. You could treat this a as two-fold optimization problem such as in this paper: https://arxiv.org/pdf/1810.02513.pdf, but it gets tricky very fast.
What lowers my concerns is that such issues might come once the bot is profitable, and by then I might have found other ways to raise capital (I'm doing a trading bot as a way to raise capital to create an AI research lab). Being able to setup a profitable crypto trader is a good thing to add on a resume or for personal branding even if I shut it down at some point. I'll be in a very different position by this point so it's a bit premature to be concerned about that now, although it's still somewhat of a concern.
On the positive side, I hope this scares the competition away.
Contrast this post with those you see with ML hobbyists who delve into medicine or fake-news and produce useless results testament to their lack of domain-specific competency.
The successful application of ML requires a deep understanding of the domain it's being applied in.
(Yes, I know they do much more than ML, but still)
The historical record overlooks the people he hired who knew a thing or two about trading, while fixating on the team of NLP scientists he hired from IBM. Likewise Simons wasn't initially successful in the very, very early years. It wasn't until the late 80s that the Medallion firm really came into its own.
Funnily enough I think the ML hobbyist problem is most pervasive in the "predict the stock market" domain. There was a post on HN a few days ago  that was overfitting the validation set and hand-waving away fees and spreads. The author concluded that "there was no subtle underlying pattern" because they failed to find one.
Maybe in a future post you could discuss the security and banking side of this in more detail? In the 6ish years I’ve played around with crypto trading (and I really mean play, nothing close to your level), I’ve had 2 exchanges hacked and lose all customer funds, another 2 had major security breaches causing days of downtime but recovered, and one site seized by the FBI.
Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges. Luckily That hasn’t happened to me.
I bet you have some good stories and perspective on that side of it, I would love to hear it.
I'm also not trading much capital. Because the system is more on the HFT side, the actively traded capital isn't that high, and I don't care about losing it. Any profit I try to get out of the exchanges regularly. I wouldn't feel comfortable leaving large sums on those exchanges.
Also, when you said "market neutral", did you mean you also short (only few pairs have margin on Binance and it appeared recently).
Counter-party risk always exist.
> Then there are the horror stories of banks freezing your account when you move funds in and out of exchanges.
Depends on the country. What happened to me is that a bank did not freeze my account. Instead, they simply reported it to the government, and asked AML questions regarding the transfer. The government, on the other hand, wanted me to provide bookkeeping records. Otherwise, they were going to assume that every transfer coming back from cryptocurrency exchange was pure profit.
Basically, I was not raided, my accounts were not frozen, but the government knows my wallet addresses (and I had to pay back 4 years worth of cryptocurrency trading profits with interest applied, which also left me realize how little I had made profit in the end).
Extra warning: ensure your country allows individual cryptocurrency investors to reduct losses from winnings. Without such law, if you win 100 dollars and then lose 100 dollars, you would still owe the government taxes while you are at 0. This is the case in surprisingly many countries.
There looks like a lot of overfitting the validation set going on in that post.
It's also a mistake to conclude that "there was no subtle underlying pattern" just because the author couldn't find one.
Throwing XGBoost at a bunch of technical indicators isn't gonna cut it but I have had some solid real-world success (as have several people I know) applying ensembles of deep learning models (with regime switching based on model residuals) to profit from "subtle underlying patterns".
Interestingly Benoit Mandelbrot talks about this in "the (mis)behaviour of markets" and explicitly calls it "market time"
Claiming a 4000% return while staying market neutral seems a little too good to be true.
First: those levels are insanely high, so the algo must be taking some absurd risks and have the worst sharpe ratio, or getting pretty close to being 100% accurate.
Second: if you can scale this across markets, and assuming the same return, that investment will turn into 12 billions in 4 years. I doubt that you'd write a blog post about it if you had found such a gold mine.
See RenTech limiting the size of their Medallion fund because it was getting too large to scale....
Scalability and profitability are orthogonal. If it could scale indefinitely, you'd be right. But no trading strategy can scale indefinitely.
That doesn't say anything about whether or it "works", and it's not a reason to be suspicious of the results, in of itself. All successful trading strategies are capacity constrained.
I would love to know how this fared recently in the large sell-off.
What he says about some markets possibly being predicable rings true to me. But the article was far from convincing that the BTC market is actually predicable.
The natural assumption should be that the author was in the right place at the right time. Although he went through great lengths, I'm not convinced this is anything other than luck.
But there are no pure market downturns. On a daily scale the market may be down, but that does not mean that on a millisecond or second-scale you will only see downward movement. There is just an overall downtrend, but there is still almost the same upward movement to make money. For HFT systems, it really doesn't matter if the market, on a daily scale, goes up or down. There is no difference.
In fact, on many markets the system does better in downward trends. Probably because there is more liquidity on that side of the book, a bias that may come from certain market participants.
I actually have no idea what my Sharpe ratio is, sorry. With the system constantly changing, data formats changing, exchange balance apis changing, and accurate monitoring already being a challenge in itself, it's very difficult to keep track of exact returns and historical data. I could probably calculate it if I spent several days (weeks?) trying to process all my historical trade data, but that would be a waste of time for me personally.
I very much look forward to the author's follow up post at the end of 2020 to see if another 5k turns into 200.
The number of monkeys required to match the author's results over a 12-month period is well over the number of atoms in the universe.
On the other hand, if we wanted to test his 3900% yearly return, we might assume that monkey returns are equal in distribution to Bitcoin's price and then test the hypothesis that he's a monkey via something like a paired t-test. The problem here is that we only have one data point so p-value is undefined, and due to high variance it would probably take about n=10 points to get something significant. The upside of this approach is that you can get a confidence interval for how much better he is than a monkey, instead of just a yes/no answer.
In any case, since the author has at least 365 data points, he probably has an extremely good idea of both a) whether he's a monkey, and b) how much better he is than a monkey.
I feel like this should be in bold, but either way, I love reading that in these posts. In every way, from research to confirm your models are correct, to be able to trust real time trades, you need a solid architecture. This thought isn't only for trading remember, where it's the same in tons of solutions to problems. If comment readers have other examples, I'd love to hear them in responses.
Disclaimer: I built a similar system in the past, took some gains and then realised the above. I then quitted to build a company.
At least that's my understanding based on conversations I've had, I've never traded equities.
To be competitive in US equities HFT, you need an FGPA with 40GbE ports hosted in a server (which needs to power and cool the FPGA, and deal with the less latency-sensitive bits of your system). You'll need some storage as well.
That server needs to be co-located with your target exchange(s) matching engines, and connected via 40GbE. You might additionally want remote market data via mm-wave microwave.
You can probably put together a basic but competitive hardware setup for $70k or so, if you ignore redundancy, and you only need to trade a single market. More realistically, you'll need at least two, plus shared storage, and probably more depending on what markets you intend to trade on.
Then you have monthly costs: colocation for the server(s) ($5k-ish+), port fees for the order entry ($500-ish), port fees for market data ($500-ish), physical connectivity fees ($20k-ish) , cross connect fees for the connectivity ($500-ish), wireless connectivity fees, you might need roof access (more fees), market data fees (per exchange), memberships, and trading costs.
I haven't done this for a while, but it easily adds up to $100k per month or more.
So you need to be making quite a bit to pay off your infra, before you start thinking about profit. And your model will age pretty fast, so you'll want to be working on a few possible replacements concurrently.
It's a tough business.
I chased up some actual details from NASDAQ as an example:
"Liquidity in the BTC market"?
Jokes aside, it's actually something I am thinking about a lot. Such systems don't create value, but they extremely intellectually interesting and I've learned a lot. You can say the same for many other projects, for example academic research in many fields. Most of it is just noise to promote the author and does not create value in the world. But it's intellectually interesting, so people work on it.
Other people write compilers for fun to learn something new without creating value. I don't think this is fundamentally different.
No. It's a financially sound way to use time, so people work on it. Anything can be intellectually interesting.
What kind of academic research does not create value? And if so, maybe it deserves to be criticized the same way.
I have friends that study social behavior of ants. Is there any way to apply that?
- The only part I didn't like in your article was how you described creating indicators as exploitation. The limit order book is public by design so all traders can look at it. People have the free choice to trade on a centralized exchange or not. This is a trade-off between revealing information and being able to trade quickly without calling all your friends asking if they want to buy some Bitcoin.
- I'm guessing you used data from other exchanges outside the one you were trading as indicators too. That's unquestionably good since your trading helped information propagate faster or more accurately than it would have otherwise.
- Markets are only zero-sum in isolation. Most participants derive utility from things outside short-term profit and loss. Maybe they trade to manage risk, to hedge, to gamble, have a longer time horizon than you, whatever. They just want to trade and get back to their lives. They don't want to waste time squeezing the last fraction of a basis point out of their fills. It's hard to believe, but they actually enjoy getting picked off, run over, paying too much spread, whatever things make you feel bad or indifferent about the service you provide.
I used to get filled making markets on Nasdaq (which pays resting orders a rebate, and charges crossers) when BX (which pays crossers a rebate) was at the same price, and could lay off the trade for an instant profit. The people who traded with me paid for the luxury of saying "fuck it, send it to good ol' Nasdaq." I used to think it was stupid of them, and from the perspective of a prop trader, it was mind numbingly stupid, but they probably had more productive things to do than read every exchange fee schedule or hook up to every small exchange.
- Providing liquidity has nothing to do with resting limit orders vs. crossing the spread. Providing liquidity is about taking risk off the hands of people that don't want it, and moving it across time to someone else who does. If you're market neutral, trade many round trips every day, and end relatively flat, you've played that intermediary role as a liquidity provider regardless of what order types you use.
- Crossing against mispriced orders is doing the world a favor. You're not the bad guy picking them off. If anything, they're the bad guy for holding the market at an incorrect price.
So maybe think of yourself as more of a service provider. Not only will you feel better, but viewing trading through that lens tends to make you a better trader. Strategies truly built around an exploitation mindset are fundamentally unsustainable, since you run out of people to exploit. Providing a service works forever.
FWIW, the rest of what you wrote is almost exactly how the pros do things. If you built this system yourself, you could make far more than 200k at a prop firm. If you're interested, reply with a throwaway and I can refer you to a friend who's still in the business.
Ultimately all investors from the hedge funds down to the long sucker, it is all skimming for returns. HFT just skims from other skimmers returns. Yes those investments may have created value, but ultimately people invest because they want returns.
The money made will be used for more skimming but also some for investment in other possibly beneficial projects.
Extracting financial value via investment (skimming) may allow someone to start a company, support a community or family and have valuable time to make those things better. Just as an investment in a company/idea creates value in that company, the value gained from trading where the first step really is skimming, can be used to create value in the real world or even just time which may lead to more real actual value created.
> There may not be a person on the other end of the trade
uh... what? If the executed order fills, then there was someone on the other side of the trade.
Anyways, by providing liquidity in any market as a market maker, you are effectively aiding in the overall process of price discovery. HFT shops are generally market makers, although other strategies are also possible depending on how the operation intends to generate alpha. For HFT MMs, they pretty much only make money by clipping spread, i.e. submitting dual buy and sell orders at the midprice with the expectation that they will make the (ask-bid)/2 on average. They then cancel these orders as the orderbook's structure changes and as prices and markets move, resubmitting at "better" (more favorable) levels.
It doesn't matter if people are buying or selling or both. HFT MMs provide a valuable service to financial market participants - if participating counterparties submit orders and cross a HFTs latest uncancelled order, it will fill, allowing market participants to quickly gain or lose exposure to their security or instrument of choice.
There's no magic here and you seem really confused about the underlying dynamics of trading and market microstructure.
If anything, providing BTC liquidity might just be increasing net societal harm.
To create value, you need to give someone else money. If I wanted to build a better society, I'd make sure everyone had the best possible education. Paying for this would bankrupt me, maybe even bankrupt the entire country. But with every single person in the country walking around with a deep understanding of music, art, mathematics, engineering, and science... as a society, I'm sure we'd do great things. Value would be created in the very long term, but not for me, the potential investor.
Then on the other hand, we have things like automated trading. That boils down to asking a bunch of people "will you pay $5 for this $4 bitcoin?" Anyone that says "sure!" just gave you a dollar. Do that millions of times per second, and you remove as much value from the system as possible.
Markets promote value† creation, but they don't promote value creation, except insofar as the two concepts coincide. It is instructive to think about the circumstances under which the two concepts coincide, the extent to which those circumstances hold, and the extent to which we are moving towards or away from those circumstances.
For those interested in the topic, I've found the Lean Manufacturing literature helpful. They're usefully obsessed with creating customer value, meaning doing or making something valuable for their customer. Which has some of the problems you mentioned, in that traditional commerce mainly serves people in proportion to the money they have. But you can pretty easily extend the Lean approach to value to non-commercial situations.