
Lessons learned building an ML trading system - traK6Dcm
https://www.tradientblog.com/2019/11/lessons-learned-building-an-ml-trading-system-that-turned-5k-into-200k/
======
nickreese
After having spent an insane amount of time in late 2017/2018 building an HFT
bot for Binance I can say this is a pretty solid article.

In our case we were doing triangle trading between BTC/ETH/USDT pairs and had
our buys/sell delay down to 3-7ms. At one point moving 0.3-0.7% of Binance’s
daily volume.

Few notes:

* Finding an objective point of truth for value when all of the currencies are floating is hard but vital to success. This was the hardest problem we encountered. We tried taking the realtime average of BTC and ETH across all exchanges, we tried tying it to the shortest route to USD, and several other routes... but ultimately this is where we ended up “losing” most of our alpha.

* Order books are seemingly simple but the devil is in the details. This especially matters for paper trading.

* Efficiently using API limits at exchanges is an optimization problem in and of itself.

* Our model was relatively simple but we focused on speed and edge cases. For instance Binance would rotate IPs on their load balancers and we’d constantly check the latency between each open SSL connection and use the fastest. Further we wouldn’t decode the buy response to plaintext we’d just read the raw stream.

After several epic months our entire project fell apart after a cryptic phone
call about “institutional access” that didn’t follow the 1s websocket update.
The access was quiet expensive and we said no to it and shortly after all of
our strategies went to crap.

Best we could tell someone was front running us due to an artificial delay for
our account (delay between trades went to ~20ms up from our prior steady speed
of 3-7ms) and/or a bunch of the trades in the orderbook were bogus.

Frustrated we tried our strategy on another account and the delay dropped
again to our normal range and was profitable again (the orderbooks were
slightly different between bots!).

It was in that moment we realized playing in unregulated markets is not fun or
something we wanted to continue to do. Intermediary risk was something we
didn’t account for.

Further we realized that there will always been a better resourced or more
dedicated team willing to fight you for your alpha.

After months of effort and a ton of fun we decided it was best we went back
and focused on a problem where we could build a long term competitive
advantage.

Edit: typos and formatting

~~~
criddell
What's the value of HFT? If exchanges were required to add a random delay to
very trade to work against high frequency traders, would anything of value be
lost?

~~~
traK6Dcm
Many HFT systems provide liquidity. If there is no liquidity, retail investors
like you or me cannot buy or sell.

Just imagine you want to exchange a currency because you go traveling and the
exchange tells you "Sorry, nothing available right now, gotta come back in a
few weeks". That's what would happen if there is no liquidity.

~~~
akra
I always had questions about the liquidity on offer though seeing participants
flee the market in black swan events. HFT liquidity could be illusionary - its
only there when its not quite required similar to how the bank "only offers
you a loan when you don't need one". Of course this is exactly the point when
liquidity is required; normally there's sufficient liquidity in normal times
from market participants. Its easy to offer liquidity in normal times; harder
to do so when no one else wants to offer it.

~~~
wellactually
Offering liquidity is not equivalent to offering free money. The offer comes
at a cost.

------
pinouchon
I spent the last year working fulltime on a system similar to the one
described here. I trade the top ~20 cryptos on binance. I use deep learning
models (combination of temporal, causal convnets and RNNs) with heavy data
augmentation. I built my own tooling for data collection, training,
backtesting and live deployment. Having a data engineer background coming into
this was hugely helpful: most of my time was spent manipulating data in some
way (and not playing around with the models). One of the most demanding parts
was estimating spread/slippage costs and including it into the loss function.

Most of what the author talked about, I learned the hard way.

I'm now at the point where I ran some tests (trading small amounts) live on
binance and the results are positive: I do manage to make small profits, but
more importantly, the recorded live trades reflect very closely the backtest
trades (for a given period). I'm currently scaling up my model and adding
better monitoring / reporting / CI.

I'd be happy to chat with anyone having done similar projects or willing to
exchange ideas.

~~~
alexcnwy
I'd love to hear more about what kind of data augmentation you're doing. A
friend of mine recently got a GAN to work for timeseries which is really
interesting.

I've done a lot of work in the space and would love to chat - just emailed you
:)

~~~
ghgr
Indeed GANs are showing very promising results. There's a series of blog posts
from Fernando de Meer which discuss this topic in a very approachable way.
[https://quantdare.com/generating-financial-series-with-
gener...](https://quantdare.com/generating-financial-series-with-generative-
adversarial-networks/)

~~~
alexcnwy
Very cool, thanks for the link!

------
thundergolfer
This post is an exemplar of the crucial relationship between domain-specific
knowledge and ML competency in the ML space. The bulk of the post is detailing
the tricky ins and outs of trading, and overall the author gives the
impression that they're broadly knowledgeable about stock markets.

Contrast this post with those you see with ML hobbyists who delve into
medicine or fake-news and produce useless results testament to their lack of
domain-specific competency.

~~~
jacquesm
Domain knowledge is essential to almost any project that aims for eventual
commercial success, it is quite rare than an outsider will come into a field,
apply some ML and make a killing.

~~~
deepnotderp
RenTec

(Yes, I know they do much more than ML, but still)

~~~
throwawaymath
That is not what Simons did to make Renaissance Technologies successful.
Simons cultivated domain knowledge long before he started a hedge fund,
because he had an interest in trading and gambling even as a professor. He
also hired people with financial experience.

The historical record overlooks the people he hired who knew a thing or two
about trading, while fixating on the team of NLP scientists he hired from IBM.
Likewise Simons wasn't initially successful in the very, very early years. It
wasn't until the late 80s that the Medallion firm really came into its own.

~~~
deepnotderp
Yes, I understand that the "financial naiveness" so to speak of RenTec is
overplayed in the media, but my point is that superior domain knowledge wasn't
what enables Medallion to win. And to be fair, most of the early years were
before systematized quant trading.

------
gricardo99
Great post! Very refreshing to hear about a) the honest level of effort
involved in this type of endeavor, b) the amount of nonsense trading advice
out there.

Maybe in a future post you could discuss the security and banking side of this
in more detail? In the 6ish years I’ve played around with crypto trading (and
I really mean play, nothing close to your level), I’ve had 2 exchanges hacked
and lose all customer funds, another 2 had major security breaches causing
days of downtime but recovered, and one site seized by the FBI.

Then there are the horror stories of banks freezing your account when you move
funds in and out of exchanges. Luckily That hasn’t happened to me.

I bet you have some good stories and perspective on that side of it, I would
love to hear it.

~~~
traK6Dcm
Author here. Honestly, I don't have a good answer. I spread my capital across
enough exchanges so that if one runs away with it gets hacked it doesn't ruin
me. It's just a risk I'm taking.

I'm also not trading much capital. Because the system is more on the HFT side,
the actively traded capital isn't that high, and I don't care about losing it.
Any profit I try to get out of the exchanges regularly. I wouldn't feel
comfortable leaving large sums on those exchanges.

~~~
account73466
Apart from Binance who else you trust? (of course overall no crypto exchange
can be really trusted) I assume you trade alts vs (BTC or USDT).

Also, when you said "market neutral", did you mean you also short (only few
pairs have margin on Binance and it appeared recently).

------
mellosouls
I enjoyed reading this but here is a cautionary review of a project in the
same field:

[https://towardsdatascience.com/what-happened-when-i-tried-
ma...](https://towardsdatascience.com/what-happened-when-i-tried-market-
prediction-with-machine-learning-4108610b3422)

Discussed here:
[https://news.ycombinator.com/item?id=21624907](https://news.ycombinator.com/item?id=21624907)

~~~
alexcnwy
The only caution I took away from that post is that it's very easy to make
mistakes applying ML to financial markets if you don't know what you're doing.

There looks like a lot of overfitting the validation set going on in that
post.

It's also a mistake to conclude that "there was no subtle underlying pattern"
just because the author couldn't find one.

Throwing XGBoost at a bunch of technical indicators isn't gonna cut it but I
have had some solid real-world success (as have several people I know)
applying ensembles of deep learning models (with regime switching based on
model residuals) to profit from "subtle underlying patterns".

------
mtm7
I’m impressed with this system, but I’m even more impressed with the author’s
writing style. I’d love to see more technical posts written with this level of
clarity.

~~~
traK6Dcm
Thank you very much :) I'd love to write more, I just need to figure out a
good next topic.

------
dnautics
> For example, instead of defining a tick as 1 second, we could define it as
> 1.0 BTC traded...

Interestingly Benoit Mandelbrot talks about this in "the (mis)behaviour of
markets" and explicitly calls it "market time"

------
d--b
Is anyone else suspicious about the results?

Claiming a 4000% return while staying market neutral seems a little too good
to be true.

First: those levels are insanely high, so the algo must be taking some absurd
risks and have the worst sharpe ratio, or getting pretty close to being 100%
accurate.

Second: if you can scale this across markets, and assuming the same return,
that investment will turn into 12 billions in 4 years. I doubt that you'd
write a blog post about it if you had found such a gold mine.

~~~
kungito
Is't it so that many of these strategies don't scale well? When you are in low
volume trading you are collecting all the best trades but as soon as you go
10x you are affecting way too much

------
onlyrealcuzzo
Is this the author?

I would love to know how this fared recently in the large sell-off.

What he says about some markets possibly being predicable rings true to me.
But the article was far from convincing that the BTC market is actually
predicable.

The natural assumption should be that the author was in the right place at the
right time. Although he went through great lengths, I'm not convinced this is
anything other than luck.

~~~
Akababa
Through hypothesis testing you can estimate the probability that this was due
to luck is very low. Assuming that a monkey would have a 50% chance of
profiting on a day, the chance of going a month without a losing day is less
than 1 in a billion.

~~~
symplee
How many monkey bots are flipping the coin daily? Selection bias just needs
one.

I very much look forward to the author's follow up post at the end of 2020 to
see if another 5k turns into 200.

~~~
Akababa
You have to make some basic assumptions to do any statistical inference,
because if you don't then literally anything can be explained by luck. For
example, even if the author did a follow-up post (which I'd love to see as
well!) every year for 30 years and made money every time, it could still be
"selection bias".

The number of monkeys required to match the author's results over a 12-month
period is well over the number of atoms in the universe.

------
jackschultz
> The biggest edge probably comes from the effort put into building the
> infrastructure.

I feel like this should be in bold, but either way, I love reading that in
these posts. In every way, from research to confirm your models are correct,
to be able to trust real time trades, you need a solid architecture. This
thought isn't only for trading remember, where it's the same in tons of
solutions to problems. If comment readers have other examples, I'd love to
hear them in responses.

------
latchkey
I've played with writing bots before and this post hits on so many of the edge
cases I personally ran into. I have never heard it this well explained before.
Brilliant.

------
mthoms
Fascinating post. There's just so much to digest here. Well, there goes the
rest of my day!

------
adamiscool8
It's interesting they suggest the higher the timeframe, the noisier the time
series, when to my understanding the opposite is typically found -- the lower
timeframes exhibit a more random walk and the higher timeframes exhibit
trending behavior.

------
fny
Can someone comment on how taxes are handled when automated trades are made
like this? It's something that seems wholly absent from the cost calculations.

~~~
traK6Dcm
Author here. Personally, I just don't. I tried doing it but it was too
complicated. So I end up just hiring a tax accountant specializing in crypto,
send them all the data I have, and pay a few $k. In case something goes wrong,
it's their fault and they take the risk.

~~~
ACow_Adonis
Not sure what country you're in, but that's not the way tax accountants work
in mine. You're still liable for mistakes/problems they make here :(

------
lorepieri
It is a big loss of your time and nobody will give you back that one. I
suggest you to use your time in non zero-sum games, something that can create
value for you and society. Now that you have some saving you can definitely
afford it. The next best thing of not doing it is to quit doing it now.

Disclaimer: I built a similar system in the past, took some gains and then
realised the above. I then quitted to build a company.

~~~
lowracle
I have been working on such a system and it is NOT a huge loss of your time.
I've learned so much things in the past year, in market microstructure, in
networking (infrastructure, protocols), cloud computing, cloud management
(docker swarm, kubernetes), linux kernel bypass, distributed systems, data
base, and I've read hundreds of papers on neural networks, gaussian processes,
etc... If you are wondering if you should get into this, it is one of the best
learning experience you will ever have.

------
jugg1es
It must be said that it is a lot easier to make money in a stock market that
has had low volatility and no significant, prolonged dip in the last 5+ years.
My own long-term investments have earned 20% return over the last 5 years with
zero trading. I realize that this article is specifically about crypto, but
trends in all markets is generally up across the board.

------
tatoalo
As someone who just finished a BSc in computer science and started a MSc in
Financial Technology and Computing this post is really interesting to me, keep
‘em coming :D!

------
KloudTrader
This is a really good post, thanks for sharing it. Algorithmic trading systems
vary a lot and every shop have their own way of doing things.

------
known
[https://archive.vn/iI8H1](https://archive.vn/iI8H1)

------
echelon
Can this same strategy be leveraged on zero-fee stock exchanges? Why is crypto
the target here?

~~~
traK6Dcm
Author here. Perhaps if you already have existing HFT infrastructure and
connections to efficiently trade on such exchanges. But such infra costs
millions. If you don't have this, you're probably at too large of a
disadvantage to find any alpha.

At least that's my understanding based on conversations I've had, I've never
traded equities.

~~~
__d
In a little more detail ...

To be competitive in US equities HFT, you need an FGPA with 40GbE ports hosted
in a server (which needs to power and cool the FPGA, and deal with the less
latency-sensitive bits of your system). You'll need some storage as well.

That server needs to be co-located with your target exchange(s) matching
engines, and connected via 40GbE. You might additionally want remote market
data via mm-wave microwave.

You can probably put together a basic but competitive hardware setup for $70k
or so, if you ignore redundancy, and you only need to trade a single market.
More realistically, you'll need at least two, plus shared storage, and
probably more depending on what markets you intend to trade on.

Then you have monthly costs: colocation for the server(s) ($5k-ish+), port
fees for the order entry ($500-ish), port fees for market data ($500-ish),
physical connectivity fees ($20k-ish) , cross connect fees for the
connectivity ($500-ish), wireless connectivity fees, you might need roof
access (more fees), market data fees (per exchange), memberships, and trading
costs.

I haven't done this for a while, but it easily adds up to $100k per month or
more.

So you need to be making quite a bit to pay off your infra, before you start
thinking about profit. And your model will age pretty fast, so you'll want to
be working on a few possible replacements concurrently.

It's a tough business.

~~~
traK6Dcm
Thanks a lot for the detail. I've always wondered what the actual costs look
like.

~~~
__d
Thanks for posting Nick's article!

I chased up some actual details from NASDAQ as an example:

[http://www.nasdaqtrader.com/Trader.aspx?id=PriceListTrading2](http://www.nasdaqtrader.com/Trader.aspx?id=PriceListTrading2)

------
m3kw9
Only way to beat is go long, super long, longer the better. Machines doesn’t
go long

------
cco
And in the end, what value was created?

"Liquidity in the BTC market"?

~~~
traK6Dcm
Author here. Actually, the system is mostly taking liquidity from the market
so it's not even doing that :) Perhaps, "opportunity for others to create more
liquidity"

Jokes aside, it's actually something I am thinking about a lot. Such systems
don't create value, but they extremely intellectually interesting and I've
learned a lot. You can say the same for many other projects, for example
academic research in many fields. Most of it is just noise to promote the
author and does not create value in the world. But it's intellectually
interesting, so people work on it.

Other people write compilers for fun to learn something new without creating
value. I don't think this is fundamentally different.

~~~
mrpopo
> But it's intellectually interesting, so people work on it.

No. It's a financially sound way to use time, so people work on it. Anything
can be intellectually interesting.

What kind of academic research does not create value? And if so, maybe it
deserves to be criticized the same way.

~~~
TomMarius
> What kind of academic research does not create value? And if so, maybe it
> deserves to be criticized the same way.

I have friends that study social behavior of ants. Is there any way to apply
that?

