
Building AI Trading Systems - dennybritz
https://dennybritz.com/blog/ai-trading/
======
henning
I tried doing some forecasting with various neural network models after
assembling what I thought was a good amount of forex data. The neural net (I
tried various architectures) couldn't do any better than chance. After playing
around with it and trying to double-check everything, that was as far as I
could get. This puts me ahead of most traders, since most of them lose money,
then quit.

This makes me wonder what kind of trading systems can actually have any kind
of edge, since some kind of autoregressive time series forecasting system
seems pretty unreliable.

On a more general note, how do you move beyond it being gambling? Just because
a system backtests well doesn't mean a phenomenon will continue to happen,
especially if your system will significantly impact the market you're in. If
you make a trend-following system, every time you trade, you're gambling that
the trend is more likely to continue than not. If you're right, you'll come
out ahead over many trades. If you don't have enough capital to withstand
drawdown the way most beginners don't, you won't be able to last long enough
for whatever phenomenon you've found to average out.

It takes a lot of time, effort and risk to do all this, so, this is a long-
winded way of saying I don't think it's for me. If you build a SaaS product
and it fails, at least you can talk about what you learned from building it
and use that in future endeavors. If you lose money trading because your
algorithm doesn't work, what do you learn from that besides that your
algorithm doesn't work?

~~~
wavepruner
You need more data to input besides just the price time-series. Successful
human traders balance and synthesize a myriad of data sources to make
decisions.

I depend on an in-depth understanding of human psychology as one of my data
sources. You can't turn something like that into data and input to a model. It
is something learned through life experience and study.

~~~
akg_67
+1 Any trading strategy based only on price is fool’s errand. The information
that impacts price need to be included in the trading strategy. A lot of short
term price movement is news driven, thus unstructured text processing of news,
social media, relevant documents will be a key component of such trading
strategies.

I am not that familiar with forex market compared to equity market. But I
expect forex to be impacted by changes in political and economic situational
news of host countries of relevant currency pairs. All these need to be coded
into forex trading strategies.

If trading based on just price was so simple , everybody would be doing it
successfully.

~~~
smabie
No successful strategy ever has been based on price. Price isn't stationary so
you can't do anything with it. You need to be looking at the log returns.
Price is completely irrelevant, at least for equities.

~~~
Jimmc414
Sorry, but that is wrong. All I use is price and time. See Elliot Wave Theory.
Most indicators are perfectly correlated with price meaning unless you are a
HFT you can't trade fast enough to act on them.

~~~
smabie
Elliot wave theory is quackery. Price is not suitable for any statistical
analysis.

~~~
Jimmc414
It has changed and it is just observations of competing waves of pessimism and
optimism and the patterns they demonstrate. I'll put a model trained on TQQQ
and SQQQ price and time data only. I will bet you 5k mine will beat yours
using whatever inputs you choose over any reasonable time you specify.

------
anonu
Just a reminder: nobody ever wrote about their super successful trading
strategy. Its just never happened. If you have the wherewithal to research and
build a trading system that works, then you're smart enough to know that the
moment you reveal your edge to the world - it disappears. Even if you dont
discuss the innards of your strategy, but you talk about your process or the
system youre strategy is built on, you've revealed too much.

~~~
Erlich_Bachman
This is a simplified version of the truth. There is a lot of information that
you can safely share because the number of people that will know where to look
for it, know how to implement it, what to even do with it, how not to make any
one of 100possible stupid mistakes while implementing it - is very low.

Example in point: Warren Buffet. All of his process is public knowledge, he
constantly writes and talks about it. And yet somehow it didn't make him lose
his edge.

~~~
defertoreptar
Not exactly. We know the broad strokes. He figures a price for a company by
estimating their lifetime earnings and then factoring in the time value of
money. He's talked about a mental checklist he goes through when looking at
quarterly reports, but he never talks about the exact details of this process.
Whether he's lost his edge, that's debatable too. The time since he's beaten
the S&P is now entering into the long-term.

------
halfcat
I find most of this article to be “successful people can’t explain why they
are successful so they say a bunch of arbitrary things they’ve noticed”.

He found success pursuing relative advantages, infrastructure advantages, and
building custom tools from scratch.

But absolute vs relative advantages, plumbing together canned solutions vs
building your own from scratch, infrastructure-level advantages vs decision
making advantages...all of those contrasts exist in other businesses
everywhere. None of those are specific to trading.

> “in my experience, nothing beats learning by doing or finding a mentor”

This hits the nail on the head.

The best way to become a profitable trader is with a mentor, but it’s nearly
entirely luck. You drive an Uber or tend bar and happen to make friends with
someone successful who is willing to guide you. Trying to seek out a mentor
online is nearly impossible, as everyone who is findable and willing is almost
certainly a better marketer than trader.

The other way to become a profitable trader is to start trading with real
money. It’s amazing how quickly one can learn how to mend a boat, when the
boat starts sinking.

------
shoo
readers may also be interested in Benter's paper "Computer Based Horse Race
Handicapping and Wagering Systems: A Report" \--
[https://www.gwern.net/docs/statistics/decision/1994-benter.p...](https://www.gwern.net/docs/statistics/decision/1994-benter.pdf)

> This paper examines the elements necessary for a practical and successful
> computerized horse race handicapping and wagering system. Data requirements,
> handicapping model development, wagering strategy, and feasibility are
> addressed. A logit-based technique and a corresponding heuristic measure of
> improvement are described for combining a fundamental handicapping model
> with the public's implied probability estimates. The author reports
> significant positive results in five years of actual implementation of such
> a system. This result can be interpreted as evidence of inefficiency in
> pari-mutuel racetrack wagering. This paper aims to emphasize those aspects
> of computer handicapping which the author has found most important in
> practical application of such a system

Arguably the paper describes the state of the art from three decades ago,
applied to betting on Hong Kong horse races, not market price movements.

~~~
linus_torvalds
Yeah this is a great paper on the subject. Although horse betting is different
than financial markets due to the parimutuel system.

~~~
HighlandSpring
Does that mean you have to find a greater edge to cover the house edge?

------
rezahussain
Writing ai trading systems is the coding I do for fun since 2012. I'm a little
under break even so far but I keep at it because find it so interesting. Since
I started every single week I have learned a new way of thinking about a
problem I encountered or a new approach to problems that still stand in my
way.

Questions like, how do you choose a stoploss? Well you can pick it
statistically based on history or you can use a supervised label. You can even
use stock A calculated stoploss to pick the stoploss you use on stock B
because you found a condition under which those two stocks became almost
identicall correlated. How do you want to pick the supervised label? You can
do spectral analysis to pick the stoploss too. You can use sentiment as a
stoploss, source from google news or twitter or stocktwits.

It doesn't have to be, 'well I measured the average profitable stoploss to use
over the last 10 years across all stocks and that isn't working so I quit'

Things like that, you get to fit the ideas together and then test them in the
real world.

There are some things I would like to share.

1\. Just because you have a good forecast doesn't translate into cash. It has
to be paired with a trading strategy. This is probably why the author thinks
the answer is RL, because coincidentally if you approach this problem with RL,
it does the forecasting + strategy.

2\. I have measured a correlation between heavier processing(using a higher
big O) and better out of sample performance.

The criticisms with the NN approach like non stationary data have obvious
solutions that a 'by the book' trading approach + ml approach don't really
teach beginners so they dismiss it.

It is my belief right now that there are people who are prepping data from
sources like iextrading then using things like sagemaker to develop good
enough forecasting and combining it with a statistics+rules based trading
strategy to make living wages.

That said, I have 5k account size for my NN obsessions, and my 401k is 'by the
book'.

person_of_color is totally right when he says it is a Moby Dick of
programming.

~~~
discordance
It sounds like a lot of fun! I love the idea that there’s one metric ($) to
measure the effectiveness of your strategy/code.

Any recommendations or hints on where to get started (assuming I’m decent with
python/pandas etc)?

~~~
rezahussain
Yes!

1\. I would start in the numerai tournament, I did this for 3 years after the
first two years of me by myself on the market. It's useful because they
provide ml ready data, and you can iterate very quickly. If you do not have ML
experience numerai will teach you about many different types of overfitting
and the many correct and incorrect ways to deal with them. An example would be
some ML people always apply dropout, but when you have a small signal to begin
with, dropout can dropout the signal, and then there is only noise left for
the model to fit and of course it will then perform poorly. The other thing it
will help with is the hopelessness that you will encounter from hitting a
wall(hitting a wall is common in ML, and should be expected), the scoreboard
shows individuals who have broken through that wall so you can know it is
possible. I stopped participating after stabilizing in the top 20 because they
change the format of the tournament every so often and I wanted my Saturdays
back. You don't need to reach top 20, I hit a wall around rank 100 back when
they used actual bitcoin to pay people. You just need to do well in one of the
rounds where everyone else fails so you can go through the process of 'what
did I do that I'm not aware of that made me succeed where everyone else
failed'

2\. Read Advances in Financial Machine Learning by Marcos Lopez de Prado. This
goes over the false assumptions that outsiders make, and then outlines rookie
mistakes(I made many of the mistakes described in the book, then read this
book when it came out). It also will break you out of the thinking that leads
to typical approaches and why it is unrealistic to expect them to work.

3\. Become familiar with retail trader mistakes like overtrading, improper
sizing, and emotions as well as the fact that you cannot rely upon regulating
bodies to prevent fraud from occurring, they only act after it has
occurred.(This is for scenarios where your models says short this stock, then
you see that the stock is fraudulent but it continues to exist.) Learning
blackjack probabilities + sizing helps with developing a strategy. Things
like, do you want a trading system that has 60% accuracy and 10% profit each
time, or one that has 45% accuracy but 200% profit each time. It's interesting
because even if you have a 50% accurate 200% profit/50% loss strategy, you
still need to calculate the probability of what number of losses you will see
in a row that will still bankrupt you if you have the wrong size. In college
for me this was covered under the Discrete Math Class.

After steps 1+2+3 I think people who have some level of control over their
emotions have the right foundation to code a system. There are people that
should not trade because they don't have the right personality profile.

4\. Find a way to fit the data you encounter into a DB. Early on I had to pay
100 a month to get daily csvs for stock data. I wrote code that answered
questions for me from the csvs. This was wasted time, because you can write
SQL to answer so many questions. Keep this DB on a separate computer from the
one you do ML dev on. Because the computer that ML dev happens on inevitably
gets wiped(it will happen to you).

Then for you its a matter of just leveraging python+pandas etc to code a
solution that meets your criteria. There are three categories that you have to
operate across, infrastructure+forecasting+trading strategy. When you see one
of your models predictions become true it really is a different feeling. But
to ease my conscience I should warn you, if you are the curious type and you
try this once, you will always be curious about it.

For timeseries data im currently using iextrading even though it has
downsides(they only have data for trades that route through their exchange). I
used to use kibot, alphavantage,scrape yahoo, download stock data csvs from
ebay,and save etrade realtime quotes. For placing trades I am currently using
alpaca.(I've used IB,etrade,and robinhoods private api before they blocked
it).

~~~
marketgod
I see you posted you are at break even since 2012. Most people wipe out in 3
years so good job.

I recommend trading manually to feel and listen to the market so that you can
adapt your code to it. Also, maybe you are overthinking your strategy so don't
hesitate to remove parameters.

My discord is open you can join it and see my trades and look to find any
insights from what I enter.

------
person_of_color
Don't do this. It's the programmers Moby Dick. You are better off self
learning stats/ML skills in your free time and joining a quant fund than to
try and do it yourself.

~~~
keyle
Agreed. It's a goose chase, the house always wins and even a winning system
works one week and not the next.

------
DoctorOetker
I would love to try trading as a hobby with a little side money, but I would
abhor a hobby that reduces to effectively buying the trader-feel-good
experience, where you're essentially sponsoring incumbents as a fanboy
chipping in his pocket money.

What I would require from a trading platform:

1) decentralized and permissionless 2) provably fair trading

With 'provably fair trading' I mean the protocol should be such that I can
prove you are not simply held captive by an intermediary, regardless in what
shape or form. It should also be fair with respect to latency.

For example consider a trading market where token X can be exchanged for token
Y and vice versa. Each holder of X demands her minimum of Y per X, and each
holder of Y demands his minimum of X per Y. What if everyone salty hashed
their demands, and pays the market contract (proportional to how much they
will actually be allowed to trade) to register their salted hash. When the
round has closed, people reveal their salt and plaintext, and the incompatible
trading offers get their money back (minus a usage fee perhaps). The
compatible ones can have their trades go through at the rate of 'total
compatible X offered' to 'total compatible Y offered' (or some variation
thereof, say rewarding those that helped close the gap). In this way there is
no high frequency trading, and you could have a family of such markets
operating at different timescales...

------
mfalcon
I've never tried the AI trading path but I imagine that you can't get huge
gains with public data, unless you find a way to extract "hidden" information
by processing real time news.

I wonder nevertheless if there's a sweet spot where you can build a simple AI
trading algorithm and get modest earnings from it.

~~~
nv-vn
I think the answer is yes & no. If you come up with a sufficiently clever
strategy using public data that other people haven't thought to use it's
definitely doable. For example, someone with a good understanding of
meteorology would've had a significant advantage a few decades ago (though
trading firms have since caught on). You wouldn't need a perfect data set if
the strategy isn't being used.

In terms of strategies based purely on market data, you are definitely
correct. Any publicly (freely/cheaply) available market data is low
resolution, lacking the full data from any point in time, and generally based
on poor approximations of the actual data (elsewhere in this thread someone
mentions that IEX's data is based on trades that get routed through the IEX
exchange, which obviously misses any data you could get from the markets that
make up 99% of the volume, dark pools, etc.).

I think the "sweet spot" is simply coming up with a strategy that nobody else
has thought about, or else executing a better-known strategy more effectively
than other market participants. Both are hard, but somewhat in the realm of
possibility. The problem is that many people think there's free money to be
made without either of these.

------
pinouchon
For the last year or so I have been working on a ML-based trading system in
the domain of crypto with two friends. I made more in 2 months than I used to
in a year. This is after thousands of "full positions swings" and millions of
trades (short and long). We are now experimenting with different classes of
trading strategies to reduce risk.

We would like to find 1 or 2 more people to work on this project, we need
people who can tolerate risk and skilled at data engineering: data pipelines,
psql, pandas, numpy, data visualisation, setting up servers. Ideally also
skilled at machine learning / deep learning and who has tried his hand at
trading systems. If interested, my email is in my about info.

------
thedudeabides5
"Actually, many months my PnL graph looked something like this: (this is
generated to get a point across, but my real data looked extremely similar):"

I'd love to see the actual data

------
justicezyx
What's the real performance of the system so far?

~~~
nv-vn
Probably not very good. Voleon does all ML-based trading and what I've seen of
their returns does not give me any confidence in ML-based trading having
alpha. I would estimate that at best in a good year returns would be like 5%
y/y in the long term, much less than the sustained ~7% that index funds offer
especially when adjusting for risk. Just speculation but there's a lot of
firms with much more capital, better tools, and teams of extremely intelligent
people who have pretty poor returns because of how good the competition
already is.

~~~
dennybritz
I think you're comparing apples to oranges here. These funds manage billions
of dollars of client money, which forces them into highly liquid markets with
scalable strategies. That's quite different from how individuals or smaller
prop funds can operate, trading off capacity for higher returns by trading in
less liquid markets or with strategies that are "not worth it" for large hedge
funds. If you must manage billions of client money then you are right in terms
of competition, but as someone who only trades his own capital, you can see a
lot higher returns.

~~~
g10r
Agreed. Many smaller, yet successful Hedge Funds limit the capital they manage
for this very reason. Some strategies just don't work at certain scale.

~~~
nv-vn
If the above is true, why would a fund not just allocate a small amount of
resources to trade on OP's strategies. Either:

(1) OP's strategy performs worse than the alternative (2) They already do
this, and have resources that allow them to outperform OP at their own
strategy

If the returns are really meaningful, i.e. better Sharpe ratio than just
holding $SPY or some dead simple strategy like that, then (2) must be true at
least _somewhere_.

------
mraza007
Just curious to know do financial firms have implemented something similar to
this

~~~
nv-vn
Modern finance is built on top of this type of technology. There are hundreds
if not thousands of firms participating in "quantitative finance," attempting
to use computers/statistics to predict markets. The vast majority of trades go
through major practitioners of this exact idea.

------
MichaelRazum
To be honest. I don’t believe a word about the performance using AI.
Especially if the article doesn’t present the features and the NN
architecture. Its always the question: would a super simple model perform the
same way? And very ofter the answer is yes.

~~~
rawoke083600
Lol this.. I once build a model (btc) that assumes this "we can't know the
direction of market so we might as well guess", after spending months reading
papers and trying to be "clever"

It starts off picking a random market direction (up/down) places bid (sorry I
mean makes a trade). Then based on lots of tuning/backtest decided how long to
be in position and what is the stoploss.. Think in the end the most
"profitable settings" where something like :

$proft_size = 0.38% $stop_loss_size = 0.35%

Win-Continue-Direction = 3 rounds (after winning/losing do we change
direction) So it probably in the end was Markov-model with random-start - if
we had to label it :)

Oh and for fun it would also "martingale for x rounds" :P

Worked quite well for 3-4 days and was fun implementing it while watching
"Billions" on TV in the background :D

~~~
MichaelRazum
Actually I'm not saying that you can't be profitable. Just saying that the
fancy NN, won't work like magic in a noisy environment like financial time
series. For other time series with periodic and other anomalies you can apply
NN and you may have an edge - BUT for that kind of environments also classical
methods are working quite good.

------
known
trading != investing

------
linus_torvalds
"Then, profits started decreasing and I decided to move on to other things and
I lacked the motivation to go back into it."

Is this post about the one with decreasing profits, or a new one that is
profitable?

------
dilandau
In some markets it is necessary to put the same length of fiber-optic cable
between the colocated servers, so that being closer to the exchange's cabinet
doesn't translate into an advantage. So obviously we're talking about
extremely low latency, high-frequency trading. This carries a huge amount of
prerequisites to even get started.

Not only are that, but there are many different order types besides "buy at
market price, sell at market price". Then there's options, short sales, and
more.

It goes deep. People devote 30 years of their career to this. Read the authors
experience as a kind of warning, if you will.

~~~
nv-vn
Agree to an extent, but not all money in quant finance is generated through
HFT. Notably, I don't think funds like Rentec are really doing much to get
low-latency [1]. Latency obviously does matter for any kind of quant trading,
but to my understanding a good enough strategy + slippage models & the likes
can overcome this.

[1] You can find a list of NYSE broker/dealers here:
[https://www.nyse.com/publicdocs/nyse/markets/nyse/members/NY...](https://www.nyse.com/publicdocs/nyse/markets/nyse/members/NYSE_Member_Organizations.csv)
\-- any firm where latency matters will need to be on this list to colo on the
exchange.

