
Show HN: How I Used Machine Learning  to Optimize My Trading Algorithm - taiboli
https://www.quantopian.com/posts/second-attempt-at-ml-stochastic-gradient-descent-method-using-hinge-loss-function
======
tokenadult
"Past performance does not guarantee future results" is still the operative
principle here. Data-mining discovers patterns, but it doesn't lead to deep
insight into causes, and markets are perturbed by many events that you don't
put into your training algorithm. "The market can remain irrational longer
than you can remain solvent" is still important investment advice.

~~~
ciferkey
The Keynes quote reminds me of LTCM (just finished reading When Genius
Failed).

------
tosseraccount
If someone really came up with some fool proof method of beating the market,
wouldn't they keep it secret?

Meanwhile, most folks should stick to asset class allocation and indexed funds
and ETFs.

~~~
minimax
Traders try to keep their successful strategies a secret, but most strategies
"work until they don't" meaning that the algos making money today are not
necessarily the same ones that were making money in 2012. Furthermore, as with
technology companies in general, traders and developers move around between
trading firms, and the ideas move with them.

------
yarou
I actually checked out your algorithm today during work (I work at a big bank,
aka dead end for a "technologist"). It's an intriguing concept, but I still
feel as though classifiers are a poor technique for P&L optimization for a
given portfolio. Out of curiosity, what type of data set are you using for
backtesting, and what time frame? It's not entirely clear that asset (i.e.
stock) prices follow Brownian motion/Weiner process, rather they may be
discontinuous.

------
ecopoesis
Great job. From 2007-12-01 to 2008-12-31 you only managed to lose 215.81%. The
benchmark only lost 37.04%.

~~~
wildwood
Where are you seeing that earlier data? I can only get the graphs to cover
2012 and 2013.

This is a good example of how deceptive a percentage-change-only chart can be,
without the absolute value of the portfolio also figuring in. If the system
has a max drawdown of 98%, then getting 200% returns after hitting that low
isn't going to do much good.

~~~
fawce
You can clone and run the algo yourself over any time period since 2002.

------
zaptheimpaler
What data did you use to train this? Because it looks like it might just be
overfitting the training data.

~~~
gknoy
How can one tell the difference?

~~~
rodrigtw
Typically, you would train the algorithm on one set of data and then test it
on another. So you might train it on data from FY 2010 and then test it out in
a simulation of FY 2011.

The fear is that if you train it on FY 2010 and then it does well in a
simulation of FY 2010, it might only be because it has stored some
representation of a record of FY 2010 which is extremely predictive of FY 2010
but doesn't generalize well to any other year. Testing the algorithm against a
simulation of FY 2011 would reveal this flaw.

------
steven2012
The site is pretty slick, but the backtest I ran when I cloned the algo is
pretty slow. It's been running for about 10+ mins now and it's only 40% done.

As well, in the logs, when I see stuff like:

2012-05-31handle_data:35INFO -63.520880 shares of Security(6109) sold.

it doesn't really inspire a lot of confidence. What does it mean that
-63.520880 shares were sold? Does that mean they were bought? And the fact
that you are purchasing fractional shares also doesn't inspire a lot of
confidence.

~~~
jbredeche
(disclaimer: I work for Quantopian)

I can't speak for the author of the algo, but from the algo, it looks like the
relevant lines for your question are 34 (order(stock,indicator *
context.bet_amount)) and 35 (log.info("%f shares of %s sold."
%(context.bet_amount * indicator,stock)).

Our backtester (Zipline) will only order whole number of shares, obviously. If
you pass it a fractional number, we take the floor:
[https://github.com/quantopian/zipline/blob/master/zipline/ge...](https://github.com/quantopian/zipline/blob/master/zipline/gens/tradesimulation.py#L217)

The log line you're seeing should probably flip the sign of the number of
shares before logging. order(-63, sid(6109)) means sell 63 shares of security
6109. The log line is simply logging the negative value instead of the
positive one. Users can log anything they want in their backtest.

As for the slow performance, apologies - being on HN has resulted in a lot of
people running this algo and while we're scaling up new servers, it's taking a
bit of time to distribute load.

thanks for using Quantopian!

[Edit - added source link to Zipline's order method]

------
eob
From the business standpoint, what are the risks/hurdles that need to be
overcome for Quantopian to offer a "Operationalize Algorithm" button that
starts running on real money?

Surely there could be some structure in which Quantopian gets institutional
trader status (or whatever the "trade for (virtually) free" status is), and
then passes off the low-cost trading to its users, for a fee.

~~~
dunster
There aren't a ton of hurdles left before we start offering "live trading" on
Quantopian. We have all the pieces, we just need to stitch them together. A
couple more months, I think.

In the beginning, at least, it will be leveraged through your existing
brokerage account. You're going to integrate Quantopian with your brokerage,
and Quantopian will place orders for you with your brokerage.

If we're as successful as we hope to be that will mean we're driving a lot of
trading volume. If you start driving enough trading volume, the exchanges
start to pay you rather than the other way around. It would be a pretty sweet
day if we can offer trading for free to our members and fund the company on
the exchange fees.

~~~
nolite
Which brokerages do you see yourself allowing? I'm thinking of opening one

~~~
dunster
Interactive Brokers will be our first integration.

------
meson2k
Good try. Some observations: \- The sharp ratio is very bad. Focus on
improving. Instead of looking for spectacular gains, focus on solid growth. \-
Looking at the daily tick backtest, at many points, the alpha is so negative
that the losses your algo occurs would make you delinquent. Again, focusing on
better sharp ratio should help here. (RETURNS -72.64%) \- Practical
consideration: You look at every stock in data for each tick and its
historical prices. This could work perfectly well at a low frequency trades
e.g. daily, but not at per-second tick because computing time > transaction
time i.e. you would be acting on stale inference. \- Transaction costs?

PS: Pet peeve. Gradient descent is a heuristic at best and not true machine
learning :)

------
em70
That code is a recipe for getting hurt. Here are a few points:

\- Return is not everything. More informative performance metrics are Sharpe
ratio (a sort of reward/risk measure) and information ratio. Both of the above
have ridiculously low values in this case.

\- Another thing that matters is the distribution of returns. If you have
plotted this and still see nothing wrong, you are really better off doing
something else. With numbers like these, chances are will be out of cash much
sooner than you will hit a good month. And even after you have hit a good
month, what happens when you hit a bad one?

\- Beta: essentially, when the algo does well, it is mostly because of
significant overexposure to the market. At this point, I would much rather
lever up and buy SPY than trade using this thing.

\- Predictability and risk management: ok, so you have tested this on
historical data. What are the cases in which this would misbehave? After all,
it is optimization, so there may be inputs for which this gives very
undesirable results. How would you notice? (hint: you have no risk management
in your code!)

The bottom line is that, if you ever want to put some money where your mouth
is, you would have way better chances at doing well if you learned some basic
finance rather than treating the markets as a black box (no matter how
creative you can be). At least, you will be able to evaluate appropriately
whether you are doing well or not.

------
aneth4
Has anyone consistently beat the market with a home grown trading algorithm?

I suspect it's possible but probably needs more signal than just stock price.

~~~
KMag
Given the number of people trying to beat the market with their personal
algos, at least a few are certain to beat the market for many years, even if
their strategies were no better than random.

Edit: I don't mean to suggest that it can't be done. I'm just suggesting that
the existence of people beating the market may not be a good indicator of your
ability to beat the market.

------
nolite
sticking around for 97% drawdown? Sorry.. doing this in real life would be
more stupid than anything

~~~
unreal37
One point in the graph shows a -144% return. How do you even get a -144%
return? Debt?

~~~
dunster
Yeah. This algo is highly leveraged - like 15X. It's possible to really lose
your shirt if you trade this algo exactly.

Taibo's algo is interesting as a starting point. It's not one that that you
just take off the shelf and start trading with. But, you can take it and learn
from it and develop an alternative strategy. Presumably one with less risk!

~~~
meson2k
Dunster, any clue if Quantopian supports option trading?

------
dcnstrct
Is it possible to bring in your own data sets? It seems the more interesting
algorithms would be those that merge the pure financial volume data with
outside sources like news, social data, weather patterns, or other predictors
relevant to individual stock prices.

I do data mining work (not in finance) and often the best signal is the one
missing. Often more prediction value is gained from additional feature
construction and the layering of more interesting data sources onto the
problem than simply a better algorithm on the data at hand.

------
nsxwolf
So, I run this, and it makes me a bunch of free money?

~~~
Hansi
No look at the drawdowns.

------
karamazov
Quantopian is a great tool. I've played around with it, and all I can say is I
wish it had existed when I was day-trading.

