
Show HN: Bateman, a stock trading system I'm working on - henning
https://github.com/fearofcode/bateman
======
minimax
"And, in fact, every single one of our simulated trades was profitable, even
while the stock overall was going up and down."

This is a huge red flag with respect to the simulation results. You show some
trades like this one

> 2013-03-05 6:41,2013-03-05
> 7:15,14.47,14.48,LONG,5192,75145.75,52.85,100237.47

where you enter the trade and exit a penny higher. It sounds like you're just
looking at the trade print and assuming you can execute at that price with a
market order (or marketable limit order). Consider a stock at 14.47 bid x
14.48 ask. If I cross the spread to sell at 14.47 and then someone else
crosses the spread to buy at 14.48, you will see two trades at the two
different prices, without the prices on the inside having changed, this is why
the midpoint between bid and ask is considered a more useful value than the
last trade price.

With the system you propose, you are a price taker. You are crossing the
spread with both your entering and exiting trades. Most of the trades you show
are for around 5000 shares. Assuming the spread is $0.01, you are going to
spend $100 just to get in and out of the position.

I don't know what kind of data Google offers about the intraday state of the
order book, but I think you'll need to incorporate it into your backtesting in
order to get a better picture about the profitability of your strategy.

~~~
henning
Thanks for this. I hadn't looked at that carefully enough and need to think
more about it.

~~~
fawce
If you are up for porting Bateman to python, over at
<https://www.quantopian.com> we let you backtest with high quality intraday
data for free. You can also reference our opensource backtesting engine,
<http://zipline.io>, to see how we handled modeling slippage and order
simulation.

------
niggler
" It produces profitable simulated results on historical data"

I understand this is an intellectual exercise, but for those considering going
into algorithmic trading, those words are dangerous:

\- what transaction fee model is being used? Almost all profitable day trading
strategies trade too often that the profits and adverse selection reserve are
decimated by commissions and taxes.

\- have you tried to approximate the presence of your own trade? For example,
if you sell a boatload of google shares, the price will start falling. Even
with more liquid issues like AA it doesn't take much (5K shares) to rock the
boat.

\- Have you considered the spread? It's unprofitable to quote a penny spread
on google or other high-dollar names (the SEC tax alone, roughly $25 per $1M
sold, doesn't allow for really profitable market making without at least 3
cent spreads.

There are many more questions, and for each question there are hedge funds and
prop shops that have lost significant amounts of money, or were driven out of
business, due to an oversight.

------
Choronzon
Henning, If you are going to go some computer optimisation please read Whites
reality check, I applaud your effort but I think the trading strategy as
described might be too simple unfortunately and historical optimisation can
make anything profitable. In an ideal strategy edge would be present do to an
extrapolation of platonistic assumptions and optimisation would further tune
this rather than produce profits in itself. Also AAPL is a lousy stock to
simulate anything with apart from itself as its basically an complete outlier
in terms of statistical factors(std deviation,exposure to news etc)

~~~
cschmidt
Wow, I've never seen anyone else who knew about White's reality check.
Unfortunately, it is patented (US Patent 5,893,069). There's also a related
thing called the test for Superior Predictive Ability (SPA), by Peter Reinhard
Hansen, which is not patented (afaik). We liked it better for other reasons as
well.

~~~
Choronzon
I didn't know that it was patented,thanks for the info. There go my plans for
githubbing a small pandas(python) based version of same. Ill also look into
the SPA now. People also need to consider the kelly criterion,and abandoning
Technical Analysis as "not even wrong". Signal Processing is TA for grownups
but even then people have to bear in mind that the assumption of a linear
system (a default assumption for signal processing) is generally incorrect as
far as the market is concerned. In the strange position of having a
potentially very viable trading model myself and no time or funds to trade
with as yet.Potentially a better position than the converse however.

------
waterlesscloud
Someone should hire you just for your ability to write documentation for your
projects.

~~~
twfarland
Someone should hire you just for your ability to name your projects. "How'd a
nitwit like you get so tasteful?"

------
_this
Writing a framework for running trading strategies is certainly an interesting
idea. I too am dissatisfied with most commercial platforms due to lack of
features and flexibility. Unfortunately, there seems to be a lack of open
source code in this sector. Eclipse Trader looked kind of interesting but the
project appears to be dormant now. So expanding on this project could fill
that gap.

However, from experience developing and testing algorithmic trading systems I
can tell you that your strategy probably has some issues in its current form.
I haven't looked into the code but from your description it appears you
(correct me if I'm wrong): 1.) Pick a stock 2.) Use PSO to figure out the
parameters 3.) If profitable, run the strategy on the stock with the optimised
parameters

This means you're making a well known error in the system development
community which is curve fitting parameters to historical data. This'll look
very good in the simulations, but there is a high probablity that it will
break down when trading it forward with real money, because it is optimised
for the past. This is why there are a couple of widely accepted best practices
when it comes to developing and testing trading systems.

First of all, your system should not have or need too many parameters. As a
rule of thumb a robust system shouldn't have more than a handful of parameters
and it should ideally show profits in simulations without a great deal of
optimisation on those. When optimising make sure that the optimised parameter
values are robust. This means that changing the value by a small increment
only changes the resulting performance of your system by a small margin
(somewhat analagous to numerical stability). If the performance changes by a
big margin, then those values aren't robust and should be discarded.
Furthermore, don't run optimisation on all of your historical data. Instead,
optimise on portion of that data (the 'in-sample' data) and then test the
optimised parameter values on the more recent data your didn't optimise on
(the 'out-of-sample' data) and see if the performance of your system stays the
same or breaks down. Another popular approach is 'Walk forward optimisation'
[1] which takes the above one step further by repeatedly optimising and
forward-testing on your historical data to find robust parameter values.

Some other things to consider: You need to factor in transaction costs, spread
and slippage (the difference between the price you enter the order at and the
price at which you get the fill). Transaction costs are easy to determine.
Spread and slippage only apply when using market orders and can be reduced by
trading with limit order if your system isn't negatively affected by this.
Trading with market orders in a fast-moving market may incur siginificant
slippage and there are predatory HF algos out there making money from screwing
you on your execution. To get a better sense of this, it is considered a best
practice to run your simulations on a lower timeframe than the one your system
is supposed to work on in order to eliminate inaccuracies in the results.
Ideally, you run simulations against unfiltered tick-by-tick data and
additionally used bid and ask data series to factor in the spread. This may,
however, be overkill and not needed for a system that runs on a daily
timeframe, but it may make all the difference for a faster system.

[1] <http://en.wikipedia.org/wiki/Walk_forward_optimization>

~~~
pja
This man speaks the truth.

Particularly running the optimise & test routine on chunks of past data to
test for robustness & over-fitting.

------
dunster
That's a very interesting system you've built.

The the thing about backtesting a strategy is that it is very easy to make a
mistake in your backtester. Look ahead bias is the most common mistake.

Another challenge is the data. Are you testing against a history of stocks
that includes bankruptcies? If not you have survivorship bias.

I suggest you take a look at my website, www.quantopian.com. Look at our open-
sourced backtester, <https://github.com/quantopian/zipline>. Between the two
we can help you get past those two sources of error.

------
matt__ring
I trade real money on a similar system, but using genetic algorithms. I worked
on it for maybe 4 years before putting money on it in 2013. So far, I'm up
1.05% on $284K traded.

I wrote my own fairly pessimistic backtester. Also, I currently use the
generated models for buy signals, but I tend to sell earlier than they
dictate, because I find it hard to turn down even a modest profit.

~~~
objclxt
Out of interest, is that better or worse than you were expecting? You don't
say _when_ in 2013 you started, but since the start of the year the S&P is up
around 6%, the NASDAQ composite is up 4.5%, and the FTSE All Share is up a
similar amount.

Assuming you did put the money in in January you would have probably made more
profit by simply investing in an index tracker.

~~~
matt__ring
True, I'm losing to the market, since January. But 'the market' is
unpredictable, unrealized gains on paper. Unless you sell your index tracking
positions, you will, in the next couple years, see some nice unrealized losses
on paper.

My gains are realized. And fairly predictable.

Also, I only have about $55k in capital. But my rapid buy/sell cycle (not HFT)
allows me to keep 'reusing' it.

~~~
pja
So you're up 5% on capital employed. Well, if you can sustain that, you'll be
doing pretty well!

------
plg
I've toyed with this kind of thing before and things can look good in
simulation, until I incorporated realistic effects of:

\- trading commissions & fees \- capital gains taxes \- currency exchange (for
those of us "unamericans")

then all of a sudden blammo, I decided I'm better off putting "investment"
cash into my mortgage

~~~
jrockway
Might be a good idea if you are paying 12% interest on your mortgage, but not
such a good idea if you're paying 3%.

I don't know what country you're in, but if you want to invest in the US stock
market, I'm sure your broker sells an index fund with very low management
fees. That may be a better investment than putting everything into a single
home in the current market.

(In the US, mortgage interest is tax-deductible, making this strategy even
less of a good idea.)

~~~
plg
Mortgage interest not deductible here (Canada). Our current mortgage rate is
3.8% so given marginal taxe rate of 46% I would need to have a guaranteed
return of at least 8.3% just to match wha I would save by "investing" in my
mortgage. You tell me where I can get a 8.3% return over the next 5 yrs,
guaranteed. PS here in Canada (maybe exception is Vancouver) our housing
market is v stable, in my City house prices did not even dip during 2008
crisis. Yay Canada!

------
fchollet
I hate you, because how am I going to find a name for my project now, after
you set the bar so high. Argh...

All jokes aside, this sounds promising. Will your framework be flexible enough
to allow for composing and running any ML/statistical algorithm?

------
jrvarela56
Sorry if I'm missing something key here, but why does the implementation of a
stock trading system involve a particular trading strategy?

------
spitfire
Whenever I hear of an amateur built trading system, the first question that
always comes to mind is: What's your risk and money management like?

It's easy to come up with an algorithm, but where the real pro's win is with
risk management.

EDIT: This isn't meant to be negative at all. Keep up the good work, but know
that you're just starting down the rabbit hole now.

------
radikalus
Interesting practice project; I think the distinction between algorithmic
trading and HFT is largely one of semantics.

I think we just generally label the higher latency, lower frequency strategies
old fashioned 'stat-arb' -- this better connotes that the 'edge' from these
trades is derived out of superior mathematical modeling and not better
execution.

I highly recommend decoupling the simulation/back-testing framework from your
strategy work.

Building a reasonable simulator is no small task -- market data is often NOT
an accurate depiction of what is executable and there's a fair number of
corrections/assumptions that must be made to reflect this.

Either way, I've always liked PSO -- I highly recommend playing with DE
(differential evolution) as well. I've used DE to tune parameters for many
strategies with great success where SGD would have been, well, painful.

------
DrJokepu
Expect a letter from a lawyer representing DC Comics demanding you to rename
your product. An acquaintance of mine also launched a product (entirely
unrelated to comics and toys) with the same name recently and was asked by DC
Comics to change the name.

~~~
random42
I am curious, on what basis he was asked to change the name? As per my
understand, It is certainly not a trademark infringement, if not belonging to
the same business domain.

~~~
DrJokepu
Obviously their point was that Bateman was too close to Batman. Not being a
trademarks expert, I can't comment on whether it was a trademark infringement
or not, but that is besides the point. Do you want to spend time and money
litigating with DC Comics?

~~~
irahul
> Obviously their point was that Bateman was too close to Batman.

Their point is moot. DC comics will have to go after people who own 'Patrick
Bateman'. This isn't infringement(nobody is going to confuse a trading
software named Bateman with a book/movie character); even if it were, the
owners of 'Patrick Bateman' and not DC comics can claim infringement. At most,
he will have to remove the image of Christian Bale as Patrick Bateman, but I
doubt it will come to it.

Also, batman.js is in business, and DC comics didn't go after them. At what
grounds did they went after your friend? Naming your software batman or
superman isn't infringing DC's trademarks.

------
pnathan
Spiffy! A buddy and I have been working on a similar thing (not open sourced
(yet?)) for a while off and on. The model we're going to shoot for is to look
at online opinion on the stock as a base for estimating rise/fall.

~~~
niggler
Derwent Capital Markets (website down now) tried to do this with twitter, but
ultimately collapsed:

<http://www.idsnews.com/news/story.aspx?id=80469>
[http://venturebeat.com/2012/05/28/twitter-fueled-hedge-
fund-...](http://venturebeat.com/2012/05/28/twitter-fueled-hedge-fund-bit-the-
dust-but-it-actually-worked/)

~~~
pnathan
Interesting. The venturebeat story suggests that the strategy works, but the
company's web presence is dead. Hmm.

------
trotsky
A practice that can improve results in intraday program trades is to only
trade with the market, or only trade the direction that coincides with your
meta analysis. In this case that might translate into only taking positions on
days the market futures are up (above a threshold) pre-market and/or only
trading tickers that you expect to rise over a 6-12 month timeframe.

While far from a silver bullet these can help you avoid buying positions when
the market is hopeless, or trying to play a statistically losing game. It
might be worth testing.

Make sure you include all trading fees and software license costs in your
models.

------
pg1337
This looks interesting, and I'm going to try back testing this.

I would suggest trying out-

1\. market neutral positions. Eg- If you go long AAPL, go short XLK ( The tech
stock ETF). 2\. going long slightly out of the money call options instead of
stock. ( If the options are liquid)

I mostly do stat arb, and for back-testing, even I tried Metatrader,
Quantopian and several other platforms and didn't think any of them were
suitable. FXCM's Strategy Trader is worth taking a look at. It can only trade
forex live through FXCM, but you can import CSV data and back test on whatever
you import.

------
lampooned
Does anyone think someone extending this to place trades via API would be that
horrible of an idea? Seems like it'd be interesting.

------
blt
Really good readme. Explains well without hype. I like the name too :)

------
yarou
Interesting project. I haven't had the time to look at your project in-depth,
but keep in mind most retail traders are at a disadvantage due to routing.
Most orders are routed through providers such as Getco or Knight Capital, and
often you are not aware of whom your counter-party is. Therefore, any expected
P&L may vary by x amount of basis points, due to latency and liquidity.

~~~
3327
alright, I don't know how many "professional" traders there are on HN but I
spent a good part of my life doing prop algo-trading on Wallstreet. The
assumption that most trading systems are in VBA is perhaps valid for small or
retail, but I can tell you that serious algo shops (trading desks, hedgefunds)
run clusters of GPU harnessing CUDA with matlab, and R plugins, C++ etc.
When/If you try this with real in the real world, you will encounter something
called, "slipage" - slipage is the price the trade gets executed vs. where you
thought it would be. This is due to various factors like the bid-ask spread,
liquidity in that moment, order size, route, vol, etc you got the idea. As you
said this is an algo system and in your description you do emphasize the diff
between HF and algo. The strategy is simple and could work as any other
strategy but any algo system like this needs lots of capital to make a decent
profit, and when lots of capital is in question it needs to be redundant when
things go haywire - and haywire go they shall. So the key assumption seems to
be:

>buy stocks that are going up intraday and sell them higer

to be blunt it seems interesting, and I would encourage you to work on it
further because complex systems rarely work right. but also bear in mind that
its not any different than a candle stick trading strategy, or ichimoku
clouds. The key is how fast you converge to usable parameters
(time,money,trades), if you can do this fast the underlying strategy can be a
multitude of things, and thats my 2 cents...

~~~
minimax
"slipage is the price the trade gets executed vs. where you thought it would
be."

Anyone not using limit orders in an automated trading system deserves whatever
they get.

~~~
chollida1
That's common wisdom for the Jim Cramer crowd and the reason it's so well
known is that it's good advice for the common person who doesn't understand
the market.

Sometimes a trading system just needs to get an order done. Having too many
busted pairs can really cause your system to become ineffective fast.

or put another way, limit orders are your best bet, unless they aren't:)

~~~
minimax
I'm not from the Jim Cramer crowd and it doesn't sound like you are either. If
you are sending orders at the best bid/offer and you can't get filled it's
probably because you're too slow. I suppose you could send a marketable limit
order in at a price deeper in the book and let the exchange match it at the
best price available, but that's still different from outright market (i.e. no
limit) orders.

------
notdrunkatall
Is it profitable?

