where you enter the trade and exit a penny higher. It sounds like you're just looking at the trade print and assuming you can execute at that price with a market order (or marketable limit order). Consider a stock at 14.47 bid x 14.48 ask. If I cross the spread to sell at 14.47 and then someone else crosses the spread to buy at 14.48, you will see two trades at the two different prices, without the prices on the inside having changed, this is why the midpoint between bid and ask is considered a more useful value than the last trade price.
With the system you propose, you are a price taker. You are crossing the spread with both your entering and exiting trades. Most of the trades you show are for around 5000 shares. Assuming the spread is $0.01, you are going to spend $100 just to get in and out of the position.
I don't know what kind of data Google offers about the intraday state of the order book, but I think you'll need to incorporate it into your backtesting in order to get a better picture about the profitability of your strategy.
If you are up for porting Bateman to python, over at https://www.quantopian.com we let you backtest with high quality intraday data for free. You can also reference our opensource backtesting engine, http://zipline.io, to see how we handled modeling slippage and order simulation.
" It produces profitable simulated results on historical data"
I understand this is an intellectual exercise, but for those considering going into algorithmic trading, those words are dangerous:
- what transaction fee model is being used? Almost all profitable day trading strategies trade too often that the profits and adverse selection reserve are decimated by commissions and taxes.
- have you tried to approximate the presence of your own trade? For example, if you sell a boatload of google shares, the price will start falling. Even with more liquid issues like AA it doesn't take much (5K shares) to rock the boat.
- Have you considered the spread? It's unprofitable to quote a penny spread on google or other high-dollar names (the SEC tax alone, roughly $25 per $1M sold, doesn't allow for really profitable market making without at least 3 cent spreads.
There are many more questions, and for each question there are hedge funds and prop shops that have lost significant amounts of money, or were driven out of business, due to an oversight.
Henning,
If you are going to go some computer optimisation please read Whites reality check, I applaud your effort but I think the trading strategy as described might be too simple unfortunately and historical optimisation can make anything profitable. In an ideal strategy edge would be present do to an extrapolation of platonistic assumptions and optimisation would further tune this rather than produce profits in itself.
Also AAPL is a lousy stock to simulate anything with apart from itself as its basically an complete outlier in terms of statistical factors(std deviation,exposure to news etc)
Wow, I've never seen anyone else who knew about White's reality check. Unfortunately, it is patented (US Patent 5,893,069). There's also a related thing called the test for Superior Predictive Ability (SPA), by Peter Reinhard Hansen, which is not patented (afaik). We liked it better for other reasons as well.
I didn't know that it was patented,thanks for the info. There go my plans for githubbing a small pandas(python) based version of same. Ill also look into the SPA now. People also need to consider the kelly criterion,and abandoning Technical Analysis as "not even wrong". Signal Processing is TA for grownups but even then people have to bear in mind that the assumption of a linear system (a default assumption for signal processing) is generally incorrect as far as the market is concerned.
In the strange position of having a potentially very viable trading model myself and no time or funds to trade with as yet.Potentially a better position than the converse however.
Writing a framework for running trading strategies is certainly an interesting idea. I too am dissatisfied with most commercial platforms due to lack of features and flexibility. Unfortunately, there seems to be a lack of open source code in this sector. Eclipse Trader looked kind of interesting but the project appears to be dormant now. So expanding on this project could fill that gap.
However, from experience developing and testing algorithmic trading systems I can tell you that your strategy probably has some issues in its current form. I haven't looked into the code but from your description it appears you (correct me if I'm wrong):
1.) Pick a stock
2.) Use PSO to figure out the parameters
3.) If profitable, run the strategy on the stock with the optimised parameters
This means you're making a well known error in the system development community which is curve fitting parameters to historical data. This'll look very good in the simulations, but there is a high probablity that it will break down when trading it forward with real money, because it is optimised for the past. This is why there are a couple of widely accepted best practices when it comes to developing and testing trading systems.
First of all, your system should not have or need too many parameters. As a rule of thumb a robust system shouldn't have more than a handful of parameters and it should ideally show profits in simulations without a great deal of optimisation on those. When optimising make sure that the optimised parameter values are robust. This means that changing the value by a small increment only changes the resulting performance of your system by a small margin (somewhat analagous to numerical stability). If the performance changes by a big margin, then those values aren't robust and should be discarded. Furthermore, don't run optimisation on all of your historical data. Instead, optimise on portion of that data (the 'in-sample' data) and then test the optimised parameter values on the more recent data your didn't optimise on (the 'out-of-sample' data) and see if the performance of your system stays the same or breaks down. Another popular approach is 'Walk forward optimisation' [1] which takes the above one step further by repeatedly optimising and forward-testing on your historical data to find robust parameter values.
Some other things to consider: You need to factor in transaction costs, spread and slippage (the difference between the price you enter the order at and the price at which you get the fill). Transaction costs are easy to determine. Spread and slippage only apply when using market orders and can be reduced by trading with limit order if your system isn't negatively affected by this. Trading with market orders in a fast-moving market may incur siginificant slippage and there are predatory HF algos out there making money from screwing you on your execution. To get a better sense of this, it is considered a best practice to run your simulations on a lower timeframe than the one your system is supposed to work on in order to eliminate inaccuracies in the results. Ideally, you run simulations against unfiltered tick-by-tick data and additionally used bid and ask data series to factor in the spread. This may, however, be overkill and not needed for a system that runs on a daily timeframe, but it may make all the difference for a faster system.
The the thing about backtesting a strategy is that it is very easy to make a mistake in your backtester. Look ahead bias is the most common mistake.
Another challenge is the data. Are you testing against a history of stocks that includes bankruptcies? If not you have survivorship bias.
I suggest you take a look at my website, www.quantopian.com. Look at our open-sourced backtester, https://github.com/quantopian/zipline. Between the two we can help you get past those two sources of error.
I trade real money on a similar system, but using genetic algorithms. I worked on it for maybe 4 years before putting money on it in 2013. So far, I'm up 1.05% on $284K traded.
I wrote my own fairly pessimistic backtester. Also, I currently use the generated models for buy signals, but I tend to sell earlier than they dictate, because I find it hard to turn down even a modest profit.
Out of interest, is that better or worse than you were expecting? You don't say when in 2013 you started, but since the start of the year the S&P is up around 6%, the NASDAQ composite is up 4.5%, and the FTSE All Share is up a similar amount.
Assuming you did put the money in in January you would have probably made more profit by simply investing in an index tracker.
True, I'm losing to the market, since January. But 'the market' is unpredictable, unrealized gains on paper. Unless you sell your index tracking positions, you will, in the next couple years, see some nice unrealized losses on paper.
My gains are realized. And fairly predictable.
Also, I only have about $55k in capital. But my rapid buy/sell cycle (not HFT) allows me to keep 'reusing' it.
Might be a good idea if you are paying 12% interest on your mortgage, but not such a good idea if you're paying 3%.
I don't know what country you're in, but if you want to invest in the US stock market, I'm sure your broker sells an index fund with very low management fees. That may be a better investment than putting everything into a single home in the current market.
(In the US, mortgage interest is tax-deductible, making this strategy even less of a good idea.)
Mortgage interest not deductible here (Canada). Our current mortgage rate is 3.8% so given marginal taxe rate of 46% I would need to have a guaranteed return of at least 8.3% just to match wha I would save by "investing" in my mortgage. You tell me where I can get a 8.3% return over the next 5 yrs, guaranteed. PS here in Canada (maybe exception is Vancouver) our housing market is v stable, in my City house prices did not even dip during 2008 crisis. Yay Canada!
You're quite right about trading fees and taxes, but what's that about being non-American? Practically all developed countries have a functioning stock exchange.
Interesting practice project; I think the distinction between algorithmic trading and HFT is largely one of semantics.
I think we just generally label the higher latency, lower frequency strategies old fashioned 'stat-arb' -- this better connotes that the 'edge' from these trades is derived out of superior mathematical modeling and not better execution.
I highly recommend decoupling the simulation/back-testing framework from your strategy work.
Building a reasonable simulator is no small task -- market data is often NOT an accurate depiction of what is executable and there's a fair number of corrections/assumptions that must be made to reflect this.
Either way, I've always liked PSO -- I highly recommend playing with DE (differential evolution) as well. I've used DE to tune parameters for many strategies with great success where SGD would have been, well, painful.
Expect a letter from a lawyer representing DC Comics demanding you to rename your product. An acquaintance of mine also launched a product (entirely unrelated to comics and toys) with the same name recently and was asked by DC Comics to change the name.
I am curious, on what basis he was asked to change the name? As per my understand, It is certainly not a trademark infringement, if not belonging to the same business domain.
Obviously their point was that Bateman was too close to Batman. Not being a trademarks expert, I can't comment on whether it was a trademark infringement or not, but that is besides the point. Do you want to spend time and money litigating with DC Comics?
> Obviously their point was that Bateman was too close to Batman.
Their point is moot. DC comics will have to go after people who own 'Patrick Bateman'. This isn't infringement(nobody is going to confuse a trading software named Bateman with a book/movie character); even if it were, the owners of 'Patrick Bateman' and not DC comics can claim infringement. At most, he will have to remove the image of Christian Bale as Patrick Bateman, but I doubt it will come to it.
Also, batman.js is in business, and DC comics didn't go after them. At what grounds did they went after your friend? Naming your software batman or superman isn't infringing DC's trademarks.
Spiffy! A buddy and I have been working on a similar thing (not open sourced (yet?)) for a while off and on. The model we're going to shoot for is to look at online opinion on the stock as a base for estimating rise/fall.
Reminds me of an article I read awhile back that pointed out a correlation between news about Anne Hathaway and an increase in Berkshire Hathaway stock.
Since stock prices are all about speculation and market psychology, it is an interesting approach. I'd be very interested to know about how you implement.
How do you define and track the "online" world though?
> How do you define and track the "online" world though?
Wink wink, nudge nudge, that's the secret sauce. ;-)
But more seriously, the plan of attack right now is just to scrape newsfeeds from news sources via rss.
While of course this is susceptible to bias in the media, my hypothesis is that so are the stocks. A rising tide lifts all ships as has been said: so long as you buy lowish and sell highish you'll do all right[1]
Other routes might be:
- twitter
- stock trading forums
- industry news sites
[1] bear markets demand some kind of shorting strategy I believe.
A practice that can improve results in intraday program trades is to only trade with the market, or only trade the direction that coincides with your meta analysis. In this case that might translate into only taking positions on days the market futures are up (above a threshold) pre-market and/or only trading tickers that you expect to rise over a 6-12 month timeframe.
While far from a silver bullet these can help you avoid buying positions when the market is hopeless, or trying to play a statistically losing game. It might be worth testing.
Make sure you include all trading fees and software license costs in your models.
This looks interesting, and I'm going to try back testing this.
I would suggest trying out-
1. market neutral positions. Eg- If you go long AAPL, go short XLK ( The tech stock ETF).
2. going long slightly out of the money call options instead of stock. ( If the options are liquid)
I mostly do stat arb, and for back-testing, even I tried Metatrader, Quantopian and several other platforms and didn't think any of them were suitable. FXCM's Strategy Trader is worth taking a look at. It can only trade forex live through FXCM, but you can import CSV data and back test on whatever you import.
Interesting project. I haven't had the time to look at your project in-depth, but keep in mind most retail traders are at a disadvantage due to routing. Most orders are routed through providers such as Getco or Knight Capital, and often you are not aware of whom your counter-party is. Therefore, any expected P&L may vary by x amount of basis points, due to latency and liquidity.
alright, I don't know how many "professional" traders there are on HN but I spent a good part of my life doing prop algo-trading on Wallstreet.
The assumption that most trading systems are in VBA is perhaps valid for small or retail, but I can tell you that serious algo shops (trading desks, hedgefunds) run clusters of GPU harnessing CUDA with matlab, and R plugins, C++ etc. When/If you try this with real in the real world, you will encounter something called, "slipage" - slipage is the price the trade gets executed vs. where you thought it would be. This is due to various factors like the bid-ask spread, liquidity in that moment, order size, route, vol, etc you got the idea. As you said this is an algo system and in your description you do emphasize the diff between HF and algo. The strategy is simple and could work as any other strategy but any algo system like this needs lots of capital to make a decent profit, and when lots of capital is in question it needs to be redundant when things go haywire - and haywire go they shall. So the key assumption seems to be:
>buy stocks that are going up intraday and sell them higer
to be blunt it seems interesting, and I would encourage you to work on it further because complex systems rarely work right. but also bear in mind that its not any different than a candle stick trading strategy, or ichimoku clouds. The key is how fast you converge to usable parameters (time,money,trades), if you can do this fast the underlying strategy can be a multitude of things, and thats my 2 cents...
That's common wisdom for the Jim Cramer crowd and the reason it's so well known is that it's good advice for the common person who doesn't understand the market.
Sometimes a trading system just needs to get an order done. Having too many busted pairs can really cause your system to become ineffective fast.
or put another way, limit orders are your best bet, unless they aren't:)
I'm not from the Jim Cramer crowd and it doesn't sound like you are either. If you are sending orders at the best bid/offer and you can't get filled it's probably because you're too slow. I suppose you could send a marketable limit order in at a price deeper in the book and let the exchange match it at the best price available, but that's still different from outright market (i.e. no limit) orders.
I would assume that if one would deploy a strategy live then it would be done with something like Interactive Brokers. They do allow directed orders although if you cancel them they are not cost effective.
This is a huge red flag with respect to the simulation results. You show some trades like this one
> 2013-03-05 6:41,2013-03-05 7:15,14.47,14.48,LONG,5192,75145.75,52.85,100237.47
where you enter the trade and exit a penny higher. It sounds like you're just looking at the trade print and assuming you can execute at that price with a market order (or marketable limit order). Consider a stock at 14.47 bid x 14.48 ask. If I cross the spread to sell at 14.47 and then someone else crosses the spread to buy at 14.48, you will see two trades at the two different prices, without the prices on the inside having changed, this is why the midpoint between bid and ask is considered a more useful value than the last trade price.
With the system you propose, you are a price taker. You are crossing the spread with both your entering and exiting trades. Most of the trades you show are for around 5000 shares. Assuming the spread is $0.01, you are going to spend $100 just to get in and out of the position.
I don't know what kind of data Google offers about the intraday state of the order book, but I think you'll need to incorporate it into your backtesting in order to get a better picture about the profitability of your strategy.