Hacker News new | past | comments | ask | show | jobs | submit login
Introduction to Zipline: A Trading Library for Python (quantinsti.com)
321 points by kawera on July 25, 2016 | hide | past | web | favorite | 150 comments



I've typed and deleted this post a few times trying to find a way that it doesn't sound kind of pompous but if it helps save one person alot of money then screw it, I'll sound pompous....

I get asked quite a bit on how to start doing algorithmic trading and the first thing I always tell people is don't.

I think I've said this many times now but the number of people who come at it with the thinking "I'm a computer scientist. I'll just fire up R or python and apply some machine learning to the markets and watch the money roll in" is staggering.

I mean each day 100's of Phd's start with clean market data, more data sources than you could possibly think of and statistical back testing systems that have 1000's of man hours put into them, trying to find a way to make money.

After all of that if you really want to I wrote this in response to an Ask Hacker News a little while ago

https://news.ycombinator.com/item?id=11352562

TL/DR - focus on time periods greater than a day

- expect to lose money

- expect to take a year to figure out some edge in the market

- most decent trading strategies that a normal person can use come from economic/market insights first and technology second.

The site: https://www.quantstart.com/ is also decent at bringing you up to speed on the math you'll need to know though I believe that the material there oversells how easy it is to find a decent trading strategy.


I mostly agree with this. Treating markets as an exercise purely in data science is a _bad idea_. Risking money without a solid understanding of market mechanics and trading conventions is a recipe for disaster. Taking a Bayesian view won't save you either - this is a land where some new "six sigma" event happens every six months.

That said, the same logic that holds for identifying profitable strategies within an institution holds for individuals: unless you have better gear, don't fish a crowded pond. As an individual, your small size is, in some regards, an advantage. Institutions routinely pass on strategies that don't have capacity (there's not enough liquidity to make the strategy's returns worthwhile relative to their trading level) or strategies that aren't quite up to their standards (but might be up to yours). There's also a huge class of strategies that have somewhat choppy but long run consistent returns. Traders and funds worried about MoM track records won't touch those.

Fees are the biggest barrier to entry. Institutions enjoy substantial discounts and an ability to amortize costs across a much wider base, making the strategy performance hurdle rate proportionally higher for smaller traders.


I'm curious what folk's thoughts are on these two platforms:

Interactive brokers offers low fee access to their platform (https://www.interactivebrokers.com/en/index.php?f=13869)

Quantopian (www.quantopian.com) gives you the ability to trade through robinhood with long trades at 0 commission. There has been some skepticism in the Q forums on how well robinhood's execution is (possibly the effect of a "you get what you pay for"-attitude).

Trading real capital with algorithms is more difficult than it sounds. Mentally if you're algorithm is doing stock selection as well as trade execution, you no longer understand what you own. During a period of extended drawdown, you have to be mentally tough enough to believe that your algorithm has what it takes to dig itself out.

In a sense, it is like the start-up game. You have to believe you're doing something well enough that in the end you'll be right (and not bankrupt).


I use Interactive Brokers. They're probably the best retail brokerage out there. They're cheap -- the best rates on commissions and by far the best rates for margin. And they have a reasonably well-documented and supported API which you can write your programs against. The API is actually pretty much industry-standard so if you're interested in a third-party program, they probably support it.

If there's any downside, it's their data. You can't get that much of it historically (1 year max on the minute bars, far less on the second bars, and it takes forever to download it because of the throttling). It's also not...I'm going to say correct...historically. I mean, it is a correct record of what trades happened when. But it includes trades that you won't see when they happen, making it less useful if you're looking for a stream of events as they are happening. Also compared to a broker like Lightspeed (who I haven't tried) their data is expensive.

The software is a big mess, but it's usable. It's always in this weird state where it's 50-75% of the way to being completely awesome, but there's just a few things missing that stop it from being so. Also if I have one nitpicky complaint it's that I can't direct route complex options orders.


use iqfeed or quandl for data (i use iqfeed for quotes as well).


Sure but that's a big monthly expense.


My thoughts are that if you use either platform for anything other than entertainment you're a fool.

Seriously, here's your competition: https://en.wikipedia.org/wiki/Renaissance_Technologies How deluded do you have to be to think you have an edge over that?


This level of defeatism is the exact reason I got into the markets algo trading.

People like you stay out.

More to the point, just because a genius mathematician and code breaker started a hedge fund it doesn't at all push out any of the little guys. The market is so large he can't possibly be trading all instruments at once, and "scaling" is a problem for huge hedge funds. Especially ones that have to answer to their shareholders. Even though this is a huge fund, it is highly inflexible.

What you're saying is akin to "Google already does email and they're loaded with geniuses, what makes you think you have an edge over that?"


In fairness, if someone told me they were going to make a killing on free webmail supported by ads and data mining, I think "How on earth do you intend to complete with Google?" would be a perfectly legitimate question.


In fairness, by that logic nothing would ever get done. Microsoft? You're never going to win over IBM. Xerox? You'll always be ancillary to Kodak. GM? Ford is already there. The Fugger's Banking company? Good luck against the Venetians and the Florentines. Same old story. At the end of the day, you either try something new or you don't.


Only if they don't have a good answer to the question.


> In fairness, by that logic nothing would ever get done.

Or it will get done by people who have thought long and hard about how they are going to compete with the giants, rather than someone naïve sap.

The fact that the question is being asked does not imply that there are no good answers to it, but it is unlikely that someone who hasn't spent time considering how to improve on the incumbents is going to beat them.


Indeed, but nobody (I certainly wasn't) was advocating to not do your homework and evaluate your risk. Sometimes, however, things just happen, for example James Simons stumbled upon the industry by chance. How many companies were motivated by killer instinct? Did Gates want to kill IBM? Did Xerox plan to kill Kodak (they didn't but they serendipitously started a revolution)?


Dropbox vs OneDrive or Google Drive could be a better example here.


Comparative advantage. Even if they're absolutely better than you at every possible strategy, there are still going to be opportunities in markets that it isn't worth their time to invest in.


This isn't true. There are a non-zero number of traders who use IB and post positive returns.

For example Taaffeite Capital Management (who gained some publicity for good returns on Brexit) are a < $10mm fund who use IB[1].

Sure, bigger funds have better market access, and it will always be impossible to implement a high frequency trading approach. But these platforms are about as good as a small player can get.

[1] http://www.afr.com/personal-finance/shares/artificial-intell...


Dr. Lun has a PhD from MIT and is doing original AI research. If that's your profile then do what you want=)


There are plenty of MIT PhDs very successfully losing money.

I think this goes directly to the "if you use these platforms you are a fool" comment. It seems to me that some people who aren't fools use the platforms, and some who are, don't.

Also, this is HN. Pretty sure there is more than one MIT PhD reading this, and I know there is more than one doing original AI research. Bring this comment to mind: https://twitter.com/paulg/status/28911860225 (exact comment here: https://news.ycombinator.com/item?id=35079)


  "if you use these platforms you are a fool"
Hey, that's not fair, you're arguing against half my point!

I said if you use these platforms _and don't have a reason to think you've got an advantage_ you shouldn't be doing so.

Lun and his peers are an exception to this. If you want to make a second argument and say he shouldn't be trading go ahead (and I'll try to back you up), but my original point was about the 99.9% who aren't Lun and are clearly just fish.


Don't get impressed by credentials and buzzwords. A lot of people had and are having success without credentials. The investing, trading domain is large enough to accommodate different approaches and wide variety of skill set. It is not the domain of a few chosen or privileged ones.


> Seriously, here's your competition: https://en.wikipedia.org/wiki/Renaissance_Technologies

Incorrect for the reasons pointed out above, they're trading size, you're likely not, it's a completely different game for them in a completely different field due to their liquidity needs. They are not your competition.


This has been on my mind recently as someone who was in the front office for a few years (but not making trades) but is now on the outside.

    I mean each day 100's of Phd's start with clean market
    data, more data sources than you could possibly think of
    and statistical back testing systems that have 1000's of
    man hours put into them, trying to find a way to make money.
It seems to me that the Tiger Rule applies here. You don't have to outrun the tiger, you just have to outrun your buddy.

Is the choice really that one can be in the cohort you mention, or you can buy an index fund (or whatever the equivalent is in the market you're interested in)... and that's it?[1] Is the market so efficient that there is no middle ground where a smart and methodical person can make more money than the index plodders without being obliterated by the big players?

That seems really unlikely to me.

[1] (I don't think that's what you are saying... but you've given me a chance to try and express something I've been thinking about. Thank you for that)


Don't think in terms of there being one tiger that stops as soon as it gets something to eat. The market's more like a sea full of countless sharks, where all the sharks have to survive by eating other sharks.

You're not wrong to think that, if there's any way to consistently make money by trading, it's by exploiting inefficiencies in the market. But remember that those inefficiencies come from people. Are you confident enough in your skills, your education and your resources to be sure that you'll be one of the ones finding and exploiting inefficiencies, rather than one of the ones who's creating inefficiencies for others to find and exploit?


Why is it efficient for gains in the market to be won by only a few actors?


You're using the word "efficient" in a different sense from what it means in the phrase "efficient market".


Why do you think you deserve any of the gains?


you can make more money, you just have to take more risk, and it takes being quite knowledgeable, experienced, and careful to know what a good risk is and how to diversify it/hedge it.


I run Quantopian's Lectures, which are also intended to teach the statistics behind trading. https://www.quantopian.com/lectures


Hi Delaney, you're my favorite quantopian. Just watched all the lectures and I think jupyter notebooks are the best way to learn combined with a lecture for nearly any programming related subject.


Thanks, happy they've been helpful. I agree that they're a super powerful teaching tool. We actually got the idea when seeing a prof at Harvard using them to teach the entire class: homeworks, lectures, everything.


These have been really great. Thanks for putting them together!


No problem, more on the way.


This was my experience as well. Doing "quant" trading on the side quickly becomes a second job, as you are always adjusting, training, simulating, and even just observing the models you're using.

It's not impossible to make money though. When it comes down to it, the trade-off for the hours you put in and the money you make only really pays off if you're trading with someone else's money, and lots of it.


As long as the target you're optimizing for is different than what those 100 of PhD's (and most of the market) are optimizing for then maybe you've got a chance. You also don't need to pay their salary.

So I agree wrt/ to longer time periods. Most of those PhD's are probably trying to predict minute to minute moves, or daily moves. A lot of them would be out of a job if they lose over 90 day periods.

Not sure how much of an edge can be gained over buying and holding some reasonably diverse equity portfolio. If anyone is thinking they'll beat some HFT hedge fund they're almost certainly not going to. If they're thinking of beating a reasonably diverse/optimized portfolio over longer terms that's maybe possible IMO but the difference isn't going to be huge. To amplify the difference requires taking on more risk, for example by leveraging, options etc.

If I want to decide whether long term investing in the US vs. Turkey, Greece, Brazil, Russia, or the UK, or gold :) it's not clear machine learning can give me useful insight. The historical data has its limits. Predicting geo-political processes and things like interest rates for the long term seems like a very difficult problem. I would suspect the machine answer to this question would be something like P/E is lower so it's a good investment but who knows. So I agree with the economic/market insight comment as well.


> I would suspect the machine answer to this question would be something like P/E is lower so it's a good investment but who knows. So I agree with the economic/market insight comment as well.

This is spot on (tried it once :).

ML doesn't really help much at the macro level. MPT, perhaps combined with some decent economic insight, is a lot more useful.


In a similar vein: https://xkcd.com/1570/


The rapid rise in popularity of algorithmic trading among everyday coders, the massive growth of /r/wallstreetbets, and the popularity of Quantopian tells me one thing: Go long on retail brokers (AMTD, ETFC, IBKR)


That was my first thought. If I had the resources, I'd built up that subreddit. Youtube videos. HN posts. Some trading libraries on Github. Some forums. Promotional deals to start out.

Previously there were other pyramid schemes, penny stocks, FX trading forums and so on. As those went out of fashion there is a new cohort of young people with great imagination who think they can beat the stock market. Some surely see a great potential there ready to exploit. They'd be silly not to.

Or to put it more obviously when everyone goes digging for goal, don't follow them, but start selling shovels. Then maybe take some profits and put some ads out in the saloons with stories about awesome finds of massive amounts of gold.


Massive growth of wallstreetbets? Come on, I've been subscribed to that subreddit since the beginning and it's largely a joke.


It's absolutely a joke but that doesn't mean it's not causing a lot of people to trade options


Random idea, but what if you could combine market data with news articles, and (obviously only for the past few years) data from blogs/reddit/hn/twitter/etc.? Sometimes there is fascinating insight to be learned in obscure places on the Internet (hn itself being a great example); I imagine a system that would collect and analyze that kind of information would be quite interesting, despite the huge amount of noise. Such a hypothetical system reminds me of this story: https://www.facebook.com/notes/robin-sloan/julie-rubicon/985...


Glib answer....

You'd get really mad at Anne Hathaway

http://www.theatlantic.com/technology/archive/2011/03/does-a...

Longer answer, this has been done for 10+ years. Bloomberg will sell you a sentiment annotated news feed for 5 figures a month if you'd like to try.


This has been made many, many times already. Zipline that we are talking about here is created by Quantopian that actually provides aggregated twitter harvesting datafeed free for this year.

https://www.quantopian.com/data/psychsignal/stocktwits


Been done tons of times. Some of them make quite a bit of money, others not. The real game-changer is if you're effective at separating out the real signal from the noise (and the noise floor is high in these sorts of feeds).



You would probably have as much luck correlating features like whether your neighbors kid was at the bus stop or had to run to catch it each day and the amount of times your neighbor complained about someone not cleaning up their dog's poop that week. Bullshit random edge case features don't get any better or worse if they come from the internet.

In regular trading(algorithmic or not) you really need some non-trivial insight that for some reason no one else will have.


> I mean each day 100's of Phd's...

Yes, but then... "Finance novice beats hedge fund pros, winning $100k in Quantopian trading contest"

https://pando.com/2015/03/05/finance-novice-beats-hedge-fund...

Thousands of brilliant people are out there undiscovered and I think quantopian does a great job leveling the playing field.


Novices sometimes beating pros is something you'd expect to be true the more performance was due to luck rather than strategy.

If it was strongly strategy-dependent, best-information would win predictably and consistently.


Yup. Incidentally this is exactly the same effect that makes model selection within a portfolio challenging. Two worthwhile reads on that front: the multiple comparison problem [1] and regression towards the mean [2].

Regardless of your objective - hosting a contest like this or allocating to individual institutions and strategies - you'll always face this problem. Good allocation decisions are tactical, taking both the logic of the strategy and the broader portfolio into account. Bad allocation decisions chase returns.

[1] https://en.wikipedia.org/wiki/Multiple_comparisons_problem

[2] https://en.wikipedia.org/wiki/Regression_toward_the_mean


"Yes, but then... "Finance novice beats hedge fund pros, winning $100k in Quantopian trading contest"

Not sure if you know, but that winning algorithm was taken offline within one month for poor performance. It was probably just overoptimised on past data to win the competition and then failed miserably out in the real market.


Equally, there are many athletes that play at amateur/semi-professional level and suddenly they are top level starts (e.g. Chris Smalling, Jamie Vardy). So yes, it's possible. It's just that the odds are not in your favor :) should you keep chasing your childhood athlete dream because someone did it? Probably not :)

Same rules apply to your example.


I don't know anything about algorithmic trading, but I'm just wondering, are the people who do make money lots of money out of it, those who have servers close to the data source, who do high frequency trades.


You can make money having by having better access like the HFT firms or by having data not widely available. You can also make money by applying well known principles more intelligently than others. This latter approach usually requires a lot of money. You can't afford retail brokerage costs when you're in a highly crowded and competitive trade.

The best way to make some money as a personal trader is to take advantage of the liquidity premium in one way or another. Because you're trading money in the 5 or 6 figures rather than the 7 or 8 figures, you can take advantage of smaller opportunities without thoroughly distorting the market with your own trades. These smaller trades usually require research and market insight rather than clever algorithms. These trades are usually on financial instruments that don't have a lot of easily accessible data to build an automated trade on top of.


> You can't afford retail brokerage costs when you're in a highly crowded and competitive trade.

This is very important. Most brokerage charge around $7 per trade, which makes high-volume trading very very expensive and prohibitive.

Robin Hood is an amazing alternative that charges nada for trades, and once they have an API[1], I think they'd be a great choice for small-time developers looking to do some (low-frequency) algorithmic trading.

[1] https://support.robinhood.com/hc/en-us/articles/210216823-Ro...


If you're serious into retail algotrading, IB (Interactive Brokers) is the best broker you can get.


Can you give an example of the kind of trade you can make in that kind of market? Doesn't matter if it's an old one, I'm just curious where to start looking.


Thanks! That's a very insightful reply


algorithmic trading doesn't equal high frequency trading. people conflate these a lot even though they know better. high frequency trading requires algorithmic trading. but algorithmic trading can implement warren buffet or suzi orman's style if thats what you wanted.


Agreed, HFT has become a catch-all term for anyone trying to make a marketing/political statement about trading. Like the term "big data" - it's been used to describe so many things that it no longer describes anything.

Another thing that gets ignored is the the difference between a trading strategy and an execution algorithm.


I wish I could get some rest API and websockets into the debt and credit default swaps market, really inefficient markets that reveal a lot about sentiment

don't need high frequency at all, just pretty average latency actually (for now anyway)

IB probably has this, not sure though. But those data costs are a huge deterrent for me!


CDS and debt are not really inefficient markets. Some bonds are less liquid than others but there is currently a rapid conversion to electronic trading going on. CDS is also a very mature market. Not saying that those markets don't contain signal but they are not very inefficient.


You're right, I would primarily like to see them more liquid so that there were more data points to extrapolate moves across different asset classes


What you really want is more transparent data. There are plenty of data points but they are not easily accessible.


and I think the lack of this perpetuates market inefficiencies. I think insights into CDS can be a leading indicator into equities and equity futures, yet CDS are restricted to OTC markets in the US

anyway, I'm sure this is its own discussion


CDS isn't an OTC instruments. Swaps are now traded on SEFs like MarketAxess or TradeWeb. Just because you don't have access to the data don't think that professionals don't. But this isn't OHLC data for listed equities. Expect to pay. Actually high quality realtime equity data (ITCH,PITCH,OpenBook Ultra,etc...) costs real money as well.

Checkout CDS and FI market data offerings: https://www.marketaxess.com/data/marketdata.php

But your idea is correct. CDS Spreads can be predictive of dramatic shocks in an equities price.


That is awesome

Yeah i get the promise: professionals do this, pay a premium for that

Until they don't because someone provides the data for free or next to nothing

In this town we call it disruption


Who's going to provide that data for free when they can charge a LOT of money for access to it. Ever notice why free data on Google or Yahoo are 15 min delayed?


I know what the exchanges charge for data access, and even if you pay it is in an antiquated format.

Tradier has been providing equity and option tick data for free for years in a clean RESTful API with the capability of websockets and streaming. So thats the answer to your question of who.

Like I said, in this town we call it disruption.


Thats part of their business model. I have not used their data but I can tell you that all tick data is not created equally.

We are unlikely to ever see CDS data for free since individuals cannot trade them as most retail investors are not Eligible Contract Participant. Some brokerages give away data to get people to trade with them so there its about customer acquisition.


I am a developer that has done a few projects on the tradier API. Agreed


I think DTCC has some freely available live CDS data but I forget where and don't know how the APIs are.


No. Compare any big banks revenues from trading against the revenues of a public HFT firm like virtu. No contest.


You're pretty much spot on sans the "clean market data" part. I work for a very large electronic trading firm (and have been in HFT professionally the past 8.5-9 years of my career) who has a team dedicated to just grooming this data. It is a lot of work and it is noisy. Coming from some exchanges, it is even often wrong. Look at the entire mess the recent leap second did to some exchanges.


I think you misunderstood, or I wasn't clear:)

The reason they have clean data is that big firms have teams of people dedicated to just cleaning data.

The little guy has to spend time cleaning data even before he/she starts to compete.


Right, I'm saying I work at one of said firms. The data even then isn't clean, just cleaner. Perhaps I'm just too pedantic :)


I agree with you. I have spent my nights and weekends into trying to find an edge and am still looking. As a retail trader, it is very difficult to find anything but random data all over. I hope I will prove myself wrong someday, but as of now, I think a retail trader finding an edge is nearly impossible.


I've had the same experience. Lots of people fooled by random returns think they haven't edges that don't exist.


Can't find the original comment but somebody wrote here time ago that some HFs let you run on their infrastructure (keeping 85/90% of any profits) if you can prove to have a valid strategy. Just out of curiosity, I could not find any other information about this, is it a thing that actually happens?


There are places (like Quantopian, but also more professionally-oriented) that will not only provide infrastructure but also capital if you come to them with a trading algorithm that does well on back tests.

The next step down are firms that'll give you a platform including hardware and software for the "infrastructure" bits: often called an "algo container", but lots of brokers have an API you can use to avoid needing to write feed handlers, etc. eg. Pico.

Then there are heaps of providers who'll rent you servers, connectivity, rackspace, etc. You do all the software. Lucera is a trading-oriented "cloud" provider. OptionsIT or Fixnetics are infrastructure providers. Or you just go straight to the data centers -- any decent finance-oriented datacenter will have a POP for most of the venues, and you can just cross-connect.

No relationship with any of the firms named -- just examples off the top of my head.


Yep. I am currently doing this, but under an NDA, at an HFT firm.

80% of profits seems a bit rich, though. If they're supplying top spec colo servers, a super fast network to connect them, and people to stare at it, I'd say something a bit lower (but still pretty chunky) would make more sense.

I suppose the key for them is they have a bunch of existing infrastructure that pays for itself, so any extra cream is good. Also they have capital sitting around, so why not?

The strategies have to be more than unusually profitable, though. I don't think I've seen a down day in the 3 months I've been sitting here, so your average trend following sharpe ~1.5 or so ain't gonna impress people.


Yes it is. Small arrangements like this all over Chicago. Perhaps NY also.

The big issue isnt actually infa it's capital. If you want to trade big you need a million bucks or something on hold with the exchange.

How do you borrow a mil to park at an exchange? The guys mentioned above.


Zipline (from OP's article) was created by a company that does something like that: https://www.quantopian.com/faq


Disclaimer(s): QuantStart.com founder here, background as a quant dev at a small fund.

I should probably nuance my statement that it is easy to find trading strategies by saying that it is easy to find new trading /ideas/. There are a huge number of freely available trading ideas on forums, pre-print servers (arXiv, SSRN), blogs etc. The trick is knowing how to implement them properly, accounting for any transaction costs and adjusting the parameters of the model. This is often where the stated performance falls down. It takes a lot of time to carry out this sort of research.

Long-term profitable strategies are tricky to find, due to the ever-present spectre of "alpha decay". This is where your strategy's edge is "arb'd out" - everyone else knows what you're doing and so there's no tradeable edge anymore. Hence it is necessary to have a portfolio of strategies and gradually phase out the ones that aren't doing well, and bring in new ones over time.

That being said there are a large number of trend following funds (known as Commodity Trading Advisors, or CTAs, in the industry) that all broadly do the same thing (follow "trends" in the commodity futures markets) and have great years every now and then. There are some well-known "retail" quant traders who do well by trend following, but it does require quite a bit of capital to trade in futures.

The philosophy that I do try to emphasise is to always be learning and researching new ideas. Also, as you mention, I'm pretty keen on discussing the math(s)/statistics aspect because once you have a solid math capability, it is easier to see where potential edges might exist and how to really assess whether it is a true "edge" or just a statistical anomaly.

I believe someone else in a grandchild comment below said that there are many areas that bigger quant funds won't touch because of institutional incentives. If you have $10bn assets under management (AUM), then you're not going to care about investing $100-200k, even if the returns are good, because it won't move the needle on your monthly reports.

The trick is to niche down into markets that you can spend a lot of time researching to find a distinct edge, that won't likely be touched by bigger funds. One area that is becoming interesting recently, due to the prevalence of satellite data/AI/deep learning-esque VC-backed startups, is building commodity supply/demand models. A good example is forecasting oil supply/demand by analysing large quantities of storage tank heights in global refineries [1].

Also, a small related-to-Zipline plug: I've recently started a free Python-based MIT-licensed open-source backtester [2], predominantly as a learning tool for programming and quant trading. There's about 4-5 of us working on it at the moment and it's in an early alpha stage, but we're always looking for people willing to help.

[1] - https://orbitalinsight.com/solutions/ [2] - https://github.com/mhallsmoore/qstrader/


You can have a good starting point here, if you are interested in Algorithmic Trading http://www.quantinsti.com/epat/


What do you think about cryptocurrencies trading? Do you think the same warning applies or there is an advantage because there are few players?


I never understood this. If its possible to be a profitable independent day trader, and we know it is because many are, then it should be possible to code the rules you follow and become a profitable algo trader.


> If its possible to be a profitable independent day trader, and we know it is because many are, then it should be possible to code the rules you follow and become a profitable algo trader.

There's actually a logical error here: if trading results were essentially random, a certain subset of traders (including day traders) would, at any given time, have profitable records. But you would not be able to derive (and, therefore, implement in code) any set of rules which would make automated trading profitable. You would just either be lucky, or not, as an algorithmic traders just as you would as any other kind of trader.

Now, I'm not saying that trading profits are random, but the existence of some profitable traders does not mean that there are rules you can deduce from their behavior that will guarantee profitable trading when implemented by someone else (either in an automated system or otherwise.)


Agreed. Probably a large fraction of 'profitable independent day traders' mention by parent depend on luck. It's one thing to post a profitable month/year, whole different thing is a profitable decade.

It's relatively easy to have a profitable period on a strongly bull market. But it's irrelevant if any 'unexpected' (people call such events unexpected despite the fact that they tend to happen regularly over time) event such as 2008 crisis will completely wipe you out.


> If its possible to be a profitable independent day trader, and we know it is because many are

You're forgetting about the many that aren't profitable. You're also forgetting about the majority that are less profitable than the market average. The latter is probably the easiest to overlook. Yes, they are technically profitable, but so are packaged portfolios of stock held for long periods of time. If you can't be more profitable than someone who puts in zero effort, then whats the point of putting in more work?


A lot of day trading is done based on intuition, which is difficult to translate to code. Also, keep in mind your survivorship bias... Most "day traders" lose money and quit the game.


There are lots of successful software developers too, but writing out a series of rules to automate writing code is not going to happen soon. Unless day trading is substantially easier than programming, an autonomous robotrader that can make money seems unlikely.


Leaving aside the question of how many profitable day traders there actually are, there are profitable professional poker players out there but no bot can compete with them. Now scale up the complexity of poker by at least an order of magnitude and you end up with the financial markets.

Sure, it is possible to to program very niche behaviour, but we are nowhere near any sort of program that can act as general "day trader".


The comments here are interesting...Looking at your question from a different angle, yes, of course it's possible to be a profitable algo trader - set your first investment to buy("SPY") and do nothing else for 20 years. What you are asking though depends entirely on your risk tolerance and what you are benchmarking your strategy against.


You vastly underestimate the complexity of a discretionary trader's intuition and experience. You cannot just replicate years of human experience with computer code so easily.


Indeed. Think about how many tens of millions of people drive everyday... But a self-driving car still proved elusive...


This statement is too general. You could of said the same thing about chess, there are chess Grandmasters who devote their lives to studying the game yet computers play chess at a much higher level than any human.


Chess is rational, following a easily understood set of rules, and both players have perfect information. The big problem has always been analysing all future possibilities.

The stock markets are very far from a rational, perfect information game with simple rules.


If you honestly think that living a real human life, with all the concurrent decisions that are simultaneously and relentlessly made on a micro and macro level throughout every second, every day is the same as a single game of chess, then by all means, go trade the stock market and show us how it's done.


An organisation at the scale of IBM was able to create, after many attempts, and vast investment, a computer that can beat Grandmasters. That insight isn't useful to an individual trying to do the same.


There are many strategies that are difficult to encode into an algorithm. When I briefly did day trading independently, my most profitable strategies played on public perception and (over)reactions to news. Those things are very, very difficult to properly automate (though many companies try, with varying degrees of success).


> Those things are very, very difficult to properly automate (though many companies try, with varying degrees of success).

Case in point is the fluctuations in Warren Buffett's fund whenever Anne Hathaway is in the news.

http://ftalphaville.ft.com//2011/03/28/528481/for-the-bots-a...


Its not as simple as converting rules followed by independent day trader into algorithms - most of them just do not rely on just technical analysis but also on fundamentals which is hard problem to decipher with all the hype around any stocks in the news and analyst of institutional investors influencing with the media news through various sources.

The best for you to start learning - would be to get a start on how the market works - my suggestion would be - https://www.amazon.com/dp/B000THOD1G/


No idea why are you being downvoted for expressing an opinion and asking for an explanation ...


It's a slightly different thing though, being an independent trader isn't the same as becoming an independent trader.

A trader that is independent may still have advantages that prevent some random programmer from bootstrapping his/her way to also being an independent trader.

Just thinking out loud though, I'm not a trader of any sort.


Hi, a shameless plug: I went to the Quantopian (the company that is behind Zipline and essentially uses Zipline as the core backend to their cloud platform) algo-trading hackathon two weekends ago and came up with this algo:

https://www.quantopian.com/posts/xiv-slash-vxx-pair-trade-1

Pair-trading VXX and XIV based on the StockTwits sentiments of the SPY at market open. The backtest did really well from 2011 to 2014 with 1700-1800% return in 3 years; and flat between 2014 to present-time,

I'd really love it if people can improve upon the algo and see what people when they clone the algo and come up with ways to mitigate the drawdown's and improve the performance!


Everybody's trading nowadays.

How about just investing :-)

I.e. focus on periods longer than a year, which so few people/professional market participants do. And on actual businesses instead of the crazy antics of a line.

I wonder if you could use something like Zipline/Quantopian to screen huge amounts of consolidated balance sheets for markers of undervaluation. You could reject 1000s of companies and focus your “manual” vetting on the few that remain.

If you can find the dollar selling for half a dollar and you can understand why it's selling for that price (e.g. because the entire market is down), you may have identified a winner. Then all you need is a little guts and lots of patience. And a predefined set criteria that you would constantly monitor to decide if your thesis is still valid.


> Everybody's trading nowadays.

Primarily the artifact of sustained positive returns recently and short-term memory. There used to be a saying: When your cab driver start giving you stock tips, it is time to bail out of market. When every Tom, Dick and Harry think they can beat the market, time to take a break.

> How about just investing :-)

This is the right way to go for majority of your portfolio. Follow simple, tried and test strategies - buy Index funds/ETFs for majority of your portfolio. Bogleheads Wiki https://www.bogleheads.org/wiki/Main_Page is a good starting point.

If you really interested in individual stock/investment picking, have a very small portion of your portfolio as play money for such endeavors.

> I wonder if you could use something like Zipline/Quantopian to screen huge amounts of consolidated balance sheets for markers of undervaluation. You could reject 1000s of companies and focus your “manual” vetting on the few that remain.

I primarily use similar methodology. Automated filtering of stocks to find a few that I want to review further. It is not scalable. Majority of time is spent on developing strategy for filtering and selecting the stocks for review. I most probably manually review 15-20 stocks a year (Reading SEC filings for the company and competitors, industry news, trade articles, analyzing financial statements, etc) and invest in 3-6 stocks a year at most.


How about no. You're entering a field where professionals working full time struggle to beat the market, what's saying that you, a folder of 10-k's, and a copy of Ben Graham are going to beat them? You could probably spend a lifetime studying investing and still come up short, because you don't have the resources or mentoring that the pros have.


Well definitely a valid response. But it still seems like a bigger problem to beat professional traders using Zipline, than to spend time looking out for a good company that is being sold for a price that I really like.

In the latter case, investing and looking for value, it seems to me like an amateur can even give himself a little edge over the pros.

First, it's a worse fate for most professional money managers to miss out on some bull market, than to go down together with all his colleagues.

Second, amateurs working with their own money are not evaluated every quarter or even every year. They can just wait and keep looking if unsure. There are no mandates or arbitrary limitations, so the amateurs are free to look for value wherever they can find it. They can look in places that would require more patience, or that have a bit less liquidity or some more volatility (because they will typically have less money to move in/out the stock).

Of course, the pro will definitely have benefits in terms of legislation, taxes and lowering the costs of research vis. the amount of money being invested.

But in the end: does the extra information and non-GAAP stuff that the pros use, help so much over common sense and a decent understanding of accounting? (That's a genuine question, not a statement.) In fact, does succesful investing even involve outsmarting everyone else in the same way that trading does?


Wrong that's all you need. In fact according to Peter Lynch you have a better chance of alpha as you dont have to deal with all the bs a PM at a big fund has too.


The markets that Peter Lynch traded and the markets today are completely different.


Sigh, trading vs. investing... Also if you make a statement like that I am sure you are unfamiliar with his thesis and methodology.


Professionals aren't playing the same game as amateurs are, so your logic is flawed.


I'm all for value investing (it's the only legitimate reason for stock to exist) but in evolutionary terms, this is hoping to be large and healthy enough to ignore parasites rather than resist them dogging your every step.


Actually, does anyone know where an amateur could buy/download quarterly balance sheets/income statements for the broad stock market universe?


All US public company quarterly financial reports (and much more) are available in raw form at the SEC EDGAR site (https://www.sec.gov/edgar/searchedgar/companysearch.html). Those reports income statements, balance sheets, etc. But beware that companies will often file corrections later.

In addition, nearly all finance sites provide summaries of these reports for at least the last few quarters and last few annual reports. I like morningstar.com. But finance.yahoo.com and finance.google.com both work fine.

If you want a bunch in one shot and don't have the money for Bloomberg or CapitalIQ or whatever, I suggest quandl.com, which acts as a cut-rate data aggregator for financial data.


Thanks, quandl looks interesting. API and what not. Getting a few fundamental datasets probably costs less than subscribing to a couple of investment newsletters.


Also tiingo.com


Most professionals I've known use Capital IQ, which is expensive.

It's surprisingly tough to get broad, machine-readable market data for free but there are some cheaper options. Check this thread: https://www.reddit.com/r/SecurityAnalysis/comments/2ci5du/ca...

Or you could always scrape Yahoo Finance :)


Yeah but you don't want to spend most of your time writing and maintaining scraping code :-)


Yep, therein lies the classic problem: if you want something of value, you can either spend your time or spend your money.


You could always use Yahoo Finance or Marketwatch for a quick overview of the financial statements of publicly traded companies.


General question: how do you take this and interact directly with the market? Is there some sort of general, public api that you're making calls agains, where do you get an account for it, etc? Or, is this going through some firm that interfaces with the market?


Upvotes, but no comments :(

If anyone comes back to this ever, still interested in knowing!


I basically implemented this, and a lot of other features for my own personal trading bot against the Cryptsy API written in Python. The idea was to makes tons of small trades on alt coins throughout the day, constantly buying and selling on short crossovers, making fractions of a perfect profit after fees. It turns into an up and down roller coaster, sometimes you're way ahead, and other times you lose it all. The biggest issue was the low volume on most of the alt coins. At the end of the day, like many others have said, it's mostly luck, you will lose money eventually,but it's an interesting learning exercise. The only real way to win, is to have some insight into the market, not machines.


I'm sitting at an HFT here, coding.

This zipline thing is quite interesting if you're new, but if you can code, I'm not sure what the advantage is. The idea of a backtest is quite simple, and you can easily fire up something like pandas to do it for you. The equity line is simply your positions x returns, minus costs. To determine your positions, you have to make sure you aren't looking at future prices, but apart from that you are flexible in doing whatever you like.

And this was a question for me. Suppose I want to code a cross-sectional strategy. How would I do that in zipline? It seems to be the kind of thing that gives you one backtest for one time series. Perhaps I just haven't looked into it enough. When we backtest, often we want to do things across the ensemble. We also take positions in a whole universe of instruments, so the backtest needs to be a matrix, rather than just one column.

Incidentally, the example strategy will work quite well for retail traders. You can add a bunch of futures together and get a sharpe well over 1, basically what every CTA does but won't admit to. If you're wondering what all those PhDs do all day, it's adding capacity and researching minor improvements on that MA strategy. A colleague of mine worked at one of these brand names, and another friend owns one.

So, does that mean anyone can simply do this? Well, yes. But you'd have a lot of leg work to do, and you might get discouraged before you start. You need an account from someone like Interactive Brokers. You need a fair bit of money, or you'll have increment problems trading the large contracts. And you'll have to set up all the data feeds and look at it each day.


2 questions - what kind of leg work is involved, and how much is a fair bit of money? Is it possible with 250k of working capital?


Leg work:

- Getting the data into a shape that you can use. Normally a total PITA. For futures, you have to either stitch the contracts yourself, or get a pre-stitched series, which you have to take time to understand. Filtering it for weird data points.

- Writing the strategy / backtesting code. The fun part.

- Connecting to a broker. Gotta read API docs, test the functions, connect it to your code in a way that makes sense, and probably in a way that makes it easy to switch brokers. Test the price feed, write error handling code.

- Daily operations code. You'll need a daily process where you can see what's going on. Automated testing of the trade report for correctness. Notifications from brokers need responses, you need to post margin as well. Some kind of SMS or Whatsapp for when something is wrong. Holiday calendar.

250k is not enough. Some of the futures contracts are quite large, and you won't be able to get the full benefit of diversification if you don't have a bunch of instruments to trade (look for a blog called Investment Idiocy, he recently talked about this). Above ~$3-5M, it isn't a problem and you can ignore it.


If you're talking about HFT and front running (which HFTs do whether you like it or not) a retail trader can't compete in that sector.

HFT operates on algorithms that mostly involve making money on the spread by running ahead of the brokers, buying the cheap stuff, and selling it to the broker who needs it. They don't trade on market microstructure, mostly because all of it starts to fall apart at the tick level.

250k is CERTAINLY enough to invest in futures contracts. You can do it with much less. Much much less. Futures are highly leveraged instruments. You can diversify by trading multiples of futures contracts (or e-minis depending on account size) because you're only required to post initial and maintenance margin.

You can easily blow up your account, but if you're just TRADING 5000 is enough to start selling a few contracts. If you're looking to start building something sustainable 10k is enough. But, the more the better.


As a veteran of one algo shop, I have this to say:

Play with the data all you like. Don't try to trade on it if you don't really know what you're doing. (Or, just recklessly trade other people's money. It's fun.)

What you're seeing here is the "napsterization of finance." (Google it, it will lead you to the article I am almost plagiarizing).

Basically, the market at large puts together a pot of money (called "alpha", debatably) The better you are at trading, the more of that pot you get.

BUT this is not a zero sum game. It's worse.

If the markets are functioning properly, then the better you are a this, the bigger the share of the pot you get, AND the smaller the pot of money gets.

It used to be that middlemen like the NYSE stock market specialists made very large amounts of money doing what Homer Simpson automated with a drinky bird. Now, the also shops have already shrunk that pot considerably. Good news for your pension fund. Bad news for you if you try this yourself. So don't.


For the past year I have been trying to learn more about trading, risk management, etc. There are so many stories about how the markets work and how to make money in them. You could spend your lifetime throwing money down a hole trying each one and probably do worse than random. I can't say enough good things about the perspective I have gained from just listening to good interviews of people that trade and manage funds for a living. Take a look at https://chatwithtraders.com/podcast/ and https://realvisiontv.com


Never trust stories about how to make money in the stock markets, unless said stories are told entirely in the past tense.

If someone really does have a successful trading strategy, the only way it makes economic sense to publish it is if they believe they can make more money by publishing it now (i.e. selling books, pageviews, whatever) than by using it to trade. Either that, or the algorithm is being described in sufficiently general terms that you're not actually given enough information to use it effectively.


Couldn't agree more.


Have you read Market Wizards by Jack Schwager? Fantastic book and it really goes deep with how these traders think about approaching and exiting a trade/market.


The problem is it's all hogwash due to survivorship bias; every single one of them could simply be lucky. This is literally no different than interviewing lottery winners and asking them how they chose their numbers. People who write books about the successful are conning you for cash, always. There is nothing to be learned from the habits of just successful people; other than misleading yourself with superstition.


I have looked at Zipline before, but it does not handle intraday trades, and does some guesses on when the trade executes during the "day", so you may not get the best price.

Running an algorithm for multi-day trades for more than a few months does not make sense on how the markets move, as certain events like "brexit", earnings, M&A, etc... affect stock price.

If you are really interested in algorithmic trading, and you have programming experience, it's best to build your own backtesting system with intraday market data (pay for this).

This way you will know the ins and outs of a trading system.


Zipline dev here. Zipline happily works on minutely data (in fact, we recently dropped support for daily mode entirely on Quantopian, which is built on top of Zipline).

All the tutorials and examples for Zipline use daily data because there's no freely-available minutely data that we can distribute to our users.


Good to hear, I stopped using quantopian because of lack of intraday details.


;-) you ever review that PR I sent? No browsing HN on the Job! (I kid)


I looked at it briefly over the weekend and then got distracted trying to make numpy.isfinite() work on datetimes :(. It's still in the queue though! Feel encouraged to gently bump it if I don't get back to you in the next day or two.


For backtesting intraday trades, LEAN from QuantConnect is a better alternative.


Could you explain why you feel that it is better? What in particular does LEAN support that Zipline does not support?


Two features of LEAN won me over are : 1) LEAN supports finer data such as minutes, seconds, and ticks. 2) LEAN was developed in C#, which is way faster than Python.


At first glance you would think that python would be slower than C#; however, all of the real computation is happening in Numpy (C), Pandas (Cython which compiles to C), or in our own Cython. The extra overhead of the python is mainly dominated by the array computations happening in C or the IO of loading data.


Equities, FOREX, Futures, Options; tick, second, minute, hour and daily resolutions. Python, C# and F# backtesting. Dozens of models for improving the accuracy of your backtest.

Live trading on IB, Tradier, FXCM, Oanda and paper trading.

Local charting built in for desktop and backtesting.

Lots of tools provided for free data downloads to work with public free data libraries. lean.quantconnect.com

(I'm founder of QC :))


It is a little disingenuous to say that zipline doesn't support python, the short description from github says: "Zipline, a Pythonic Algorithmic Trading Library". Zipline also supports equities at minute and daily frequencies. There is no charting built into zipline itself but tearsheets and graphs can be generated with pyfolio (a project by the same people as zipline). Zipline also comes with the ability to pull pricing and splits data from quandl and yahoo.

I realize I can't win you over but I wanted to present a fair comparison for others ;)

Also, I work on zipline


Before you jump in try to make your millions, a fairly well-known and accepted statistic is that at least 95% of day traders (using any method) lose money.

All the very successful day traders I know lost lots of money in the beginning before learning how to do it properly. You need to have a solid source of funds to fuel your learning, and tremendous patience.


I think Ruby needs to start broadening its Appeal beyond Rails.


I use the Magic Formula[0] strategy because index funds are too boring for me. It's a value strategy and you have to hold for a year, but it's fun to see your stocks rise (and fall).

[0] https://www.magicformulainvesting.com/


i just want to check my understanding of the algorithmic trading "world", so please do jump in.

Once upon a time (1986ish) the equities and bond trading world was run by humans talking to humans and agreeing deals, the prices then fed into computer systems and the exchanges passed the prices around to make things mostly fair.

Fair of course is relative, the Eco-system was very hierarchical, with major institutions at the top, trading between each other at low fees, with brokers feeding up into them and retail shops feeding into major brokers. The customer got a raw deal, being charged heavy fees per transaction, and getting a poor "spread".

Spread was where the major institutions made their money. Human traders effectively bought very low and sold very high - both because they were human and could not easily handle algorithms in their heads and because who was going to stop them? At the top of the hierarchy traders got to see both sides of every trade - they could net trades off one against the other to make deals with little risk. And if it was not visible in a fair exchange they had even more leverage.

Spreadsheets took off around now, making it possible for one trader to plan and monitor his trades and look really good to his boss.

And then it became obvious that having a human in the spreadsheet-to-trade loop was sub optimal. A human with a spreadsheet still needed to dial a phone, make a decision, go to the toilet. A perl script could out perform him.

And at the time the algorithms were simple. If Exxon's share price dropped then pretty obviously other oil companies would drop too, but so would say car company stocks, but maybe coal miner shares would go up. And that's just in LSE - the same goes for Hong Kong and Chicago. Those correlations I could work out in a perl script. (OK, 1980, maybe some Basic :-)

And so algo trading was feasible with really tiny hardware - because the correlations in the world markets were simple, and large. And so low latency trading started. Because if I can use my ZX spectrum of my Commodore 64 to beat major traders to the punch, then all you need is a faster computer than the commodore and you beat me to the punch. And so it goes.

Fast forward twenty years and

- the hierarchy of the past is mostly still in place. Retail shops pull in the customers money, pass it upwards to brokers and they deal with traders at large banks. However the traders are much reduced, the volumes they do are orders of magnitude larger now.

- the spread has gone. Major institutions make money on tiny margins and tiny fees and just do vast vast volumes. Major FX desks will make maybe 10 USD on a billion dollars of Eurodollar trades (I think).

- the spread has gone for the algo traders. The reason PhD's are needed is because the correlations and arbitrage is all eaten up. The wins are few and far between and mostly need real world events (Brexit)

- this is generally good, there is more trade on open exchanges (good for everyone) there is smaller spreads (good for customers). The break neck automation to a good for contractors like me :-)

I'm not sure where I am going with this to be honest - but mostly it's that I am sure zip line is a good library, that the core part is written in the way a proprietary engine would look if someone took a year to rewrite it, but the core tech will not give you any edge - that edge has gone. The correlations have gone except in esoteric areas.

If you want the edge, you need to be at the top of the tree again.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: