
Introduction to Zipline: A Trading Library for Python - kawera
http://www.quantinsti.com/blog/introduction-zipline-python/
======
chollida1
I've typed and deleted this post a few times trying to find a way that it
doesn't sound kind of pompous but if it helps save one person alot of money
then screw it, I'll sound pompous....

I get asked quite a bit on how to start doing algorithmic trading and the
first thing I always tell people is don't.

I think I've said this many times now but the number of people who come at it
with the thinking "I'm a computer scientist. I'll just fire up R or python and
apply some machine learning to the markets and watch the money roll in" is
staggering.

I mean each day 100's of Phd's start with clean market data, more data sources
than you could possibly think of and statistical back testing systems that
have 1000's of man hours put into them, trying to find a way to make money.

After all of that if you really want to I wrote this in response to an Ask
Hacker News a little while ago

[https://news.ycombinator.com/item?id=11352562](https://news.ycombinator.com/item?id=11352562)

TL/DR \- focus on time periods greater than a day

\- expect to lose money

\- expect to take a year to figure out some edge in the market

\- most decent trading strategies that a normal person can use come from
economic/market insights first and technology second.

The site: [https://www.quantstart.com/](https://www.quantstart.com/) is also
decent at bringing you up to speed on the math you'll need to know though I
believe that the material there oversells how easy it is to find a decent
trading strategy.

~~~
canttestthis
I never understood this. If its possible to be a profitable independent day
trader, and we know it is because many are, then it should be possible to code
the rules you follow and become a profitable algo trader.

~~~
hellofunk
You vastly underestimate the complexity of a discretionary trader's intuition
and experience. You cannot just replicate years of human experience with
computer code so easily.

~~~
selectron
This statement is too general. You could of said the same thing about chess,
there are chess Grandmasters who devote their lives to studying the game yet
computers play chess at a much higher level than any human.

~~~
semi-extrinsic
Chess is rational, following a easily understood set of rules, and both
players have perfect information. The big problem has always been analysing
all future possibilities.

The stock markets are very far from a rational, perfect information game with
simple rules.

------
cowmoo
Hi, a shameless plug: I went to the Quantopian (the company that is behind
Zipline and essentially uses Zipline as the core backend to their cloud
platform) algo-trading hackathon two weekends ago and came up with this algo:

[https://www.quantopian.com/posts/xiv-slash-vxx-pair-
trade-1](https://www.quantopian.com/posts/xiv-slash-vxx-pair-trade-1)

Pair-trading VXX and XIV based on the StockTwits sentiments of the SPY at
market open. The backtest did really well from 2011 to 2014 with 1700-1800%
return in 3 years; and flat between 2014 to present-time,

I'd really love it if people can improve upon the algo and see what people
when they clone the algo and come up with ways to mitigate the drawdown's and
improve the performance!

------
unknown_apostle
Everybody's trading nowadays.

How about just investing :-)

I.e. focus on periods longer than a year, which so few people/professional
market participants do. And on actual businesses instead of the crazy antics
of a line.

I wonder if you could use something like Zipline/Quantopian to screen huge
amounts of consolidated balance sheets for markers of undervaluation. You
could reject 1000s of companies and focus your “manual” vetting on the few
that remain.

If you can find the dollar selling for half a dollar and you can understand
why it's selling for that price (e.g. because the entire market is down), you
may have identified a winner. Then all you need is a little guts and lots of
patience. And a predefined set criteria that you would constantly monitor to
decide if your thesis is still valid.

~~~
nether
How about no. You're entering a field where professionals working full time
struggle to beat the market, what's saying that you, a folder of 10-k's, and a
copy of Ben Graham are going to beat them? You could probably spend a lifetime
studying investing and still come up short, because you don't have the
resources or mentoring that the pros have.

~~~
phyalow
Wrong that's all you need. In fact according to Peter Lynch you have a better
chance of alpha as you dont have to deal with all the bs a PM at a big fund
has too.

~~~
pfarnsworth
The markets that Peter Lynch traded and the markets today are completely
different.

~~~
phyalow
Sigh, trading vs. investing... Also if you make a statement like that I am
sure you are unfamiliar with his thesis and methodology.

------
shortstuffsushi
General question: how do you take this and interact directly with the market?
Is there some sort of general, public api that you're making calls agains,
where do you get an account for it, etc? Or, is this going through some firm
that interfaces with the market?

~~~
shortstuffsushi
Upvotes, but no comments :(

If anyone comes back to this ever, still interested in knowing!

------
overcast
I basically implemented this, and a lot of other features for my own personal
trading bot against the Cryptsy API written in Python. The idea was to makes
tons of small trades on alt coins throughout the day, constantly buying and
selling on short crossovers, making fractions of a perfect profit after fees.
It turns into an up and down roller coaster, sometimes you're way ahead, and
other times you lose it all. The biggest issue was the low volume on most of
the alt coins. At the end of the day, like many others have said, it's mostly
luck, you will lose money eventually,but it's an interesting learning
exercise. The only real way to win, is to have some insight into the market,
not machines.

------
lordnacho
I'm sitting at an HFT here, coding.

This zipline thing is quite interesting if you're new, but if you can code,
I'm not sure what the advantage is. The idea of a backtest is quite simple,
and you can easily fire up something like pandas to do it for you. The equity
line is simply your positions x returns, minus costs. To determine your
positions, you have to make sure you aren't looking at future prices, but
apart from that you are flexible in doing whatever you like.

And this was a question for me. Suppose I want to code a cross-sectional
strategy. How would I do that in zipline? It seems to be the kind of thing
that gives you one backtest for one time series. Perhaps I just haven't looked
into it enough. When we backtest, often we want to do things across the
ensemble. We also take positions in a whole universe of instruments, so the
backtest needs to be a matrix, rather than just one column.

Incidentally, the example strategy will work quite well for retail traders.
You can add a bunch of futures together and get a sharpe well over 1,
basically what every CTA does but won't admit to. If you're wondering what all
those PhDs do all day, it's adding capacity and researching minor improvements
on that MA strategy. A colleague of mine worked at one of these brand names,
and another friend owns one.

So, does that mean anyone can simply do this? Well, yes. But you'd have a lot
of leg work to do, and you might get discouraged before you start. You need an
account from someone like Interactive Brokers. You need a fair bit of money,
or you'll have increment problems trading the large contracts. And you'll have
to set up all the data feeds and look at it each day.

~~~
infinite8s
2 questions - what kind of leg work is involved, and how much is a fair bit of
money? Is it possible with 250k of working capital?

~~~
lordnacho
Leg work:

\- Getting the data into a shape that you can use. Normally a total PITA. For
futures, you have to either stitch the contracts yourself, or get a pre-
stitched series, which you have to take time to understand. Filtering it for
weird data points.

\- Writing the strategy / backtesting code. The fun part.

\- Connecting to a broker. Gotta read API docs, test the functions, connect it
to your code in a way that makes sense, and probably in a way that makes it
easy to switch brokers. Test the price feed, write error handling code.

\- Daily operations code. You'll need a daily process where you can see what's
going on. Automated testing of the trade report for correctness. Notifications
from brokers need responses, you need to post margin as well. Some kind of SMS
or Whatsapp for when something is wrong. Holiday calendar.

250k is not enough. Some of the futures contracts are quite large, and you
won't be able to get the full benefit of diversification if you don't have a
bunch of instruments to trade (look for a blog called Investment Idiocy, he
recently talked about this). Above ~$3-5M, it isn't a problem and you can
ignore it.

~~~
sldjfsdfkjsdfkj
If you're talking about HFT and front running (which HFTs do whether you like
it or not) a retail trader can't compete in that sector.

HFT operates on algorithms that mostly involve making money on the spread by
running ahead of the brokers, buying the cheap stuff, and selling it to the
broker who needs it. They don't trade on market microstructure, mostly because
all of it starts to fall apart at the tick level.

250k is CERTAINLY enough to invest in futures contracts. You can do it with
much less. Much much less. Futures are highly leveraged instruments. You can
diversify by trading multiples of futures contracts (or e-minis depending on
account size) because you're only required to post initial and maintenance
margin.

You can easily blow up your account, but if you're just TRADING 5000 is enough
to start selling a few contracts. If you're looking to start building
something sustainable 10k is enough. But, the more the better.

------
ocschwar
As a veteran of one algo shop, I have this to say:

Play with the data all you like. Don't try to trade on it if you don't really
know what you're doing. (Or, just recklessly trade other people's money. It's
fun.)

What you're seeing here is the "napsterization of finance." (Google it, it
will lead you to the article I am almost plagiarizing).

Basically, the market at large puts together a pot of money (called "alpha",
debatably) The better you are at trading, the more of that pot you get.

BUT this is not a zero sum game. It's worse.

If the markets are functioning properly, then the better you are a this, the
bigger the share of the pot you get, AND the smaller the pot of money gets.

It used to be that middlemen like the NYSE stock market specialists made very
large amounts of money doing what Homer Simpson automated with a drinky bird.
Now, the also shops have already shrunk that pot considerably. Good news for
your pension fund. Bad news for you if you try this yourself. So don't.

------
lootsauce
For the past year I have been trying to learn more about trading, risk
management, etc. There are so many stories about how the markets work and how
to make money in them. You could spend your lifetime throwing money down a
hole trying each one and probably do worse than random. I can't say enough
good things about the perspective I have gained from just listening to good
interviews of people that trade and manage funds for a living. Take a look at
[https://chatwithtraders.com/podcast/](https://chatwithtraders.com/podcast/)
and [https://realvisiontv.com](https://realvisiontv.com)

~~~
usefulcat
Never trust stories about how to make money in the stock markets, unless said
stories are told entirely in the past tense.

If someone really does have a successful trading strategy, the only way it
makes economic sense to publish it is if they believe they can make more money
by publishing it now (i.e. selling books, pageviews, whatever) than by using
it to trade. Either that, or the algorithm is being described in sufficiently
general terms that you're not actually given enough information to use it
effectively.

~~~
lootsauce
Couldn't agree more.

------
edsouza
I have looked at Zipline before, but it does not handle intraday trades, and
does some guesses on when the trade executes during the "day", so you may not
get the best price.

Running an algorithm for multi-day trades for more than a few months does not
make sense on how the markets move, as certain events like "brexit", earnings,
M&A, etc... affect stock price.

If you are really interested in algorithmic trading, and you have programming
experience, it's best to build your own backtesting system with intraday
market data (pay for this).

This way you will know the ins and outs of a trading system.

~~~
ssanderson11235
Zipline dev here. Zipline happily works on minutely data (in fact, we recently
dropped support for daily mode entirely on Quantopian, which is built on top
of Zipline).

All the tutorials and examples for Zipline use daily data because there's no
freely-available minutely data that we can distribute to our users.

~~~
jjm
;-) you ever review that PR I sent? No browsing HN on the Job! (I kid)

~~~
scoutoss11235
I looked at it briefly over the weekend and then got distracted trying to make
numpy.isfinite() work on datetimes :(. It's still in the queue though! Feel
encouraged to gently bump it if I don't get back to you in the next day or
two.

------
hellofunk
Before you jump in try to make your millions, a fairly well-known and accepted
statistic is that at least 95% of day traders (using any method) lose money.

All the very successful day traders I know lost lots of money in the beginning
before learning how to do it properly. You need to have a solid source of
funds to fuel your learning, and tremendous patience.

------
ksec
I think Ruby needs to start broadening its Appeal beyond Rails.

------
seibelj
I use the Magic Formula[0] strategy because index funds are too boring for me.
It's a value strategy and you have to hold for a year, but it's fun to see
your stocks rise (and fall).

[0]
[https://www.magicformulainvesting.com/](https://www.magicformulainvesting.com/)

------
lifeisstillgood
i just want to check my understanding of the algorithmic trading "world", so
please do jump in.

Once upon a time (1986ish) the equities and bond trading world was run by
humans talking to humans and agreeing deals, the prices then fed into computer
systems and the exchanges passed the prices around to make things mostly fair.

Fair of course is relative, the Eco-system was very hierarchical, with major
institutions at the top, trading between each other at low fees, with brokers
feeding up into them and retail shops feeding into major brokers. The customer
got a raw deal, being charged heavy fees per transaction, and getting a poor
"spread".

Spread was where the major institutions made their money. Human traders
effectively bought very low and sold very high - both because they were human
and could not easily handle algorithms in their heads and because who was
going to stop them? At the top of the hierarchy traders got to see both sides
of every trade - they could net trades off one against the other to make deals
with little risk. And if it was not visible in a fair exchange they had even
more leverage.

Spreadsheets took off around now, making it possible for one trader to plan
and monitor his trades and look really good to his boss.

And then it became obvious that having a human in the spreadsheet-to-trade
loop was sub optimal. A human with a spreadsheet still needed to dial a phone,
make a decision, go to the toilet. A perl script could out perform him.

And at the time the algorithms were simple. If Exxon's share price dropped
then pretty obviously other oil companies would drop too, but so would say car
company stocks, but maybe coal miner shares would go up. And that's just in
LSE - the same goes for Hong Kong and Chicago. Those correlations I could work
out in a perl script. (OK, 1980, maybe some Basic :-)

And so algo trading was feasible with really tiny hardware - because the
correlations in the world markets were simple, and large. And so low latency
trading started. Because if I can use my ZX spectrum of my Commodore 64 to
beat major traders to the punch, then all you need is a faster computer than
the commodore and you beat me to the punch. And so it goes.

Fast forward twenty years and

\- the hierarchy of the past is mostly still in place. Retail shops pull in
the customers money, pass it upwards to brokers and they deal with traders at
large banks. However the traders are much reduced, the volumes they do are
orders of magnitude larger now.

\- the spread has gone. Major institutions make money on tiny margins and tiny
fees and just do vast vast volumes. Major FX desks will make maybe 10 USD on a
billion dollars of Eurodollar trades (I think).

\- the spread has gone for the algo traders. The reason PhD's are needed is
because the correlations and arbitrage is all eaten up. The wins are few and
far between and mostly need real world events (Brexit)

\- this is generally good, there is more trade on open exchanges (good for
everyone) there is smaller spreads (good for customers). The break neck
automation to a good for contractors like me :-)

I'm not sure where I am going with this to be honest - but mostly it's that I
am sure zip line is a good library, that the core part is written in the way a
proprietary engine would look if someone took a year to rewrite it, but the
core tech will not give you any edge - that edge has gone. The correlations
have gone except in esoteric areas.

If you want the edge, you need to be at the top of the tree again.

