Note that you can't redistribute this data or generate income through it [0] per the Yahoo TOS [1].
Also note that Yahoo's data is somewhat unreliable [0, 2].
Unfortunately options data is just hard to get access to due to the agreements forced on providers by the exchanges. If you want to build a trading system on it without an institutional budget, the best options I've found so far are QuantConnect [3], which has paid for it and lets you build your system on their platform and OptionWorks [4], which you can pay $150 for and download all the data. You should be able to get a live feed from your broker after that. If you want it for academic research, many universities and colleges have access to much higher quality data.
I would not recommend to use the Yahoo or Google apis to get option prices. Some of the bid/ask prices are completly wrong and some expiration dates are missing. I created an option scanner as a side project and found about this bad data quality the hard way.
It's a good place to get solid data but it's quite expensive. EOD history for SPX alone is $250 apparently. All symbols are $2500 and my understanding is that this is just CBOE options? Is CME not in there?
iVolatility told me it would be ~$1000 for EOD chains + IV & Greeks since 1990 - just for SPX. CBOE is partners with iVolatility so I imagine their data is decent, but I can't bring myself to drop that much on it for a simple personal project.
How's the data quality of OptionWorks? Do they have the indicies? I bought several years of EOD data for SPX from https://www.historicaloptiondata.com/ and found it to be of only moderate quality. (Charting some of the data on multiple dimensions makes the stuff that is padded stick out like a sore thumb... I ended up tossing about 20% of the data).
TDAM's API is a little weak on options. Although you can pull the entire option chain for almost any stock during trading hours, that's about all you can do. There's no historical data or realtime streaming, and there is a rate limit of some kind.
For stocks (not options), they took it to the next level, and you can stream price/volume data in psuedo realtime. I think you can also pull EOD data on stocks (not options) going back pretty far.
Is option prices what I think of when I think "Stock Data"? I've been wanting to download and play around with processing "Stock Data" but I don't really know what I'm looking for. What metrics are stocks judged by? I'm assuming price is only one of many things.
> Is option prices what I think of when I think "Stock Data"?
You may be thinking of simply stock prices or stock fundamentals (income, expenditures, that kind of stuff) but you may also be thinking of other things like sentiment or analyst opinions. There's a huge range of things you can look at. Options prices are certainly one kind of stock data though, they give valuable information like implied volatility (how much the market at a particular time expects the stock to move in a particular period in the then future).
If you're interested in stock data, Quandl (http://quandl.com) has a broad selection of the different kinds available (though you don't have to purchase there) and Quantopian (https://quantopian.com/) has some examples of how to use it.
To be clear though, options are not stocks.
A stock is a share of ownership of a company. You buy it, you sell it, that's it.
An option is a contract between you and another party that gives you the right but not the obligation to buy or sell a stock at a particular price for a certain amount of time.
Practical examples:
- I buy 10 AAPL stock at $100. I now own $1000 worth of Apple. AAPL's stock price rises to $110. I now own $1100 of AAPL stock, which I can sell and make $100.
- I buy a contract that gives me the right to purchase AAPL (an AAPL call option) for $100 (the "strike price" is $100) which expires in 30 days. This right is limited to 100 AAPL shares and I pay $500 for it. For the next 30 days, regardless the price of AAPL stock, I can buy 100 AAPL for $100 each. If the price of AAPL drops, I lose the $500 I paid for the rights. If the price of AAPL rises to the $110 from earlier, I can either buy 100 shares of AAPL for $100 and sell them immediately for $110, making $1000 - $500 (for the contract) = $500, or I can simply sell the rights to someone else and make the profit more cleanly.
This isn't meant to be an exhaustive course on how each security works, just an illustration of exactly how different they are, yet affect each other.
I posted this here because after many years, it has proven difficult to get simple options prices. There seem to be a lot of paid services and misinformation.
This guy [1] has written a neat post on generating option data from historic raw data. Would be quite interesting to compare the results on 1-minute data compared to raw prices. It seems OK for longer holding periods, where you'd mostly look to use options as directional leverage on underlying large caps.
I've seen this too but I don't think it's particularly valuable as it seems to base the prices on measured volatility, while the actual market prices them entirely differently. You can see in his chart at the bottom that the price diverges from real data by a full dollar at points and that's just for SPY, I'd be interested to see how it goes for something like VIX.
It doesn't really work using VIX either. At the end of the day, using any constant across the options (measured volatility, VIX, etc.) you're assuming a flat volatility surface which you simply don't see in the real world.
A better approach might be to use some kind of avg volatility surface with VIX as a baseline, but even that leaves you with no sentiment.
For some strategies this might work well enough (e.g. a flat volatility surface implies a lot of 50/50 probabilities), but for any advanced historical analysis (which seems to be the scope of this post), you really need to have the price/IV of evry individual option.
I tried using this and got really excited when I thought I found some options that were grossly mispriced (indicating that I could buy them, immediately exercise them, and make a hefty profit). Turns out that Yahoo just has bad data sometimes.
At QuantConnect we offer minute level options trades"es for your backtesting. Its 400TB of data :D. (I'm founder of QC). To backtest it we run it on 5GHZ water cooled machines! We play with really fun toys :D
You can also use symbol=CMG170915P00580000 to get a specific strike and save a lot of bandwidth, unless you happen to need the entire series or all series/expirations
Also note that Yahoo's data is somewhat unreliable [0, 2].
Unfortunately options data is just hard to get access to due to the agreements forced on providers by the exchanges. If you want to build a trading system on it without an institutional budget, the best options I've found so far are QuantConnect [3], which has paid for it and lets you build your system on their platform and OptionWorks [4], which you can pay $150 for and download all the data. You should be able to get a live feed from your broker after that. If you want it for academic research, many universities and colleges have access to much higher quality data.
This should be fine for personal use though.
[0]: https://meumobi.github.io/stocks%20apis/2016/03/13/get-realt...
[1]: https://policies.yahoo.com/us/en/yahoo/terms/product-atos/ap...
[2]: https://seekingalpha.com/instablog/6052771-kevin-wrotenbery/...
[3]: https://quantconnect.com
[4]: https://www.quandl.com/databases/OWCS