Free API access is not 1 day but 24/7 on : http://cryptomarketplot.com/api.json ; if you have traffic limits check http://cryptomarketplot.com/api.json.br added a few month ago but not shown on the mainpage (I did that https://news.ycombinator.com/item?id=18653590 )
Only drawback is the limitation to 1 minute resolution. Still good enough if you are not HFT.
If you are HFT they had packages like "all you can eat" feeds. It was like $500/month for the BTCUSD pair on any 5 exchanges out of 75 supported, with different pricing for different latency offers (going all the way to colo!) using custom client and software integration (for SLA on latency targets)
I'd have to ask if they now provide historical data.
A few random questions:
1. Can you speak more to the "synchronized clock" part? You augment data feeds timestamps with your own?
2. What kind of database did you choose for this? How much data are you managing?
Also, (promise you'll give me a discount if tell you this?) this seems underpriced. I haven't seen anyone sell truly high resolution depth data going months back yet. For BitMEX nonetheless. But maybe i'm out of the loop. Anyways, congrats on the launch!
1. Yes every message has also local timestamp (100 ns precision), all exchanges feeds run on singe VM host hence 'synchronized clock' - definitely could use better wording there. I know alternative services have problems with that (out of sync timestamps for different symbols on single exchange for example), that's why it's mentioned.
2. It's all stored in Google Cloud Storage, plus Wasabi (S3 alternative) as a backup so no DB - it would be too expensive for what I wanted to achieve, currently it's around 4-5TB of compressed data (so 25-35TB uncompressed - need to check to be sure). Data filtering (by symbol and channel) is done on demand, plus API clients that cache filtered data locally. All written from scratch in .NET Core, plus Cloudflare Workers as public facing API auth and caching proxy.
Indeed pricing is supposed to be very affordable as it's targeted at independent algo traders so without spending huge amounts of $$$ they can have good data to backtest on professional level (arguably if you can call crypto trading professional as some argue). Happy to provide you with the discount, please get in touch with me via email if interested.
It is well known that exchanges like bitmex suffer from special latency issues. People pay very well to study that and provide mitigation.
Many approaches can be done. But running on a single computer, let alone a single VM, is very dangerous.
I worked for a company that offered multiple timestamps + hardware timestamping for accuracy.
A VM will cause too much latency. Even very well configured, your jitter will be above 100ns by several orders of magnitude.
So be very careful about considering your extra timestamp as authoritative.
You are spot on the other challenge: database. The person who focused on that achieved impressive results using a special hardware + software mix.
I appreciate you making this accessible to indie algo traders, it's def what we need. Will keep an eye on this.
Some hopefully not too harsh feedback:
You're capturing data in London. I know nothing about crypto markets, but they probably aren't all colocated in London, and your users won't be either. You should try to collect data at each source, synchronize it well, and let users adjust timings to suit their needs.
Data integrity is critical. You have incidentReports in your API, but I didn't see what goes in there. Ideally, make this machine readable (begin/end timestamps for each incident interval) or call the user saying data is good/bad as they stream it.
To make this more useful as a product, consider building a normalization layer on top of what you have here. It's great that you provide the actual exchange messages for those who need them, but researchers often want to answer questions like "which market has the tightest average bid-ask spread over the past month?" without learning details of a dozen APIs and writing boilerplate code for each.
I'd suggest providing the user with a standardized object representing the limit order book for a market and ticker. Clients would subscribe to it and receive generic events like snapshot, order/price level added/deleted, trade, etc. As the data is being streamed, they could also access the current state of the book at each point in time through this object to get information like the best prices, size and number of orders at each price, spread, etc.
Could you give a scenario where order book data at this granularity might come in handy, as opposed to say a single measure of liquidity (however that would be defined)? Thanks
Edit: Do you use FIX API for the exchanges that provide it?
But dedicated shops use custom made formats, with custom made software, often running on custom hardware.
If you want FIX, you may also need a low latency feed and at least a custom client where you just add your algorithms.