Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Announcing Timescale Cloud (timescale.com)
89 points by kermatt on June 18, 2019 | hide | past | favorite | 38 comments


I've been ingesting stock market trading data into MongoDB and using Metabase[1] to visualize it. It is essentially date, price, volume, ticker symbol, exchange. Around 400M documents so far.

Queries out of Metabase take upwards of 2-5 minutes to run even simple questions like:

   Plot the average price of Apple for the last 5 days grouping by minute.
Would Timescale Cloud be a better replacement in terms of performance? Is there a nice GUI visualization platform like Metabase for it?

[1] https://metabase.com/


We did benchmarks a while back comparing TimescaleDB to Mongo[1], and TimescaleDB was quite a bit better. So I think you'd definitely see much improved query time.

Also it appears that Metabase supports a PostgreSQL connection[2], so you could probably continue to use it.

[1] https://blog.timescale.com/how-to-store-time-series-data-mon...

[2] https://metabase.com/docs/latest/operations-guide/start.html...


MongoDB is not well suited for OLAP-style workloads - have you considered Yandex ClickHouse?


Hey I was wondering how you ingest stock market data. What source do you use, and do you use a connector, or did you write one yourself?


I use Alpaca[1], specifically the Node.js `websocket.onStockTrades` method and a bit of custom JavaScript code.

[1] https://github.com/alpacahq/alpaca-trade-api-js


Very cool! I haven't been able to find a more real-time data source yet. Thanks for sharing!


Yeah - TimescaleDB comes with a time_bucket function that allows you to group things by minute, and specify a where clause that queries for just the last 5 days. You can build indexes that include the ticker, and also reorder data on disk to optimize how much data you scan on disk. So, TLDR - you should definitely try it! I did some quick googling, and it looks like Metabase supports PostgreSQL, so it should work with Timescale. We would love to hear how it goes!


"Powered by Aiven" + nearly identical interface, so this is kind of a reselling arrangement? Aiven supports timescale on postgres, so what additional features does Timescale Cloud provide?


Yes this is a partnership with Aiven.

Aiven Postgres supports the open-source version of TimescaleDB.

But Timescale Cloud is the only hosted service where you can get the full TimescaleDB experience, which includes our community and enterprise features (eg interpolation, data retention policies, continuous aggregates, data reordering, etc). [1]

It also includes special machines types and plans more suited for time-series data that we co-developed with Aiven.

It is also the only hosted service directly staffed by the TimescaleDB development team.

[1] https://www.timescale.com/products


One main difference in these sorts of arrangements is vertically-integrated support.

If your Aiven database is having Timescale architecture problems, your support contact is someone working for Aiven who would need to turn around and reach out to Timescale about the bug (or suggest that you do so.)

If your Timescale Cloud database is having Timescale architecture problems, your support contact is someone working at Timescale who can just call over the guy who wrote the code with the bug in it.

(On the other hand, if your problem is with the Aiven backing cluster, it'll presumbly take slightly longer for Timescale Cloud to resolve, given that they'd have to bounce that forward.)


In this case we (Timescale) are your main line of support. Our job is for you to have a 100% positive experience on Timescale Cloud. The buck, as it were, stops with us.


Is there any autoscaling or pay-for-what-you-use pricing? It's not 100% clear, but it looks you essentially choose the instance type you want when using Timescale Cloud.

I understand why you might not want to call that out specifically in these promotional materials, but it's an important consideration when choosing which managed DB to use and when evaluating cost.

What specifically does this mean in practice - "Grow, shrink and migrate your workloads between configurations and plans with ease."


It's more like pay-for-what-you use. You can check out the pricing calculator for more detail: https://www.timescale.com/cloud-pricing

Growing, shrinking, and migrating involve moving to a different instance type, so you have to select a different instance type. That being said, there is very very little downtime (on the order of 3-5 seconds while the DNS resolves)


Thanks for the clarifications, that's helpful.

I wouldn't call it pay-for-what-you-use unless the pricing varies with your actual usage instead of changing when you change plans.


Interesting point of view - it's certainly always a bit hard to find the right verbage that everyone can understand, but hopefully this discussion clarified things!


Last time I used a traditional hosting provider, I could get a new bare metal server setup in under half an hour. I would hardly call them "pay what you use" even though I could start and stop servers and change the plan I'm on and still be only two to three times slower than doing the same on AWS.


Certainly - I've been seeing a bunch of usage based pricing that price on a different metric (like metrics per second) etc.

Regardless, with Timescale Cloud, if you get a machine, you pay the price for that machine for as long as you use it. So I guess to avoid the confusion, we can call this just paying for the machine :)


By the way, I've recently started using TimescaleDB (past month or two) for processing cryptocurrency trading information and I'm liking it a lot so far. I love that I can use Postgres as normal, but have efficient time-based queries.

My first ever test query was to generate minutely OHLC+volume from time,price,quantity trades. It was pleasantly easy to do:

    select time_bucket('1 minutes', time) as minutely, 
           max(price) as high,
           min(price) as low,
           first(price, time) as open,
           last(price, time) as close,
           sum(quantity) as volume
      from trades
    group by minutely
    order by minutely;
https://gist.github.com/danielytics/e9b69933586e00732646e016...


Plus growing, shrinking, migrating only requires a few clicks


How do you do live migrations? Do you shard, then suspend existing queries, and finally redirect?


We spin up a separate instance that matches the type that you want to migrate to, restore a backup and stream the WAL logs. Then, we redirect.


The good ol' event sourcing trick :D


This is cool, is there a guide to migrate from influxDB? Mostly interested in the cost savings.


We haven't done a formal price comparison, since it's actually a bit hard to compare apples to apples since the two databases are architected differently. Definitely something we should consider doing! Thanks for the idea.

Migration wise, I would use Outflux for batch migration and Telegraf to support a live migration (https://docs.timescale.com/v1.3/tutorials/outflux) and (https://docs.timescale.com/v1.3/tutorials/telegraf-output-pl...).


We have a tool to migrate from Influx. https://docs.timescale.com/v1.3/tutorials/outflux


It seems disingenuous to call it the first. AWS has it's own time series db. In terms of open source Apache druid had a managed cloud variant that imply.io runs.


It’s the first multi-cloud.

AWS obviously just AWS.

Based on Imply Cloud website it looks like it too is just AWS.

Timescale Cloud gives you freedom to choose your own cloud provider, and lets you migrate between clouds with a few clicks.

I’m fairly certain we are the first to offer that, but I’m open to any counter examples.


Q: Regarding multi-cloud. Say if AWS has an outage will Timescale cloud fallback to use GCP or Azure?

Can something like this be provided? Not sure if the network latency between different cloud providers would allow doing a multi-master replication scheme.


You select the public cloud vendor you want your machine spun up on. So no, if AWS has a full outage, it won't fall back to a different cloud. Failover is done at an availability zone level.

Since TimescaleDB is also open-source, if you want that kind of replication scheme, you can always install on VMs across clouds. However, as you rightly pointed out, network latency is a definite concern and impacts the feasibility of RPO and RTO.


One thing to add:

The Timescale Cloud does allow you to do is create asynchronous read replicas across different clouds and regions (with a couple clicks).

You can then "fork" a read replica (at any point in time) and make it a primary to start serving out of that cloud (again, with just a couple clicks).

That's not quite the same as auto-replication/failover between clouds, but getting pretty far there.


Can be done with an async-replication db like ScyllaDB. Or even CP db (just harder/slower).


Yeah! CockroachDB is also a really cool multi-cloud DB. That being said, they are really more for transactional workloads, and less purpose built for time-series.

I guess there are always trade-offs in the software world.


Also, Amazon Timestream isn't usable yet, it's still in private preview.


how does this compare to druid/imply.io ?


I think the quickest comparison is SQL vs NoSQL. We haven't done performance benchmarks against Druid yet, but do know of several users who have switched because they want to use PostgreSQL instead.


Just announced today, here's the blog post: https://blog.timescale.com/timescale-cloud-first-fully-manag...


That gives more background info so we've switched to it from https://www.timescale.com/cloud. Thanks!


Why would I pay to use this open source software? It doesn't make sense to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: