Hacker News new | comments | show | ask | jobs | submit login
Building a scalable time-series database on PostgreSQL (timescale.com)
305 points by RobAtticus on Apr 4, 2017 | hide | past | web | favorite | 133 comments

While I appreciate PostgreSQL every day, am I the only one who thinks this is a rather bad idea? The row based engine of PG is the antithesis of efficient storage and retrieval of timeseries of similar patterns, yielding almost no compression. Columnar storage should naturally be much better at that (BigQuery, Redshift, Citus), culminating in purpose built stores like Influx, Prometheus or KDB. Prometheus for example manages to compress an average 64-bit FP sample including metadata to just 1.3-3.3 bytes depending on engine. As most DB stuff is I/O bound, that usually translates into at least an order of magnitude faster lookups.

Not all time series data is metrics (which is what Influx & Prometheus are adapted for). Any kind of audit log -- to support a customer service console, for example -- is going to be a time series of fairly complex, multi-field records. Orders (and order modifications), support requests, payouts and transfers... Most of business activity falls under the aegis of time series, nominally immutable, even when it's not numbers. Working with this kind of data is definitely helped by a comprehensive SQL dialect.

To take a contrary position: whose infrastructure is so large that a specialized SQL and storage engine (like Influx) is necessary? Actually not that many companies...so why does infrastructure always end up with these special snowflake versions of things?

Soo much this, just worked on a project that sacrificed reliability, maintainability, and scalability to use a real time database to deal with loads that were on the order of 70 values or 7 writes a second.

Granted this was an especially bad case because it was the shittiest real time database ever built.

Don't think we're going to disagree with this!

And because Timescale is an extension in Postgres, it can actually sit inside (and coexist) with your standard Postgres database if you want. Or, it can be deployed as a separate instance that can transparently JOIN data against your main DB via a FDW.

Companies don't dream of staying small, and they want solutions that scale with them. If they do end up growing, planning ahead with a purpose-built solution saves time and resources recouping technical debt.

And many companies stay small or disappear because they wasted resources on problems they didn't have.

It's a risk either way.

But an equal risk? For most companies it isn't. Their are more failed big dreamers than companies that missed the big time because they couldn't scale.

It's a good question, and I don't know where the quantitative data exists to answer it.

My personal preference is to hire experienced teammates who have solved similar problems in the past, because at most levels of scale (other than, say, Google's or Facebook's), few problems are truly novel. So it's reasonably possible to predict where the business will be WRT technological requirements in, say, 1 to 3 years, sometimes 5 if you're just getting started, and plan accordingly.

In the case of this particular problem (time series data), there are some pretty easy and scalable answers out there -- Datadog, SignalFx, Wavefront, Prometheus, etc. to start with. So it's not like you have to waste a bunch of opportunity cost going out and building a solution.

With references to infrastructure in particular -- not all aspects of the business -- even very large companies rarely need specialized storage engines for logs or metrics. A thousand servers don't generate enough data for it to matter.

We've seen something of a convergence here with ELK -- people push their syslog into the same system they use to provide general text processing for the application, instead of having a separate/weird storage engine for infrastructure logs.

We do acknowledge that storage is not our strong suit, but we think the benefits of having SQL and being able to integrate easily with non-time-series data is a big win itself. Certainly if storage is an issue for you, TimescaleDB is probably not the right choice. But if it isn't, and you are doing more advanced queries (including JOINs or across multiple metrics), TimescaleDB provides a compelling solution in our view.

It's not just the storage itself (which is really cheap nowadays), but more of the fact that every byte read and written needs to go through the processor, pollutes RAM and poisons caches. Less storage usually translates into direct performance gains as well.

Also, if you find yourself JOINing timeseries on more attributes than the timeline itself, you should question whether you really have a solid use case for a timeseries DB.

That being said, always good to see competition in the market, especially if it's built on such a rock solid product and community.

I really like that Postgres is "good enough" solution for almost every use case by now besides relational data, be it document storage, full text search, messaging - or time series now. Nothing wrong with having less stack to worry about, especially for prototypes and small scale projects!

I think it's pretty clear Postgres is getting to a situation where the storage format becomes the limiting factor for a lot of workloads, particularly verious types of analytics. Timeseries are example of yet another type of data that would benefit from different type of storage. There already were some early patches to allow custom storage formats, chances are we might see something in one of the next Postgres versions (not in 10, though).

> It's not just the storage itself (which is really cheap nowadays), but more of the fact that every byte read and written needs to go through the processor, pollutes RAM and poisons caches. Less storage usually translates into direct performance gains as well.

FWIW, I think that's currently a good chunk away from being a significant bottleneck in postgres' query processing. But we're chipping away at the bigger ones, so we'll probably get there at some point.

One more thing to add: Postgres index usage via index-only-scans go a long way to mitigating performance issues of wide rows (although, admittedly not disk-space issues). This allows good performance on columnar rollups.

You are mentioning implementation-specific issues that may have more than one solution.

If one goes to the heavy contenders in this space, e.g. Teradata, you may expect: - DMA for data retrieval - A suitable and linearly scalable network layer with Remote DMA - Row, columnar and hybrid storage options - Utilization of CPU vector options - Etc

The analytical database has become a commodity. I really like Postgres, but I would still do a very careful analysis of my business needs if I were to choose a DBMS when there is such a strong range of commercial options available.

All good points. And especially agree about your last paragraph -- another benefit I didn't highlight is anyone familiar with Postgres does not have to learn a new part of the stack.

I totally agree with all the points here.

Maybe Postgres needs pluggable storage engines [0].

I read Postgres 10 might have this, but looks like it will miss the deadline.

[0] https://www.pgcon.org/2016/schedule/events/920.en.html

Good time series performance is more than just using column-based storage. You also need a query language to take advantage of this and the ordering guarantees it gives you. SQL while it has tried to reinvent itself, is a very poor language for querying TS databases.

From personal experience, not sure I'd agree with that statement. SQL may be limiting for some time-series use cases, but for others it's quite rich and powerful. I won't pretend that SQL solves everyone's time-series problems, but we've found that it goes pretty far.

That said, we may have to get a little creative to support some specific use cases (e.g., bitemporal modeling). Still TBD.

Also, I agree that SQL isn't for everyone (the popularity of PromQL is evidence to that). But a lot of people have been using SQL for a while (personally, since 1999), and there is a rich ecosystem (clients, tools, visualization tools, etc) built around it.

It most definitely is.. LEAD and LAG are about all you get, and they are painfully slow. SQL was made to be order agnostic, and attempts to make it most order-aware don't quite work. A good time series database is build on table order and lets you exploit it. SQL is abysmal for any time series work.

And temporal and bitemporal databases (despite the name) are orthogonal to the aggregation and windowing issues that make time series difficult in SQL or a row-oriented database. The are just a couple of timestamps and where clauses to support point-in-time queries.

Maybe this is why so many time series databases fail. People making them often don't seem to fundamentally understand the issues, Very few, such as Kx and KDB, seem to understand them.

Pluggable storage will definitely miss PostgreSQL 10 since the feature freeze is later this week.

And there's not even a credible proof-of-concept patch.

AFAIK, cstore_fdw (their columnar storage implementation) is not widely adopted among Citus users because it's not real-time. They also rely on memory to be able to query the data efficiently. I believe that the tricky part for time-series databases is not the underlying storage format, if you store the data in a way that allows you to query the data in a specific time period efficiently, that's usually enough for most of the cases.

If the data is not small for that specific time period and, than you will hit some IO problems, if you create the appropriate indexes it may be efficient at first but then you may need to switch to columnar storage format if you really need it. If Timescale uses immutable shards which are basically Postgresql tables, you can easily re-write the historical shards periodically using the columnar storage format rather than default row-oriented table format.

That's usually how most of the time-series databases work. The hot data is in row-oriented format and the system uses buffering to write the efficient columnar tables periodically under the hood.

I think these are good points about cstore_fdw and real-time analysis (although we don't have personal experience with this).

The usual thing that prevents indexes from scaling with large data is that inserts slow down as soon as the index BTrees can't fit into memory. TimescaleDB gets around this problem by auto-sizing of tables: we start new tables as old tables (and their indexes) grow so large that they can no longer fit in memory. This allows us to have tables that are not so big that inserts become slow but big enough that queries don't have to touch many tables (and are thus efficient).

However, as data sizes grow, you may want to convert data to column-stores to save disk-space though, as you allude to. We are looking at how best to do this and the best "archive" format to use. Stay tuned.

(Ozgun from Citus Data)

Your observations on cstore_fdw are accurate. I'm adding a few points about our learnings from cstore deployments in production.

From a use-case standpoint, we find that most cstore deployments use it for data archival and to a lesser extent data warehousing workloads on Postgres. Some of these deployments are fairly sizable (Fortune 1000) and they primarily benefit from compression that comes with cstore_fdw (3-10x).

Citus + cstore_fdw deployments exist, but they are fewer in number. This isn't because of an architectural issue related to cstore or Postgres FDW/extension APIs. Rather, making cstore work for real-time workloads require notable implementation work. In particular:

(1) Real-time ingest & update / delete capabilities: Columnar stores have a harder time providing these features. The most practical way to achieve this is by partitioning your data on time, using a row store for the most recent data, and then rotating that data into a columnar store.

Users who use cstore for real-time workloads, manually set up table rotation themselves. In fact, this type of integration between Citus (distributed db) and cstore_fdw (columnar storage) has been one of the most requested features: https://github.com/citusdata/citus/issues/183

(2) High-availability & disaster recovery: This requires that cstore integrates with PG's write-ahead logs. Fortunately, Postgres 9.6 makes this possible / pluggable and it's on cstore_fdw's immediate roadmap.

(3) Indexes for quick look-ups: cstore_fdw comes with built-in min/max indexes. This helps but doesn't provide quick look-ups across any field. We thought about integration with PG indexes -- one challenge there involves PG indexes' storage footprint and potential hit on compression ratios.

(4) Query planning and execution tuned for columnar store: If your workload is bottlenecked on CPU (rather than disk), these type of improvements can help a lot. For now, we're happy to see both recent and upcoming analytical query performance improvements that are coming up in Postgres 10 and beyond.

Also, putting cstore_fdw aside, I'm a huge fan of Timescale and the work they are doing! My favorite "feature" is that Timescale is a PostgreSQL extension. :)

So I don't know about the data layout here (and definitely agree with you re: column stores), but if you look at stuff like cstore_fdw[0], I think the data layout is also somewhat configurable under the hood. So "based on postgresql" doesn't necessarily mean "uses the same storage" - it could have its own columnar storage with the postgres query planning, connection management and other infrastructure on top, perhaps?

[0] https://github.com/citusdata/cstore_fdw

Currently we are using the default storage mechanism in Postgres, but have had discussions on alternatives and other ways to compress our storage footprint.

Have you tested running it on a FreeBSD box with ZFS? It has lz4 compression by default and makes such a great storage solution for PG. You get compression, snapshots, replication (not quite realtime but close), self healing, etc etc in a battled hardened and easy to manage filesystem and storage manager. I've found you can't beat ZFS and PG for most applications. Edge cases exist of course everywhere.

Compression works well with column stores for different reasons - since all columns are things of the same data type, there are way more opportunities for very very lightweight and highly effective compression, like run length encoding (a,a,a,a,a,a -> 6a) which helps a lot with sparse or duplicate columns, range compression (1,2,3,4,5 -> 1-5), dictionary compression (hippopotamus,potato,potato,hippopotamus -> 1,2,2,1 + a dictionary like {1:hippo,2:potato}), value scale (203,207,202,201 -> 3,7,2,1 + base:200), bit arrays for booleans and bit vectors for low-cardinality columns, etc.

This saves on space but also often improves query latency, provided your execution engine supports queries on compressed columns - less data means more of it fits into cache, meaning less cache misses (i.e. expensive disk reads).

edit: So what I mean is compressing the entire files or disk blocks with LZ might not take advantage of the same properties and might not get you the same deal - here often the in-memory data layout is also compressed, and queries execute efficiently on compressed data, versus with a blindly compressed filesystem you'd ostensibly have to always decompress first.

True. Actually, this execution efficiency is usually way more important benefit of column stores than just saving disk space.

Sadly, PostgreSQL does not support that now, but hopefully it's not that far in the future. It's becoming a fairly regular topic in discussions both with customers and devs on conferences.

Thanks for the great suggestions.

We've certainly discussed running on ZFS internally, but haven't evaluated yet. We're a bit uncomfortable dictating file system requirements to users, so looking to ultimately provide many of these things in the database instead.

Would welcome any testing/benchmarks, though! :)

Reminds me of IBM Informix TimeSeries which has been around for a while. This was IBM's attempt to bridge the gap between the relational model and the columnar format that is optimal for time series, except that they store the time:value arrays in blobs. Aside from compression benefits the columnar storage works really well for irregular time series. If the incoming data is irregularly spaced, row-based schema will have a certain degree of inefficiency due to the need to store NULLs each time one of the columns is missing a valid value.

That point about irregularly spaced data (sparse) is a very insightful observation. I’d just add that a user can to some extent address that by normalization, i.e., splitting incoming data across multiple TimescaleDB (hyper)tables, like in any SQL database. However, the are clear trade-offs here. The upside is that users can themselves balance these trade-offs.

Congrats on the release!

I'm in the process of migrating from influxDB to a combination of Kafka and TimeScaleDB.

Has anyone beaten me to the lunch with testing if it works with [PipelineDB](https://www.pipelinedb.com/)? I can see that being a powerful combination, perhaps also tied to their forthcoming [Stride](https://www.stride.io/) analytics API.


Just so you're aware, upgrading between versions of TimescaleDB at this point is not fully supported. Meaning, we are still breaking some non-trivial things between release and don't have upgrade scripts. I don't want to discourage your usage, but I'm letting you know about the potential pitfalls.

We hope to start supporting smoother upgrade transitions in the near future.

Out of interest, why are you moving away from InfluxDB? We're looking at moving to it....

We can't pull stats we need.

We tried: so many continuous query rollups... Eats into ability to write data... at only 1000/sec (artificially tripled due to splitting into multiple measurements to get around cardinality problem) Dropping writes even on 24 core 500gb ram, SSD Dedicated.

Can't scale when also used to query against.

Basically: their claimed performance looks great; but real works doesn't match up. They've pushed fixing cardinality two quarters.

No reply on my github issue with broken top()

Compression is good... is about all I can say lol

Didn't want to make the title too long, but I should note this is an open source database. Here if anyone has any questions.

Also, github if you just want to see the code: https://github.com/timescale/timescaledb/

What do you plan on doing for benchmarking? I don't expect you to get something like STAC done, but will you try to find general benchmarks that others use?

I've been working with TS databases for a long time now, and it never fails that every database vendor always has benchmarks showing they are the best (no put that more bluntly, when you come out with your own benchmark suite and you are the fastest/smallest/bestest I won't be surprised or believe you).

I don't expect you to be the fastest when having a row-oriented architecture, and it would be an unfair comparison against the non-free databases, but I would like realistic numbers.

Actually, if you came out 2nd to 3rd against competitors' workloads, I would be far more impresses.

(But really? A row-oriented TS database?)

Influx has a framework[1] we are looking at utilizing, while adding other queries that they do not support (e.g., more complex predicates, non-time-based ordering, JOINs). We hope this will give a more accurate set of benchmarks for time-series workloads.

Definitely agree that sometimes its hard to take numbers from a company at face value. As you said, we don't necessarily expect to show that we are the best at all types of queries (and will include query patterns where we may not be the best), but for those who need more SQL features and richer queries we expect Timescale to perform well. Additionally, working with people's existing stack and database is another key win.

[1] https://github.com/influxdata/influxdb-comparisons

That benchmark looks like its made to hide inefficiencies, if I understand it correctly. Like why is JSON serialization time not included? If you made the interface JSON over HTTP, you should have to live with the performance implications.

I notice they only compare against free versions. I wonder if they have tried against any of the commercial vendors? The comercial vendors right now tend to lead the FSS pack by a very large gap.

We've been using this framework to avoid the appearance of crafting our own that cherrypicks only our strong suits. Don't disagree with comments that you should account for JSON serialization cost in your benchmarks, but that's a criticism of the suite, not Postgres/Timescale/our numbers.

To be clear: our tests are all using the standard PostgreSQL interface, not JSON-over-HTTP. And we don't exclude any SQL parsing time in the DB.

I tried to read it a little more, that it is incredibly confusing to me. Did you have issues with that too? I have access to a couple commercial products, but just trying to find the basics to run are difficult. STAC was so much easier to understand.

Give dataset, X, how fast does query Y return.

Wow, lots of very critical comments. I for one think this is a very good idea. I have a use case right now that pretty much fits the manifesto of timescale. Having the power of SQL is very atteactive. I'm looking to move my current setup to timescale. Will let you guys know how it goes.

Hey, Timescale team member here. Thanks for the <3. Looking forward to hearing how it goes for you.

Would also love to learn more about your use case... If you feel inclined to share more, please shoot me an email andrew@timescale.com. Thanks again!


Full SQL interface

Scale-out distributed architecture

Auto partitions across space and time

Single-node and cluster optimizations

Complex query predicates

Distributed aggregations

Native JOINs with relational data

Labels and metrics support

Geo-spatial query support

Fine-grained access control

Reliable (active replication, backups)

Automated data retention policies

At my last company we did tens of billions of sensor events per day into Cassandra and I thank god our engineering team was smart enough to spend manpower on product instead of writing yet another database

One thing I'm curious about is what are the differences between an operational historian (like ABB, PI, Honeywell etc) and a time series database like Cassandra or the one in this article?

I work a step upward in the stack (analyzing the sensor data and building models based off of it - usually using SQL) so I'm not an expert on underlying technology.

Over the last 15 years or so my work (industrial plant) has been big on historians. We've moved from in house bespoke stuff from the 90's which as I understood it was basically a custom written ring buffer that held 1-2 days of live data and would get down sampled into oracle tables as 1min/ 6min/hourly averaged data every hour or so.

Prior to this in the 80's we used a database technology called "IMS" but I'm not as familiar with how this system worked.

Nowadays in all our newer systems all I hear from the low level guys is "OPC" (which I gather is a messaging format) and historians. Time series databases sound like they should fit in somewhere but I'm not sure where - given other people use them for logging high frequency sensor data sounds like they could be a solution.

To be clear - Cassandra is great for time series but it's not a "time series database" - it just solves all of the same problems (and then some), because it gives you natural replication, real HA, cross-wan, let's you cluster/order data on disk using big table like sorting, and scales over a thousand nodes per cluster.

For sensor data collection, it's very hard to beat. Lay something like spark or filodb on top for more flexibility and it's virtually unmatched.

How many machines did you use ?? Not everyone can afford a cluster - and I do not think a single cassandra node could store 150000 entries per sec.

Why didn't you use Scylla, which is 3x to 10x faster than Cassandra?

A single cassandra node can probably do 100k writes/second.

We ran hundreds of machines (per cluster), and multiple clusters.

We don't use Scylla for dozens of reasons, including (but not limited to):

- The product was written before Scylla existed

- Our benchmarks didn't show Scylla as being 3x faster, let alone 10x faster

- Our machine count wasn't based on Cassandra limits, it was based on the amount of data we were comfortable storing on a given machine

- We don't throw shiny new tech into a critical stack just because someone says it's faster. I'd rather run code that's tested at scale. ONE of my Cassandra clusters there would have been 10x larger than any known Scylla deployment. At the same time, that deployment is at least 2 orders of magnitude smaller than other published Cassandra deployments (meaning largest known Cassandra deployments are 3 orders of magnitude larger than the largest known Scylla deployment)

The one thing I agree with the author: databases aren't where you want to have hipster tech. You want boring things that work. For me, Cassandra is the boring thing that works.

I absolutely agree about hipster tech and databases, but I'm not so sure that Cassandra fits entirely into the "just works" category. Don't get me wrong, it's pretty neat and definitely more mature than stuff like Mongo or whatever other NoSQL-of-the-week, but it's not exactly at the level of PostgreSQL or even InnoDB/MySQL. It works, but it needs quite a bit of care and feeding by comparison.

Really Cassandra? No way does it compare to mature RDBMS's like Postgres, MySQL or SQL server.

So, good for your past company? What happens when your primary product is the database? More options are never a bad thing.

If there's a downside here, it's that this database is still a little too immature to have a great use-case.

More options dilute talent and stifle innovation - people spend time writing new tools instead of advancing state of the art

But by all means, it's not my money being burned, so I'm not losing sleep over it. I'll just sit here and giggle as people talk about scaling Postgres and in my mind all I hear is single point of failure masters and vacuum to prevent wraparound in high write workflows

Yes, but people work on what they choose to and it's not possible to say putting them all on the same team/project would move that forward - or that they would even want to do that.

Does it waste a certain amount of effort in duplicated work and products? Sure. But we also get new projects that may try new approaches, gain different experiences, and train new developers which all helps the market expand. Ultimately it's better to have yet another version than nothing at all.

Sounds like you’re main concern is high availability (HA) and scaling, things Cassandra certainly does well. But if you are also interested in data exploration and complex queries, Cassandra might not be the best choice. I think there is certainly a part of the spectrum here that is not well addressed by something like Cassandra. A future clustering version of TimescaleDB will provide horizontal scale out and solutions for HA, btw.

Also, regarding vacuuming, TimescaleDB does not suffer from it much under the normal workloads it targets because time-series data is largely immutable and deletions typically happen at the chunk level (i.e., dropping complete tables), avoiding any vacuuming during deletes and making wraparound vacuuming more efficient.

A future clustering version of TimescaleDB will reinvent problems already solved by Cassandra, Riak, and similar databases, and the level of effort in doing so will be far greater than the level of effort needed to add data exploration and complex queries on top of cassandra in the first place (that is: clustering is the hard part, and had this team spent time to bolt the analytics logic onto cassandra similar to what FiloDB did, they'd have the best of both worlds already).

Wraparound vacuuming still sucks for high write workloads. I've been there. I've fought that problem in a high-write-throughput-no-delete-immutable-workload. I've seen it in person. You're still writing a lot of frozen txids to disk. Your slaves are still going to get the WAL command in a single-threaded WAL sender and fall behind in replication as that vacuum runs. You're still going to have pain trying to create an HA setup.

Vacuuming has traditionally been a problem with large table sizes. In TimescaleDB we break up the tables so that they are smaller. That, combined with the new freeze map feature in Postgres (since PG 9.6: http://rhaas.blogspot.jp/2016/03/no-more-full-table-vacuums....) make vacuums not an issue for us. Certainly we've never seen this issue (even on default autovacuum settings) and we've tested some huge datasets.

If your main use case for timeseries is a graphite/statsd replacement, another PostgesSQL-based project worth looking at is tgres.



And Tgres requires no extensions to Postgres, you can use Postgres as-is (I know because I wrote it :) ).

I'd also add the link to the post explaining the principle of how time series is actually stored which makes it capable of receiving 150K+ data points per second on relatively small hardware:


Fundamentally though, what timescale does is different in that they store (as far as I can tell) every incoming data point, which is more of a "big data" problem, whereas Tgres is all about what you can do with Postgres if you make the series regular (i.e. evenly spaced) on the fly and store time series as round-robin archives of fixed resolution and span, so I don't think comparing Tgres and Timescale makes a lot of sense.

Not sure what you're trying to say about "requires no extensions"; Timescale is an extension that doesn't modify Postgres itself, similar to Citus, and very different from XL/XC, GreenPlum, or EnterpriseDB's proprietary offerings.

I simply mean that no CREATE EXTENSION is necessary.

This looks nice. I've had to roll my own PostgreSQL schema and custom functions a few times for timeseries data and if this prevents the need for that I'm impressed.

The author mentioned write performance, but didn't touch much on read performance. Are there any benchmarks you can share in that direction?

Also the author didn't touch much on the space requirements or the node splitting expectations.

Those two areas would be useful to explain in greater detail.

Thanks for the comments. I (not the author, but a co-worker) mentioned in another comment that we are working on more benchmarks that we hope to share soon, particularly in regards to read performance. We've seen some impressive speed-ups for aggregates grouped by time intervals, but we're still a bit early in our tuning on the query optimizations to give out more numbers. Again, hopefully in the next few weeks.

Space requirements can admittedly be a bit higher than some other time-series databases or column stores. Since disk is cheaper than memory, and with easy ways of doing retention, we think its a worthwhile trade-off for people for now. But compression is something we will look to down the road. For our later benchmarks we will include memory/space comparisions to help people evaluate.

As for node splitting, our clustering solution (not yet released) is being worked on quite a bit as well. When its closer to being ready we'll include our thoughts on how to best manage your partitions.

Are you considering configurable automated downsampling triggers for inserts over time buckets? Essentially these would be aggregates triggered to fire and insert the resulting values into another table of lower granularity after the time bucket has closed (1s, 10s, 1m, 5m, etc.)

This is a great usage idea and something that should be fairly easy once we support close and update triggers (on our list of todos)

And just to follow-on previous comment: Yes, we plan to offer such continuous aggregations, where data of lower granularity can then just be automatically written into a different hypertable.

This also plays nicely with configurable data retention policies if desired, e.g., only store raw data for X days, but more aggregated data for longer periods of time.

I will be waiting for you to release your clustering solution. Will it be also included in the open-source version or are you planning to release it as part of your commercial service?

Clustering will be in the open-source version.

Great, thanks!

Most distributed SQL databases fall into 2 uses: large data-warehouse and scalable primary data store.

For DW needs (including time series), MemSQL is still the best option for distributed SQL unless you absolutely need postgres features. Proprietary w/mysql flavor but integrated columnstore + memory rowstore built for performance and fantastic ops management software makes it a great system. Also has interesting kafka and s3 ingest pipelines.

For an operational db, Citus is a great scale-out option for postgres. You can combine it with cstore to get columnar compression for DW use but at that point it's better to just use a more focused product. There's also CockroachDB which looks good (if it ever gets out of beta). I'd also recommend ScyllaDB + JanusGraph once both mature to get a scalable multi-master graph database if HA/reliability/replication are a big concern.

Timescale DB is interesting and it's always nice to see more options, but right now there are better data warehouse options already that work just fine for time series + SQL. A built-in columnstore would go a long way to making it competitive though.

Whats the main difference compared to pg_partman? That provides parent and child tables, automated partitioning and the user only queries the parent tables.


TimescaleDB does auto-sizing of the child tables (chunks) to keep write performance relatively constant while also making retention policy easy to do. That is, we make sure the chunks are aligned to time unit (e.g. an hour or a day) so when you want to remove data its dropping whole tables rather than individual rows.

Additionally, TimescaleDB comes with optimizations on the query side specifically for time-series data.

Ok, autosizing is missing in pg_partman. Retention works the same (just dropping childs). I've read the paper and there it gets more clear for me.

So you distribute the childtables to several nodes of a server cluster.

Is network latency a problem? I guess one should colocate the servers in one location rather than spread it out?

How good does it work when nodes die?

Do you use query parallelization (available since 9.6 in vanilla) on a single node and across different nodes?

Yes we distribute the child tables among the cluster. Our default distribution mode uses "sticky" partitioning where a partition prefers to stay on the same node. This allows you to control data colocation via the partition key. Our clustered solution is not released yet but we plan to handle node failures via regular postgres STONIH mechanisms. Once node failure is detected, the system reconfigure which children to use.

Query parallelization works in single and multi-node cases.

And a clarification just in case this wasn't clear: We use constraint exclusion analysis to determine which of the "child tables" of the parent hypertable to query. So, when performing query parallelization, you aren't sending subqueries to chunks/nodes that will not match your SQL predicate (plus a bunch of other query optimizations we do with pushdown, etc.).

This is also true at write time: You can insert a batch of rows in a single insert, then the query planner will split them into sub-batches, which it then will insert into the appropriate chunk. So, you don't need to "pre-partition" your data before writing to the system (even in the forthcoming distributed version).

I've pinged someone on the team whose a big more familiar with our clustering, so hopefully they can give you a more detailed response soon.

Loved this article, as a competing database company, they did a fantastic job relating to developers and being authentic! Great job, please keep this up, it will definitely make you a winner.

Key quote of interest: "135K writes per second per node" this is very good! Quite impressed.

Has anyone used gnocchi outside of OpenStack?

It can use postgresql as an index store.


Congrats on the launch!

I was investigating the same topic (PG based timeseries database) for a stock tick data project, would definitely give timescaledb a try.

Since financial data is mentioned in the blog, would be curious on how it performed / scaled in practical.


We're still preparing some benchmark numbers/performance numbers that we will hopefully share in the coming weeks. We do have some write performance numbers in there as you can see. There is also a lot of churn at the moment as we're still in beta and refining some key features, so I don't want to speculate too much on how performance looks until after we get a few more of our query optimizations tuned.

So, having played/worked with a few time-series databases, this just doesn't seem to fit the picture we're used to. For applications having to store a set of well-defined time-series in a more optimal way, it looks great. As a generic time-series database on the other hand, this sounds like a maintenance nightmare.

I quite like Postgresql (and deploy it all the time), and I'm no fan of nosql stuff, which just means you don't have to properly analyze your database structure before-hand, but with time-series it's different matter. The data you tend to send to generic time-series databases tends to be very unpredictable. I currently don't care what data is sent to Prometheus or Influx. This includes, but is not limited to ZFS stats of our storage, system load, network traffic, VMWare, nginx/haproxy, application usage and errors, ... I know that when I'll need it, I'll have it available and can try to correlate the data at any point in the future. In TimescaleDB it looks like I would have to pre-create tables fitting all those stats, which would make it an absolute pain in the ass.

As counter-intuitive as it sounds, to do it properly/flexible/usable in a real world, you should impose a fixed/limited database structure. The tgres project mentioned elsewhere in this thread seems a lot more useful for 'generic' cases. Maybe they can be combined in some way.

Also, some other important things in the time-series world are a REST api. Yes you might want advanced SQL to query data, but no you don't want that to insert it. Pretty much any application can do http(s) stuff without any additional libraries (except when you're talking about C/C++ things). Postgres on the other hand? Oh no I need a JDBC driver for this Java app, I need an extra Python library for X, ... Not to mention you suddenly have to manage database pools in the applications and have something that is a potential bottleneck.

Then, no Grafana support is a no-go. It is the de-facto standard visualization tool these days.

What about retention policies, you don't need ms-accurate data-points from 2 years ago, when talking about many data-points, you might want to aggregate this. Though since the data set in TimescaleDB will be limited due to specific data-sets, this could be less of a problem.

While you do need to create tables for your stats, you are able to ALTER TABLE just as you can for normal Postgres without problem. It may not be as pain-free as NoSQL in that regard, but building tooling for this or automating this is certainly possible.

Alternatively if you do want pure blog storage, Postgres's JSONB datatype works as well, although with some performance trade-offs. This can work quite well if you have some structured data that lives alongside other unstructured data. [1][2]

Building or putting a REST front-end in front of PostgreSQL should not really be an issue (we built one for our hosted version), and there are already a fair number of PostgreSQL clients/libs. That said, for ease of use we are already thinking of adding an HTTP interface.

For Grafana: We were actually working on our own connector to support PostgreSQL, but found out that one is already in the works by the Grafana team (which will work out of the box with Timescale, because each of our nodes look like PostgreSQL to the outside world). This doesn't seem like it'll be an issue for very long.

We've mentioned retention policies elsewhere, but you do bring up a good point. Instead of dropping data, support for aggregating data in older chunks to a more coarse-grain resolution is something we are already looking into.

[1] https://www.citusdata.com/blog/2016/07/14/choosing-nosql-hst... [2] https://blog.heapanalytics.com/when-to-avoid-jsonb-in-a-post...

I think you're a bit missing my point. My view is maybe a bit limited to the 'ops' side of things, but creating and maintaining tables completely defeats how I'm currently using metrics, and I suspect this applies to most people using them when I look at the available ops-targeted metrics collector tools: Collectd, Telegraf, Intel's Snap, ... Going over the list of their available modules/plugins/sources should give you an idea of how realistic it is to maintain tables for all those metrics if you were planning on adding support to them.

The main reason time-series databases are 'hot' these days is because ops jumped on them, and I think understanding how metrics are being used there and for what purpose is the key here. My first reaction when something generates 'events' or data - whatever that might be - is simply to push them into a time-series database without thinking about them. Is it data from snmp (a switch, firewall, ...), generic machine stats, database stats, application metrics, ping statistics, time until an ssl certificate expires, ... you name it - I don't care, maybe it'll be useful, maybe not. I don't even dare to estimate the amount of different metrics I'm currently tracking.

Real-world example: did I think I had to know what the sizes of my different caches on my ZFS storage units were? Not at all, but Telegraf pushed them anyway - so whatever. They actually ended up being very useful tracking slow-downs on our fileshares caused by some rogue process scanning all files on them, completely trashing the caches in the process. I had the data right there at my disposal, and I didn't really knew the details of what exactly I was tracking until I had to take a closer look.

These use-cases are something TimescaleDB's approach on it's own would be completely unsuitable for. As you mentioned, JSONB has performance trade-offs and blindly using that for everything sounds like a recipe for disaster. It also only addresses one problem, complexity to use is another, and everyone having to define their own structure and insert queries is yet another: there is no standard. While there are quite a few time-series out-there, most popular tools support the most important-ones, but I don't see how you can add support for TimescaleDB since the data-structure is completely undefined.

That's why I mentioned the tgres (which I had encountered before). Something like that, backed by a Postgres database with TimescaleDB's extension could be very interesting. Doing the 'generic' metrics through a simplified, well defined interface with some fixed table-structures, while still allowing way more powerful 'direct' metrics to address niche needs sounds very interesting to me.

Currently however, I only see TimescaleDB useful within one well-defined application, where it can be very valuable. In the grand scheme of 'ops' things however, it feels too limited and inflexible, and not really like a time-series database.

> For Grafana: We were actually working on our own connector to support PostgreSQL, but found out that one is already in the works by the Grafana team

Ah yeah, I forgot that the Postgres datasource type was added in Grafana, yes that would do.

Other thing is time-series-specific functions, but I'm not really familiar with all stuff Postgres offers, I know it's a lot - but I couldn't find stuff like regression analysis/derivative functions. Or maybe it's my google-foo is failing. Providing metrics-specific functions in the Postgres extension should be possible though.

Something that might be very interesting would be combining https://www.torodb.com/ with Timescale. ToroDB is database agnostic (though they prefer Postgres). If you extracted the appropriate keys (including the timestamp) out I expect the combination would be very powerful.

How does this compare to say, Aerospike, or Honeycomb.io?

Honeycomb is also unapologetically a SaaS. We believe that - unless your company's core competency is, in fact, managing databases and a garden of myriad open-source monitoring tools - it makes sense for most people to outsource their observability solutions.

(We also don't currently support joins, while TimescaleDB's joins sound pretty dope :))

There are a few differences:

- Both Aerospike and Honeycomb.io don't support full sql queries. Instead supporting their own custom (and more limited query format)

- Aerospike is not optimized for time-based queries and is more like a key-value store. You cannot get the same performance

- Honeycomb is a column store. Ours is built on Postgres and can work with your existing Postgres databases

As for performance, we are still working on gathering numbers that we hope to share soon comparing us against other solutions including Influx and similar.

Hi from Timescale, here are 6 insights we took away from this awesome discussion. Thank you!


This looks promising! Somewhat unrelated question, but has anyone had success using AWS DynamoDB as a time-series database? I'm generally curious on the case for a "managed" TSDB versus setting up your own open source one.

Regarding DynamoDB, it is a key/value store, so IMO not a very good fit. I suppose you could make it work for specific cases, but you only will be running into issues, for example (note that I'm more familiar with Riak, which is based on DynamoDB, so it should have similar limitations):

You could use a timestamp as the key, and put data as json, but now you need to make sure that all data arrives at the same time so it can be inserted once, otherwise you will be doing series of updates.

For querying you are very limited, because you mainly want to query by the main key, which would be the time. You can get only one key at the time, you can't request range of data or data fitting specific criteria (unless you use some extra indices, which comes with own issues). If you have holes between the keys you will need to find some way to make them predictable, because listing all keys is a very expensive operation that only makes sense to use when testing.

Expiration of old keys... generally it is not pretty either, riak has bitcask backend which can expire old data, which could help, although the expiration is more of "time guaranteed that given key(s) is not purged". I'm not sure if DynamoDB has this as well, for example riak's leveldb backend doesn't allow expiration.

There's no dedicated time series database[1] in AWS the closest thing to it would be using a relational one.

[1] BTW: don't use "TSDB" initials, because that's a name of another time series databases that runs on top of hadoop, it confused me a bit :)

We are planning are trying to get our extension onto major cloud providers like Amazon RDS and similar, so that might be useful for your use case.

Very cool. The partitioning reminds me of the way VoltDB does it. By splitting it across a set of keys it's able to ensure that queries can be run concurrently on their own node.

Are you guys familiar with it and I am understanding this properly?

Don’t remember enough specifics of VoltDB’s partitioning/parallelization details (or its academic H-Store predecessor), but that wouldn’t surprise me.

VoltBD’s focus on all in-memory processing and general OLTP workloads (so-called ”NewSQL”) is taking a different point in the design/problem space though.

How does that differ from PostgreSQL's stock table inheritance (partition data on arbitrary rules, while letting you efficiently query the parent table using constraints)? The TimescaleDB approach apparently looks very similar.

We actually really like (and use) Postgres inheritance. Our extensions add functionality specifically geared for time-series workloads. Namely, we add:

- Automatic table-sizing. This includes dynamically creating new tables for new data, closing old tables as soon as they get too big, etc. (more challenging than meets the eye because of deadlocking issues, etc.).

- Related to above, optimized inserts to correct chunks. In Postgres 9.6, this is slow using rules/triggers. In upcoming Postgres 10 the situation is supposed to be better (waiting for finalization of 10 to test). But in Postgres 10, auto-closing and table creation are still not supported, and implementing them would probably require the slow path. In contrast, Our inserts are very fast.

- Hash-partitioning query pruning according to partition key

- (Time based pruning supported also, but that's standard Postgres)

- Time-based query optimizations. For example, we enable the planner to group by time-intervals using indexes on time-columns. Plain postgres can't support that because it does not know that truncating/rounding time is order-preserving. Other optimizations in active development.

- Our upcoming clustered version will include distributed query optimizations. In addition our scale-out design takes advantage of many workload features unique to time-series (see white-paper for more details)

Out of curiosity, is it inheritance you like, or just the partitioning aspect of it?

I've actually used inheritance far more than partitioning, but I know of others that would love to see inheritance die.

Very interesting.

Can you speak as to the stability of your system for production usage?

We are currently in beta, and so I don't think production use would be a great idea at this point. We are looking to get things more stable relatively soon so that more adventurous users may try us out in non-critical-path scenarios.

Right now we're looking for feedback as people use it on their own machines or experiments. Don't want to mislead people into using this for something critical and having it not work.

Thanks for the clarification.

In the next few months, I'm going to be examining the feasibility of Postgres as a timeseries store, with an eye towards deployment onto RDS for production.

Might have some feedback then.

Best regards- this project looks like The Right Way Forward.

Thanks! Yes we've had talks about how to get our extension on RDS so hopefully by the time you're ready we'll have that done and ready for you.

I couldn't find the architectural diagram on how do you scale this across N nodes, can you elaborate on that? Is it similar to Postgres XL or Citus?

At a high-level, most distributed architectures take a similar approach: Spread partitions over multiple nodes, insert/query to any node, route (sub)requests to the appropriate nodes. The unique parts of Timescale is that, unlike Postgres XL or Citus, our design/optimizations all focus on time-series data.

So, as we've mentioned in some other replies, this means that our partitions/chunks are typically based on two-dimensions (time and a primary partition key), chunks across nodes are automatically created and closed (based both on time intervals and sizes), and there are a bunch of time-based distributed query optimizations and time-series-specific functions that we'll provide out of the box.

More information in whitepaper: http://www.timescale.com/papers/timescaledb.pdf

A single-node version is currently available for download. A clustered version is in the works.

Adding clustering on sounds like the hardest part.

Its definitely something we want to make sure we get right before releasing. We do think single node can work for a lot of use cases though, especially with our scalable insert/write rates and our time-series specific optimizations.

Are the hypertables effectively views with insert triggers on them?

Does Timescale have any baked in support bitemporal constructs?

Are temporal foreign keys supported?

It's a bit more complicated than that. At a high-level: views + auto-sizing of tables + fast insert path + time-based query optimizations + (upcoming) cluster support (+ more).

We do not currently support bitemporal constructs.

What's an example of a temporal foreign key query that you need?

If you have two logical views of information, say people and their jobs. Imagine the attributes of people change over time. Also the attributes of their jobs change over time. It's possible to delete a person record from the person table. What if a Job record references that person and it intersects with the period of time that has been removed from the person. Suddenly the data would be nonsensical. I don't ask this to be obtuse, but when I see joins supported in sql on a time oriented database, I assume some position is being taken on this.

Basically, does the database have an opinion on time oriented data consistency?

Ah, sounds like you are again thinking about bitemporal reasoning over relational data.

We are not really built for that use case. Instead of focusing on fundamentally relational data such as people or jobs (which you are free to store alongside our stuff, using plain postgres functionality), we store event or metrics data. This data can be easily joined against the relational data (e.g., using a person_id stored in both event and relational tables).

I reread 'managing time in relational databases' like 3 times one summer and now every time I mess with time in databases I have a heart attack.

That makes sense, thanks for answering ;)

Joins against time series data sound really nice.

We like to think so! We provide some sample datasets for people to play with[1][2] that have relational "metadata" tables alongside a time-series table where you can do this.

One thing we'll note is that currently JOINs between two time-series tables (what we call 'hypertables') are not optimized, but we're working on it! :)

[1] http://docs.timescale.com/other-sample-datasets [2] http://docs.timescale.com/getting-started/tutorial

How does this compare to citusdb?

Author here. We actually know the Citus guys pretty well, and spent some time comparing notes last week at PGConf (Hi Ozgun!). We're quite friendly.

At a high-level, Citus is solving a very different problem: general analytics and transactions at scale, ideally for B2B applications. On the other hand, Timescale is laser focused on time-series workloads.

would be interesting to get that integrated with Prometheus remote storage.

We plan to add support through remote_read/remote_write, but it is still on our todo list. (And we're open to suggestions.)

Congrats on the launch!

How many dimensions does each metric support?

Not sure if this is what you're asking, but some monitoring focused systems have you associate each individual time-series metric with a bunch of metadata/labels (either encoding into a single string ("dc1.rack15.server46.port2") or in some json/array.

We expose a more traditional postgres schema, which is a different way of thinking about the data model if you come from a monitoring world.

So you can shove data into a postgres JSONB or array type if you want, but you might more commonly have all your data items stored as individual columns. And different/many metrics can be stored in a single row, so (1) you aren't paying for the label cost with each metric, (2) our query planner makes it efficient to support complex predicates on these metrics. Like:

SELECT date_trunc('minute', time) AS minute, COUNT(device_id) FROM logs WHERE cpu_level > 0.9 AND free_mem < 1024 AND time > now() - interval '24 hours' GROUP BY minute ORDER BY COUNT(device_id) DESC LIMIT 25;

The real power of time series databases comes from the ability to aggregate metrics based on arbitrary dimensions. For example, to create a dashboard containing stack graphs of http requests summed across all servers of a particular application name, grouped by response code, one for every application name, then separately sectioned in the dashboard by region - but only for the servers in the production environment. And none of the dimension keys or values are known to the database in advance.

Datadog, SignalFx, Prometheus, and Wavefront all have this ability. It's now standard functionality for any reasonably advanced time series monitoring implementation.

The fact that your implementation happens to be backed by Postgres is interesting, but it's merely an implementation detail as far as power users are concerned.

You can query by arbitrary dimension (what we would think of as time aggregates). In the future, you'd be able to materialize to arbitrary dimensions with continuous queries.

How do you support arbitrary dimensions in a tabular database system without massively blowing up the number of tables, the number of columns, or number of sparse rows?

This kind of problem is really an OLAP cube problem, yet you're throwing an OLTP solution at it. I've seen it tried before at several sites and it's been abandoned each time for having crippling performance and management behaviors. What are you doing differently that will make this attempt successful?

Sorry, not sure I follow. Do you mean how many partitions are supported?

We allow up to two dimensions of partitioning: we always partition by time, and we allow you to specify another key on which to partition. Allowing multiple keys to partition by is probably far down the road.

As for the number of partitions on that non-time key: I don't think we have a limit (or its something like 32,000).

Not sure if that answered your question.

Wow, cool!

How does this compare to Citus?

Hi, Author here. I just answered this question on another thread: https://news.ycombinator.com/item?id=14037105

But at a high-level Citus is solving a very different problem (general analytics and transactions at scale, ideally for B2B applications).

On the other hand, Timescale is super focused on time-series workloads.

I'm a novice regarding DBs. What's the significance of this?

Hi, I'm the author. There are a few things worth noting:

1. Time-series data (data collected over time, e.g., DevOps data, financial data, sensor data) is a growing (and lately, quite popular) type of "big data" thanks to a number of trends: more sources of data, cheaper storage, rise of a monitoring culture, IoT, etc.

2. One of the characteristics of time-series data is that it grows very, very quickly (e.g., when collecting data every few seconds), making it hard to use traditional relational databases for storage. In response, people have developed specialized time-series databases, which so far achieve scalability by sacrificing query power.

3. We just released a new time-series database that takes a different approach, achieving both scalability and query power. One of the benefits of our approach is that one can use normal SQL to run queries, something that was missing in prior time-series databases. Another benefit is that we are tightly integrated with PostgreSQL, one of the most popular, reliable, stalwarts of the database world. A lot of people find this approach useful.

If you collect any type of time-series data, then I welcome you to take a closer look. Happy to answer any other questions.

I really hate this stile of writing. Why does it have to sound like every other hipster it text?

- I'm not your padawan

- Postgresql is cool, why do you say it is boring?!

- yes awesome some PhD People are vouching with there titles to a peace of software i might wanna use in a production environment? Nope. Tx for the efford and i will watch it but no.

I do like that you do that work and tx for this but why not wrting a post which goes into details instead of this hipster pseudo tl;dr text?

Sorry the tone wasn't to your liking. For a more academic voice (and deeper technical dive), you might enjoy our LaTeX-typeset paper instead:


Hey, Thanks i really like that way more.

^ when boring is awesome

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact