Not a surprising result at all. Timestream is another case of AMS trying to make something for the sake of having it (see here Kinesis instead of just doing Kafka and many other products) instead of just adopting something industry standard. That isn't to say they don't do some good stuff just they also ship a lot of crap, very much 2 tiers of products in the AWS catalog.
I would eventually like to see comparison for a more analysis heavy workload vs Druid though.
Simple metrics workloads like what Timestream and Influx do aren't really of interest when comparing to Timescale ability to do JOINs and use other SQL and PostgreSQL specific functionality.
I think AWS is early to offer a not-yet-matured product just to get it in front of customers and start gathering real usage feedback. I’d consider dynamo a fantastic tier 1 service, but it wasn’t always that way. This is inline with the Amazon philosophy in general.
As for “making something for the sake of having it”, I believe the key reason for building their own applications is that they can build it on the same multi-az incremental ledger/block store that underlies most services (first built for aurora), or a variant thereof. They have some pretty stellar underlying tech built to be reliable at cloud scale, something you won’t get from running a service on your own EC2 fleet. The new stuff built on top of it just isn’t all that mature though. That’s just my take though. If interested, the deep dive talk from reinvent last year on aurora is pretty cool.
Disclaimer: I work at Amazon, nothing to do with AWS or timestream. In fact, I’ve been excited for time stream for a while but found its initial performance lackluster for what I needed.
>I think AWS is early to offer a not-yet-matured product just to get it in front of customers and start gathering real usage feedback.
It's not just this. In many cases, the services launched are essentially MVPs that were only created to serve a handful of customers' needs. If a huge whale of a customer says they want hosted Jupyter notebooks on AWS (just a theoretical example I picked because I saw someone criticize Sagemaker, but I don't have any specific knowledge of Sagemaker), then AWS will create that for them, but, AWS doesn't actually do "one off" services for a specific customer, so they will also launch those hosted notebooks as a public service.
The result is of course that this service was built with one customer's use case in mind, so it might not serve other customers' use cases very well, which leads to some negative perception. But ideally AWS then goes and incorporates feedback into the service, turning it into a "tier 1" service rather than an MVP. It just takes time.
I used to work for one of AWS's biggest customers. This was 100% how we operated, we told AWS what we wanted them to solve, and they would build it for us. When we're done closed beta testing it and getting changes in, it gets released as a public service a year later or so.
They are also incentivized to do it because it makes AWS sticky. The more of their services you use, the harder it is to migrate somewhere else. They don't need to put out quality services, they just have to put out enough that you become dependent on one or two of them and then you're stuck.
I remember the announcement of Sagemaker at an AWS summit, a moment of profound embarrassment - the whole thing was barely a Jupyter notebook glued to some EC2 stuff with duct tape.
You're right, they're definitely doing things for the sole purpose of locking you in.
You still have to compile your own Docker image and then "run it" from a Jupyter notebook if you are not using their Estimators (who uses those?). What is the point of just launching Docker images from Jupyter notebooks and looking a different window to see the log?
Serving is ok on Sagemaker, i will admit. The new ones: feature store, pipelines, and data wrangler - we will see, all rushed (some more than others).
Sometimes I'm not even sure they're re-inventing, just adding a vender-lock-in layer on top of the service and making it run on Lambda so they can advertise it as "web-scale".
But why do they do it so badly? Seems like they have the engineering muscle to put out a great reinvention of a document db, a time series db, a graph db... but each is relatively poor to the other services on AWS. Is it the combination of MVP-culture and lock-in?
"The viability of our company, Timescale, is 100% dependent on the quality of TimescaleDB. If we build a sub-par product, we cease to exist. Amazon Timestream is just another of the 200+ services that Amazon is developing. Regardless of the quality of Amazon Timestream, that team will still be supported by the rest of Amazon’s business – and if the product gets shut down, that team will find homes elsewhere within the larger company."
Breadth versus depth seems to be the play outside of their absolute core offerings. It allows people to dip their toes into something with minimal commitment and by the time they realize the shortcomings they might conclude that migrating for those features just isn't worth it.
So this give them a bit of lock-in. Come for the EC2, stay for the minimally viable queues, machine learning, containers, etc.
This exactly, they are just trying to get their foot in the door in literally any idea someone else is already pursuing so that if it starts to make money they can start to put the juice on the engineering for that and take their market share.
The business model is solely focused on competing with and taking out other businesses, not actually providing any value. AWS is like a pack of sharks.
The value they sell is really operational convenience. Over time they have figured out how to run large services at scale, and monetize that. The features of those services are often weaker than competition, as noted in this thread.
> ...see here Kinesis instead of just doing Kafka and many other products
Kinesis was a solution to their in-house woes with metering and billing which absolutely drowned in so much data from internal services with myriad metering and billing items and rules that they had to do something about it themselves: https://gigaom.com/2014/03/20/why-amazon-built-its-data-stre...
BTW, Kinesis Data Firehose is a pretty good product.
It seems like AWS is just focusing on increasing it's vendor lock in and it makes sense not just from the AWS point of view but if you're a developer at an enterprise company that has authorised AWS the more databases the clone, poorly or badly the more choice you have without having to get authorisation and the cost doesn't matter to the developer since the enterprise company is paying for it.
I suspect there are many enterprises that are already so locked into AWS that it would take them a decade to move to a different provider.
I don't have much expertise with Timestream. Can't comment on it. But, there is nothing wrong in trying for an alternate implementation. We're using Kinesis and it is a good fit in our stack with lot less overhead. They did bring Kakfa (Amazon MSK) later to the AWS service list. Its upto the consumer to pick Kinesis or MSK.
One example which was influential in nature - Dyanmodb. The Dyanmodb paper inspired a number of NoSQL databases, including Apache Cassandra. If they have went with the popular NoSQL database at that time, we might have missed some good contribution.
You can but then you need to spend time setting up and maintaining the service. Which many enterprises aren't willing to do. Also, while you may be authorised to use AWS tech you may not be automatically authorised to use different tech. We literally were told to use MySQL instead of MongoDB soley because RDS was allowed and all services had to be managed by AWS.
I am not complaining AWS is doing vendor lock in makes complete sense, I am just commenting that is their main direction for their offering. And their vendor lock is aimed at the enterprises where they can really run wild on it and get lots of money in return. Smaller orginisations will generally never lock themselves in so much because they won't build so many things that are so hard coupled since they just don't have the manpower.
And we in fact have fully open-source/free k8s helm charts to easily spin up TimescaleDB, including with HA and automated failure detection/failover and streaming replication for incremental backups:
In a recent webinar for capital markets, AWS didn't even try to advertise Timestream, they talked about the three most popular tsdb in finance and that's it.
In my opinion it would be very hard to justify using Timestream for any analysis heavy workloads for at least three reasons:
1. Queries will need to touch a lot of data - will cost a lot, and there is no ability to optimize the queries in any way (no EXPLAIN, no indexes, no downsampling)
2. No integration with data exploration and visualization tools
3. No ability to have non-timeseries data, or correlate any data in two different tables (no JOINs)
What about comparing TimescaleDB to VictoriaMetrics? There are some old benchmarks [1], but it would be great to see updated benchmarks as well for the latest TimescaleDB and VictoriaMetrics versions.
TimescaleDB inherits from Postgres a rich role-based access control, so all requests require permissions and roles can be enforced at various levels (database, schema, table, or even rows) and privileges (SELECT, INSERT, DELETE, CREATE, etc).
Authentication to the database is typically governed to different security mechanisms, including certificate-based auth, password via SSL, etc.
How to tell what service is in what tier, crap or good? What's the metric? I am not a direct user of AWS, so absent an informed technical judgment of "this is a deployment and renaming of that", eg Kinesis for Kafka, how do you know what to avoid? In general, it seems like the value-add for service X is that now that function is integrated to some degree w/ the rest of AWS - so maybe it's based on effective feature set/use cases plus integration points. Alternatively, some function of throughput and unit cost.
If I were a SAAS competing with an AWS service, I would want to know how long before a given AWS service might get good enough to be an alternative for my customers. If you run with the idea that AWS offerings get better over time and eventually cross some threshold of "not good" to "good", maybe a crawl of the AWS announcements[0] in combination with a metric could provide a predictor of an AWS service's time to "good"...
Amazon designed a system that is perfectly horizontally scalable. The only issue is that they now need 33,000 machines to achieve the throughput of TimescaleDB running on a single node.
I would call the results this new school of software development produces horrifying, but I already get lynched every time I tell a startup that their horrifyingly complex horizontally 'scalable' AWS setup that costs them $10k a month could be replaced by a handful of ducktape running on a rPI.
The best part is how the scaling never really works and just adding a load balancer and a second rPI would've worked better.
"We didn't think we'd get that many users. I mean 10k! Whew! It just wasn't designed for this!"
"Oh. That sucks. My rPI is serving 2 million monthly visitors right now and I'm still waiting for CPU usage to dip into the double digits."
I'm not saying people should design their stuff to handle millions of users on underpowered hardware, but I am suggesting they not lose their sense of perspective while building their whatever.
Because if you do, you may end up thinking you did well when your DB maxes out at 500 inserts/second - Another DB running on a Nintendo DS might just outperform it.
Horizontally scalable systems are usually slower than single-node solutions because of overhead on passing data between nodes. We at VictoriaMetrics clearly see this - cluster version of VictoriaMetrics needs up to 2x more CPU and RAM resources comparing to a single-node version for achieving the same performance.
I'm very towards Postgres and resultingly Timescale for being one of it's biggest shining stars. Combine that with a cynical view that this is essentially S-tier content marketing (in the tech world), and the numbers are still bonkers.
Might as well add on a bit here -- if you're into this sort of thing you might enjoy Timescale thrashing other purpose-built databases (which have since also improved so YMMV):
- Timescale vs Influx[0]
- Timescale vs Mongo[1]
- Timescale vs Cassandra[2]
These blog posts are all from the timescale side, but the numbers are undeniable and their methodology is open source -- not sure there's more you can ask. Cassandra probably scales better at FB/Twitter scale, but with the recent work done to improve Timescale's scaling features[3] and the permissive license (unless you're an Amazon) I'm not sure even that is a real benefit.
I'd love to see a response from Amazon -- if HN is so lucky to have a dev who worked on AWS Timestream that would be awesome.
I did some time series data benchmarking recently. Most large data is "time series" data, but I will not digress on terminology right now. For the usecase I was looking at my results for InfluxDB did not qualitatively disagree the above blogpost. Timescale got better compression efficiency, faster query results, more constrained memory usage, but lower ingest speed at high concurrencies blocked by WAL insert locking and single threaded compression. If InfluxDBs measurement oriented data model and query interface is a good fit for your use case, and you don't have high cardinality data, then it might be convenient to use it, but it's a terrible choice for anything outside its niche.
However none of the databases tested above is anywhere close to the efficiency of a column store database that can do vectorized execution over batches of rows. ClickHouse is a good example of one such database. For queries that have to shift through large amounts of data, either filtering or aggregating it, the performance difference is easily >10x. I was seeing aggregation performance above 2B rows/s and that was I/O throughput bound.
I know the first thing that shocked me was how they could get so close to something that was purpose built for time series. Even being within spitting distance is really great in my opinion for an off the shelf, general tool.
> However none of the databases tested above is anywhere close to the efficiency of a column store database that can do vectorized execution over batches of rows. ClickHouse is a good example of one such database. For queries that have to shift through large amounts of data, either filtering or aggregating it, the performance difference is easily >10x. I was seeing aggregation performance above 2B rows/s and that was I/O throughput bound.
Agreed -- OLTP (in the case of Timescale) and purpose built timeseries-focused (but not necessarily analytics focused) DBs hold nothing to a proper OLAP database.
Did you write about this anywhere? would love to read it. I've never had a real need for the kind of stuff that Clickhouse does, but it looks to be the best in class for F/OSS OLAP DBs. Have you ever tried Druid?
I was really surprised at how well Timescales compression worked. It was pretty much comparable to best in class columnstores. Only the row-by-row query execution engine was holding it back. Perhaps something that future versions of postgres can help with.
I haven't published the benchmarking results anywhere yet, but I will probably do some conference talks on it once that is a thing again.
Didn't look into Druid in detail, but did try out Hive. Both of them look more suitable for cases where there is significant engineering effort in developing the data ingest and structuring pipeline. I wouldn't recommend either to a small team. With a measly triple digit TB database size both seemed overkill.
> I was really surprised at how well Timescales compression worked. It was pretty much comparable to best in class columnstores. Only the row-by-row query execution engine was holding it back. Perhaps something that future versions of postgres can help with.
I think zedstore[0] might be something that could help here. I've mentioned it in the past but one of the best things about postgres is it's extensibility, and if timescale rides that wave (and maybe contacts zedstore to get this integration started early) it could be awesome.
> Didn't look into Druid in detail, but did try out Hive. Both of them look more suitable for cases where there is significant engineering effort in developing the data ingest and structuring pipeline. I wouldn't recommend either to a small team. With a measly triple digit TB database size both seemed overkill.
Thanks for this -- I haven't tried it yet at all but will try to remember this. ClickHouse was already first on my list for hobbyist->enterprise scalability but this cements it.
I'm out of my depth here (not a db guy but curious), so forgive the interruption, but have either of you messed with either SciDB or VoltDB? I have found them to be quite interesting especially given Stonebraker himself being involved in the design. Druid and timescale have also been on my radar too, and I think these comments have finally given me the push to play around with them in my sandbox.
VoltDB is about performance by computing entirely in-memory and using a strict sharding architecture. It's unique that you interact through stored procedures that are shared by the data so each node processes completely in parallel without any side-effects. It's more akin to other streaming/SQL processing engines like Spark rather than a fully functional SQL database.
I highly recommend MemSQL (now called Singlestore) for a really polished distributed relational database that combines OLTP with OLAP columnstores.
I've never even heard of SciDB, thanks for the pointer. I have heard of VoltDB and the only thing I remembered was that it was in-memory...
I'm still not 100% sure what SciDB is for but it looks like none of the databases we're mentioning are directly comparable:
- SciDB for scientific computing (??)
- VoltDB for in memory + SQL
- Druid for OLAP queries
- TimescaleDB for row-based timeseries storage
Seems like all apples to oranges to me. I personally like TimescaleDB because it runs on Postgres and I can usually find a way to do all those other things in postgres relatively efficiently (memory requires some squinting while using UNLOGGED tables).
ClickHouse is great OLAP database with outstanding performance! It can be used for collecting and querying observability data [1]. But it may be hard to properly design database schema for ClickHouse for storing general-purpose observability data. That's why we created VictoriaMetrics - purpose-built time series database, which is based on ClickHouse architecture ideas [2]. It just works out of the box without the need to design database schema, while providing outstanding performance [3].
We've made every effort to be open about all of the testing we do and many people outside of Timescale contribute to TSBS.
As an aside, we tried to solicit help/feedback on both Twitter and Redit groups. One AWS employee reached out and offered to pass it to the team (I was upfront about what we were doing) - but we never heard back.
I'm also super curious about AWS's response. Because these numbers - and the aspect of backups - mean that, barring Timescale just not doing the benchmarking right, Timestream isn't going to be an option.
I think those benchmarks vs TimescaleDB require an update - a lot of things changed since the last test. However, even then VM ingestion rate was 11 million/s on 32CPU instance and this topic says TimescaleDB ingestion rate is 3.1 million/s.
I enjoy trying to guess how cloud services like Amazon Timestream are internally built. I have already bet in the past that DocumentDB was built on top of (Aurora) PostgreSQL (this I also know was possible as I founded https://torodb.com).
My bet for Timestream is that is built on top of DynamoDB. There are many potential indicators (1KB writes, throttling) and some clear ones(pricing follows the exact proportion up to the 4th decimal if you compare across regions, for example), it supports "unlimited" scalability, batching, etc. That reads are eventually consistent may be because they are computed over a GSI. It would be interesting if true as it is cheaper for writes than DynamoDB (on demand, which is the model Timestream has).
Plus there is a reasonable amount of mindshare and possibly market opportunity into offering time series on a serverless database like DynamoDB.
If this were true, it would also mean that Timestream, as hinted in the post, is more performant when accessed more in parallel (as DynamoDB itself is by design a massively parallel multi-tenant infrastructure).
This is a likely option, too. Your observations on pricing are keen.
The one thing that threw me was querying. The limitations felt more Athena-like than PartiQL. And the billing based on scans felt almost like Redshift Spectrum. I mean, S3 is infinitely scalable, right?
Note that DynamoDB's PartiQL support is not an architectural pattern but rather a "simple" translation layer on top of the current three read operations. It doesn't say much to me tbh.
I wouldn't say is impossible to say. Maybe impossible with 100% certainty, obviously, but for me it's quite clear ;)
Timescale employee here. We generally update our benchmark blogs every year to keep them fresh. No doubt you'll see the updated results here on HN as well!
Yup, I would've kept it in the submission except the full title is 16 characters too long for the HN Title field. It felt weird to only include part of it, so I just chopped all of it off.
Oh yeah sorry - didn't intend as a criticism! I think the hn headline should be concise, what you have was perfect. _Maybe_ adding the word "benchmarks" could have clarified but you cover more than that in the post.
Disclosure: I work at AWS but not on Timestream. Opinions my own.
Unless I'm missing something this is not an apples to apples benchmark. TimescaleDB is running as a single node without any replication whereas Amazon Timestream is replicated[0] to three AWS Availability Zones for durability. I've only skimmed the TSBS[1] repo and the start/stop scripts for TimescaleDB. Can someone confirm this?
Our experiments all ran on cloud instances (DO and Timescale Forge) that use storage that are replicated across multiple AZs/racks for greater reliability and fault tolerance.
This contradicts what is in the blog post. From the machine configuration section:
> 1 remote client machine, 1 database server, both in the same cloud datacenter
> Disk Size: 4.8TB of disk in a raid0 configuration (EXT4 filesystem)
Both those statements lead me to believe it's a single server with locally attached SSDs in a RAID0. Which is it?
I know benchmarking is hard, and it's difficult to test certain aspects of Amazon Timestream due to it being a managed service, but I really think these details need to be firmed up to make sure you are comparing apples to apples. TimescaleDB seems like a cool product.
Another suggestion I'd make is to run the Amazon Timestream clients across multiple AZs if you aren't already. The blog post doesn't mention whether all the t3 instances are in the same AZ or not.
> Another suggestion I'd make is to run the Amazon Timestream clients across multiple AZs if you aren't already. The blog post doesn't mention whether all the t3 instances are in the same AZ or not.
They were all run in the same AZ. This brings up a good point, however, that we discussed internally when the first results came back. If there are tricks like this that might improve performance, it's not (currently) spelled out in the documentation so there's no way to know that. And we reached out for help in various forums with no response.
It's worth noting that since we performed this analysis, Amazon did release their own tooling for a similar benchmark and created a post[0]. While neither it, nor the tooling documentation[1] specifically spell out how many threads or instances they ran to achieve their results, it's hard to draw an apples-to-apples comparison. It does reveal that they used (had to use??) an m5.24xlarge instance (96 vCPU,384GB) to run their tests. As discussed in the article, one much smaller t3 instance was able to add >1 million metrics/second into TimescaleDB running in Timescale Forge.
Would that really explain the difference between 5 minutes on Timescale and 2 weeks on Timestream? Timeseries databases need the ability to ingest data rapidly.
Please see below for correction... but at some point today a poster said "~2 weeks" for Timestream... but that's incorrect. We ran the data load/ingest for "~2 days" (40 hours).
Just want to make sure the right numbers are being used! Thanks!
Large numbers are unintuitive. If we’re conservative and add only 1 millisecond of latency for the 3 AZ replication, on 410 million records that is about 4 and a half days of time if the client is not doing any batching.
TSBS did batch 100 metrics at a time, the max per request that Timestream allows. As we (and others) have suggested, this could easily be one of the reasons it's harder to ingest more quickly without significantly more threads. TimescaleDB (and other TSDB's) allow much higher batching values.
OT: Every time I see Timescale on here I'm excited to use it. After some preliminary testing I always run into the same issue: can't create a hypertable if there's a unique index / primary key that doesn't include the 'time' field.
For example my data looks like (row_id, user_id, time, data), where row_id is a unique ID, and time is a non-unique time field. Timescale will refuse to create a hypertable like this because it causes partition issues.
You can create a hypertable with a UNIQUE composite key on (row_id, time).
Otherwise, you are correct in that we require your partitioning keys to be at least _part_ of unique constraints; otherwise, we'd need to build global indexes across all your chunks (which would inhibit scalability)...this would only be worse with multi-node =)
Why not create an index on user_id, time? Why do you need a unique reference back to the row? We find that they're often (though certainly not always) kind of meaningless and/or unnecessary for timeseries workloads.
Yes that's what the reply suggested. However the business requirement is that the row_id should be unique, such that looking up a row by ID is guaranteed to have 0-1 results. A unique constraint on (row_id, time) doesn't satisfy that.
Sorry, the previous reply suggested it was on row_id, time I thought, anyway. I think the question is why should row_id be unique? What looks it up by row_id? And can they look it up by user_id/time instead? It may be impossible. On the other hand, if row_id is really what you're searching by, not by time, then just use our partitioning on row_id, assuming it's a bigint or something like that, you can just partition on that...
No problem, my fault I misread your comment. `row_id` is a unique identifier for the row, for example my API will need it when the user wants to delete the specific row. Since it's possible for multiple rows to have the same `time`, even for the same `user_id`, I cannot assume uniqueness there.
Partitioning on the row_id by making it an ordinal instead of a UUID could work, however I feel I would be missing out on TimescaleDB's advantages for querying based on the `time` filter?
Consider that my main queries are:
* DELETE FROM t WHERE row_id = xyz AND user_id = xyz
* INSERT INTO t (...)
* SELECT FROM t WHERE user_id = xyz AND time > xyz and time < xyz
I mean, I'm not sure this is the place to get into detailed schema discussions, but I think I wouldn't use a row_id here at all, certainly, there's no need for this to be unique across all of the rows, you could make a unique constraint on user_id, time, row_id and then provide all 3 of them for the deletes, if the only reason is for the deletes, then that should be fine. If there are other reasons, then you probably have a different unique constraint that has some real world meaning (and that likely involves time).
If you take one thing away from this it's do not use AWS timestream, and if you are using it, get rid of it quick. It's unfit for purpose.
That first graph tells you everything you need to know. Time to insert a billion events:
TimescaleDB: 5 min
AWS Timestream: ~2 weeks!
I think the team at AWS that built it should be reassigned and contractors or an A team brought in to try and salvage it. Because they've built themselves a very expensive lemon and they're trying to sell it to the world as a Cadillac. The fact that they did not detect this themselves before or since releasing it upon the world demonstrates to me that they're incompetent and have to be replaced.
> I think the team at AWS that built it should be reassigned and contractors or an A team brought in to try and salvage it
That's a knee-jerk assessment. You don't know the first thing about Timestream's development and we are only a few years into the development process of what I believe is a novel architecture for a timeseries database. Give it another couple of years and then we can take a look at how Timestream is doing.
One of Amazon's advantages is the ability to plan and execute on a longer time scale than their competitors. They can take approaches that take longer to bear fruit but are better in the long term.
Timestream is a scalable timeseries database with separated compute and storage built for extremely high volumes (or that's my guess given the architecture offloads cold data to magnetic storage). Whereas Timescale seems to be timeseries functionality added to Postgres - which means they are probably going to hit scaling (perf/cost) issues once they need to work at petabyte scale. And my guess would be that their coupling to Postgres is going to make it painful to build for that scale. They also claim to offer separated compute and storage, but based on the pricing model it seems to be that you can change your CPU+Storage configuration, not an ephemeral design where you only pay for compute when it's needed - very different from Timestream's ephemeral pricing model. This is a pure guess, but I would imagine that Timescale being built on top of Postgres is going to make a truly serverless SaaS (i.e. pay for only what you use) very difficult to build.
Time will tell, but this is sort of if MySQL benchmarked writes against Snowflake 5 years ago and then thinking that everyone at Snowflake should be fired.
Yeah, it's harsh. But we're talking 6000x worse ingest performance for a database whose whole raison d'etre is based on ingesting high volumes of data - otherwise just use an RDBMS.
That's like making a car with square wheels. The only logical thing for management to do to a team who delivered that is disband them - because something is horribly wrong.
Now to be completely fair they do get good ingest performance if you open thousands of connections to send the data over. So they can probably fix it. I still wouldn't touch it with a stick though, that's not the only issue they have at present.
Yeah, differences like that are why I ditched MySQL years ago in favor of PostgreSQL.
I was using MySQL at the time, and did a performance comparison of how long it took each database to dump and load a snapshot of a 100GB database.
MySQL took 3 days.
PostgreSQL took 30 minutes.
I posted on a MySQL forum asking why the drastic difference, and they tried to hand-wave it away by saying I had a lot of indexes. Well yeah, databases have indexes. I shouldn't expect my MySQL database transfers to be quick if it has an index? Features shouldn't cripple the application for routine tasks.
That's surprising to me since Postgres does less efficient writes compared to MySQL in order to optimize for read queries[1] - a sensible tradeoff since most of the time reads are more common than writes in OLTP workloads. TimescaleDB essentially solves this problem for the specific case of time-series inserts.
[In Postgres] if we have a table with a dozen indexes defined on it, an update to a field that is only covered by a single index must be propagated into all 12 indexes to reflect the ctid for the new row.[1]
The parent post didn't have enough detail, but I'd assume the dump would be using COPY for inserts. AFAIK when using COPY, the indexes are only built at the end of the command, not on each insert.
Indexes are built during the inserts for copies and for normal inserts, in general copies are faster because many people use single row inserts that need to instantiate executor nodes and other bits for every row. I think there used to be some other differences, but in general we've seen that multi-row inserts are around as fast as copies and I'm honestly not sure which one our benchmarking tool uses. (I'm at Timescale, btw). Aside from that, the MySQL vs Postgres comment above actually has to do with updates not with inserts and the main difference here is that PG stores rows in the heap whereas MySQL has primary keys that actually organize tables (and also things around copy-on-write MVCC model). So, in some ways, PG should actually be faster for raw inserts, because you don't have to deal with as many page splits and other bits for the organization of the heap, things are just written there (and indexes are secondary). We limit the overhead of building indexes by making sure our chunks are "right-sized" so that you don't end up swapping too much as you're doing inserts into recent data.
(Just noticed this was about dump/restore, during restores you can in fact write the data and then build the index in a separate operation and that can be faster, sorry for missing context)
For the record, in reviewing HN conversation tonight I saw this and realized it was an incorrect quote of the article. Totally honest mistake I'm sure, but I wanted to set the record straight.
We spent a little over a week working at Timestream benchmarking. Trying different approaches to batching metrics for ingest, threading differently, running multiple EC2 instance, etc. to improve performance.
Once we felt like we were getting the best we could, we started our final ingest which we let run for nearly 40 hours (~2 days).
The other value is absolutely correct, however. We were able to ingest 1 billion metrics into Timescale in 5 minutes.
It's all detailed in the post (granted... it IS a detailed, long read ;-)
TL;DR; - TimescaleDB was tested with a cloud setup in Digital Ocean, separate client and server. We actually ran a second, unpublished, test into Timescale Forge from the same client(s) that we tested Timestream with. We did this second TimescaleDB test just to see if something was wrong with our EC2 instances or setup in other way. So, completely separate service offering with no VPC or anything - just a raw PSQL connection from client to server. The Timescale Forge instance was 1/4 the specs of the DO server from the published results (again, the intent wasn't to replicate the TimescaleDB tests all over again), and it still easily achieved 1.2 million/sec ingest from the same client computer where Timestream only achieved ~525 metrics/sec.
What an embarrassment for AWS - getting smoked by an open source project. Should just fire the management team overseeing the timestream project and just provide managed timescaledb instances.
A decent amount of AWS product offerings get smoked by FLOSS counterparts. But being able to 1-click use something, not having to figure out how to deploy it, potentially getting commercial support for it etc. is awfully tempting even if the product is only 1/3rd as good.
Unfortunately that company will double down even if there's a hint of traction, and in 3 years it'll finally be decent and TimescaleDB's cloud offering will be in trouble.
They have a lot of money and a bit of time.
It is very embarrassing that Timestream was delayed for so long only to lose so badly.
For me the killer feature of TimescaleDB is that you can run it on your own server. Not all of us can run our services in the cloud. In my case I'm working at a particle accelerator, which is considered a nuclear facility and thus not allowed to be directly connected to the Internet.
One year ago I moved our old system to TimescaleDB, and I have been really happy since then, it's a really amazing product. Kudos to the development team!
No, I work at the IFMIF/EVEDA construction site in Rokkasho, Japan. It's a much smaller project, and right now we are only two software engineers here, with me the only one taking care of this kind of things.
Thank you for the links, I don't think we have the resources to implement something similar, but they are good food for thought.
That is an interesting test I wanted to see for some time. When Timestream came out I was a bit surprised by its price. It might work well for Enterprise POC but for larger use the cost benefit is not appealing.
I played with it recently, being a PostgreSQL extension I found it easy to implement a rust client [1] to test ingestion rate performance, on my laptop was able to insert ~600K rows/sec.
The documentation on their website [2] is also very good.
It is a postgresql extension that you install on top of a normal postgresql server, so it is not worse in any way.
Timescale works by creating a 'hypertable', which is an aggregate of a lot of smaller 'chunk' tables. These chunk tables are automatically split by date or incrementing id. This means that for queries that specify IDs or a date range within a certain range, you only have to query results within a few chunks, instead of looking through all the contents of the entire 'hypertable.' [1]
Timescale also offers some other things like compression which can save you up to ~96% disk space while also improving query performance in some cases. [2][3]
It also has something they call 'continuous aggregates' [4], which are similar to postgresql's materialized views, but do not require manual refreshing - they instead update periodically through an automatic background job. There is also a feature which builds on this called 'realtime aggregates' that allows you to combine the data within a continuous aggregate with the raw data in the tables that has yet to be materialized.
There are a lot more things besides that, but I think that's a decent overview of the major features it brings to the table. From a dev perspective these things all make the data and the database easier to work with (especially targeting timeseries data). There is an api reference [5] that has some of the other commands timescale adds, if you want to see some of the other things it can help you do.
The two main things most developers will benefit from is how we manage the automatic partitioning of your incoming data (hypertables), something which is non-trivial to do yourself even though other tools exist for it. And because we do it with a time-based focus, we can be really efficient and smart about it.
Second, we've improved the query planner in PostgreSQL around the parts that relate to querying time-based, partitioned data, and provided special time-based functions. These improvements help you efficiently query data that time-series applications most often need. A quick example is something like "LAST()", which retrieves the most recent value for a given time-range. There are ways in SQL to do something similar (LATERAL JOINs or CTEs for instance), but they're usually slower and bulkier to maintain. When dealing with time-series data, getting the most recent value for an object is usually what you're doing the most often.
When you add those two foundational features, everything else that @drpebcak mentioned become amazing value-adds that you just can't get elsewhere.
Back in 2015, I'd architected and deployed a system for a AAA game that handled 24B events/day on launch without breaking a sweat, and supported 200ms round-trip ingestion-to-aggregation SLAs with no windowing (the protocol and ingestion layer did most of the heavy lifting: sequentially ordered _guarantees_ on events even when loadbalanced/connection migration meant no need for windowed batch ordering)... but the scenario for which it was designed was cut and we ended up using it for just 15m slices. :eyeroll:
Still, it was used by a dozen+ games, including a few more AAA titles, and still in use today, and portions of the tech have been cannibalized into other products. I still get the occasional inquiry about memory fencing or memory boundaries on Console X for the 5-15μs event generation API (improperly aligned memory could cause interlocked increment corruption!).
Annnyways:
I had an opportunity to chat with one of the founders at Snowflake in 2017? 2018? for a few hours. I tried to convey how imperative I felt true-realtime time series engines would be critical moving forward, an the reception was rather lukewarm. If they had been as excited as I, it'd have been one of the few opportunities to pull me away from my dream job.
I still feel the world will need this architecture, as we start moving towards more ML/AI driven decision making, and that the company which can get traction will be in a pivotal position moving forward.
Sometimes I wonder about feeling pressured to shift into Data & Applied Science to stay at that org (there just didn't seem to be vertical opportunities in the dev track). I excel in this job too, and I love what I work on... but dang sometimes I feel that the architect career path had even bigger impact potential. It was a fun couple decades. :P
While the comparison seems terrible for Timestream, as a customer who does not want to manage my databases I would love a similar GCP option, if the product had better tradeoffs.
It's also interesting that Timescale attributes this AWS product to their own licensing. They had some much discussed [0] developments on that front, and if it did in fact force AWS to build their own implementation, that seems like a win for opensource, but not so much for serverless users.
Slightly off topic:
I have a side project that generates ~10 daily metrics for ~350 entities. Around 150k readings currently (the number of metrics gathered per entity per day has increased since I started).
I'm pretty interested in these timeseries databases, but I think my use case is still on the side of using MySQL/Postgres out of the box. Not to mention that I can get by with archiving data > 1 year old.
Does anyone have any quick checklist or metrics to make decisions like this? When does it make sense to evolve from a vanilla RDB to a timeseries one?
I'd just add that if you have any belief that your application will grow, and it's time-series in nature, there's usually only upside to starting with a time-series specific database now. The hoops you'll jump through as the application grows will often bring you more headache than you can imagine now.
Before coming to work at Timescale a few months ago, I spent 18 years managing products at two companies, both of which were time-based applications (utility billing/energy data & IIoT). In both cases, the app started small and everything seemed fine. We could usually get around performance issues with other hacks. But in both cases there was a tipping point because the original database (one relational, no NoSQL) just wasn't designed for the challenge as the app scaled.
So whether you manage it yourself in a smaller environment for now or try something like Timescale Cloud, I can (almost) guarantee that you'll thank yourself in a year or two. ;-)
> Does anyone have any quick checklist or metrics to make decisions like this?
I'm not a database expert in any way, but I've had to evaluate database options for both existing and new applications, and my "methodology" is as follows:
- Understand the workload
Is the database workload read-intensive, write-intensive, or both? Does the application require strong consistency? Does it require replication? What kind of replication? Can writes be batched, or are processed ad-hoc? What kind of read operations will you do, and what is the desired response times for those operations (ex. is 2s tolerable? 500ms? 20ms?)
- Scouting
Select a set of available options to benchmark using whatever criteria you may see fit, but following the previous step constraints.
- Benchmark
Create a somewhat simple model that fits your workload and models your real case, using the language you'll be using to build your application; Create benchmark code for the most critical/complex operations; Run it at least 5 times on similar hardware of what you expect production to be (this is important; eg. storage (both memory and disk) latency varies wildly between "my laptop" and a cloud provider) and record the results; Be sure to keep an eye on memory consumption, cpu and disk usage when running the benchmarks;
- Decide :)
Usual criteria:
* Price;
* Features (of course);
* Easiness of deploy or cloud provider availablity;
* Documentation quality;
* Driver quality for your application language (it may happen you already excluded a candidate on the previous step due to this);
* Query language familiarity;
* Performance;
* Adoption (translated to: can I expect a new developer to have the basic skills on this tech?);
Thank you. At that price it's probably overkill for me, but I wish you had more info on your site that wasn't gated by a signup. How is the product managed in my GCP environment?
> Does anyone have any quick checklist or metrics to make decisions like this? When does it make sense to evolve from a vanilla RDB to a timeseries one?
10 daily metrics for 350 entities result in `10 * 350=3.5K` reading per day. This translates to `3.5K * 365=1.3M` readings per year. Such workload can be easily handled by any DBMS out there. There is no need to search for specialized time series database for this workload.
The need in specialized TSDB solutions arises when ingestion rate reaches a million of readings per second and the number of readings stored in the database exceeds hundreds of billions and trillions. Such a workload cannot be handled by general-purpose database, but it is easily handled by specialized databases such as VictoriaMetrics. Fun fact that it easily handles 10 trillions (e.g. 10K billions) of readings in a single-node setup - see https://victoriametrics.github.io/CaseStudies.html#wixcom
Not to mention that Timestream inserts time range are dependent on the memory store configuration (https://aws.amazon.com/timestream/faq/), so if you need to insert data past that configuration it's a PITA.
Serious question: Why doesn't AMZ just buy TimeScaleDB instead of building their own? Time series db's in general seem to fit their vibe pretty well: it's a new hot tech, it's usually deeply embedded into infrastructure, and requires loads of bandwidth & storage.
Timescale isn't exactly huge yet, but their potential is enormous - but so is the potential in building a competing product offered as a service. Unless Amazon want to put at least 8 zeros in their offer, I can't see it happening, and at that point Amazon might as well hedge their bets and build their own offering (which is exactly what they're doing). At worst, their offering will be only _okay_, and at the same time Timescale is doing business with Amazon by hosting DaaS instances on their infrastructure, which I'm sure Amazon takes a handsome cut from.
It doesn't matter, look at kinesis. People are willing to use it over kafka (even though AWS has a hosted kafka solution), just because the percieved convenience and interconnectivity to other AWS services. If timescale starts to make serious money, then AWS can use this to mobilize their engineering and make it fast and take timescale out. Amazon's goal is literally to be the only kid on the block for everything forever. Normally that kind of competition would be good, but they are wiling to cut the price so low and take massive loses for really long periods of time until all the competition dies out. Which isn't honest competition.
Amazon Kinesis is based on Apache Flink. One complaint might be the branding but keep in mind that the Apache Foundation has a strict policy of enforcing trademarks. The Apache Foundation itself doesn't choose winners and many equivalent projects live under the same umbrella, just like Flink and Kafka.
Performance is often transient and it isn't the only criteria used when choosing a vendor. Some enterprise customers prefer a single vendor, i.e., "one throat to choke". Others have very specific constraints: greenfield decision making is a luxury.
My impression is that Amazon responds to customer requests to solve specific problems and they make a decent effort to continuously improve the performance and other aspects of their services. At one point people said "no one gets fired for choosing IBM" and this switched to Microsoft and, rightly or wrongly, Amazon AWS now wears this crown.
> My impression is that Amazon responds to customer requests to solve specific problems and they make a decent effort to continuously improve the performance and other aspects of their services.
I get that, but I would have thought it would make more sense to approach TimescaleDB for a licensing deal. That way, right from day one they get a mature product with incredible features and performance. And they'd win developer mindshare by supporting OSS.
> I get that, but I would have thought it would make more sense to approach TimescaleDB for a licensing deal. That way, right from day one they get a mature product with incredible features and performance. And they'd win developer mindshare by supporting OSS.
In their mind, long term, that leaves money on the table. If they entered a licensing deal, they would have exposure risk if they ever wanted to replace it with an in-house product (and, given their corporate culture, an actual legal exposure, intentional or not, is a possibility).
Dying to use this on Google Cloud SQL but they just won't support the extension. There's an issue with upvotes to add support but still no official response as to whether support is coming... extremely frustrating
Although even if Google were to add support for TimescaleDB, it would only be for the Apache-2 version of the database, which lacks many of its key features (compression, continuous aggregates, multi-node scale out, data retention policies, job scheduling framework, various analytical functions, etc.)
See this HN discussion about Timescale's "cloud protection" licensing [0].
Of course, Timescale Cloud is available across 20+ GCP regions =)
It all looks amazing and I want it but am personally quite weak at Database administration. I think I'll have to wait until there's a guide which allows me to migrate from a 9.6 Cloud SQL instance to TimescaleDB Cloud without any downtime. I'm doing a quick google for "Cloud SQL migrate to Timescale". There's a guide from Google themselves but it's not without downtime.
I read that you can't easily migrate 9.6 to another version of PG via streaming replication until 10 :(
When you jumped between these AWS instance types did you tune / monitor to see what resources were becoming constrained? Or can you provide some details on network / VPC setup and changes to http connection pools in the AWS SDK? Since timescaledb is communicating over HTTP through the AWS client, I'm wondering how much you can tune to eek out performance there. I've been burned so many times across AWS services on classes of error like this that I'm just casually curious, not that anything was done "wrong".
- Community features: All features labeled with "community" on here: https://docs.timescale.com/latest/api. Only restriction is that you can't offer them as part of a Timescale-DBaaS.
- Open-source features: Everything else, licensed under Apache 2
Thank you for posting this benchmark, great addition to the TSDB community. We look forward to benchmarking our SQL time-series database QuestDB[1] versus both using the TSBS framework soon!
I looked at Timestream at my previous job and my research left me thinking, "Who uses this thing?". Expensive but most annoying of all, the only Grafana plugin was in private paid beta. It looks like they've finally open-sourced it.
AWS should pay a special Enterprise agreement to companies like MongoDB and Timescale so they could really host and resell the product. It would be good for users and license owners, and I guess for AWS also. GCP and Azure would soon follow.
the biggest pain i've got with my postgres instances is upgrading them, and that's without pretty much any storage backend extensions. i imagine using any extension is only going to make it harder to do.
Depends on what type service you trying.... We do face few hiccups now and then.. but overall okay..
Rather i would recommend to go with architecture that is not tightly coupled to certain vendor and built on open source products...
We also prefer/buy services from companies who build OSS like in case of ELK, we, rather than hosting and doing it by ourselves. We are more than happy to pay bit more for hard work their team has done.
That is not the mantra at our company. We want serverless and managed services whenever possible because getting resources to setup and maintain anything is next to impossible. We are now trying to use some competing products to AWS products that are full SaaS, so theoretically great, but because it isn't tacked on to our AWS bill it is very hard to get it approved and we are waiting around for weeks turning into months.
I'm not saying you should not! But lets just say you want ELK stack, you can try ELK cloud from Elastic which is hosted on AWS or Azure Or GCP as managed services But Elastic company manages it...That also help those folks support themselves.
Timestream hasn't even really been generally released as best I can tell. We wanted to use it, went to TimescaleDB, and then ended up going to Druid to improve performance.
Seriously, If you need to do this at scale, use Druid. It's much more efficient for time-series data.
We've been hard at work with lots of great features for TimescaleDB 2.0, which should be GA in the next couple of weeks which makes multi-node available for anyone to use!
There is a branch for PostgreSQL 13 support available for beta testing if you want to build it yourself. It's important to us and we'll focus on completing that integration soon!
AWS is no longer our friend. We need to start weening ourselves from them. They are literally just looking at a list everyday and saying, what company can we try to screw over today.
That's so true. For big corps thats just game of money, a channel for more revenue and VP of products to shine/brag about it in next quarterly meeting.
Its not that, its fine for there to be big corps, and for companies to play the game and get money. Its the way that amazon choses to play the game. They are unsportsmanlike, and sometimes downright cheaters.
> 224x cheaper if you’re self-managing TimescaleDB on a VM
I'm having a hard time understanding the cost comparison without details of the above. Are they saying that hosting your own cluster of timescaledb nodes within EC2 still comes in cheaper than timestream? This seems impossible, depending on the instance types of course.
1. To complete this benchmark workload, it took less than an hour in two different environments (Digital Ocean self-managed & Timescale Forge fully managed) to ingest 1 billion metrics and run all 30K queries. It took us a week of work (testing, modifying code to try and make Timestream better) to get 40% of the metrics into Timestream and then query it. 1 hour vs 7 days.
2. If you look at the bill/costs, the main driver was querying. We (attempted) to run the same 30K queries on less than half the data (410 million metrics) in Timestream and somehow scanned 21TB of data. I have no idea why and there's nothing we could do to change it.
As a developer, that's going to be your biggest unknown. If you're ingesting millions or billions of metrics a day and querying it with a real application, you could really get hit with crazy query costs.
With a more traditional server architecture, at least you know your day-to-day costs and can set a known capacity to achieve the performance you need (or scale in understandable ways when you need it)
This depends on what AWS service we're talking about, doesn't it? For example a Kinesis stream compared to running a (say) 5 EC2 node Kafka cluster with zookeeper etc. Isn't it within the realm of possibility the managed service comes out cheaper?
"Amazon has a history of offering services that take advantage of the R&D efforts of others: for example, Amazon Elasticsearch Service, Amazon Managed Streaming for Apache Kafka,[...]"
And at the end of the article they promote how they themselves rely on other people hard work:
"TimescaleDB uses a dramatically different design principle: build on PostgreSQL. As noted previously, this allows TimescaleDB to inherit over 25 years of dedicated engineering effort that the entire PostgreSQL community has done to build a rock-solid database that supports millions of applications worldwide."
In the end, it sounds like they are doing exactly what Amazon is doing with open-source.
There's a big difference -- that's how the PostgreSQL community works. 2ndQuadrant (now EDB), EDB, Citus (now Microsoft) all add value to open source Postgres, contribute back to the community by bringing new features, new life, usecases, and of course committing changes upstream where possible. Timescale is actually on the more open side of that balance, with the licensing and the community version feature matrix.
Also, in this case, Timescale actually has a pretty forgiving license[0] as long as you are not a add-nothing-aaS-provider, perhaps more than it should be, which I've asked about before[1]. Even before that change was made, running just the community edition as a add-nothing-aaS-provider would have been an improvement on the status quo, given how soundly it thrashed some other solutions in the past (ex. Influx[2]) and what you can do it (promscale[3]).
I know it can't be all roses, nothing is, but I don't think they've put too many feet wrong so far.
[EDIT] - I should note that on the scale of "contributing" to Postgres, the scale heavily tips in favor of 2ndQuadrant, EDB, and Citus as obviously they have the most committers and core team members. All those companies are to be commended of course, they're making postgres work as businesses and keeping it free while also improving it.
This GitHub repository is just the database, what about the code that holds together their cloud resources? I was not able to find it.
They criticize AWS for making money on Elasticsearch for example, AWS is "taking advantage of the R&D efforts of others". So Amazon is making money on a "serverless" / cloud experience. At the same time, it is known that Amazon is contributing back to Elasticsearch [1]. To that regard, I find their business model really similar to the one AWS is relying upon.
There's no way that Elastic would accept the Open Distro features as a contribution to Elasticsearch because they include inferior versions of features already available in the commercially-licensed version of Elasticsearch and, in the case of the Search Guard code in Open Distro, code that Elastic alleges was lifted from existing commercially-licensed Elasticsearch features. AWS knows all this but offers it anyway as a PR stunt so they can say that they at least attempted to make contributions to Elasticsearch.
Huh? If I want to run someone else's software on my cloud, how I want to do that depends on my cloud. I want docs and binaries, not a full cloud configuration.
Some people would want k8s, some docker swarm or whatever, some an aws config, others ansible, etc etc etc.
If I want a managed service, I _want_ to pay for that. The price includes people responding to pages and fixing problems, as well as fiddling with configs.
And I'd much rather be paying that money to a small (relatively) open-source company than a behemoth like AWS.
> If I want a managed service, I _want_ to pay for that. The price includes people responding to pages and fixing problems, as well as fiddling with configs.
Is this not what S3 is? You are free to use Elasticsearch and handle everything yourself, but if you want a managed service, you can use AWS. They are "attacking" the S3 offering, I still struggle to see any difference with them hosting Postgres.
> And I'd much rather be paying that money to a small (relatively) open-source company than a behemoth like AWS.
I am 100% with you on this, I do not want to defend AWS nor do I want to promote their products. The vendor lock-in situation you are in when using AWS is pretty bad and quite scary... An yes, I agree that the way they monetize open-source software is questionable.
I think we (like many) stand on the shoulders of giants when it comes to software, but I’m not sure the comparison is quite apt.
Amazon primarily runs and monetizes closed-source, SaaS-only managed services.
TimescaleDB instead is implemented in the open as an extension to PostgreSQL, and enriches and benefits the broader PostgreSQL community by unlocking a new use case (time series). The Postgres extension framework exists very much for this purpose, for projects like TimescaleDB to contribute back without needing to “pollute” mainline with domain-specific features. Most of TimescaleDB’s code (and all development for the first few years) is Apache 2, and all features are free for anybody to self-manage.
Timescale is a freely available PostgreSQL extension built in the open, and anyone can contribute. That's one of the great things about PostgreSQL - it was built to be extensible and allow customization. YAY!
Again, you're free to use it for any project, all Community features, wherever you want. The only thing you can't do is run a DBaaS for TimescaleDB. Seems like a fair tradeoff, right?
This reads like a Postgres vs DynamoDB comparison, because that’s likely what it is.
The article makes a lot of points really well, and if I had 8 highly qualified people who had nothing else to do on my team working for free (or on salaries that were a rounding error in my actual and opportunity cost budget), I’d certainly get them to learn and use Timescale. But I don’t right now, so I’m going to use the option that gives me a time series DB with three or four clicks, gets the job done, and charges me for deployment and maintenance amortised across thousands of other customers.
Having these arguments purely on cost of service terms is a very slippery slope. Yes, it’s less dollars paid in hardware if I put my life on hold and learn how to configure this. Or pay someone else to do it. Yes, it’s cheaper if I order parts on newegg, build a server, drive to a colo and install it in the rack, and then drive there again each time something breaks. And yes, it’s even cheaper if I run it off a Raspberry Pi duct taped under my desk. No one is disputing these things.
AWS and other cloud providers sell peace of mind, acquisition and procurement speed, professional maintenance and remediation, deep integration, faster development and testing, and infrastructure management - with servers thrown in for free. Timescale sells time series database software. This is not a valid comparison.
1. Timescale sells a fully managed service offering running on AWS, GCP, or Azure. So you can get started with 3-4 clicks and have a database running in 30 seconds. (We actually don't sell "software" at all - that's all free to download and run yourself.) We're fully aligned that cloud services often have lower TCO in the long run =)
2. Beyond operations, your team probably already knows how to use much of TimescaleDB, if they know SQL and PostgreSQL. AWS Timestream, on the other hand, introduces a bunch more of strange gotcha's, as evidenced in the blog post (even from the weird SQL hoops that you need to jump through.)
The managed cloud feature is great, so if there's a comparison happening it'll be much more valid if you do a like-to-like one. I'd be more interested to see that the numbers (performance and cost would be) if you ran the comparison with Timescale Cloud managed, on AWS, with EBS set up (and note IOPS settings) at a comparable provisioned size with multi-AZ failover. Doing that and checking costs would be much fairer from a cost-per-metric point of view.
We report numbers in the blog post with our managed offering Timescale Forge, running on AWS with storage replicated across AZs and supporting automated failover.
Not sure about "comparably provisioned re: IOPS" given that you don't know about that at all with AWS Timestream, but our blog post reports the performance/cost of a suitably provisioned Timescale Forge instance (8vCPU / 1TB storage).
100GB ingest then query benchmarks took far less than 1 hour with Timescale Forge @ $2.18/hour.
The "consumption based pricing" for AWS Timestream for same benchmark took $336.40.
Pre-reading hypothesis: TimescaleDB is declared orders of magnitude faster because the benchmark is serving results they're computing at writing time? Is it just like the ClickHouse benchmark from earlier, where they read from a `CREATE TABLE [...] ENGINE = AggregatingMergeTree`?
Post-reading:
"faster queries via continuous aggregates". So is this it? I couldn't find how tables / materialized views were created in the source though [1].
TimescaleDB is probably a very good product (and pg-compatible!), but producing such articles hiding the usage of a magic feature is sort of dishonest. Why not make an article directly on the power of the feature? It's hurting their brand reputation a bit.
I'm sorry you feel like we were trying to be dishonest in the post. On the contrary, we put a lot of effort (and 7,000+ words) into trying to explain everything that we did - just as we've done with other benchmarks which others have linked to.
The TimescaleDB test did not use continuous aggregates for these test, only raw time-series data stored in hypertables.
For each database, we (and other contributors) do our best to use features in all cases that take advantage of the DB. For TimescaleDB, a function like LAST() happens to be really powerful for most workloads and is really, really fast. That's not cheating, it's using the software properly! :-)
I've been reading about your columnar compression pipeline [1], and it sort of makes sense if the comparison is against a regular row-oriented DB. AWS Timestream must really be doing something wrong here, or serving an entirely different use case.
5-175x faster queries and 150x-220x cheaper I do get it. But 6000x higher inserts does not make sense to me. It is insane, and literally unbelievable to me.
Storage savings are at 96% for "IT metrics (DevOps dataset from TSBS)", so it should be closer to 25x higher insert rate. Where is the missing 240x? Is this from some distributed replication overhead? Is this from local vs remote insertion? Is this from bulk inserts vs per row?
Anyway I wanted to thank you for your kind efforts in writing the blog post and providing answers here; and for the patience that you show to the audience here, me included.
"This is made possible by the way Timestream is managing data: recent data is kept in memory and historical data is moved to cost-optimized storage based on a retention policy you define. All data is always automatically replicated across multiple availability zones (AZ) in the same AWS region. New data is written to the memory store, where data is replicated across three AZs before returning success of the operation. Data replication is quorum based such that the loss of nodes, or an entire AZ, does not disrupt durability or availability. In addition, data in the memory store is continuously backed up to Amazon Simple Storage Service (S3) as an extra precaution."
Feels like apples to oranges comparison, as the consistency models are really different. Then again, I woul definitely optimize for performance on most timeseries use cases. Different products with different features baked in.
We're happy for people to poke at this and helping us to improve. It's obviously hard to work at something for weeks, see the numbers (even knowing you really tried for days to move the needle) and then still publish numbers that seem impossible. And again, if you look at TSBS, this isn't the first time we've run benchmarks on other databases, so we were just as shocked and put extra effort into it.
In the end, if you read the article (and not just the headlines - not saying you are, but it's easy to see 6000x and latch on to it), the comparison is absolutely focused on this one, pretty straight forward use case (although we normally run 5 different scenarios):
From one client, given a specific kind of workload (100 hosts, 10 CPU metrics every 10 seconds for 30 days = ~1 billion metrics) - how fast could we save the data. Most other time series databases at least perform marginally well with the same setup... load data with one client.
But Timestream just doesn't seem setup to work that way. Some of the responses today imply that we need really large clients with thousands of threads to get those speeds. And that might work if we kept going and spent more time and significantly more money. We just haven't ever had to do that before.
If your use case better aligns with what Timestream offers, then it might be a great product for you. Given some of the many other concerns we discovered along the way, it doesn't yet seem like the time to jump in.
This comment is pretty dishonest after reading the article. The most striking part of the benchmarks was the absurdly abysmal insert rate for timestream. Honestly after seeing that, I think the team at AWS that built it should be reassigned and contractors or an A team brought in to try and salvage it. Because they've built themselves a very expensive lemon and they're trying to sell it to the world as a Cadillac.
I would eventually like to see comparison for a more analysis heavy workload vs Druid though.
Simple metrics workloads like what Timestream and Influx do aren't really of interest when comparing to Timescale ability to do JOINs and use other SQL and PostgreSQL specific functionality.