Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: QuestDB (YC S20) – Fast open source time series database
357 points by bluestreak on July 28, 2020 | hide | past | favorite | 167 comments
Hey everyone, I’m Vlad and I co-founded QuestDB (https://questdb.io) with Nic and Tanc. QuestDB is an open source database for time series, events, and analytical workloads with a primary focus on performance (https://github.com/questdb/questdb).

It started in 2012 when an energy trading company hired me to rebuild their real-time vessel tracking system. Management wanted me to use a well-known XML database that they had just bought a license for. This option would have required to take down production for about a week just to ingest the data. And a week downtime was not an option. With no more money to spend on software, I turned to alternatives such as OpenTSDB but they were not a fit for our data model. There was no solution in sight to deliver the project.

Then, I stumbled upon Peter Lawrey’s Java Chronicle library [1]. It loaded the same data in 2 minutes instead of a week using memory-mapped files. Besides the performance aspect, I found it fascinating that such a simple method was solving multiple issues simultaneously: fast write, read can happen even before data is committed to disk, code interacts with memory rather than IO functions, no buffers to copy. Incidentally, this was my first exposure to zero-GC Java.

But there were several issues. First, at the time It didn’t look like the library was going to be maintained. Second, it used Java NIO instead of using the OS API directly. This adds overhead since it creates individual objects with sole purpose to hold a memory address for each memory page. Third, although the NIO allocation API was well documented, the release API was not. It was really easy to run out of memory and hard to manage memory page release. I decided to ditch the XML DB and then started to write a custom storage engine in Java, similar to what Java Chronicle did. This engine used memory mapped files, off-heap memory and a custom query system for geospatial time series. Implementing this was a refreshing experience. I learned more in a few weeks than in years on the job.

Throughout my career, I mostly worked at large companies where developers are “managed” via itemized tasks sent as tickets. There was no room for creativity or initiative. In fact, it was in one’s best interest to follow the ticket's exact instructions, even if it was complete nonsense. I had just been promoted to a managerial role and regretted it after a week. After so much time hoping for a promotion, I immediately wanted to go back to the technical side. I became obsessed with learning new stuff again, particularly in the high performance space.

With some money aside, I left my job and started to work on QuestDB solo. I used Java and a small C layer to interact directly with the OS API without passing through a selector API. Although existing OS API wrappers would have been easier to get started with, the overhead increases complexity and hurts performance. I also wanted the system to be completely GC-free. To do this, I had to build off-heap memory management myself and I could not use off-the-shelf libraries. I had to rewrite many of the standard ones over the years to avoid producing any garbage.

As I had my first kid, I had to take contracting gigs to make ends meet over the following 6 years. All the stuff I had been learning boosted my confidence and I started performing well at interviews. This allowed me to get better paying contracts, I could take fewer jobs and free up more time to work on QuestDB while looking after my family. I would do research during the day and implement this into QuestDB at night. I was constantly looking for the next thing, which would take performance closer to the limits of the hardware.

A year in, I realised that my initial design was actually flawed and that it had to be thrown away. It had no concept of separation between readers and writers and would thus allow dirty reads. Storage was not guaranteed to be contiguous, and pages could be of various non-64-bit-divisible sizes. It was also very much cache-unfriendly, forcing the use of slow row-based reads instead of fast columnar and vectorized ones.Commits were slow, and as individual column files could be committed independently, they left the data open to corruption.

Although this was a setback, I got back to work. I wrote the new engine to allow atomic and durable multi-column commits, provide repeatable read isolation, and for commits to be instantaneous. To do this, I separated transaction files from the data files. This made it possible to commit multiple columns simultaneously as a simple update of the last committed row id. I also made storage dense by removing overlapping memory pages and writing data byte by byte over page edges.

This new approach improved query performance. It made it easy to split data across worker threads and to optimise the CPU pipeline with prefetch. It unlocked column-based execution and additional virtual parallelism with SIMD instruction sets [2] thanks to Agner Fog’s Vector Class Library [3]. It made it possible to implement more recent innovations like our own version of Google SwissTable [4]. I published more details when we released a demo server a few weeks ago on ShowHN [5]. This demo is still available to try online with a pre-loaded dataset of 1.6 billion rows [6]. Although it was hard and discouraging at first, this rewrite turned out to be the second best thing that happened to QuestDB.

The best thing was that people started to contribute to the project. I am really humbled that Tanc and Nic left our previous employer to build QuestDB. A few months later, former colleagues of mine left their stable low-latency jobs at banks to join us. I take this as a huge responsibility and I don’t want to let these guys down. The amount of work ahead gives me headaches and goosebumps at the same time.

QuestDB is deployed in production, including into a large fintech company. We’ve been focusing on building a community to get our first users and gather as much feedback as possible.

Thank you for reading this story - I hope it was interesting. I would love to read your feedback on QuestDB and to answer questions.

[1] https://github.com/peter-lawrey/Java-Chronicle

[2] https://news.ycombinator.com/item?id=22803504

[3] https://www.agner.org/optimize/vectorclass.pdf

[4] https://github.com/questdb/questdb/blob/master/core/src/main...

[5] https://news.ycombinator.com/item?id=23616878

[6] http://try.questdb.io:9000/

mmap'd databases are really quick to implement. I implemented both row and column orientated databases. The traders and quants loved it - and adoption took off after we built a web interface that let you see a whole day and also zoom into exact trades with 100ms load times for even the most heavily traded symbols.

The benefits of mmaping and in general POSIX filesystem atomic properties are quick implementation, where you don't have to worry about buffer management. The filesystem and disk block remapping layer (in SSD or even HDDs now) are radically more efficient when data are given to them in contiguous large chunks. This is difficult to control with mmap where the OS may write out pages at its whim. However, even using advanced Linux system calls like mremap and fallocate, which try to improve the complexity of changing mappings and layout in the filesystem, eventually this lack of control over buffers will bite you.

And then when you look at it, the kernel (with help from the processor TLB) has to maintain complex data-structures to represent the mappings and their dirty/clean states. Accessing memory is not O(1) even when it is in RAM. Making something better tuned to a database than the kernel page management is a significant hurdle but that's where there are opportunities.

thank you for sharing! The core of memory management is abstracted away. All of the query execution logic is unaware of the source of memory pointer. That said we are still learning and really appreciate your feedback. There are some places where we could not beat aggregation of julia, but the delta wasn't very big. This could have been down to mapped memory. We will definitely try things with direct memory too!

The databases I implemented experimented with various ways to compile queries. Turns out that the JVM can run quite fast. I feel like LLVM (Julia) is likely to be able to be better for throughput and definitely better for predictability of performance.

If you know layouts and sizes, then your generated code can run really fast - using SIMD and not checking bounds is a win.

Hugepages would greatly reduce pagetable bookkeeping, but obviously may magnify writes. Wish I could have tried that!

Our best performance currently is in C++ code and LLVM is something we are considering using to compute expressions, such as predicates and select clause items. This is most likely to be way faster than what we can currently do in Java. What I would like to know if LLVM can optimize all the way to AVX512?

We also need to experiment with hugepages. The beauty is that if read and write are separated - there is no issue with writes. They can still use 4k pages!

Please consider going through Andy Pavlo's superb video courses on building databases.


He discusses many common pitfalls and architectural decisions that impact database quality and performance, including map and why it's bad.

Absolutely love the story. TimescaleDB & InfluxDB have had a lot of posts on HN, so I'm sure others are wondering - how do we compare QuestDB to them? It sounds like performance is a big one, but I'm curious to hear your take on it.

As you said, performance is the main differentiator. We are orders of magnitude faster than TimescaleDB and InfluxDB on both data ingestion and querying. TimescaleDB relies on Postgres and has great SQL support. This is not the case for InfluxDB and this is where QuestDB shines: we do not plan to move away from SQL, we are very dedicated in bringing good support and some enhancements to make sure the querying language is as flexible and efficient as possible for our users.

Hi, TimescaleDB cofounder here. Nice to read about your journey in time-series data, and always welcome another database that can satisfy a specific type of developer needs.

I also commend you on your desire to rebuild everything Postgres offers from scratch. We took a different route by building on top of Postgres (which e.g. allowed us to launch with native replication, rock-solid reliability, window functions, geo spatial data, etc without sacrificing performance). But there are many ways up this mountain!

As a quick thing, however: While it’s not very representative of the workloads we typically see, I tried your simple 1B scan on a minimally-configured hypertable in TimescaleDB/PostgreSQL, and got results that were >12x faster on my 8-core laptop than what you were reporting on a 48-core AWS m5.metal instance.

I think the Hacker News community always appreciates transparency in benchmarking; looking forward to reading a follow up post where you share reproducible benchmarks where all databases are tuned equivalently.

I'm sure many folks would be really interested to see two things:

1. A blog post around a reproducible benchmark between QuestDB, TimescaleDB, and InfluxDB

2. A page, like questdb.io/quest-vs-timescale, that details the differences in side-by-side feature comparisons, kind of like this page: https://www.scylladb.com/lp/scylla-vs-cassandra/. Understandably, in the early days, this page will update frequently, but that level of transparency is really helpful to build trust with your users. Additionally, it'll help your less technical users to understand the differences, and it will be a sharable link for people to convince others & management that QuestDB is a good investment.

Perhaps the QuestDB team could add it to the Time Series Benchmarking Suite [1]? It currently supports benchmarking 9 databases including TimescaleDB and InfluxDB.

[1] https://github.com/timescale/tsbs

This is a great idea, we will have a look! It is good to see that the ecosystem is moving towards a normalized / "standard" benchmarking tool.

Love it! And agree. Hopefully this has the same broad effects the same way YCSB benchmark or Jepsen testing really helped level the playing field.

Wow! Nice. I am surprised neither Scylla nor KairosDB are on that list. I think you could run Scylla by itself (to compare with raw Cassandra) and also re-run with KairosDB running on top of Scylla and Cassandra to see what effects that has on performance. (Though of course, there are advantages to having KairosDB, too.)

Did you also compare to VictoriaMetrics?

Not yet - there is a bench vs clickhouse that has been done by one of their contributor though see below in the comments.

do you do realtime steaming using SQL as well?

Over the network streaming is not yet available. Someone has mentioned Kafka support, how useful would that be to stream processed (aggregated) values and/or actual table changes?

Are there any performance comparisons to TimescaleDB and Influx that you can share? A blog post perhaps?

hi there - co-founder of questdb here. The demo on our website hosts a 1.6 billion rows NYC taxi dataset with 10 years of weather data with around 30-minute resolution and weekly gas prices over the last decade.

We've got example of queries in the demo, and you can see the execution times there.

We have posted a blog post comparing the ingestion speed of InfluxDB and QuestDB via InfluxDB Line Protocol some time ago: https://questdb.io/blog/2019/12/19/lineprot

> We are orders of magnitude faster than TimescaleDB and InfluxDB

I think gp might be asking for a source for this claim.

I see execution times on the demo but not sure if thats enough to say its faster than timescale.

(Hard to draw many meaningful conclusions from a single, extremely simple query without much explanation?)

Graph shows PostgreSQL as taking a long time, but doesn't say anything about configuration or parallelization. PostgreSQL should be able to parallelize that type of query since 9.6+, but I think they didn't use parallelization in these experiments with PostgreSQL, even though they used a bunch of parallel threads with QuestDB?

So would be good to know:

- What version of Postgres

- How many parallel workers for this query

- If employing JIT'ing the query

- If pre-warming the cache in PostgreSQL and configuring it to store fully in memory (as benchmarks with QuestDB appeared to do a two-pass to first mmap into memory, and only accounting for the second pass over in-memory data).


Database benchmarking is pretty complex (and easy to bias), and most queries do not look like this toy one.

I agree that our blog post lacks of details, here are some:

- PostgreSQL 12

- 12

- No

- We ran the test using the pg_prewarm [0] module, the difference was negligible

Regarding the "toy" query, the reason we are showcasing this instead of other more complex queries is because this is a simple, easily reproducible benchmark. It provides a point of reference for performance figures.

> Database benchmarking is pretty complex (and easy to bias), and most queries do not look like this toy one.

I would say that benchmarking is very hard. We tried not to perform a biased benchmark by running something that is not time-series specific and which does not put us in advantage compared to what Postgres should do.

The takeaway from this is that configuration is important and we should expose it. The next benchmark we do will have an associated repository so people can review our config and point non optimal items if any.

[0]: https://www.postgresql.org/docs/9.4/pgprewarm.html

Is also be interested in hearing when is QuestDB not a good choice? Are there use cases where TimescaleDB, InfluxDB, ClickHouse or something else are better suited?

Hard question to answer because each solution is unique and has its own tradeoffs. Taking a step back QuestDB is a less mature product than the ones mentioned, and therefore there are many features, integrations etc. to build on our side. This is a reflection of how long we have been around and capital we have raised versus those companies who are much larger in size.

If you are already using Postgres, then TimescaleDB is a natural fit - not having to deploy and manage a separate service is a real boon. You can also join with non-TimescaleDB tables, so if you need to combine time series data with regular relational data, that's another advantage.

OLTP is not a good fit, if your workflow consists from INSERT/UPDATE/DELETE statements

I see this as a very interesting project. I use ClickHouse as OLAP and I'm very happy with it. I can tell you features that make me stick to it. If some day QuestDB offers them, I might explore the possibility to switch but never before. - very fast (I guess we're aligned here) - real time materialized views for aggregation functions (this is absolutely a killer feature that makes it quite pointless to be fast if you don't have it) - data warehouse features: I can join different data sources in one query. This allows me to join, for instance, my MySQL/MariaDB domain dB with it and produce very complete reports. - Grafana plugin - very easy to share/scale at table level - huge set of functions, from geo to URL, from ML to string manipulation - dictionaries: I can load maxdb geo dB and do real time localisation in queries I might add some more once they come to my mind. Having said this, good job!!!

Thank you for the kind words and constructive feedback. We are here to build on feedback like this. Grafana plugin is coming soon.

Glad to be useful. On the other side, I can tell you that ClickHouse also misses a feature everyone in the community of users wish for, which is automatic regarding when you add a new node (sort of what elasticsearch does).

And before I forget, ClickHouse Kafka Engine is simply brilliant. The possibility of just publishing to Kafka and have your data not only inserted in your DB but also pre-processed is very powerful.

Let me know if I can help you with use cases we have.


This is incredibly useful, thank you! It would be awesome if we could chat more about your use cases at some point. Drop us a line on hello at questdb.io or join our slack. Whichever is easier for you.

Thanks for the helpful feedback! Feel free to reach out to chat more. I'm super interested in more feedback from you. davidgs(at)questdb(dot)io

I certainly will. Cheers

If you don’t mind sharing, what are the specs of your ClickHouse cluster including Zookeepers? Also, how large of a dataset are you working with?

Am I the only one that's like "wtf is a time-series database compared to a normal one?"

Here's a Medium post on Time Series Databases: https://medium.com/datadriveninvestor/what-are-time-series-d... But basically, they are databases where the time dimension is a critical aspect of the data. They handle never-ending streams of incoming time-stamped data. Think of streaming stock prices, or streaming temperature data from a sensor. You want to be able to query your data specifically in a time dimension -- let's see the temperature fluctuations over the past 24 hours, including the mean. Stuff like that.

A 'normal one' is typically used for things like transactional data which adds, deletes, and updates data among linked tables. While these transactions happen in time, the time component isn't necessarily a critical dimension of the data.

One differentiating feature is "as of" join. You have records of the form (time, value), and you ask "what's the most recent value as of $time?"; On a non-TS oriented DBMS, this query is usually slow and hard to write. Window extensions to SQL can make it a little better, but - you can assume that a proper TSDB answers this query x10 to x10,000 times faster on the same hardware, especially when done in bulk (e.g.: I have one million (time,bid_price) records, and one million (time,transaction_price) records; For each transaction record, I want to know what the most-recent bid price was at that time.

That's something kdb+ and ClickHouse do in milliseconds; and I assume QuestDB can to, though I didn't check.

> On a non-TS oriented DBMS, this query is usually slow and hard to write.

I don't see why this is hard to write:

SELECT max(time) as most_recent_time, transaction_price INTO newtable FROM oldtable WHERE time > $my_datetime GROUP BY time;

As for being slow - can you provide some references about this being slow in non-time-series columnar DBMSes? MonetDB, Vectorwise, Vertica?

"most recent" is min(time) where time>=$my_datetime ;

I don't understand your query - max(time) .. GROUP BY time means that every record is its own group, so max(time)==time; also, you will have a result for every single point in time after the requested time.

There is no way in standard SQL (without window extensions) to do this with just one clause; you need the price associated with the min(time) associated with time>=$my_datetime ; You need either a subquery, a join, or some other correlated query. e.g.

SELECT time, price FROM trades WHERE time IN (SELECT min(time) FROM trades WHERE time>=$my_datetime)

And that's for one record; It gets more complicated when you try to do a batch.

> As for being slow - can you provide some references about this being slow in non-time-series columnar DBMSes? MonetDB, Vectorwise, Vertica?

I've timed kdb2 against Vertica a decade or so ago, and kdb2 was about x10 faster, I didn't do a proper benchmark but monetdb was in the x10-x100 slower than kdb2 at the time.

Ah yes, I should have grouped by the non-time key fields.

I still don't understand how that would give the right result.

It organises storage so that operations like "drop all entries from before X", "get entries between X and Y with tags A or B" are cheap and so that storing ~linearly increasing values is super efficient with stupid ratios like 1:800+ for DoubleDelta.

In general, imagine a problem space where you have millions (or much much more) of timeseries, each is potentially millions long (usually it's purely floats), and you want to perform time-series specific operations, like interpolate, extrapolate, moving avg, forecast, alert if ususual, plot, etc. Like, imagine AWS has 100s-1000s (or more) of metrics per "thing", and there's a very large number of things (EC2 instances, subnets, SageMaker instances, network switches, etc). Very specific data model, usually append-only, very specific read operations.

Broadly timeseries DB's are optimized around the following:

- ingesting large amounts of data

- aggregating data around a "time" field (i.e. per minute, per hour, etc)

- updates are rare

- deletes are rare

By sacrificing update and delete functionality, timeseries DB's can optimize creating and reading records.

"normal" database does not preserve the order of data as it comes in. To get the data out you have to constantly rely on indexes or "order by" constantly to get chronological order back. Time series database should maintain the order of data and not rely on indexes to have data make sense again.

This is actually an underrated question.

Time-series databases offer better performance and usability for dealing with time-series data (think DevOps metrics, data from IoT devices, stock prices etc, anything where you're monitoring and analyzing how things change over time)

They allow you answer questions where time is the main component of interest much more quickly and easily:

eg 1: IoT Sensors) Show me the average of temperature over all my devices over the past 3 days in 15 minute intervals

eg 2: Financial data) What's the price of stock X over the past 5 years

eg 3: DevOps data) What's the average memory and CPU used by all my servers of the past 5 mins

A normal database could be a purely relational database (e.g Postgres) or a non-relational database (e.g MongoDB). In both these cases, while you could use these databases for time-series data, they tend to offer worse performance at scale and a worse experience for doing common things (e.g real-time aggregations of data, data retention policies etc)

For more on time-series data and when you'd need a time-series database, check out: https://blog.timescale.com/blog/what-the-heck-is-time-series...

If you're comparing performance against a transaction-oriented DBMS like postgres, rather than an analytics-oriented columnar DBMS like Actian Vector, MonetDB, HP Vertica etc - then of course you'll get bad perfromance. The former kind are typically up to 1000x slower than the latter kind on analytic queries.

> eg 2: Financial data) What's the price of stock X over the past 5 years

This is so incredibly frustratingly slow to pull on FactSet and Capital IQ, it makes me want to pull my hair every time I have to build line charts over time for a period greater than 2 years

plug, but our system provides very fast access to price, fundamentals, estimates, etc: https://factset.quantopian.com

Sounds like you need a time-seres database for those sorts of narrow and deep queries :)

What's difficult is to find a database that has good performance on both narrow and deep queries (e.g Price of stock X for past 5 years) as well as shallow and wide queries (e.g Price of all stocks in past 15mins)



Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

I'm curious how QuestDB handles dimensions. OLAP support with reasonably large number of dimensions and cardinality in the range of at least thousands is a must for modern-day time series database. Otherwise, what we get is only incremental improvement to Graphite -- a darling among startups, I understand, but a non-scalable extremely hard to use timeseries database nonetheless.

A common flaw I see in many time-series DBs is that they store one time series per combination of dimensions. As a result, any aggregation will result in scanning of potentially millions of time series. If any time-series DB claims that it is backed up by a key-value store, say, Cassandra, then the DB will have the aforementioned issue. For instance, Uber's M3 used to be backed up by Cassandra, and therefore would give this mysterious warning that an aggregation function exceeded the quota of 10,000 time series, even though from user's point of view the function dealt with a single time series with a number of dimensions.

FYI M3 is now backed by M3DB, a distributed quorum read/write replicated time-series based columnar store specialized for realtime metrics. You can associate multiple values/timeseries with a single set of dimensions if you use Protobuf's to write data, for more see the storage engine documentation[0]. The current recommendation is not to limit your queries but limit the global data queried per second[1] by a single DB node by using a limit on the number of datapoints (inferred by blocks of datapoints per series). M3DB also uses an inverted index using FST segments that are mmap'd[2] similar to Apache Lucene and Elastic Search to make multi-dimensional searches on very large data sets fast (hundreds of trillions of datapoints, petabytes of data) which is a bit different to traditional columnar databases which focus on column stores and rarely is accompanied by a full text search inverted index.

[0]: https://docs.m3db.io/m3db/architecture/engine/

[1]: https://docs.m3db.io/operational_guide/resource_limits/

[2]: https://fosdem.org/2020/schedule/event/m3db/, https://fosdem.org/2020/schedule/event/m3db/attachments/audi... (PDF)

Recommended reading on FST for the curious: https://blog.burntsushi.net/transducers/

Thank you for mentioning that, Andrew's post is really fantastic covering many things altogether: fundamentals, data structure, real world impact and examples.

I love FST's and similar structures, they're just such a cool idea.

Anyone know if there are any other similar interesting blog posts/articles about FST's?

Thanks, @roskilli! Nice documentation.

We store "dimensions" as table columns with no artificial limits on column count. If you able to send all dimensions in the same message, they will be stored on one row of data. If dimensions are sent as separate messages, current implementation will store them on different rows. This will make columns sparse. We can change that if need be and "update" the same row as dimensions arrive as long as they have the same timestamp value.

There is an option to store set of dimensions separately as asof/splice join separate tables.

Thanks for the explanation.

Can you handle multiple time dimensions efficiently? We have 3 of them, can one get away without having to physically store "slices" on one of them?

if you can send all three in the same message, Influx Line Protocol for example, we will store them as 3 columns in one table. Does this help?

> https://news.ycombinator.com/item?id=22803504

As I already said or rather asked there: Assume I already use Clickhouse for example. What are the benefits of QuestDB? Why should I use it instead?

Surely it's a good tech and competition is key. But what are the key points that should make me look into it? There is a lot of story about the making and such, but I don't see the "selling point".

Hey, one of the key differences here is that Clickhouse is owned by a large corporation, Yandex (the Google of Russia) and seems to drive its roadmap in function of the needs of the company. We are committed to our community and driving our roadmap based on their needs rather than having to fulfill needs of a parent company.

Ultimately as a result we think that questDB will be a better fit for your community. We acknowledge that Clickhouse has lot more features as of now being a more mature product.

Hi Vlad, this looks really interesting!

I really enjoyed reading the backstory and the founding dynamics upon QuestDB was born and I think a lot of others in the YC community will as well.

Can you give some use cases or specific examples of why QuestDB is unique?

thanks! What differentiates us from other time series databases is the performance. Both for ingestion and queries. For example we can ingest application performance metrics via Influx Line Protocol and query them via SQL and both should faster than incumbents

I find your story very interesting, thank you for sharing that.

It also gives an interesting background as to why questdb is different than all the other competitors in the space.

Thank you for the kind words!

Congratulations on launching! It looks like a great product. Some technical questions which I didn’t see answered on my first glance:

(1) Is it a single-server only, or is it possible to store data replicated as well?

(2) I’m guessing that all the benchmarks were done with all the hot data paged into memory (correct?); what’s the performance once you hit the disk? How much memory do you recommend running with?

(3) How’s the durability? How often do you write to disk? How do you take backups? Do you support streaming backups? How fast/slow/big are snapshot backups?

thank you!

- replication is in the works, this is going to be both TCP and UDP based, column-first, very fast.

- yes, benchmarks are indeed are done on second pass over the mmaped pages. First pass would trigger IO, which is OS-driven and dependant on disk speed. We've seen well over 1.5Gb/s on disks that support this speed. Columns are mapped into memory separately and they are lazy accessed. So the memory footprint depends on what data your SQLs actually lift. We go quite far to minimize false disk reads by working with rowids as much and possible. For example 'order by' will need memory for 8 x row_count bytes in most cases.

- durability is something we want user to have control over. Under the hood we have these commit modes:


NOSYNC = means OS flushes memory whenever. That said, we use sliding 16MB memory window when writing. Flushes will trigger by unmapping pages. ASYNC = we call msync(async) SYNC = we call msync(sync)

Definitely enjoyed the story and I find the product interesting! I especially like the time-series aggregation clauses since it makes it easy to "think in SQL."

I was also going to ask about replication. Any idea when it's going to be done?

Oh and kudos for the witty (previous) company name: Appsicle, haha, love that.

Hi, I'm a questdb dev working on replication, we should have something working within a couple of months. If you have any questions feel free to ask me.

Curious: What is your strategy on replication? Is it some form of synchronous replication or asynchronous (i.e. active/passive with potential for data loss in event of hard loss of primary)? Also curious why you might look at UDP replication given unless using a protocol like QUIC on top of it, UDP replication would be inherently lossy (i.e. not even eventually consistent).

The strategy is to multicast data to several nodes simultaneously. Data packets are sequence to allow receiver identify data loss. When loss is detected receiver finds breathing space to send a NACK. The packet and the nack would identify missing data chunk with O(1) complexity and sender then re-sends. Overall this method is lossless and avoids overhead of contacting nodes individually and sending same data over the network multiple times. This is useful in scenarios where several nodes participate in query execution and getting them up to date quickly is important.

This reminds me a bit of Aeron (https://github.com/real-logic/aeron) which is a reliable UDP uni/multicast transport library with built-in flow control. It's written in Java and seems to have superb performance (I haven't used it myself). Might be an interesting alternative if you don't want to write it all yourself.

Hi Vlad - your anecdote about ship tracking is interesting (my other startup is an AIS based dry freight trader). You must know the Vortexa guys given your BP background.

How does QuestDB differ from other timeseries/OLAP offerings? I'm not entirely clear.

thank you, life is an interesting experience :) I used to work with Fabio, Vortexa CEO and had to turn down an offer of being first employee there to focus on QuestDB. They are an absolute awesome bunch of guys and deserve every bit of their success!

What makes QuestDB different from other tools is the performance we aim to offer. We are completely open on how we achieve this performance and we serve community first and foremost.

This looks great, but more importantly good luck! There seems to be market need for this and it looks a solid implementation at first glance. You're off to a good start. I hope you and your team are successful!



Also thank you for your awesome blog[0]! It's really the kind of technical gem I enjoy reading late at night :)

[0] https://questdb.io/blog

This is great! Quick question: would you mind sharing why you went with Java vs something perhaps more performant like all C/C++ or Rust? I'd suspect language familiarity (which is 100% ok).

Java was the starting point. Back in the day Rust wasn't a thing and C++ projects were quite expensive to maintain. What Java does for us is IDE support, instant compilation time and super easy test coverage. For things that does require ultimate performance we do use C/C++ though. These libraries are packaged with Java and transparent to end user.

Makes sense, that's what I also guessed.

You may also want to check out NetData (a hugely popular OSS project) for ideas how to grow.

Congrats! I've been looking for a time series database but most of them seems to be in-memory nosql databases. QuestDB might be exactly what I need. I'll definitely give it a try soon!

It would then be in your interest to know ClickHouse. I recommend your to have a look at it.

We've had one of their contributors bench questdb versus Clickhouse recently - you can find the results here https://github.com/questdb/questdb/issues/436

This came from a bench we had on our previous website versus them about summing 1 billion doubles.

Thank you! We have quite active slack and we try to listen to and help our community there. Feel free to join!

Stories like these help a product to get traction. Every founder/creator must come up with a story related to the product.


How does your performance compare to Atlas? [0]

[0] https://github.com/Netflix/atlas

I have not tried to benchmark Atlas but I am not sure the result would be meaningful. Atlas is an in-memory database, QuestDB persists to disk, the 2 are not very comparable.

Atlas persists to disk too. Netflix stores trillions of data points in it.

It stores recent data in memory for increased performance which is replicated across instances and then persists to S3 for long term storage, making aggregates queryable and full resolution data available with a delay for a restore from storage.

Amazing story and congrats on all the progress!

Shameless plug: if you'd like to try it out in a production setting, we just created a one-click install for it:


There's an opportunity for a tool that combines this sort of technology in the backend with a spreadsheet-like GUI powered by formulas and all the user friendliness that comes with a non-programmer interface. Wall Street would forever be changed. Source: I'm one of the poor souls fighting my CPU and RAM to do the same thing with Excel and non-native add-ins by {FactSet, Capital IQ, Bloomberg}

This stuff

    SELECT * FROM balances
    LATEST BY balance_ccy, cust_id
    WHERE timestamp <= '2020-04-22T16:15:00.000Z'
    AND NOT inactive;
Makes me literally want to cry for knowing what is possible yet not being able to do this on my day job :(

We are working on building a solid PostgreSQL support insofar as allowing ODBC driver to execute this type of query from Excel. This is work in progress with not that much left on it.

Awesome! I think about this almost on a daily basis, and could very well be wrong, but from my perspective think the killer feature is integrating the querying with the financial data providers I mentioned above so they could sell the whole thing as the final product to end users. (EDIT: from a reply to another comment, it seems like some people are onto the concept: https://factset.quantopian.com)

If you ever install FactSet for a trial period and try querying time series with even ~10,000+ data points, you'd be amazed at how long it takes, how sluggish it is and how often Excel crashes.

My real perspective is Microsoft should roll something similar out as part of Excel and also get in the business of providing the financial data as they continue the transition into services over products

Perhaps this is an aside, but the questdb.io website claims "postgres wire support", but doesn't clarify whether that's _full_ wire support, or a subset of postgresql (which would be understandable) and if so, what the subset is.

For Excel-like analysis with pivot tables take a look to our https://www.seektable.com; it can connect to various DW that are suitable for near real-time big-data analytics (like Redshift, BigQuery, Snowflake, Clickhouse). SeekTable has can be deployed on-premise.

We can add a connector for QuestDB if someone is really interested in this.

The SQL explorer at http://try.questdb.io:9000/ is pretty slick – was that built in-house, or is it based on something that's open-source?

Thanks! This is something we built in house. In fact, ‘mpsq’ did all of that. All credit should go to him!

The database aside entirely, that story was a really fun read. Thanks for writing it up and sharing. Rooting for you!

I noticed there is "Clustering" mentioned under enterprise features, but I can't seem to find any references to it in the documentation. Is this something that will be strictly closed source?

There will be two different flavors of replication:

- TCP-based replication for WAN - UDP-based replication for LAN and high traffic environments

We are currently building foundation elements of this replication, such as column-first and parallel writes. These will go into and always be part of QuestDB. TCP-replication will go on top of this foundation and also part of QuestDB. UDP-based replication will be a part of a different product we are building that will be named Pulsar.

Thanks for your response! Last question...

Will the clustering target just replication (HA) or will it also target sharding for scaling out storage capacity?


Eventually both. We are starting with baby steps, e.g. get data from A to B quickly and reliably. Replication/HA will be first of course. Then we want to scale queries across multiple hosts. Since all nodes have the same data - they may as well all participate. Sharding will be last. We are thinking of taking a route of virtualizing tables. Each shard can be its own table and SQL optimiser can use them as partitions of single virtual table. We already take single table and partition it for execution. Sharding seems almost like a natural fit.

I am still hoping to see comparisons to Victoria Metrics, which also shows much better performance than many other TSDB. Victoria Metrics is Prometheus compatible whereas Quest now supports Postgres compatibility. Both have compatibility with InfluxDB.

The Victoria Metrics story is somewhat similar where someone tried using Clickhouse for large time series data at work and was astonished at how much faster it was. He then made a reimplementation customized for time series data and the Prometheus ecosystem.

This is on the roadmap, we will work on integrating with https://github.com/timescale/tsbs, TSBS has Victoria Metrics support too.

Awesome! Could you share a bit about business model?

hi, co-founder of QuestDB here.

QuestDB is open source and therefore free for everybody to use. Another product using QuestDB as a library with features that are typically required for massive enterprise deployment will be distributed and sold to companies through a fully managed solution.

Our idea is to empower developers to solve their problems with QuestDB open source, and for those developers to then push the product within the organisation bottom up.

I'm not associated with QuestDB, but if it's anything like the other open-source startups I work with then the business model is probably selling a managed or hosted version of the DB with enterprise benefits like security compliance, SLAs, and engineering support. In that case the open-source DB will act as a driver of awareness and of demand for the commercial option.

Testing out the demo:

SELECT * FROM trips WHERE tip_amount > 500 ORDER BY tip_amount DESC

Very interesting :-)

thanks for trying the live demo. That's a very interesting result indeed. Btw, we are working on applying SIMD operations on filter queries (where clause) that will speed up the runtime of queries like that considerably.

For some reason this query is taking too long to execute. Not sure if I missed something.

When I ran it it took about 20s total.

took almost a min for me

Some of those are absolutely monstrous tips!

Your story is very inspiring. I wish you all the best with this project.

Thanks for the kind words!

Maybe I'm out of the loop, but I noticed lately that a majority of show/launch hn posts I click on have text that is muted. I know this happens on down voted comments, but is this saying that people are down voting the post itself?

I've noticed this when I post "Ask HN" questions too - text seems to be muted immediately though, regardless of votes. It's kind of confusing since, as you say, elsewhere this means a comment has been downvoted.

I don’t think it’s being downvoted. Maybe it’s because it’s an actual post?

1. Does QuestDB support SQL sufficiently to run, say, the TPC-H analytics benchmark? (not a time series

2. If so, can you give some performance figures for a single machine and reasonable scale factors (10 GB, 100 GB, 1000 GB)? Execution times for single queries are even more interesting than the overall random-queries-per-hour figure.

3. Can you link to a widely-used benchmark for analytic workloads on time series, which one can use to compare performance of various DBMSes? With SQL-based queries preferably.

Congrats on the launch!

One question, there are many open source database startups that make it easy to scale on the cloud. However, when you look into the offering, the scaling part is never actually open source and you end up paying for non open source stuff just like any other proprietary database. So I guess my question is, are you planning to go open core too or will you remain open source with some SaaS offering? Good luck to you!

QuestDB is open source and will remain so forever. You will be able to scale with it. Our commercial product, Pulsar, uses QuestDB as a library and will offer enterprise integration and monitoring features, which are typically required for massive enterprise deployment.

To answer your question, it will depend how big of a scale we are looking into. If you are a large company running questdb throughout the organization and need a specific feature set to do so, you will probably be looking to get our paid offering.

When I read Pulsar I think Apache Pulsar.

please try http instead of https

Can you talk about some of the ideal use cases for a time series db? Versus Postgres or a graph database.

Great question! Time series databases are a great solution for applications that need to process streams of data. IoT is a popular use case. DevOps and infrastructure monitoring applications as well. As has been mentioned in other comments here, there are a lot of use cases in financial services as well.

These are all applications where you’re dealing with streams of time-stamped data that needs to be ingested, stored, and queried in huge volumes.

Does it supports some kind of compression ? That's very important when storing billions of events.

Not yet but this is on the roadmap. In the meantime you could use a filesystem that supports compression such as ZFS or brtfs. The data is column orientated, this means that compression would be super efficient.

Does postgres wire support mean QuestDB can be a drop-in replacement for a postgres database?

Is this common?

QuestDB Head of DevRel here ... Yes, it can be a replacement of Postgres and it will be cheaper and faster. That being said, PGwire is still in alpha and is not 100% covered yet, so while migrating is possible, 100% Postgres Wire Protocol compatibility is not there yet.

For traditional transactional RDBMS data, I don't think it's a very common choice. For Time Series data, QuestDB is by far the fastest choice for Postgres-compatible SQL Time Series databases.

This is great if it works well. Drop-in replacement would be great for systems with DB abstraction like in django, sqlalchemy.

Yeah, I checked it out and wanted to use but a bunch of regular old SQL queries don't work. Please add support for the old fashioned group by syntax! (This will be helpful for getting to a true drop-in replacement!)

QuestDB dev here. We added support for GROUP BY syntax in yesterday's release

How do you get the best performance out of QuestDB? Does it have to be on bare metal machines? Is there any performance benchmark of QuestDB running on bare metal vs. cloud instances (e.g. EC2 with EBS volumes) etc.?

QuestDB is quite intensive on cores. For example it is not a good idea to put two threads on hyperthreded cores that share the same physical core. Also it isn't good idea to put threads on cores that belong to different physical CPUs on multi-socket server. With bare metal box you will have visibility of all of these conditions. On virtualized boxed - this will depend on your luck really. That said - if no one is hammering the same core questdb is hammering - performance will be very similar to metal box.

https://questdb.io/docs/crudOperations Has js errors and is not loading/page not found

Thanks for reporting this! This is an old link, please use https://questdb.io/docs/guide/crud instead. I am currently updating the README and removing all dead links.

Good luck. I work on similar OS database engine for about decade now. It is not bad, but I think consulting is better way to get funds. Also avoid "zero gc", JVM can be surprisingly good.

Will be in touch :)

Thanks Jan!

Do you measure performance vs k/shakti?

No one is allowed to benchmark them except them. :-)

something is off with your website. I just see images https://questdb.io/blog/2020/07/24/use-questdb-for-swag/

What browser are you using?

chrome on osx Version 84.0.4147.89 (Official Build) (64-bit)

works fine in safari, something is up with the dark theme.

I tried with the same setup and it works fine. I tried to disable JS too and it's OK. Could it be a rogue extension?

ah yea you are right. I had high contrast extension messing with this page.

How do i join the slack group? It says to request invite from the workspace administrator?

if you click on join slack top right of our website you'll be joining our public channels!

Unable to join your slack here...Slack is stating: "Contact the workspace administrator for an invitation"

Loved the story and the product!

Great story! Thanks for sharing

Thanks Jerome!

Impressive. Can we talk?

Sure could you join our slack ? Top right of our website www.questdb.io

Unable to join your slack here...Slack is stating: "Contact the workspace administrator for an invitation"

Head of DevRel for QuestDB here drop me an email at davidgs(at)questdb(dot)io and I'll make sure you get invited.

great story! well done.

Any plans on integration with Apache Arrow?

It has been asked here: https://github.com/questdb/questdb/issues/261. Definitely on our road map. It would be good if you could share your story why you need arrow?

No urgent reason. I've noticed a decent of technologies have adopted it in some way or another. I could imagine it being useful for integrating QuestDB with existing internal systems which use Arrow for its in-memory/interchange format.

Appreciate the issue link :)

kudos @ launching, impressive

Thank so much!

Can you add a tldr?

this. i'm sure qdb is a great product, but i can't even stomach reading long lines and walls of text...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact