
Show HN: QuestDB – fast time series database, zero-GC Java - bluestreak
https://www.questdb.io/
======
bluestreak
Back in 2012, my former boss forced me to use marklogic as a database to
ingest and help display positional data in real-time. Unable to get marklogic
to work without building fairly complex caches around it, I then decided to
look around for a database, which would have performance properties of a
cache. With none found (I missed redis somehow) and no money to spend on KDB,
I started to cobble together file based storage and cache to work as a
cohesive unit. Let me tell you, I learnt a lot more in that one year than my
prior 15 years developing software.

Addicted to learning new things, I doubled-down on turning the storage+cache
idea into a functioning database, without realising how hard it was to turn
this idea into reality. Unwilling to give up, and fast forwarding to almost
2020, I'm asking you to provide feedback about what I've accomplished so far.
QuestDB is currently usable as a single node or embedded, whilst not yet
supporting horizontal scalability (on our roadmap in 2020'). Thank you for
your time, I look forward to hearing your honest feedback.

------
sirffuzzylogik
How does it compare to kdb performance wise?

~~~
bluestreak
We stay on par with KDB in most cases we tested, exempt for resampling of time
series (xbar in kdb) where we are much faster

------
woodyHood
Interesting project, how do you manage to achieve consistency of the data?

~~~
bluestreak
Each table has transaction file (_txn). This file has navigation pointers for
both reader and writer. We map this file into memory and read and write this
memory atomically without blocking. Data that is written but not referred to
from _txn is ignored by reader. On other hand _txn writer related information
has "todo" list that is populated before writer attempts structure changes or
truncate. When something writer does fails, it will attempt to undo that. When
undo also fails it will report error and will re-attempt repair on restart.

A good example of that would be deleting a column or partition on Windows.
Writer will communicate to reader via _txn file that structure version is
changed and attempt to delete files via todo system. So reader will pick up
such change even if writer fails to delete files. Hope this all makes sense!

------
priezz
I understand, that the choice was made 5 years ago, but why Java?

~~~
bluestreak
Choice of language was a product of my environment. Trading systems I worked
on at the time were in Java and people I had pleasure to work with were java
experts. That said I have very little regrets. IDEs, debuggers, profilers,
test framework makes it super easy to iterate. Java VM and JIT are state—of-
art compared to anything else and assembly they produce is absolutely great.
The only things I wish Java had is fast access to C libraries. On this project
I miss SSE or AVX based string comparison.

------
m3slo
How does it compare to existing time series databases such as InfluxDB,
Prometheus or even Graphite?

~~~
bluestreak
QuestDB has relational data model supporting larger range of data types. We
use SQL for all queries, have SQL optimiser and provide useful error
reporting. All accessible over both HTTP and PostgreSQL wire.

From our testing we are considerably faster to ingest data than Influx. We are
faster for most queries and use much less CPU to achieve that. We have not
tested with Prometheus or Graphite yet.

On other hand we do not have high availability yet - this is on roadmap for
next year.

------
dang
Url changed from
[https://github.com/questdb/questdb](https://github.com/questdb/questdb) to
the project site.

