
Time-Series Database Requirements - eaxitect
http://www.xaprb.com/blog/2014/06/08/time-series-database-requirements/
======
aaa667
Would be keen to know what people think about:
[https://github.com/ambiata/ivory](https://github.com/ambiata/ivory)

------
thrownaway2424
"Reads need to be fast, even though they are rare. There are generally two
approaches to dealing with this. The first is to write efficiently, so the
data isn’t read-optimized per-series on disk, and deploy massive amounts of
compute power in parallel for reads, scanning through all the data linearly.
The second is to pay a penalty on writes, so the data is tightly packed by
series and optimized for sequential reads of a series."

As the little girl says in the GIF: why don't we have both? Write to a write-
optimized store of limited size that requires full access during reads, and
re-write that into a read-optimized format hourly or daily. Because it's
limited in size, you won't care that the most recent data isn't very efficient
for reading, or isn't particularly compact.

------
rodionos
I'm one of the developers behind Axibase Time-Series Database which runs on
top of HBase. ATSD is two years into development and has a built-in rule
engine, forecasting,and visualization: [http://axibase.com/products/axibase-
time-series-database/vis...](http://axibase.com/products/axibase-time-series-
database/visualization/). The rule engine allows you to write expressions such
as abs(forecast_deviation(avg())) > 2.0 to trigger url/email/command actions
if sliding window average is outside of 2.0 sigmas from Holt-Winters/ARIMA
forecast.

The license is commercial and there's a free CE version which can be scaled
vertically without any throughput constraints. Tags are supported for series
as well as for entities and metrics to avoid storing long-term metadata such
as location, type, category etc. along with data itself.

I wouldn't be surprised if functional differences between TSDBs and historians
will disappear in just a few years. Right now the historians are good at
compressing repetitive data at source and on disk which makes sense given
their heritage in archiving data from SCADA systems.

------
k1w1
I built exactly the database that you are describing here. Unfortunately it is
not open source. However it is no secret that we used sqlite for the storage.
Inserts and queries in sqlite are fast, even for databases with billions of
rows. Deletes are very, very slow, so we created a new sqlite database for
each week, and simply deleted an entire database file when the retention
period expired. Sqlite is ACID, supports all of the complex queries that could
be imagined and is easy to embed into the overall engine code base. We use
aggregate tables to more efficiently display common aggregations of the data
(e.g. graph metric over the last day, or average metric over the last date).
The aggregates were updated as data was inserted, so that real-time views were
always available.

~~~
biokoda
Sounds like you implemented something similar to
[http://www.actordb.com/](http://www.actordb.com/)

------
damian2000
Anyone heard of Operational Data Historians? Like OSISoft's PI, Honeywell's
PHD, GE's Proficy. They're expensive suites of software that have optimal
real-time ability, plus historical access. They usually work in process
control/operations of factories or manufacturing plants. Each item being
measured is called a tag.

Just thought I'd throw this out there since its a specialised area that not
many people know about. I've done some work with them in terms of writing
adaptors to a time series data visualisation product.

[http://en.wikipedia.org/wiki/Operational_historian](http://en.wikipedia.org/wiki/Operational_historian)

[http://en.wikipedia.org/wiki/OSIsoft](http://en.wikipedia.org/wiki/OSIsoft)

[https://www.honeywellprocess.com/en-
US/training/programs/adv...](https://www.honeywellprocess.com/en-
US/training/programs/advanced-applications/Pages/uniformance-phd.aspx)

[http://www.geautomation.com/products/proficy-
historian](http://www.geautomation.com/products/proficy-historian)

On the topic of Historians vs Relational Databases, theres a blog post here
about it ...

[https://osipi.wordpress.com/2010/07/05/relational-
database-v...](https://osipi.wordpress.com/2010/07/05/relational-database-vs-
process-historian-for-process-data-use-both/)

... admittedly this is by the developer OSISoft so it may be biased, but their
points seem valid. Especially the swinging door algorithm reference and the
fact they are far more efficient in storage.

~~~
snowwindwaves
I just installed and threw 200 tags at the rockwell factory talk branded
version of osi. I always wanted to work with it but unfortunately a
"historian" was just one of many deliverables so I only got to do the bare
minimum that ticked the box. It is noticeably faster at retrieving data from
the historian than the rockwell factory talk SE HMI binary datalog files
though.

------
welder
This sounds like a job for
[http://www.aerospike.com/](http://www.aerospike.com/)

~~~
nivertech
How would you model timeseries on top of Aerospike?

Using 1KB chunks of datapoints as simple K/V?

Or using Large Data Types [1] like Large Stack [2]?

    
    
        Large Stack
        
        A Large Stack collection is naturally aligned with time series data
        because the stack preserves the insert order.
        Stacks provide Last In First Out (LIFO) order, 
        so it's a convenient mechanism for tracking "recent" data.
    
        Usage examples include:
    
            Show me the last N events
            Search the last N actions with a filter
            Show me all documents in reverse insert order
    
    

[1]
[https://www.aerospike.com/docs/guide/ldt.html](https://www.aerospike.com/docs/guide/ldt.html)

[2]
[https://www.aerospike.com/docs/client/java/usage/ldt/ldt.htm...](https://www.aerospike.com/docs/client/java/usage/ldt/ldt.html)

------
obstinate
Multi-dimensionality optional/drawback? Strongly disagree.

This may be my experience as a Googler talking, but I also somewhat disagree
with the notion that the data can't fit in memory. I operate a X0,000 task
service, and our monitoring data could fit into memory in a large server, if
need be.

Of course, we don't keep per-task data permanently, that would be prohibitive
at a 5s monitoring interval like the one we use, even if it were put on disk.
Instead, we accomplish what I described by aggregating away some dimensions,
mostly task number, and then holding the aggregated series in memory, for fast
queries. There are some nuances, particularly around having foresight over the
cases where you _do_ want to see individual tasks.

But suffice it to say that this person's experience does not match mine in
terms of what I need from a TSDB. Perhaps his ops background comes from a
different set of needs than mine, but if you're building a TSDB for many
customers, I wouldn't take this list as gospel.

~~~
devdas
If I need a few terabytes of data to be stored reliably enough that I can lose
a node (or two), support mostly writes and some reads, for a duration of a few
years (at least three), would you still suggest storing this in RAM?

~~~
obstinate
My team keeps replicated copies of the data in RAM, replacing nodes that get
lost. The data is also echoed to disks, of course.

------
doodlebugging
This might not be what he's looking for but it is one way to manage huge
amounts of data:

[HDF5]([http://www.hdfgroup.org/](http://www.hdfgroup.org/))

~~~
rspeer
HDF5 is mostly a storage format. It's not what he's looking for because it
doesn't meet the requirements of a database.

For example, only one process at a time can open an HDF5 file for writing.

------
sethev
I'm kind of curious about the no "tagging" point. Won't there always be some
data that doesn't fit the [timestamp double] format?

~~~
misframer
We don't use tags, but we basically encode tags into the metric name of our
time series. An example would be "host.queries.c.1374c6821ead6f47.tput". This
tells you the category, type of query, query ID, and the metric associated
with it. In this case, it's the time series for the throughput of the query
with ID 1374c6821ead6f47.

~~~
philsnow
If you have multiple tags, do you canonicalize the ordering of the metric
name? If so, where does that canonicalization happen (on the metric
origination side or on the server/collection side)? It would seem to be a pain
if you end up with metrics named

    
    
        host.queries.type=GET.result=200.tput 
    

and

    
    
        host.queries.result=200.type=GET.tput

~~~
misframer
Good question. We do have a specific ordering, and that happens when the
metrics are generated. Again, we don't really treat them as tags for a
specific metric, but rather unique metrics. Having a different name means it's
a different metric.

------
manigandham
Recently came across Prometheus ([http://prometheus.io](http://prometheus.io))

There's also OpenTSDB ([http://opentsdb.net](http://opentsdb.net)) that's been
around for a while.

~~~
otterley
Prometheus has no scaling model at this time. Sharding is currently described
as an exercise for the user. This surprises me if they're actually still using
it at SoundCloud.

~~~
Filligree
Why do you need scaling? A single machine should be enough to monitor
thousands of others.

Or, in short, if you do need scaling you probably can afford to implement it.

~~~
devdas
Because you end up being write limited, and there is only so much you can do
with SSDs.

------
Rapzid
I would actually be very interested in hearing more about this MySQL
implementation.

~~~
misframer
I work with Baron, the author. He gave a talk recently at SCaLE[0] about our
time series storage using MySQL. The slides[1] are also available in case you
don't want to watch the entire talk.

[0] [http://youtu.be/EoUfkkrIbPg](http://youtu.be/EoUfkkrIbPg)

[1] [http://www.slideshare.net/vividcortex/scaling-vividortexs-
bi...](http://www.slideshare.net/vividcortex/scaling-vividortexs-big-data-
systems-on-mysql)

~~~
Rapzid
I'm wondering if it has been considered to buffer the metrics into a memory
table so they can participate in queries, and then let MySQL flush them to
durable tables at regular intervals?

~~~
misframer
My personal opinion on that is that it's another layer of complexity without
too much to gain. It's something else to manage (operational overhead), and
something else to interact with (programmer overhead).

MySQL by itself is plenty fast for us. As Baron mentioned, we don't even run
the database servers at full capacity.

------
digitalzombie
Cassandra seems like a good fit.

Writes are faster than read, it's an AP, and you shouldn't really update
frequently it unless you want tombstone hell. There's also TTL too.

Is there any cons of using Cassandra as a Time Series Database? I'd like to
hear it.

The biggest thing for Cassandra is you should know your queries before hand
before you data model.

~~~
kodablah
I too believe it is the best fit, but the "aggregate functions" gets most
people. The use of counter columns is very limiting and many engineers don't
want to struggle with storing state during streaming writes to precalculate
the aggregates on write. Also, engineers tend not to want to do large reads to
rebuild large aggregate values on small data changes.

It is what we use, and we use spark streaming for the rollups. We had
evaluated Influx, OpenTSDB, and Druid also. So long as you know the exact read
patterns for your client I think Cassandra is definitely the best fit for most
things.

~~~
salex89
Did you try out the Cassandra-backed KairosDB? I'm very interested in the
results if you did.

~~~
kodablah
I did not. The Cassandra schema appears similar. We had enough custom needs
for how we aggregated data (e.g. ewma) that we probably needed to do this
ourselves anyways.

------
hartror
This smells like a job for Apache Kafka [1], I've yet to use it personally but
its feature set appears to hit the mark though it lacks SQL. The application
described sounds like it uses something similar to event sourcing [2] which
people have used Kafka for successfully. If you're not familiar with Kafka
there is a very good interview with Jun Rao [3] on se radio.

[1] [http://kafka.apache.org/](http://kafka.apache.org/)

[2]
[http://martinfowler.com/eaaDev/EventSourcing.html](http://martinfowler.com/eaaDev/EventSourcing.html)

[3] [http://www.se-radio.net/2015/02/episode-219-apache-kafka-
wit...](http://www.se-radio.net/2015/02/episode-219-apache-kafka-with-jun-
rao/)

~~~
misframer
Kafka is great, but it doesn't solve the problem the author describes since
you can't specify an index. Data arrives in (timestamp, metric) order, but it
needs to be indexed by (metric, timestamp).

~~~
jat850
Would Apache Samza fold in here better, perhaps?

~~~
misframer
I don't know much about Samza, but I don't think stream processors are what we
(I work with the author) are looking for. We don't really have a lot of
"stream processing" to do, and aggregate functions are usually computed on-
demand. Also, still have to put results from the stream process somewhere,
right? Back into Kafka is something people do, but we still need indexing
capabilities. As the post mentions, we use MySQL for time series storage, but
we also use Kafka in front as a durable log.

~~~
jat850
You're welcome to email me if you like - it's in my profile. Kafka and Samza
are intended (generally speaking) to go hand in hand. Samza is a re-imagined
datastore that Kafka can shuttle data into. I've been investigating Samza
quite heavily specifically for time series data storage. I'd be happy to share
thoughts.

~~~
andrioni
While I'm not using Samza, Spark Streaming also works pretty nicely in this
case, although it is not so focused on keeping state (it can, though, using
the checkpointing system and the `updateStateByKey` transformation) and thus
might not perfectly stable if you require to handle failures without
reprocessing.

------
nimish
Druid is quite a good one

~~~
zodvik
How has your experience being using Druid? It seems like a good fit for our
use cases, but there doesn't seem to be much of a community around it.

