
Ask HN: What are recommended tools for OLAP for event data with durations - ninjakeyboard
We have a need to query on &quot;activities&quot; which exist over a period of time. EG Consider a user tracking their time online, if we want to track efficiency and show trends over time of working vs messing about on distracting sites, which datastores are good for this? I was looking at Druid https:&#x2F;&#x2F;en.wikipedia.org&#x2F;wiki&#x2F;Druid_(open-source_data_store)
And ElasticSearch. Elasticsearch is a fulltext search first and happens to be able to do some of this but it can&#x27;t really calculate things based on data. Eg if for 10 minute buckets I want to capture productivity % and then be able to also recalculate productivity % at 20 minute buckets, hour buckets, and show trends over time, I need to recalculate the productivity %. I&#x27;m wondering if there is a good set of tools for doing this. This isn&#x27;t my forte so really appreciate some direction.
======
seektable
Take a look to Yandex ClickHouse, this is open-source append-only analytical
database. It offers ultimate performance of OLAP queries and data like events
log, and its SQL dialect includes a lot of specialized functions for metrics
calculation.

------
mattbillenstein
How much data? Probably plays a big part in picking a solution.

I've most recently been using BigQuery for stuff like this -- streaming the
data into BQ, running rollups using SQL there and either saving those results
into new tables or pulling back the resultset and inserting it into another
database.

