I'm Mike Schroll, CTO of DNSFilter - Happy to answer any questions about our exp...

camel_gopher · on March 7, 2018

When you talked about 150M queries per day; are those inserts or reads? That's about 1,700 per second, which to be honest doesn't seem like a lot for time series metrics ingest. I would expect a single node on most TSDBs to be at least ~100 times that performance.

Can you talk about the data you were ingesting? Was it numerics, text, or something else?

(Disclosure, I work for TSDB provider IRONdb http://irondb.io)

LogicX · on March 8, 2018

Hi there -- We are currently ingesting 150M/day -- though it's not evenly distributed -- probably peaks around 3,000qps

Agreed that it's not too much yet. I think the element which kills us is needing to do rollup tables, and then query against both the raw data and rollup data for our customer analytics dashboard.

The data is a combination of date/time, strings, and ints. 22 fields.

ryanworl · on March 8, 2018

I know I mentioned this in another comment, but you really should check out Clickhouse. They have a table engine for exactly this purpose.

You create a table with the raw logs, then a materialized view (or another actual table) which declaratively does the rollup for you in real time.

https://clickhouse.yandex/docs/en/table_engines/aggregatingm...

A full example: https://www.altinity.com/blog/2017/7/10/clickhouse-aggregate...