
Ask HN: Perfect tooling for high-availability statistics? - vernondegoede1
I’m working at a financial company that has 80K+ active clients throughout Europe. We have a dashboard, that allows our users to see financial statistics. A couple of years ago, we introduced ElasticSearch which indexes all payments. While this was initially just used for transaction searching &amp; filtering, at some point we decided to also use it for transaction statistics.<p>Given our current scale (hundreds of millions of payments), however, we’re running into performance issues with out current ElasticSearch setup. While it still works perfect for filtering, we realized that ElasticSearch may not be the perfect tool to use for statistics as it’s becoming too resource intensive. We’re currently looking into alternatives, but we’re not sure what to use.<p>What tools would you suggest to use for this?
======
bckygldstn
Analytics workloads are often a good fit for columnar databases. Popular
examples are Redshift (AWS), Vertica (enterprise), and Clickhouse (open
source).

Columnar databases are awesome, for the right kinds of task the speedup can be
multiple orders of magnitude. Columnar databases excel at filtering and
aggregating a subset of columns, storing sparse or slowly-changing data, and
timeseries operations.

Of course there's always tradeoffs. Columnar databases tend to suck at reading
individual rows, reading large number of columns, and heavy writes.

Another option is of course to put a caching layer between the dashboard and
ElasticSearch, and precompute common queries e.g. daily.

Feel free to message me if you want to chat, email in profile.

------
hodgesrm
ClickHouse is a common replacements for ElasticSearch in cases where data
consists of structured records. ContentSquare publicly reported 11x decrease
in cost and 10x faster 99th percentile queries after migration. [1] Others
have seen similar results.

[1] [https://github.com/ClickHouse/clickhouse-
presentations/blob/...](https://github.com/ClickHouse/clickhouse-
presentations/blob/master/meetup30/contentsquare.pdf)

Disclaimer: some of the "others" are customers of my employer Altinity, which
offers support for ClickHouse.

------
snikolaev
Do you need full-text search? I think the answer would depend on that as there
are only few open source technologies that do full-text search: ElasticSearch,
SOLR, Manticore Search, couple others and LOTS of others that don't, but are
much better in just analytics. Clickhouse should be a good fit then.

------
verdverm
Time series DB maybe?

------
amypinka
Kdb

