Does anyone know of any other prod ready elasticsearch alternatives? I'm working...

atombender · on Jan 13, 2019

Grafana's new Loki [1] looks very promising. It bills itself as "Prometheus for logging". It doesn't index the text of messages, but it does allow fast filtering predicates on labels and time.

[1] https://grafana.com/loki

core-questions · on Jan 14, 2019

ES is not "bloated"; it does require decent hardware, but so does anything else that's taking in gigabytes of data an hour and making it all searchable and indexed. Why bite off working on some weird, less-supported mechanism because of your lack of confidence in Elasticsearch?

> I've been tempted to just ship straight to a dB and skip all these crazy shippers and parsers and all the other middle men in the equation.

You need the parsers. You want to find a needle in a haystack? Good luck without having things broken down into proper fields with metadata.

You need the shippers. Elastic Beats has full backpressure support so that when your cluster is busy, it can intelligently back off. Otherwise, you'll drop logs, or overwhelm the system to the point of uselessness....

> why has no product unified monitoring and logging?

Metricbeat from Elasticsearch + Grafana aimed against Elasticsearch to get you better dashboards and alerting.

Please don't reinvent the wheel on this one. Deploying ELK + Beats + Grafana is not that hard, there's tons of documentation, and it is a very stable product.

noahl · on Jan 13, 2019

Back when I worked at Google, the standard log processing tool was Dremel. You could get exactly the same thing by shipping your logs to BigQuery.

I haven't checked, but I bet it's cheaper than ES for data that's mostly cold, like logs. You will need a separate monitoring solution though.

mentat · on Jan 13, 2019

If you want streaming inserts to BQ, that becomes the biggest cost. Dataflow could be used to turn inserts into batch and gather interesting metrics that you don't want to hit BQ for, but I don't think anyone's open sourced anything in this space. I've implemented streaming inserts to BQ for logs "at scale" and it was at least an order of magnitude cheaper than splunk still. Happy to talk via email.

dominotw · on Jan 13, 2019

what was the monitoring solution at the end, where were BQ results going to?

mentat · on Jan 14, 2019

It was generic log aggregation used mostly for incident response and forensics as well as some offline metrics. There were a bunch of metrics that were being created (like 3 different ways) on box with parsing that we were looking at moving into the log processing stream. We had a chat bot that people could use to interact with common queries as well as standard SQL interaction via UI and API auth'd by Google IAM.

dominotw · on Jan 13, 2019

we use spark on hdfs.

eikenberry · on Jan 13, 2019

IMO it is not as widespread because it isn't a good practice. Health monitoring, metrics and logging are orthogonal and really should be handled separately. Monitoring makes sure everything is working properly, metrics is about understanding how it is being used, and logging is about peering inside what is happening. Conflating them hinders their application and makes them less useful.

alexanderdmitri · on Jan 14, 2019

I like the seperation of concerns, do you have an example of conflating these that leads to issues?

_slyo · on Jan 14, 2019

When a customer complains that something's not right, and you check your logs, and the logs have been spewing millions of alarming messages for hours that you wish you had seen before the customer noticed issues, that's when you wish the programmer who wrote the log lines had used the health monitoring framework instead of the logging framework.

dominotw · on Jan 14, 2019

> health monitoring framework instead of the logging framework.

What is the difference?

Can you just extract metrics from logs via ES. Logstash even has a prebuilt JAVASTACKTRACEPART for java exceptions.

manigandham · on Jan 15, 2019

That user/dev error, not a fault of the system or a problem in handling it all with the same.

Metrics can always be generated from logs, especially structured payloads.

sv12l · on Jan 13, 2019

You may take a look at https://blevesearch.com/ written in Go.

FBISurveillance · on Jan 13, 2019

If you're looking for SaaS solution, check out DataDog. They offer monitoring and logging (and a bit more) as a packaged service.

courtewing · on Jan 13, 2019

As of 6.5, the Elastic Stack ships with a logging app and an infrastructure monitoring app out of the box in Kibana. They are both new, so expect a bunch of new features im 6.6 onward.

The docs have more info about both: https://www.elastic.co/guide/en/infrastructure/guide/current...

Disclaimer: I work on Kibana at Elastic

jetter · on Jan 13, 2019

Clickhouse is a good (self hosted) alternative to Elasticsearch for log storage: it saves a lot of space due to better compression, it supports sql (with regex search instead of useless by-word indexing), and ingestion speed is great.

mdaniel · on Jan 13, 2019

> (with regex search instead of useless by-word indexing)

Perhaps I misunderstand your situation, but I don't see any "CREATE INDEX" available in Clickhouse, and thus won't "SELECT * FROM logs WHERE match(message, '(?i)error.*database')" require a full column-scan (including, as you mentioned, decompressing it)? Versus the very idea of an indexer like ES is "give me all documents that have the token 'ERROR' and the token 'database'" which need not tablescan anything

I only learned about the project 9 minutes ago so any experiences you can share about the actual performance of those queries would be enlightening -- maybe it's so fast that my concern isn't relevant

ryanworl · on Jan 13, 2019

Clickhouse is designed for full table scans. It allows one index per table, usually a compound key including the date as the leftmost part of the key. This allows it to eliminate blocks of data that don’t contain relevant time ranges. It is also a column store, so the data being read is only the columns used in the query.

If your query is linearly scalable conceptually, Clickhouse is also linearly scalable. Per core performance is also pretty good. (tens of millions of rows per second on good hardware and simple queries, like most log aggregation queries are)

lykr0n · on Jan 13, 2019

Clickhouse (like any other SQL DB) would work great if you could chop up your log files into fields and store one type per DB. ElasticSearch is great for this because you don't have to worry about schema- ClickHouse you will... unless you do Two Arrays. One for the field type, and one for the field value.

If you value being able to store arbitrary log files, ClickHouse is not for you. If you want to build your system to generate tables on the fly- ClickHouse might work.

See: https://github.com/flant/loghouse

hodgesrm · on Jan 15, 2019

You can use materialized views in ClickHouse to simulate secondary indexes. See https://www.percona.com/blog/2019/01/14/should-you-use-click... for an example of this usage. It's about half-way through the article.

Disclaimer: I work for Altinity which is commercializing ClickHouse.

unixhero · on Jan 13, 2019

Yes. The TICK stack.

Namely InfluxDB and friends

SEJeff · on Jan 13, 2019

Since when can you send logs (not just time series) to Influx? That would certainly be news to me, as a huge TICK stack user.

ollybee · on Jan 13, 2019

You can send logs to influx no problem. telegraf (The collecting agent from the Influx team) has a logparser input plugin which can parse logs via patterns, it understands Apache logs by default. The catch is that there is no full text search. It's still useful as you can quickly find logs to view by searching fields that have been parsed.

SEJeff · on Jan 14, 2019

That's really funny. My employer is one of the larger users of influxdb I'm aware of, so much so we had to write some software[1] (we did synthetic benchmarks of 500k metrics per second and it held up just fine) to overcome scaling limitations and I didn't know this. Thanks for the cluebat! TIL!

[1] https://jumptrading.github.io/influxdb/metrics/monitoring/de...

manigandham · on Jan 15, 2019

Try https://getseq.net/

It was built for .NET devs using Serilog but can accept structured logs/json over HTTP.

jamespo · on Jan 13, 2019

Loki from grafana

jedisct1 · on Jan 14, 2019

Groonga http://groonga.org/

lulf · on Jan 13, 2019

Check out Vespa: https://vespa.ai

hn_throwaway_99 · on Jan 13, 2019

> Also, why has no product unified monitoring and logging?

In my experience SumoLogic is excellent for this.

dominotw · on Jan 13, 2019

just ship to spark?