
Ask HN: Splunk vs. ELK vs. others? - jharohit
I have used Splunk extensively before in a corporate banking tech job. Now with my own startup, I am wondering which path to go. ELK and some alternatives like Prometheus have gotten really better with time.<p>Can anyone recommend whether the cost of Splunk is worth it for a startup or it&#x27;s better invest time in using open source solutions?
======
dfboyd
Splunk and Sumologic both have query languages, and store the raw log lines
now, and let you parse and analyze them later. ELK doesn't have anything like
that; you have to configure ELK to understand the logs before it can index and
store them, and it has no query language allowing you to re-parse and analyze
them afterward -- all you can do is search on the fields you've already
indexed. ELK can't, for instance, store 20G of logs today with no parsing
applied, and let you parse out key-value pairs and graph all the latency
numbers next week.

If you have more money than time, use Splunk.

If you are on AWS, dump your logs into Redshift -- you already know SQL and
you don't have to learn another query language; and it's easy to see how to
run extract/transform/load jobs. If you're on GCP, I'd investigate BigQuery
before bothering with Splunk or Sumologic.

~~~
scapecast
If you are a start-up, I second the approach of dumping logs into Redshift.
Lots of start-ups are doing exactly that (including yours truly).

Few things to keep in mind when you do that:

Create separate schemas for your raw data and your analysis. Dump the logs
into a "raw schema", run your aggregations, and write the results to a "data
schema". Once your data grows and you're running more reports, it will make
your life much easier.

Separate your users from the start. Create a user each for your data loads,
for your aggregations and for your ad-hoce analysis. As you ramp up query
volume, the separation of concerns will make it easier to use the Redshift
workload manager and keep concurrency high.

Ping me if you have more questions about the set-up. lars @ intermix dot io

------
aprdm
We have been using the ELK stack and so far we are very happy with it.

ELK is doing both centralized logging and service metrics.

Usually services log to syslog which ends up in a file like
/var/log/$service.log and then we use filebeats (also from elastic) to send
that file to an ELK server.

------
mattbillenstein
Use fluent to parse the logs to json, then the BigQuery sink to put the data
in there. Then just do exploration on that data in sql.

------
iamdave
I've been in an ELK shop before, and will definitely be putting one up first
chance I get with my new gig.

