
Show HN: Dendrite, a golang log parser - fizx
https://github.com/onemorecloud/dendrite
======
fizx
Author here. I wrote this to power a distributed tail of all of our logs on
websolr.com. This lets us do more advanced monitoring, and we'll start
exposing logs and insights from logs to users directly pretty soon. Golang was
a bunch of fun for this project, and lets us ship cross-platform binaries with
no dependencies, which makes our lives easier on the deploy side.

------
conorh
I notice you are using the Go Regexp library for extracting log data. Have you
noticed any performance issues with the regexps, or any bump in performance
between 1.0 and 1.1? It appears to be one of the parts of Go that doesn't
perform very well yet, for example -
[http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...](http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?test=all&lang=go&lang2=java&data=u64q)

~~~
fizx
Regexps are slow-ish. Several MB/second/core for applying the pattern in
[https://github.com/onemorecloud/dendrite/blob/master/cookboo...](https://github.com/onemorecloud/dendrite/blob/master/cookbook/solr.yaml),
vs pcre at dozens of MB/second/core. We'll probably put pcre into Dendrite
sooner or later.

Dendrite is actually a port of an unreleased library I wrote in C with pcre.
For us, the slowdown (which at our log velocity of hundreds of lines per
second per box, still looks like 0% cpu in top), was well worth the
productivity we got from moving to Go.

~~~
voidlogic
This is a near drop in replacement for "regexp" that uses PCRE:
<https://github.com/glenn-brown/golang-pkg-pcre>

~~~
fizx
I'm pretty interested, but most concerned about keeping cross-compilation
intact for ease-of-install as the community grows. Sooner or later, we'll have
the resources to do cross-compilation and pcre. If you want to jump on the
google group or github and convince me otherwise, I'm happy to listen.

------
daemon13
This is an excellent start! Have some q:

Why did you use yaml for configs? Yaml is/can be kind of fragile (whitespace,
etc). Do you consider something more practical - nginx config formats?

How to set-up log collector? For example I need to collect logs from several
instances. Could not find in cookbook.

How do you separate logs for different hosts/applications?

How do you ship logs? UDP? TCP? TLS?

How is this compared vs Logstash?

Thank you!

~~~
fizx
> Why did you use yaml for configs? Yaml is/can be kind of fragile
> (whitespace, etc). Do you consider something more practical - nginx config
> formats?

Yaml is the most common reasonably suitable format for things that look like
config files. It's familiar to more developers than nginx config, therefore we
hope more people can contribute.

> How to set-up log collector? For example I need to collect logs from several
> instances. Could not find in cookbook.

Did you look at the tutorial?
[https://github.com/onemorecloud/dendrite/blob/master/tutoria...](https://github.com/onemorecloud/dendrite/blob/master/tutorial.md)

> How do you separate logs for different hosts/applications?

Right now, the output map has a couple automatically inserted keys for
hostname, shipping timestamp, and the name of the application. This isn't
terribly well-documented.

> How do you ship logs? UDP? TCP? TLS?

Yes ;)

> How is this compared vs Logstash?

Dendrite isn't trying to be Logstash. I'd like dendrite to be the agent you
use with logstash, greylog, papertrail, or whatever. I think the agnostic
nature of dendrite will be a big win, as dev/ops people can persist their data
to multiple stores, or swap out stores more easily. The application we wrote
Dendrite for currently persists logs to three different stores.

~~~
daemon13
>> How do you ship logs? UDP? TCP? TLS?

> Yes ;)

ok, missed it, checked again. UDP, TCP - yes. TLS - not yet?

I checked tutorial, nothing there. Since most [semi]serious projects have a
least some servers, I suggest to add a separate example, smth like:

1\. We have 5 servers. And we collect logs to server 6. 2. These are settings
you need to use for 5 servers. 3. And these are settings you need to use for
the collector [IP:Port:TCPwithTLS].

> Dendrite scrapes your existing logs

Does Dendrite tail? Can Dendrite consume/scrape some existing/old log files?

BTW, what was your actual max throughput? Did you use some internal queue in
the design, like rsyslog is doing?

> Dendrite isn't trying to be Logstash. I'd like dendrite to be the agent you
> use with logstash, greylog, papertrail, or whatever.

I am confused here a bit. So Dendrite is like StatsD on steroids?

~~~
fizx
> ok, missed it, checked again. UDP, TCP - yes. TLS - not yet?

tcp + tls is a one-liner I haven't added

> Does Dendrite tail? Can Dendrite consume/scrape some existing/old log files?

Yes, dendrite tails/follows, with optional backfilling.

> Did you use some internal queue in the design, like rsyslog is doing?

Since you're using existing logs, all we do is maintain a pointer into the
logs on disk. When we start consuming more transient data (e.g. perhaps we
poll jmx for you), we'll have to add the queue.

> I am confused here a bit. So Dendrite is like StatsD on steroids?

Dendrite is "tail -f *.log | convert_to_json.pl | nc" on steroids.

------
zdw
Could someone who has run this compare it to other log tailer/shipper programs
like lumberjack (<https://github.com/jordansissel/lumberjack>) ?

~~~
fizx
Lumberjack seems protocol-specific
([https://github.com/jordansissel/lumberjack#future-
protocol-d...](https://github.com/jordansissel/lumberjack#future-protocol-
discussion)), whereas Dendrite aims to be agnostic.

Most of the other log tailer/shippers are in java, so untenable for RAM-
constrained environments. Dendrite also has an emphasis on structuring and re-
emitting logs in e.g. json, rather than say, matching specific lines and
sending email when you see them, or just forwarding all logs in their original
format.

------
AYBABTME
Please don't kill me but Go is the name of the language.

~~~
kintamanimatt
Golang is the colloquial name given to the language because Go is the most
ridiculous and unsearchable name. Try searching HN for "Go" and you'll see the
results are garbage.

~~~
yebyen
Has anyone really been far even as decided to use even go want to do look more
like?

~~~
kintamanimatt
Is there something wrong with what I wrote?

~~~
yebyen
Memes that should be on the front page when you search for the word "go" on
any search engine website

~~~
kintamanimatt
This makes no sense.

~~~
yebyen
well never mind then, if you won't even plug the phrase into Google to get a
laugh, I can't help you!

