

Introducing Heka - crankycoder1975
http://blog.mozilla.org/services/2013/04/30/introducing-heka/
Mozilla Services announces Heka - a tool for high performance data gathering, analysis, monitoring, and reporting.
======
snaky
Uh, yet another collector/grapher. That's nice but..

We have _tons_ of collectors. And tons of graphers. What we have not is a
little bit of smarts in that tools. Ability to predict and ability to react.

Predict. We have Holt-Winters Forecasting Algorithm implemented in RRDTool
from 2005 and a couple of papers.

React. I'm not talking about 'fix it automagically'. But everyone wants to
know 'wtf was that peak on this graph last night?'. Usually your never know,
except the simplest cases. Because you cannot collect everything about
everything all the time. But monitoring system could enable 'collect
everything we can' for short period of time when it detects _something_.
Something wrong or something _strange_ , something out of the pattern. Does
anybody hear about system with something like that?

~~~
aba_sababa
We're working on it here at Etsy :)

It'll be released in a week or two. In the meantime, I've been speaking about
it:
[http://devslovebacon.com/conferences/bacon-2013/talks/bring-...](http://devslovebacon.com/conferences/bacon-2013/talks/bring-
the-noise-continuously-deploying-under-a-hailstorm-of-metrics)

~~~
EFruit
Is what you're referring to something like Growl that pops up and says 'Hey,
metrics.sessions.active just dropped by 70%", or something pre-configured to
spin up additional VMs/instances/dynos when some metrics misbehave? TL;DR: How
autonomous is it?

------
berkay
Having implemented similar solutions, it's clear to me that developers did
their homework and designed accordingly. I find myself agreeing with almost
every decision I could see. \- Go: light, no dependencies. This is key. If you
ever deployed something in a non homogeneous environment with 100s/1000s of
servers, you'd know the pain.

\- Plugin system: Only way to scale the development of the solution

\- Lua for plugins: Yes! Language is not important, but not having to stop and
restart the application for changes in logic, etc. is essential.

\- Routing. Sounds great, can't wait to take a deeper look.

Kudos to devs. Nicely done!

~~~
Sven7
Not very familiar with Go, could someone please elaborate on what "Go: no
dependencies" mean? Thanks

~~~
berkay
Basically it means that you can have statically linked executables that do not
have dependencies to other libraries, and can be deployed by simply copying
the files.

Alternatives written is scripting language like ruby, python, etc. require
runtimes and libraries, and it can get quite complicated to deploy them
(especially if the env. has lots of different OS versions, etc.), keep track
of all dependencies, deal with conflicts with other apps that may require a
different version of the runtimes & libraries. As such for operational reasons
it's very appealing to have a distributable binary that works without having
to worry about what prerequisites are and whether it would impact anything
else on the server.

------
grosskur
I've been experimenting with a different metrics toolchain of shh + log-
shuttle + l2met recently (also written in Go):

<https://github.com/freeformz/shh>

<https://github.com/ryandotsmith/log-shuttle>

<https://github.com/ryandotsmith/l2met>

shh can be extended with custom pollers written in Go, but focuses on
collecting system-level metrics. log-shuttle is a general-purpose tool for
shipping logs over HTTP. l2met receives logs over HTTP and can be extended
with custom outlets written in Go, but requires log statements in a specific
format ("measure.db.latency=20" or "measure=db.latency val=20").

It's great to see so many new tools in this space. Previously I had a bunch of
one-off "carbonize" scripts running out of cron, each collecting a specific
kind of metric and sending it to Graphite or statsd. This worked OK but
required quite a bit of code to get things done. Heka's plugin system looks
like a nice way to structure things.

------
themgt
Very interesting. Does this fit in conceptually with circus at all? It seems
like there's a fair amount of overlap between the process/HTTP management done
by circus and this stats/data collection/analysis (specifically hekad agent in
the architecture diagram): [http://heka-
docs.readthedocs.org/en/latest/architecture/inde...](http://heka-
docs.readthedocs.org/en/latest/architecture/index.html)

I'm curious if Mozilla is using these two tools in combination internally, and
what that architecture looks like.

<https://github.com/mozilla-services/circus>

~~~
nonsequitarian
Heka and Circus have very different goals. Heka is about generic data
gathering, processing, and routing. Circus is about task and process
monitoring and management. There's some overlap in that Circus needs to gather
and process a bit of data to do its job, and it would certainly be possible to
use Heka as a part of that, but we're not doing so. We'll probably use Circus
to manage at least a few of our `hekad` processes, though.

------
dkhenry
Off the top of my head this is a reimplementation of the following * SNMP *
CollectD * Carbon * JMX * WMI * CMIP

And a whole host of other proprietary transports. So its cool and looks
awesome, but what does it give me that the entirety of other monitoring
protocols doesn't

~~~
crankycoder1975
One of the driving motivations was simplicity for developers and get a
reasonable out-of-the-box experience.

This comes from a couple things.

Go compiles to a single static library so you don't have to worry about having
dozens of "the right" library installed on your machine. Grab the heka binary
and run with it.

This greatly eases our operations work as we have fewer dependency conflicts
to deal with when we push things to production.

~~~
mapleoin
That doesn't make a lot of sense. You don't have to write a monitoring
software from scratch just because you want statically compiled bundled
libraries. You can do that with any programming language.

~~~
tptacek
How do you run Python, Java, Perl, Ruby, or any JVM language without an
installed runtime?

~~~
coldtea
Quite easily. All offer options for building standalone programs that don't
need a pre-installed runtime.

You just copy them to some directory, run them and they work.

And some of them even support building native binaries (e.g Java through gcc).

~~~
tptacek
Virtually nobody in practice uses any of these.† Java binaries are in practice
JVM bytecode in classfiles. Python programs are run by the Python interpreter.

Go compiles to native code. Not only do you not need a preinstalled Go runtime
on a target system, but there's very little advantage to even having one. The
normal way of installing a Golang program is simply to copy the binary and run
it. That's powerfully simpler than most other modern programming languages,
with the obvious exception(s) of C/C++/ObjC.

† _Commenter downthread says the same thing, but let me add that we look at
other people's Python/Java/Ruby programs professionally, and I can't recall a
single client ever doing anything like this._

~~~
lucian1900
With the gigantic disadvantage of security updates requiring recompiling
everything :(

~~~
marshray
I've never seen an organization that didn't do a full rebuild of every build
product contained in each release anyway. Usually it's just faster and less
error-prone to do a full rebuild than to recompile the minimal set of source
files and relink.

~~~
lucian1900
If one uses dynamic linking, one can use (some) system-provided libraries,
which will get security updates in the usual manner.

~~~
marshray
It looks like Go supports dynamic linking to "system" libraries. At least on
MS Windows this
[https://code.google.com/p/go/codesearch#go/src/cmd/dist/wind...](https://code.google.com/p/go/codesearch#go/src/cmd/dist/windows.c&q=win32&sq=package:go&dr=C&l=118)
call to FormatMessageW [http://msdn.microsoft.com/en-
us/library/windows/desktop/ms67...](http://msdn.microsoft.com/en-
us/library/windows/desktop/ms679351\(v=vs.85\).aspx) would be to an
implementation in Kernel32.dll that would receive security updates.

On Linux there's a large gray area for things like libexpat.so.1 that may or
may not be linked dynamically. But libc is LGPL, so I expect it too would be
linked dynamically.

------
judofyr
Seems similar to Riemann: <http://riemann.io/>

~~~
buro9
Riemann is top of our list to be implemented in an environment made up of
PostgreSQL, Memcached, Python (Django) and Go.

But as Hekad has plugins for all of the above (except Go - but I'm sure it's
possible): [http://heka-
docs.readthedocs.org/en/latest/architecture/inde...](http://heka-
docs.readthedocs.org/en/latest/architecture/index.html)

Well I guess we'll now be evaluating whether Hekad looks like it might be a
more promising fit.

I particularly like the bullet points on aggregation counters, filters and
transformations. We'll have to see how they work in practise though. The docs
are very pretty, but as is usual with early releases it seems a little
difficult to picture the whole and how it will actually work in practise from
the soup of detail that Sphinx spits out.

~~~
nonsequitarian
Unfortunately, the diagram you linked to is misleading; we don't have all of
those plugins built yet. We've got a (quite rich) Python client
(<https://github.com/mozilla-services/heka-py>) and a (rudimentary) Go client
(<https://github.com/mozilla-services/heka/tree/master/client>) but Memcached
and PostgreSQL connectors aren't yet in place. Fleshing out our plugin set
(especially the inputs) is one of our highest priorities, so those should be
coming Real Soon Now(tm). Contributions welcome!

------
samatman
with reference to the name: "and I _do_ live in South Berkeley / North
Oakland…"

I had a feeling. We may hope your code is hella tight...

~~~
samatman
I get it, no jokes on HN. This was a quote from the developer, who has, I
suppose, a better sense of humor than y'all.

~~~
fixxer
I didn't know we could say "y'all" on HN, either.

~~~
mindcrime
Why wouldn't y'all use "y'all" on HN? It's a perfectly cromulent word.

~~~
fixxer
Sure it is, Ms. Hoover.

------
zobzu
i see it more as a syslog replacement. It does a lot more than syslog of
course, but tit doesnt do what "collectd" and whatever else does. Heka seems
to "just" do logging/routing/etc and be extremely fast and reliable doing so.
And has no dependencies/small footprint.

Which is what syslog can't do.

~~~
nonsequitarian
Actually, there is a fair amount of overlap between collectd and Heka. And
Heka provides mechanisms for in-flight data processing and graphing output, in
addition to logging and routing. But you're also right that Heka might in some
cases also be used in place of syslog.

Of course, both syslog and collectd have been around and battle-hardened for
many(!) years, whereas this first Heka release is being called "0.2-beta-1"
for a reason. I wouldn't go rush into replacing _any_ mission-critical
infrastructure just yet. ;)

------
keyle
Off topic but I did smile to see it's written in Go, and not Rust. I guess
Rust isn't there yet.

~~~
coldtea
What's to smile about?

Nobody ever claimed that Rust is "there yet".

The core Rust developers all say that Rust is still in flux, and that a stable
version is still many months in the future, possibly 2014. And they advise not
to use it in production.

------
kevinmeredith
Is this similar to piwik?

~~~
nonsequitarian
Not really. Piwik is focused on web traffic analytics. You could build a
piwik-type system using Heka, but Heka is a lower level tool w/ a wider focus.

------
doun
seems that this DO NOT support Windows?

~~~
trink
Windows support will be added. <https://github.com/mozilla-
services/heka/issues/145>

