
Heroic: Spotify's time series database for monitoring - mhausenblas
http://spotify.github.io/heroic/
======
jamestnz
I'll admit I found it a little tricky to orient myself within the Heroic docs.
There are pages describing the high-level architecture, the installation
process, then each aspect of the configuration has a doc page and an example.
These are all fine, but I came away wishing there was also some kind of super-
high-level introduction along the line of: "What exactly is this thing, what
was the motivation for its development, who is it for, and what can it do?"

It turns out these do exist! But as blog postings you need to go searching
for...

1\. Monitoring at Spotify: The story so far
[[https://labs.spotify.com/2015/11/16/monitoring-at-spotify-
th...](https://labs.spotify.com/2015/11/16/monitoring-at-spotify-the-story-so-
far/)] in which they describe wanting to move from an approach based around
the monitoring of discrete hosts, to one where they could be thinking in terms
of the health of services across the entire infrastructure. Discusses various
design/architecture decisions they took, specifically in terms of supporting
_alerting_ and _graphing_ services.

2\. Monitoring at Spotify: Introducing Heroic
[[https://labs.spotify.com/2015/11/17/monitoring-at-spotify-
in...](https://labs.spotify.com/2015/11/17/monitoring-at-spotify-introducing-
heroic/)] in which they discuss the "federation" features of Heroic, how it
manages the collection of metric data from the hosts, why they used
Elasticsearch and how they mitigate its known issues.

------
je42
Found this on the architecture page:

> Elasticsearch has proven to have fairly significant stability concerns, but
> heroic uses it in a way so it acts as a non-primary storage and can rapidly
> be rebuilt.

What are these concerns specifically ? Is there a good read up ?

~~~
quacker
In this blog post[1], they link to one of Aphyr's "Call me maybe" posts,
[https://aphyr.com/posts/323-call-me-maybe-
elasticsearch-1-5-...](https://aphyr.com/posts/323-call-me-maybe-
elasticsearch-1-5-0)

1: [https://labs.spotify.com/2015/11/17/monitoring-at-spotify-
in...](https://labs.spotify.com/2015/11/17/monitoring-at-spotify-introducing-
heroic/)

~~~
je42
Thanks !

