
Prometheus: A Next-Generation Monitoring System [video] - discordianfish
https://www.usenix.org/conference/srecon15europe/program/presentation/rabenstein
======
wiremine
We started using Prometheus for a Go-based app in production a few weeks ago.
So far, so good.

They recommend starting with PromDash [1], their Rails-based visualization
tool, but I didn't find the console-based templates [2] very difficult to dive
into (Although more examples would be handy).

Worth checking out, IMHO.

[1]
[http://prometheus.io/docs/visualization/promdash/](http://prometheus.io/docs/visualization/promdash/)

[2]
[http://prometheus.io/docs/visualization/consoles/](http://prometheus.io/docs/visualization/consoles/)

~~~
bbrazil
[https://github.com/prometheus/prometheus/tree/master/console...](https://github.com/prometheus/prometheus/tree/master/consoles)
has all the public examples for console templates.

------
XorNot
I've been slowly deploying this for server and network monitoring. The
combination of Go's fat binaries and the exposition format have made it
ridiculously easy to get started with (and almost immediately produced
insights - PromDash became popular the moment people saw it working).

------
fapjacks
We are using Prometheus as the central part of our logging and alert
infrastructure. We have collectd inside individual docker containers feeding
data to Prometheus via collectd_exporter. I hate yak shaving and there's been
surprisingly little of it with Prometheus so far.

------
siliconc0w
Pull model kills it for me - I know I can use the 'push gateway' and setup a
grep of 'exporters' but then you've just killed the 'easy to deploy'
advantage.

I still think something like Riemann/Influxdb/Graphana is the best bet for a
roll your own type solution. Riemann is a really flexible
router/aggregator/alerter with realtime metric display and it can throw
everything over to influx/grafana for dashboarding/time series analysis.

------
jbrantly
This looks really cool, but I'm always instinctively turned off by the pull
model.

\- Assuming metrics are stored in memory, they'll be lost if the app or server
recycles. Of course, a push model should probably batch so there's probably
some kind of overlap here.

\- I've heard pull is good for detecting if a server is down. The flip side is
no automatic discovery and dynamic scaling where a server going down might not
necessarily be a bad thing.

\- Security. Not super thrilled with exposing something that says "Hey come
get my data".

~~~
bbrazil
> \- Assuming metrics are stored in memory, they'll be lost if the app or
> server recycles.

Similarly if the server dies before sending the metrics, so that's a draw.

> The flip side is no automatic discovery and dynamic scaling where a server
> going down might not necessarily be a bad thing.

That's more about top-down vs. bottom-up target discovery than the scraping
itself, see [http://www.boxever.com/push-vs-pull-for-
monitoring](http://www.boxever.com/push-vs-pull-for-monitoring)

As of Prometheus 0.14.0 which was released earlier in the week, Prometheus
supports target discovery. [http://prometheus.io/blog/2015/06/01/advanced-
service-discov...](http://prometheus.io/blog/2015/06/01/advanced-service-
discovery/)

> Security. Not super thrilled with exposing something that says "Hey come get
> my data".

That's a fair concern, though I'd expect your metric endpoint not to be
internet accessible (and some of the push options will have a HTTP endpoint
for debugging).

