
InfluxDB now supports Prometheus remote reads and writes natively - pauldix
https://www.influxdata.com/blog/influxdb-now-supports-prometheus-remote-read-write-natively/
======
101km
I often encounter a lot of confusion about push vs. pull and whether you
should pick Prometheus or Influx.

Prometheus comes along every so often and scrapes metrics your program exposes
via a very simple API. This has the advantage that your code doesn't need to
know about some endpoint of some cluster, it just needs to buffer up and
expose some info it knows about itself from the recent past.

You don't need to maintain a cluster of Prometheus either, you can just run
more than one for redundancy (kind of like an active-active) - it is meant for
relatively ephemeral information, it is efficient, and one big Prometheus node
will probably do you fine.

Where to store compacted historical metrics (less coarse resolution, still
interesting data that you might not want to throw away) and how has been an
open question. Sinking it into influx could be a good answer, so this is
welcome news.

~~~
pstuart
Conversely, one has to tell the collector about a new device to poll. Being
able to fire off a UDP packet with the metric and move on seems like would be
the most lightweight approach.

~~~
sytse
Two things to consider:

1\. UDP packets are lossy, we had the UDP buffers of our Influx server fill
up, and it took us a long time to detect we were dropping packets.

2\. Many people want to detect when they are not getting data from an
endpoint. Polling is a great way to quickly detect a endpoint is down.

~~~
pstuart
For argument's sake (abuse is down the hall):

1\. My understanding of UDP being lossy typically refers to it happening via
transit but your example is an endpoint failure.

2\. Since the whole point of metrics is to keep track of operations then the
monitoring of the metrics themselves should be alerting to anomalies?

------
scrollaway
Congrats! Well done, seriously. I love it when things like this happen;
settling on Prometheus as a standard will open access to so much more tooling.

(Which is also why I'm super excited to use timescale
[[http://www.timescale.com](http://www.timescale.com)] once grafana gets
support for postgres!)

~~~
pauldix
Thanks! Shared tooling is exactly why we're doing it. If we can make it easier
for our users and potentially easy for some Prometheus users, it's a win.
That's one takeaway I have from the Graphite project and community. So much
tooling was built up on that standard over time and much of it was very
useful. Iterating and narrowing in on a standard will get us to that same spot
so I think it's great.

------
qrpike
Still paid only for anything other than single node? We switched to Cassandra
for that reason.

~~~
pkaye
Is whatever product you are working on paid or free?

~~~
qrpike
We offer free real-time forex streams ( polygon.io ). We are required to
charge for equities because the actual exchanges charge us fees per use.
Otherwise I would love it to all be free.

------
_jezell_
Influx guys are always doing great things. Nice that they see that Prom has
become the standard way of doing things and embraced it.

~~~
ben_jones
In my experience they're also very responsive and friendly on community
channels such as slack.

------
leowinterde
Great, if you guys bring Clustering back to the open source version i would be
thrilled.

~~~
pauldix
Have to balance the needs of funding the business so we can continue the OSS
work we do. It's something we frequently consider. If we can figure out a way
to do it while still maintaining a healthy business, we'll be doing it.
Basically, it's an open problem. I talked about it earlier this year:

[https://www.influxdata.com/blog/the-open-source-database-
bus...](https://www.influxdata.com/blog/the-open-source-database-business-
model-is-under-siege/)

~~~
pstuart
The HA relay seemed like a nice compromise, but the repo appears to be no
longer maintained.

------
SEJeff
Thanks for adding this, it should help quite a bit with our kubernetes
deployment

~~~
Diederich
If I may ask, can you expand on this?

~~~
SEJeff
$EMPLOYER has a very sizable influx deployment and push the limits of the
software (bursts of ~500k metrics per second in some cases). We are also in
the process of moving virtually everything (including kapacitor, influx, etc)
onto Kubernetes. That said, everything in the entire kubernetes ecosystem
already includes prometheus exporter support builtin. Also, Tectonic (from
CoreOS) has the most wonderful prometheus operator
[https://coreos.com/blog/the-prometheus-
operator.html](https://coreos.com/blog/the-prometheus-operator.html) which
lets each team _trivially_ spin up prometheus to monitor all of their apps. It
also lets each team spin up their own alertmanager to send alerts, with
kubernetes guaranteeing the high availability.

Why is this useful? Prometheus is wonderful for ephemeral application state
and monitoring, but isn't really meant for storing metrics longer term.
Sometimes you want to look at the same metrics over a year or so. This is what
Influx is built for. So you have prometheus for collecting metrics and
monitoring your cluster, then you have it reading and writing to Influx. This
is literally the best of both worlds. This will be a big deal going forward
for our team.

Does this make sense?

~~~
Diederich
Yes, thank you!

------
549362-30499
What advantage do either of these data stores have over Cassandra?

~~~
BurritoAlPastor
I'm curious what you think InfluxDB and Prometheus are; neither have much in
common with Cassandra. To my ears, you might as well be asking what advantages
nginx has over the JVM.

~~~
qrpike
Both are open source, distributed ( must pay for influxdb ) databases. If you
setup Cassandra's schemas correctly it can be good for time series. Influx is
more suited to time series, however it's not 100% free and open source.

~~~
SEJeff
Cassandra is an excellent distributed database with scaling and high
availability built in. It is also a bit difficult operationally to manage and
a bit slow. It can be used for time series, but so can Postgresql or mysql,
and they aren't very good for it either.

Influx is a time series database where the way the bits on disk are stored is
written in "time series" order, so you can do certain types of operations
literally orders of magnitude faster than you can on a more generic datastore
(such as cassandra, mongo, mysql, postgresql, etc). The clustering bits in
Influx are enterprise only, but influx (non-clustered) is entirely open
source.

~~~
qrpike
I'm not sure I would call Cassandra slow. If the schemas are done well, it can
be quite good for time series. Obviously this depends on the type of time
series you're writing/querying.

Our biggest goal was writes & uptime. We sometimes do over 150k writes/sec. We
also needed it to be up and accepting writes even if one node goes down.

We regularly take nodes offline for updates/etc and cassandra never misses a
beat.

We ~really~ wanted to use influxdb, but as a startup we couldn't justify the
cost/benefit over Cassandra since we have 8 nodes for the DB. I just went to
the influx site to try to find the pricing again and it seems to be hidden now
:/

EDIT: As a PS, just remember every one of the influxdb benchmarks ( that I've
come across ) are single node. Cassandra is meant to be horizontally scalable.
Testing a single node Cassandra is like testing a racecar on your driveway...

~~~
SEJeff
And our influx setup does bursts of 500k writes per second with a lot less
operational overhead than Cassandra. For time series data, a general purpose
database is always going to be slower for both reads and writes. The data on
disk and in memory is simply laid out differently.

For an excellent academic example, see this paper of Facebook's gorilla in
memory TSDB:

[http://www.vldb.org/pvldb/vol8/p1816-teller.pdf](http://www.vldb.org/pvldb/vol8/p1816-teller.pdf)

