OpenTelemetry Is Too Complicated, VictoriaMetrics Says

tazu · 2024-04-05T09:41:22 1712310082

Mirrors my experience implementing OpenTelemetry in Go apps. The opentelemetry-go library is 90k SLOC... So 50% of Redis?

I have a lot of respect for Aliaksandr Valialkin[1], we use his FastHTTP and Quicktemplate libraries every day on dozens of projects and they purr.

[1]: https://github.com/valyala

drcongo · 2024-04-05T11:06:11 1712315171

Have you tried VictoriaMetrics? First time I've come across it and the GitHub README certainly sounds compelling.

Sphax · 2024-04-05T12:25:33 1712319933

We use VictoriaMetrics at work since I think 2019 or something. The tl;dr is that it works great.

When we migrated to it we replaced 6 prometheus servers with only 1 victoria metrics server, it was crazy efficient. We ran with a single node for all our metrics for 3 years while the number of data points observed kept growing. Today we run a 7 node cluster.

I have no idea if today Prometheus could serve our workload as efficiently but for information, a provider wanted to quote us like 300k€ per year to serve the same metrics; our setup doesn't cost anywhere near that.

drcongo · 2024-04-05T13:47:02 1712324822

Perfect, thanks!

tazu · 2024-04-05T11:58:35 1712318315

I've just been using Clickhouse as a dumb JSON log store. But I'm thinking about doing more granular stuff with VictoriaMetrics.

valyala · 2024-04-12T08:02:16 1712908936

Take a look at VictoriaLogs then [1]. It is ready to accept and query big amounts of logs without the need to configure anything for achieving high efficiency (e.g. low RAM usage and low disk space usage). See how does it compare to ClickHouse for logs [2].

[1] https://docs.victoriametrics.com/victorialogs/

[2] https://docs.victoriametrics.com/victorialogs/faq/#what-is-t...

hosh · 2024-04-05T10:05:19 1712311519

The article specifically talks about metrics part of Otel. Having read a bit of the script, I agree, it is very bloated. Statsd is a very mature and lean protocol, even with the Datadog extensions to add tagging. It is a defacto open standard that predates Otel. It may well be that people who want to stick to a lean open standard will stick with using statsd or its extended versions for collecting metrics.

The Tracing component is fine. It looks a little bloated, but it also supports distributed tracing, and has also defines schemas and semantics for trace attributes. I can look that up instead of figuring that stuff out on my own.

As far as rumors about Datadog bloating it up to make it fail … their official Nodejs sdk library pulls in Otel tracing under the hood, and the library interface was reimplemented to incorporate that. The agent you install will accept otel traces. It seems to me that this is a path towards allowing people to transition to using otel.

Datadog’s strategy, at least to an outside observer to me, builds a moat that relies on synergies and products that are more useful when you can ingest from more sources. The kind of features they have been adding in the past two years (since the 1.0 release of Otel) includes appsec scanning, infrastructure anomaly scanning, security anomaly scanning, sql database analysis (what pganalyze does for just Postgres), streaming data pipeline (such as kafks) tracing and analysis, CI/CD integration, cloud cost monitoring and analysis. Those all cost money.

The lockin happens when you sign an annual contract because the on demand costs are a noticable percentage of your cloud costs. Looking at their whole platform and their business model, Otel looks to me the way they can reduce switching costs and lock you in with their other products after you switch.

Already__Taken · 2024-04-05T09:40:38 1712310038

wasn't datadog accused of "helping" the project to the point of ensuring their solution was simpler?

formerotel · 2024-04-06T04:13:31 1712376811

I don't think Datadog did much in terms of specification work, other maintainers were quite active in bringing in the feature creep. AFAICT, it was an assumption that Google's practices apply to the rest of the world.

Anyways complicated protocols benefit the vendors only, since it makes it harder for other vendors (in this case VictoriaMetrics) to come in. Datadog is still popular because their instrumentation libraries, not the protocol which now I suspect is OTel anyways, is much better than most in OTel.

bosky101 · 2024-04-05T09:40:47 1712310047

tldr;

OTel standard is mature, but the open source implementation is too bulky. So they wrote their own library

jbiggley · 2024-04-05T10:33:01 1712313181

This is the right answer. Standards and recommendations are good. Let vendors and individual companies implement them, as needed.