End to end monitoring with minimal effort

cassianoleal · on May 15, 2022

Actual title: End-to-End Monitoring with Grafana Cloud with Minimal Effort

gravypod · on May 15, 2022

Grafana Cloud is a great product but the pricing scheme really scared me away.

With 10 physical servers, 10 services, 10 pods each, 100 metrics (counters, histograms, etc), and you'll be spending $728/month on Grafana Cloud.

The equivalent from AWS managed Prometheus is ~$300/month in my experience. Hosting your own is also far far cheaper.

tootie · on May 15, 2022

I'm so dubious of any commercial product with a free tier. They exist as a sales funnel not a gift to users. A few products (Sentry is a good example in this space) offer an open source, self-hosted option. Software is free and you take on the operational cost yourself. If I'm an enterprise I assume I'm paying thousands a month and just do a cost-benefit analysis on whatever is in the market knowing that developer hours are wildly expensive.

CGamesPlay · on May 15, 2022

Grafana is also an open source, self hostable product, so I don’t think there’s anything underhanded going on. Even if the paid tiers on Grafana cloud are expensive.

throwaway787544 · on May 15, 2022

The case mentioned was 'just monitoring a couple microservices'. If you want minimal effort E2E monitoring, create an AWS Lambda scheduled job with your test and have AWS SNS email you when the lambda fails. It's reliable, cheap, and significantly less complex than something like Prometheus or Grafana. For much larger or more complicated cases, try the route in the article.

drpyser22 · on May 15, 2022

Monitoring != Alerting on critical failures. It's worthwhile to have a real-time data feed for useful metrics(e.g. avg load, latency, requests, etc) not just to alert on critical failures, but to understand the overall behavior of your system, and to predict potential failures based on the cumulative trend over some window. Not sure how a single failure on a single test communicated via email can be similarly useful.

I guess its a matter of standard on what "minimal monitoring" entails.

throwaway787544 · on May 17, 2022

True, though "end to end monitoring" usually refers to a service check or synthetics rather than metrics. I think before saddling myself with Prometheus I'd try just throwing together collectd or Telegraf first, or some other minimal agent and throw it all into some backend storage with a short lifecycle.

deepsun · on May 15, 2022

> I strongly recommend to use "Prometheus and friends", that is Prometheus, Alertmanager, Thanos and Grafana.

Thanos is super-hard to maintain tho, so many moving parts.