
Fail at Scale and Controlling Queue Delay - Hansi
http://blog.acolyer.org/2015/11/19/fail-at-scale-controlling-queue-delay/
======
encoderer
This is pretty great stuff. At a more human scale, our saas monitoring tool
handles peaks of ~100req/sec that are written to SQS. The daemon that
evaluates rules and triggers alerts has a queue health check integrated. It
will pause itself when the queue backs up and if the issue persists it sends a
page. Features like are just a few lines of code and have helped us squash
false positive alerts.

