As a customer that was affected by this incident, I’m really happy as it was handled by Cloudflare in the end - transparent and with clear communication how it will be improved in the future. Thanks again @jgrahamc for all initial help so it was sorted out quickly.
"We are also committed to explaining to our users in plain language what is permitted under self-service plans."
I am excited to see what they revise it to. In the past, we've only really had the terms themselves (which were vague), and various social media posts by Cloudflare Employees (including by jgrahamc on HN itself), which made it a bit tricky to try to convince/explain to people that some use cases like caching R2 were allowed, especially when some automated systems went against that. The CF Dev platform is great, and having clear terms will make it easier to convince people to use it.
In particular if you claim that r2 can be used to store and retrieve any blobs of data regardless of content type but that is only via the public bucket urls and not via workers etc it results in a complicated mess where you get pulled in to a service that over promises and under delivers (or worse, start treating your paying customers like hostile entities).
Nice blog post but clearly doesn't solve the main issue. The problem is not outage caused by human or automation mistake. It happens and will happened again. All is about support. He didn't get help with regular support channels, he had to use "special HN support". How many cases have been ignored because someone doesn't know to make a little fuss on social media?
An engineer put a throttle in place when they shouldn’t have and not only did Cloudflare take responsibility for that employee’s actions, they made a public blog post about it. They didn’t even bury the lede (I’m sure that was an interesting meeting). Unusually good transparency, I’m impressed.
This is related to https://news.ycombinator.com/item?id=34639212