How Cloudflare erroneously throttled a customer’s web traffic

tardis_thad · on Feb 7, 2023

As a customer that was affected by this incident, I’m really happy as it was handled by Cloudflare in the end - transparent and with clear communication how it will be improved in the future. Thanks again @jgrahamc for all initial help so it was sorted out quickly.

This is related to https://news.ycombinator.com/item?id=34639212

realchaika · on Feb 7, 2023

"We are also committed to explaining to our users in plain language what is permitted under self-service plans." I am excited to see what they revise it to. In the past, we've only really had the terms themselves (which were vague), and various social media posts by Cloudflare Employees (including by jgrahamc on HN itself), which made it a bit tricky to try to convince/explain to people that some use cases like caching R2 were allowed, especially when some automated systems went against that. The CF Dev platform is great, and having clear terms will make it easier to convince people to use it.

pciexpgpu · on Feb 8, 2023

In particular if you claim that r2 can be used to store and retrieve any blobs of data regardless of content type but that is only via the public bucket urls and not via workers etc it results in a complicated mess where you get pulled in to a service that over promises and under delivers (or worse, start treating your paying customers like hostile entities).

asjkaehauisa · on Feb 8, 2023

Nice blog post but clearly doesn't solve the main issue. The problem is not outage caused by human or automation mistake. It happens and will happened again. All is about support. He didn't get help with regular support channels, he had to use "special HN support". How many cases have been ignored because someone doesn't know to make a little fuss on social media?

latchkey · on Feb 9, 2023

That's exactly it and also no mention of that in the article, which is disappointing.

"In addition, there is and was no need for the customer to upgrade to some other plan level."

Should have been written as:

"In addition, there is and was no need for the customer to rely on HN for technical support."

fwlr · on Feb 8, 2023

An engineer put a throttle in place when they shouldn’t have and not only did Cloudflare take responsibility for that employee’s actions, they made a public blog post about it. They didn’t even bury the lede (I’m sure that was an interesting meeting). Unusually good transparency, I’m impressed.