[dupe] Fastly CDN is down (affecting Reddit, GitHub, SO, ...) (fastly.com)
192 points by plasma 5 months ago | hide | past | favorite | 48 comments

I didn't know so many sites were depending on Fastly. Stack Overflow, GitHub, reddit, ..., even pip is unavailable. My development workflow is completely janked up. It is a bit scary that we are putting so many eggs in one basket.

I use uBlock Origin with default third-party blocking, so I see such trends happening as I browse. The strong consolidation on fastly feels like a recent development (last year, at a guess). Before that it was edgekey for a while.

Maybe we need a crawler that keeps track of the hidden centralisation of the web. Or is someone doing that already?

Cloudflare, AWS, fastly, etc. It’s not quite all eggs in one basket, but the amount of baskets is small, and the amount of eggs in them is growing. This is a timely reminder of that.

> . It is a bit scary that we are putting so many eggs in one basket.

We said the same when AWS was down. Same when fastly CDN is down. It turns out we are not putting eggs in one basket but we are putting all our (an organization's offerings) in one basket (cloud provider). That seems to be the core issue IMO.

A client of mine once told me "We are self-hosting our CI/CD servers because when problems occur, I can tell my employees to stay at work until it is fixed. I cannot tell GitHub to stay at work until they fix their issue."

That seems like a weird way of measuring success (person-hours rather than some goal/outcome-oriented metrics like uptime).

I don't think that's what was intended. It reads to me there's more perception of control (employees directed to work) to achieve the goal (restoring service) during an outage. Outages happen regardless of uptime guarantees--it's just a matter of how often and how long they last.

That's how I understood it as well, and it's a pattern I've seen in (bad) decision making: While a stakeholder might feel more "in control" by being able to look over the shoulder of an on-call engineer, asking somebody to reboot the database, etc., this very often does not translate into higher performance metrics.

Of course, the opposite would be outsourcing all responsibilities, so there obviously is an art to balancing these two extremes.

Not only that, it's just odd logic. I'm going to prevent my employees from domain-specific value-add by forcing them to overwork on generic tooling? What?

Terraform registry too AFAICT.

95k websites use them according to BuiltWith, lots of big names there!

The most popular websites are the ones that need to use CDNs. There's about 4 CDNs of note, so when one is down there's a massive knockon.

At work we have fastly, cloudflare, and some others. Fastly has been removed from the pool so we're back up. Fortunatly (or rather thanks to planning) we didn't rely on a cloud service to do that though (say a password stored in cloud based bitwarden)

ah yes, this explains why XKCD is also down.

And Noita wiki!

Status page is at https://status.fastly.com which just updated

Hugs to the team responding (as a fellow on-call member!)

Post should be updated to link to this page instead

Wow, this is nuts. Reddit was down, shrugged it off since they don't have the best reliability.

Jumped onto GH pages, and ended up with a 502, so knew it was time to jump onto HN.

I didn't realise it was such a popular CDN these days.

I forgive them because it's Guru Mediation time. In this hectic harsh world, even computers need a moment sometimes.

Yesterday I read about that error for the first time on the article about the TV guide channel and now here it is again!

I used to live in fear and respect of that error. Nothing was more dreaded to an Amiga owner in the 80-90s than that flashing red rect of the Guru Meditating....

I think internet just found out it's single point of failure.

It is funny how ubiquitous Fastly is on internet. It seems like they run the whole internet.

I get the feeling Cloudflare is bigger, might just be me.

https://www.vimeo.com/ too, including their api.

In Italy we say "Mal comune mezzo gaudio" equivalent to english " “A trouble shared is a trouble halved”

I don’t know the technical issues but it doesn’t seem prudent for a business to have a single point of failure.

Depends on the cost of doubling up and the cost of the outage. If you're looking at half an hour once a decade versus triple the development complexity, you eat the outage.

Sometimes you just have to deal with it. If your site sits behind CloudFlare you'd probably be better off just weathering the storm.

The decentralized web is coming

Like the yesr of the Linux desktop?

Some argue it's already been there for a while [0]

[0] https://staltz.com/the-web-began-dying-in-2014-heres-how.htm...


Error 503 Service Unavailable

Service Unavailable Guru Mediation:

Details: cache-lcy19242-LCY 1623146958 1812545299

Varnish cache server

This isn’t the first time I’ve seen this level of outage from Fastly.

When were the previous ones? I've been aware of Cloudflare and AWS outages as we use them, but thought that up until today Fastly hadn't had a massive outage.

Confucius say: move Fastly, and break things.

We are using Firebase hosting and apparently it is affected as well.

It took me a while to figure out that we are impacted too by this.

Pypi is also not working... breaking builds

Just taking a break

Looks like Fastly engineers fixed the problem... Slowly.

Also SMH.com.au

swarm by foursquare seems down, too

imgix is also having a major outage

even cnn.com is unavailable!

This is exactly why IPFS is needed! Decentralized CDN.

Last I checked, IPFS still leaks the IPs of hosters, so it's a no-go for a CDN if an attacker can just DDoS the target to take them offline (and oops, then IPFS can't help if new content isn't being pushed into some provider).

IPFS needs anonymous peering or it's not going to be a decentralized CDN, only a decentralized load balancer for static assets.

Done by the hedgies to stop amc squeeze.

plz go back to reddit when it's online

