Hacker News new | past | comments | ask | show | jobs | submit login

This appears to be a routing problem. All our systems are running normally but traffic isn't getting to us for a portion of our domains.

1128 UTC update Looks like we're dealing with a route leak and we're talking directly with the leaker and Level3 at the moment.

1131 UTC update Just to be clear this isn't affecting all our traffic or all our domains or all countries. A portion of traffic isn't hitting Cloudflare. Looks to be about an aggregate 10% drop in traffic to us.

1134 UTC update We are now certain we are dealing with a route leak.

@dang etc.: could someone update the title to reflect the status page "Route Leak Impacting Cloudflare"

1147 UTC update Staring at internal graphs looks like global traffic is now at 97% of expected so impact lessening.

1204 UTC update This leak is wider spread that just Cloudflare.

1208 UTC update Amazon Web Services now reporting external networking problem https://status.aws.amazon.com/

1230 UTC update We are working with networks around the world and are observing network routes for Google and AWS being leaked at well.

1239 UTC update Traffic levels are returning to normal.




Thanks for the updates. I wish I could get this information somewhere other than hacker news though. :(


The team is updating the status page but not with granular detail because they'd have to spend time discussing what to say. I'm giving you the blow by blow.


I have a year-old startup and this is the first major Internet outage we've had to deal with... was really awesome to have your play-by-play and definitely changed our incident response (for the better!). Thank you so much.


Thanks for the extra detail here


You're the best, we really appreciate it!


HN, the place to be if you’re anything in the tech community.


I’m cross posting to NANOG


We use downdetector.com because status pages tend to take up to an hour or so to update, if they ever do.


1042 UTC First alert of global traffic problem 1057 UTC Internal group chat room up and running 1102 UTC Status page updated

So, first alert to status page was 20 minutes.


Are you depending on the leaker to fix the issue on their side? What happens in case of non-cooperative or non-responsive leaker?


It's a chain. You first contact the leaker and their upstream, and then if that doesn't work then their upstream, etc.

At some point you reach a company that's large enough that they must cooperate because they want to remain in business of being an actual responsible ISP.

And then there's Verizon, who can safely ignore any ISP etiquette because they have a de-facto monopoly.


In this case Verizon seems to have absolutely no functioning NOC at night (if even at day).


Having had to deal with verizon, I can vouch that this is true.


It's sort of a network of trust thing, every time this happens everyone has to scramble to add route filters to ignore the leaked route on all their routers, and then they try to contact the leaker in parallel get them to fix it as well (and their upstream routers).

https://www.noction.com/blog/bgp-hijacking


The upstream provider, if cooperative, could filter out their announcement as a quick fix. It's surprising it happened though, most upstreams put filters in place already.


I guess we are in that 3% then!

But 50% of our traffic has gone!

Hopefully you are still working on it!


We're definitely still working on it. Sorry you're affected by this. We're talking with the network providers involved. If anyone from the Verizon NOC is online... call me!


I've had FiOS issues in PA since around 6:30AM impacting a decent fraction of sites. YouTube was down for about 5 minutes. Slack still very flaky. A client who hosts a 5-figure number of domains on Cloudflare still unavailable to me, but the pagers didn't start going off so presumably it is isolated. Thanks for all the updates.


Another PA Fios user, same here. 8.8.8.8 is mostly ok, 1.1.1.1, intermittent high latency in general, cloudflarestatus.com not loading.


Confirming widespread FiOS issues in NYC as well, not limited to CloudFlare IPs.


If I had the guess, the leak was probably for a huge range, maybe a /4 or something. Verizon is also notoriously bad about dealing with BGP stuff, so I wouldn't be surprised if they have particularly bad filtering.


Same FIOS issues for me. Any idea how this is all related?


Having connectivity issues with Verizon FIOS in Massachusetts this morning as well


We're also seeing 60%+ of our traffic missing. It intermittently comes back up.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: