Hacker News new | past | comments | ask | show | jobs | submit login
Route Leak Impacting Cloudflare (cloudflarestatus.com)
313 points by xPaw 27 days ago | hide | past | web | favorite | 157 comments



https://news.ycombinator.com/item?id=20267790 is a more recent thread on this.


This appears to be a routing problem. All our systems are running normally but traffic isn't getting to us for a portion of our domains.

1128 UTC update Looks like we're dealing with a route leak and we're talking directly with the leaker and Level3 at the moment.

1131 UTC update Just to be clear this isn't affecting all our traffic or all our domains or all countries. A portion of traffic isn't hitting Cloudflare. Looks to be about an aggregate 10% drop in traffic to us.

1134 UTC update We are now certain we are dealing with a route leak.

@dang etc.: could someone update the title to reflect the status page "Route Leak Impacting Cloudflare"

1147 UTC update Staring at internal graphs looks like global traffic is now at 97% of expected so impact lessening.

1204 UTC update This leak is wider spread that just Cloudflare.

1208 UTC update Amazon Web Services now reporting external networking problem https://status.aws.amazon.com/

1230 UTC update We are working with networks around the world and are observing network routes for Google and AWS being leaked at well.

1239 UTC update Traffic levels are returning to normal.


Thanks for the updates. I wish I could get this information somewhere other than hacker news though. :(


The team is updating the status page but not with granular detail because they'd have to spend time discussing what to say. I'm giving you the blow by blow.


I have a year-old startup and this is the first major Internet outage we've had to deal with... was really awesome to have your play-by-play and definitely changed our incident response (for the better!). Thank you so much.


Thanks for the extra detail here


You're the best, we really appreciate it!


HN, the place to be if you’re anything in the tech community.


I’m cross posting to NANOG


We use downdetector.com because status pages tend to take up to an hour or so to update, if they ever do.


1042 UTC First alert of global traffic problem 1057 UTC Internal group chat room up and running 1102 UTC Status page updated

So, first alert to status page was 20 minutes.


Are you depending on the leaker to fix the issue on their side? What happens in case of non-cooperative or non-responsive leaker?


It's a chain. You first contact the leaker and their upstream, and then if that doesn't work then their upstream, etc.

At some point you reach a company that's large enough that they must cooperate because they want to remain in business of being an actual responsible ISP.

And then there's Verizon, who can safely ignore any ISP etiquette because they have a de-facto monopoly.


In this case Verizon seems to have absolutely no functioning NOC at night (if even at day).


Having had to deal with verizon, I can vouch that this is true.


It's sort of a network of trust thing, every time this happens everyone has to scramble to add route filters to ignore the leaked route on all their routers, and then they try to contact the leaker in parallel get them to fix it as well (and their upstream routers).

https://www.noction.com/blog/bgp-hijacking


The upstream provider, if cooperative, could filter out their announcement as a quick fix. It's surprising it happened though, most upstreams put filters in place already.


I guess we are in that 3% then!

But 50% of our traffic has gone!

Hopefully you are still working on it!


We're definitely still working on it. Sorry you're affected by this. We're talking with the network providers involved. If anyone from the Verizon NOC is online... call me!


I've had FiOS issues in PA since around 6:30AM impacting a decent fraction of sites. YouTube was down for about 5 minutes. Slack still very flaky. A client who hosts a 5-figure number of domains on Cloudflare still unavailable to me, but the pagers didn't start going off so presumably it is isolated. Thanks for all the updates.


Another PA Fios user, same here. 8.8.8.8 is mostly ok, 1.1.1.1, intermittent high latency in general, cloudflarestatus.com not loading.


Confirming widespread FiOS issues in NYC as well, not limited to CloudFlare IPs.


If I had the guess, the leak was probably for a huge range, maybe a /4 or something. Verizon is also notoriously bad about dealing with BGP stuff, so I wouldn't be surprised if they have particularly bad filtering.


Same FIOS issues for me. Any idea how this is all related?


Having connectivity issues with Verizon FIOS in Massachusetts this morning as well


We're also seeing 60%+ of our traffic missing. It intermittently comes back up.


There's one thing I don't understand about this all, it looks like Allegheny Technologies Incorporated (AS396531, a suspected original leaker) was originally announcing 192.92.159.0/24.

How the heck did their peers not manage to filter a sudden announcement for a range big enough that it managed to snag both 8.8.8.8 and 1.1.1.1. Do upstreams really allow a tiny /24 AS to randomly announce a /4 and get away with it? Or am I misunderstanding something fundamental about how BGP routes are allowed to propagate?


Leaking a /4 into BGP would do basically nothing unless the originator was originally advertising a /4. IP forwarding is based on the longest-prefix match. Since allocations are sized from /8 to /24, anybody actually advertising their space would not get hijacked by a /4. The leaker would just get traffic destined toward non-advertised networks.


Then my next question is: If they didn't leak a massive range, then why was it a big problem? I assume if they leaked a bad /24 it surely wouldn't be enough to take down Cloudflare and Google for everyone... no? Did they just leak tons of bad /24s or was it something else?


My understanding is they had an optimizer that broke the /4 down in to /24s and those got announced


Aha! That was the missing piece in my understanding, it all makes sense now! <3 You're the only person out of the ~5 people I asked who explained that bit.


the smaller the prefix I announce the more it gets spread. i.e. if I would announce the whole range via /32 it would probably go trough and all sites would be down. BUT under normal circumstances an upstream provider would filter it since it's sloppy to not do it.


This is the problem with BGP


“AS396531 "Allegheny Technologies Incorporated" is leaking a better-reachable route for AS13335 "Cloudflare, Inc." towards AS701 "Verizon Business/UUnet" explaining the current LSE going on.”

https://twitter.com/OhNoItsFusl/status/1143117619106652160


> AS396531 - Allegheny Technologies Incorporated

That appears to be a steel/alloys company. Why are they operating BGP equipment?


Any company which operates large factories probably has its own ASN and runs its own networks. Every thing's gotta be internet-enabled these days, and at a certain scale, it becomes cost-effective.


Why not? Pretty much everyone that needs a redundant internet connection (dual ISP) does it.


> Why not?

It seems silly to me that an end user company not providing any network services which only has a 256 IP block has the ability to break a significant portion of the internet with a configuration mistake. There are several ways to setup dual ISPs and routing that don't involve such risk.


You need BGP and provider independent space for your two ISPs to both announce your space. What's the alternative approach?


Don't rely on a single IP routing through multiple ISPs, use DNS.


what? this statement makes no sense from a networking perspective.

thisissue still exists if you break up your IP space, it just makes it far harder to manage.


Most places that are just doing it for that reason won't be advertising anything other than their own /24 or whatever though. You have to fuck up pretty spectacularly (and have your upstream providers do the same) to be able to accomplish what has happened here.


Pittsburgh is a town that (used to be) run by steel. Even a few decades after that dominance, this particular company is still a $4B one on the S&P 400. I'm pretty sure this is the company that my grandfather worked at for decades; his brother did from high school to retirement (except during World War II). They apparently significantly polluted the air in the high school district next to mine ten years ago.

It doesn't surprise me at all that they are still a part of infrastructure, somehow.


They aren't the original leaker.

Update: sorry, I may have been wrong. Hard to see clearly in the fog of BGP.


Can you elaborate?


Seconded. I've received notices from multiple carriers that ASN 396531 is the root cause of the leak.


Final update from me. This was a widespread problem that affected 2,400 networks (representing 20,000 prefixes) including us, Amazon, Linode, Google, Facebook and others.

https://twitter.com/bgpmon/status/1143149817473847296 https://twitter.com/bgpmon/status/1143149817473847296



Great article! A couple missing periods at the ends of paragraphs FYI.

I'm curious why so much of this lies on Verizon's shoulders. Couldn't DQE and Allegheny have implemented the exact same best practices that Verizon should have, so it never leaked to Verizon's level? And to the extent non-Verizon subscribers were affected, couldn't their ISPs have implemented the same best practices in distrusting Verizon? Is Verizon directly responsible for routing that much of global traffic?


I'm not very knowledgeable on network routing, so be warned.

But I think at some point a network peering with verizon trusts it to route things, i.e., if I as an ISP always go through verizon to deliver traffic to cloud flare then it's out of my hands the route they take.

As for downstreams adding mitigation, ideally this would happen, but I would think you should place blame proportionally to the resources and criticality. A ten person ISP won't necessarily do everything right, and it shouldn't matter that they do, since there's a small part of the internet.


Thank you!


As an aside, there is something to be said about the size of Cloudflare when global network problems are reported as being Cloudflare issues.


I’m pretty sure 1.1.1.1 for dns is impacted by this. Initially I thought my WiFi was having issues this morning until I realized it must be dns switching 1.1.1.1 out and besides the cloudflare sites everything is normal again


What's weird is that 8.8.8.8 is also intermittently down for me. Are other people having issues with Google DNS too?

https://i.imgur.com/3ySmVLW.png


Hmm. That explains the inexplicable behavior to multiple domains, some being CF and some not.

I've been seeing it for about 10h now.

Update for datapoint: I'm in Bloomington, IN, on ATT DSL.


Google rate limits ICMP to 8.8.8.8. It’s not meant to be used as your personal “is the internet up” test.


He probably pinged it because DNS wasn't working. It is meant to be his personal DNS resolver.


I use my own server or 1.1.1.1 for uptime checks, but 8.8.8.8 was my DNS fallback when 1.1.1.1 went down, which then meant I had no DNS working at all, which is why I noticed and tried pinging them.


Seriously? If true that's an awfully quick bait and switch, even for Google.


How is it bait and switch?

8.8.8.8 was never marketed as a "ping me to see if the Internet is up" service, as far as I know. Just as a fast, public DNS server.


An important use of well known easy to type IP addresses is when you're mucking around to figure out if your upstream network isn't working. I could see if they attempted to set a new standard by just not responding to ICMP at all (although turning around an icmp echo takes less work than a DNS lookup...), but responding intermittently is actively harmful.


You haven't identified the "bait" bit of the bait and switch. At no point has Google promised to respond to pings on 8.8.8.8, nor are they obliged to ever do so. Rejecting ICMP isn't "a new standard".


The promise is implicit when competing for mindshare with 4.2.2.2. Typing an IP address into a router setup is quite infrequent, compared to "let's check connectivity by ping x.x.x.x". Setting expectations that 8.8.8.8 can fill this role is the bait.

As I said, it's much easier to respond to a ping than even a cached DNS query. Or it would also be consistent to simply never respond to ping.

Now obviously in the modern "you get nothing for nothing" world, Google is able to violate whatever expectations they'd like. But "rate limiting" in a way that makes basic ping(8)s look flaky, especially on a service that will be used for debugging, is downright nasty and deserves to be shouted from the rooftops (iff it's true).


> The promise is implicit when competing for mindshare with 4.2.2.2. Typing an IP address into a router setup is quite infrequent, compared to "let's check connectivity by ping x.x.x.x". Setting expectations that 8.8.8.8 can fill this role is the bait.

4.2.2.2 is not even meant to be used as a public DNS server (and has sometimes hijacked DNS requests at times to remind people of that). So it's weird to use 4.2.2.2 to criticize Google for blocking ICMP on their actually-public DNS server.


Sure, that's Level 3's official position. Unofficially, everyone uses it and there is clearly someone inside making the deliberate decision to keep it publicly available. https://www.tummy.com/articles/famous-dns-server/

As I said, the crux of the problem isn't Google's "blocking", but rather making it intermittent. Obviously it's well within their rights to play whatever games they want - drop every other packet, vary the latency based on your IP, duplicate packets, or make it appear some queue occasionally holds your packets for 3 seconds. It's also within their rights to redirect all DNS lookups to an April Fool's page. And to do any of this selectively based on how many different Google services you use.

But that is not what any user expects, and in the end that's all protocols are - expectations. To me, the pushback I've gotten here fits right in with Surveillance Valley's general attitude of shirking responsibility with some fine print disclaimer, knowing full well what the constructive situation is. "I'm just going to go like this [spinning arms], it's not my fault if you walk into me".

If you can't see how people would expect to be able to reliably ping 8.8.8.8, or how intermittently dropping pings causes confusion (as in the original comment above), then I can't help you.


there are lots of services that are available to the public, but intended only for a specific set of people. if you go to the local supermarket and take a few dozen bags without buying anything, that's immoral and illegal. nobody will stop you from stealing the 1 cent bags, but that doesn't mean that it's OK. in this case, they have specifically put up signs saying "bags for paying customers only". if you continue to regularly go in and take bags without paying, that is theft, both legally and morally.

your argument boils down to "it is convenient for me, and I see other people stealing bags too".


What in ze hell?

1. It is straightforward to restrict a DNS server so that it only answers specific networks. This doesn't even need to be close to comprehensive to get the message across. Level 3's (née BBN's) intent is to continue to respond to the wider Internet community, regardless of what their ambient PR says. Likely for similar reasons that they run a looking glass.

2. The frequency and magnitude of your scenario makes it a straw man. A more worthwhile example is someone using a business's bathroom without buying anything. Yet most places don't really care as in the end it balances out, and we're all humans that have needs that can't be fully met by commercial provisions. The major concern is people who mess up the bathroom, paying or not.

3. While a common touchstone, theft does not apply has nothing has been taken. Perhaps unjust enrichment. But given that anybody using 4.2.2.2 to answer production DNS queries is actually harming themselves with additional latency more than anything "taken" from Level3, that's a stretch too.

Have we really become so full of corporate bullshit that we're stuck analyzing things in its myopic paradigm? I thought this was Hacker News?

PS I notice 77.77.77.77 also responds to pings and DNS queries. Should I expect to get a bill for their services? Because I'd much rather just relish the feeling of a fleeting shared purpose with someone halfway around the world in a vastly different culture.


Hi Shachar from Peer5 here, we're operating a MultiCDN. Cloudflare is actually one of the best performing CDNs. All CDNs encounter issues small to big - that's why using multiple providers and intelligently routing between them is critical for high resilience.


Right now we're seeing issues in the following ASNs: 9,541 . 59,257 . 38,264 . 132,165 . 23,888 . 55,714 . 45,773 . 45,669 . 9,260 . 58,895 . 17,557 . 38,547 . 38,193 . 135,407 . 23,966 . 7,590 . 136,525 .


Are you seeing ASN 396531 as the original leaker?


The main internet and phone service provider of the Netherlands is down. Even the emergency number (112, our equivalent of 911) is down. Almost everyone is unreachable. The whole telephone network is disrupted.

I wonder if it's related to this? It does say this kind of BGP thing can be a deliberate malicious attack. Perhaps this? https://en.wikipedia.org/wiki/BGP_hijacking


Oh, and the country's train and public transport infrastructure is experiencing some major problems too due to the phone service outage.


You have to wonder if these outages aren't the result of hostile states laying the groundwork and testing the viability of certain attacks.


Heh I think based on BGP's track-record, if a state-level actor wanted to mess up everyone's BGP routes they wouldn't have to try very hard...


@dang etc. Be good if someone changed the title here. 2,400 networks were affected (including parts of Cloudflare, Google, Amazon, Linode, Facebook, ...).


This seems like a partial outage, likely region-based. We have a large number of sites routed through Cloudflare and I can access all of them from home, but our HTTP monitoring software reports the sites as down.


Unless it’s a total coincidence, this looks like this is affecting some Amazon services, including Sagemaker notebooks and Echo devices.


"Cloudflare is observing network performance issues." not just performance issues, our entire website is unavailable because of it.

edit: availability has been alternating between available and unavailable


It's not their fault as accidents happen to everyone. You should be prepared for such a scenario and for many others too.


You are right, accidents happen to anyone. I cannot really be prepared for Cloudflare to go down though. What are my alternatives? Turn it off and route traffic to our servers directly? The DNS propagation takes longer than it just took for our website to be available again.


It shouldn't - Cloudflare keeps the TTL for their cache-enabled records very low (like 300 seconds).

If you just log in to Cloudflare and click the "orange cloud" icon on the DNS tab, which points the domain back directly to your origin, you'll see the site up within a couple minutes.


Is 300 very low? I've occasionally seen 60 in the wild.


It's very low compared to 24 hours, which is what used to be the most common setting and that (among other factors) was a big part of the "DNS propagation takes forever" mentality


Same here.


It’s definitely all countries, just for a specific range of anycast IPs.

Our CloudFlare stuff isn’t even pingable. Sometimes it’ll return an echo from a far away DC.

It’s been like this for over an hour now and your status page doesn’t even acknowledge it apart from “Network performance issues”.


It was updated a few minutes ago confirming that it's a route leak.


Yeah but they just wasted an hour of everyone’s lives trying to figure out WTF was going on at 3:34am.

(The average CF user has no idea what a route leak is, tbh.)


"Everyone" speak for yourself, middle of the workday here :P


One of the weirder leaks I've seen, 8.8.8.8 and 1.1.1.1 are both down for me, but everything else is working fine.


8.8.8.8 is google’s DNS, though.


Exactly, that's why it's weird. It would have to be a huge range leaked to get both 8.8.8.8 and 1.1.1.1, surprising that a peer didn't filter it before it worked it's way up the chain.


What is a route leak?


The Border Gatway Protocol is what network providers use to announce which IP ranges they can route traffic for. The problem is it's almost totally unauthenticated, so rogue ISPs and network operators can suddenly take over parts of the internet by "leaking" routes for ranges they shouldn't be able to control.

They do this by announcing something like "send me all traffic for 1.1.1.1 - 1.1.1.255", and if their peers don't verify it, they'll just start routing that traffic to them. Peer by peer, the route then propagates and a larger portion of the internet, and as routers learn the new bad route, more of the traffic to those IPs gets sent to incorrect network.



I realized that it would be an issue like this when downforeveryoneorjustme.com didn't load either.


Isn't HN on Cloudflare? How are we reading about a CF outage on a site that runs behind CF?


The most surefire way to know if a site is behind Cloudflare (orange cloud is on) is by hitting /cdn-cgi/trace (e.g. https://news.ycombinator.com/cdn-cgi/trace) which is the debug output from Cloudflare’s HTTP server. There’s no way to my knowledge that route can be disabled or overriden.

Anyway, no, HN is not on Cloudflare, at least at the moment.



That's because it's cdn-cgi and not cgi-bin.


HN hasn't been on Cloudflare for almost a year now. Straight to their origin server it seems.



I've seen the "checking your browser" on HN message quite a few times, especially when I use a VPN, and I'm pretty certain it's the CF one.

Maybe it's turned on selectively for high traffic / bot situations?



I just put in my own Cloudflare-enabled website and it came back negative as well.


FWIW, this site doesn't seem accurate. I used it against my site that is definitely using cloudflare and it incorrectly reported it as not :p


for our sites behind cloudflare, it seems to be a routing issue (among possible other things) - some visitors can still access the sites, others can't. traffic went down roughly 60%.


Seems to be intermittent, and perhaps dependent on which POP you are hitting. I have been getting PagerDuty alerts that are flapping between triggered and resolved.


We have several websites running behind Cloudflare. Only one out of our 8 or so seem affected


This is affecting far more than just Cloudflare.


I was going back and forth about whether to post this; we're clearly experiencing some sort of partial outage, New Relic Synthetics shows everything offline (except for Sydney for whatever reason), but the actual webserver logs indicate healthy traffic and the sites load when I visit it. "Cloudflare is observing" is pretty opaque.


We're seeing the same situation with our alerts.


The current it stack needs a do-over. These outages already happen on accident often because of human error. Imagine the damage a state actor could inflict by targeting these large data centers. I hope that some of the newer decentralized cloud startups like dfinity or storj takes over.


One of the reasons we're pushing: https://blog.cloudflare.com/rpki/


>The current it stack needs a do-over.

"The network is unreliable" is a rule of thumb that was drilled into my head in network programming class.

It always has been, it always will be. Doesn't matter if it's the internet or the link between your computer and a device sitting on your desk. And it doesn't matter what the tech is.

Making the internet more resilient only increases the severity of the failure when organizations that don't understand the risk they're taking on experience network outages.

The network is unreliable.


It’s caused my site monitoring via PagerDuty to go insane, with texts sent every few minutes.


Seems to be BGP/routing related, some networks can access CloudFlare networks normally.


We've been evaluating Cloudflare mainly for doing failovers faster than DNS. This morning I ran some tests to generate graphs to show the typical delay incurred in preparation for a show-and-tell with some key people.

I started seeing delays of up to 300 seconds! At best there was a 1 second delay. I wondered if I was going to have present "Why we've decided not to go with Cloudflare!"

Any longtime Cloudflare users comment on how rare an event this sort of thing is? It seems rare from eyeballing the recent alert history.


Things like this are not unique to CF and actually originate from outside their network. It does happen every once in a while, but I have far more confidence in CF's ability to resolve it than my own. They have the clout in the industry, the connections and the expertise to deal with this kind of thing. I've been with CF since late 2011 and am quite satisfied with their services.


That's a good point, and I must admit I didn't know what a route leak was or that it could inflict this kind of damage. I appreciate now it's not CloudFlare's fault, and my hat is off to the CTO for posting more detail here.

On the plus side, I did get to test the "Pause CloudFlare" button in a real-world scenario!


> Any longtime Cloudflare users comment on how rare an event this sort of thing is?

I've been a Cloudflare user for 7+ years and a Cloudflare Enterprise user for 2 years. Before joining Enterprise, Cloudflare would suffer some kind of global or localized network outage (that impacted our operation) about once or twice a year. Most localized ones don't really get reflected on the status page properly. After joining Enterprise, this is actually the first observed incident we've encountered so far.

Though it might not be a Cloudflare-only thing because funny thing is... Verizon Fios is also down for everyone I've talked to this morning.


Longtime Cloudflare user. This isn't specific to Cloudflare, but common sense would be to always have a backup. For example, my sites I've got Cloudflare in front, but in the background I'm caching all my content and pushing to BunnyCDN, so if I need to fallover, I can safely fallover out of the network into a live cache (each request I re-populate cache in background job).

It's saved me lots of time and energy.


I had to switch off 1.1.1.1 for the first time because of this, can’t speak to their enterprise stuff but if dns resolver is a good test this is the first event I’ve hit since launch


Interesting, isn't it, that when it's a US based steel plant, it's a route leak. When it's China Telecom, it's a route hijack.

The description of it as a leak AFAICT seems to be due to CF getting first dibs on the announcement[†] and positioned it as such. However, I firmly believe that had the general tech press gotten ahead of it first, it still would be treated much more generously than we treat China leaks.

[†] grin


How would that kind of disruption happen? Someone else also anycasting the CF IP addresses?


BGP route/prefix leaks. BGP is the protocol that deals with routing across the various internet backbones (known in the protocol as "autonomous systems", identified by an AS number).

On that protocol, the various systems broadcast what prefixes they can route, which then affects the rest of the networks' routing decisions.

By error or malice, a system can report a prefix they cannot or should not route, causing other systems to start routing traffic across it. This will either just cause weird routes (such as ones going through certain suspicious countries), cause poor performance for those routed, or no connection at all for those routed.


At 3-4 major leaks per year it seems like we should probably fix BGP one of these days...


We'll get right on that after everyone has IPv6 deployed.


I think the pressure to implement better ROA checking and something like RPKI will be much stronger than the push to IPv6 was if we keep having major route leaks multiple times a year.

Eventually some governments will have to get involved...


The way I understand it it's not BGP, it's mostly human error, or malicious intent.

The protocol is fine.


An unauthenticated protocol that allows unsigned routes to be blindly accepted is not a good protocol, that's why Cloudflare has been pushing RPKI for a while https://blog.cloudflare.com/rpki/ https://blog.cloudflare.com/rpki-details/


It has authentication and requires explicit configuration to form a neighbor relationship.

BGP was designed for operators to implement a routing policy. In most implementations it allows everything by default with no modifications to route metadata, so if you do not set up your policy correctly you'll have issues like this.


It has authentication for only one hop, if routes propagated all the way up the chain with signatures, it would be much easier to block/limit bad AS behavior.


Your peering relationship is only for one hop. What it lacks is prefix/path validation, not authentication.


But authentication of every advertised range all the way up the chain would allow upstream providers to easily differentiate valid large prefix announcements that were done intentionally (e.g. big ISP announcing some routes) from crazy nonsense done by an unknown party that isn't a big ISP. We definitely need prefix filtering, but there needs to be some easily verifiable source of identity tied to each announcement to be able to automate the process of accepting and rejecting large prefix announcements.


A protocol that allows "human error" or "malicious intent" to take down entire swathes of the internet due to being entirely unauthenticated, is not "fine".

What you are describing is a protocol problem.


Malicious intent could likely be mitigated with cryptography. E.g., publishing a prefix requiring a signature from its owner.

Such system would also contain human error to a smaller set of possible faults.


No system failure is ever "human error". It's faulty system design.


Experiencing around 60% traffic drop on customer's sites.


Hey CloudFlare: this page is dependent on ajax.googleapis.com, and if js is disabled, googletagmanager.com. (Also, weirdly, they still have a link to Google+ posts?)


Does anyone know which global sites were or still are unavailable because of Cloudflare crash? Maybe some media sites?


WPEngine is down in all regions https://wpenginestatus.com


Discord for one


Disabling HTTP proxy and leaving "DNS only" option in CloudFlare DNS settings solved the problem for us.


not very safe for some users.


Doesn't appear to work for us


This morning I'm finding out just how many of our supporting services rely on Cloudflare as well.


Seeing ~60% drop in traffic here.


Is the specific IP range of the leak known ?


[flagged]


> Crimeflare

Care to explain that one?


there is a site called crimeflare that can explain it as well as it can be explained.


While I agree with the general sentiment - and I've certainly publicly and loudly expressed it in the past - this particular incident can't actually be blamed on that.

It's a route leak, which can affect any arbitrary amount of ISPs, because the BGP protocol is totally unauthenticated.


I came here to comment the same thing. Cloudflare is too big.


Does anyone know which global sites were unavailable because of Cloudflare crash?


It's not really fair to call it a Cloudflare crash.


You're not going to be able to get a solid list, this is a different category of problem than something like CloudBleed, and even then the list wasn't solid. This issue is affecting AWS, Cloudflare, Cloudflare DNS, Google DNS, and the tens of thousands of other services that depend on them, but it's region specific and will break different things for different users as the leak propagates.


One source: https://twitter.com/atoonk/status/1143143943531454464

90 AS 13335 Cloudflare, Inc. 18 AS 7018 AT&T Services, Inc. 8 AS 63949 Linode, LLC 8 AS 2828 MCI Communications Services, Inc. d/b/a Verizon Business 6 AS 26769 Bandcon 6 AS 16509 Amazon.com, Inc. 4 AS 6428 CDM 4 AS 2914 NTT America, Inc. 2 AS 9808 Guangdong Mobile Communication Co.Ltd. 2 AS 6939 Hurricane Electric LLC 2 AS 62904 Eonix Corporation 2 AS 55081 24 SHELLS 2 AS 54113 Fastly 2 AS 46606 Unified Layer 2 AS 45899 VNPT Corp 2 AS 4246 New Jersey Institute of Technology 2 AS 3257 GTT Communications Inc. 2 AS 27695 EDATEL S.A. E.S.P 2 AS 22781 Strong Technology, LLC. 2 AS 20473 Choopa, LLC 2 AS 16625 Akamai Technologies, Inc. 2 AS 12129 123.Net, Inc.


I think it's like 2.4k ASNs at this point, each with 10s-1000s of IPs, I guess you can make a list from that but it's going to be as unreliable as the Cloudbleed list was. Also not always easy to do reverse hostname lookups from the IPs to see the site names.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: