Hacker News new | past | comments | ask | show | jobs | submit login

Google, Microsoft Bing, Yahoo, DDG, Baidu, Yandex, and more. The caches other than Google were quick to clear and we've not been able to find active data on them any longer. We have a team that is continuing to search these and other potential caches online and our support team has been briefed to forward any reports immediately to this team.

I agree it's troubling that Google is taking so long. We were working with them to coordinate disclosure after their caches were cleared. While I am thankful to the Project Zero team for their informing us of the issue quickly, I'm troubled that they went ahead with disclosure before Google crawl team could complete the refresh of their own cache. We have continued to escalate this within Google to get the crawl team to prioritize the clearing of their caches as that is the highest priority remaining remediation step.

Thousands of years from now, when biological life on this planet is all but extinct and superintelligent AI evolving at incomprehensible rates roam the planet, new pieces of the great PII pollution incident that CloudFlare vomited across the internet are still going to be discovered on a daily basis.

I was expecting this:

Thousands of years from now, when biological life on this planet is all but extinct and superintelligent AI evolving at incomprehensible rates roam the planet, taviso will still be finding 0-days impacting billions of machines on an hourly basis.

Be glad that Google is employing him and not some random intelligence agency.

I have huge respect for taviso and his team. Their track record in security work is so impressive. They are without a doubt extremely capable.

However, I am always wondering: are they really globally unique in their work and skill? So that they are really the ones finding all the security holes before anyone else does because they are just so much better (and/or with better infrastructure) than anyone else? Or is it more likely that on a global scale there are other teams who at least come close regarding skill and resources, but who are employed by actors less willing to share what they found?

I really do hope Tavis is a once-in-a-lifetime genius when it comes to vulnerability research!

One of the big conservatories in the infosec world are people who sell 0-day exploits to "security companies." Some go for the tens of thousands of dollars. Ranty Ben talked about how some people live off this type of income, when it came up in a panel discussion at Ruxcon 2012.

No he is definitely not alone, some of them work for other security companies, for antivirus companies, some of them are selling found vulnerabilities

What's funny is he kinda just stumbled upon this bug accidentally while making queries.

If I were just casually googling two weeks ago and came across a leaked cloudflare session in the middle of my search results I think I would have vomited all over my desk immediately. Dude must have been sweating bullets and trembling as he reached out on twitter for a contact, not knowing yet how bad this was or for just how long it's been going on.

I believe the 2009 Yahoo-Bing agreement is still in force, where Bing provides search results on Yahoo.com:


I know the search I performed now on Yahoo states "Powered by Bing™" at the bottom.

Yeah, I thought that could be it as well but was at the bottom of the Yahoo result:

<!-- fe072.syc.search.gq1.yahoo.com Sat Feb 25 03:58:27 UTC 2017 -->

Given they are identical results it's pretty clear it must be a shared index I suppose, that or the leaked memory was cached.

Yahoo provides a front end to the search results, Bing provides the crawl/search/archives.

What the hell does Yahoo even do anymore? Just email? Or is that just a proxy to hotmail?

Finance, News, Mail, Fantasy Sports, etc to name a few where they are still in the top three of the category.

Yahoo was never really a search company (even its founding, it was a "directory", not a "search"). Sure, they pretended fairly well from 2004ish (following their move off Google results) to 2009 (when they did the Bing deal), but the company never really nailed search or more importantly search monetization despite acquiring one of the first great search engines (Altavista) and the actual inventor of the tech Google stole for its cash cow Adwords (Overture).

Isn't Yahoo search just a frontend to bing nowadays?

Some IPv6 internal connections, some websocket connections to gateway.discord.gg, rewrite rules for fruityfifty.com's AMP pages, and some internal domain `prox96.39.187.9cf-connecting-ip.com`.

And some sketchy internal variables: `log_only_china`, `http_not_in_china`, `baidu_dns_test`, and `better_tor`.

Exactly, it looks that the cleaning people up to now only looked for the most obvious matches (just searching for the Cloudflare unique strings). There's surely more where "only" the user data are leaked and are still in the caches.

The event where one line of buggy code ('==' instead of '<=') creates global consequences, affecting millions, is great illustration of the perils of monoculture.

And monoculture is the elephant in the room most pretend not to see. The current engineering ideology (it is ideology, not technology) of sycophancy towards big and rich companies, and popular software stacks, is sickening.

How about clearing all the cache? (Or at least everything created the last few months.)

I've never seen anyone suggest it, I suppose It cannot or should not be done for some reason?

You are asking for deleting petabytes of data. Some sides are interested in owning such data.

The real problem is going to be where history matters and you can't delete - for example archive.org and httparchive.org. There is no way to reproduce the content in the archive obviously, so no one will be deleting it. The only way is to start a massive (and I mean MASSIVE) sanitization project...

or clearing all the cache of Cloudflares website. I think that's do-able.

At this moment problem is not in Cloudflare's side, search engines crawled tons of data with leaked information, even though Cloudflare drops their caches, data is already in 3rd party servers (search engines, crawlers, agencies)

That's why he asked that the caches of all Cloudflare sites are dropped, not by Cloudflare but by these 3rd parties.

That might work. If said 3rd parties were interested in helping. Most of them might be but it just takes one party refusing to help and then you've still got the data out there.

no I meant, get a list of all domains using Cloudflare, get that removed from the cache of Crawlers.

Offtopic: "with all due respect" is often followed by words void of respect.

He is British. "With all due respect" means no respect is due. I don't think it's possible to show less respect while appearing polite. In other words, them's fighting words.


This is perfectly fine if the amount of respect due is sufficiently low.

Given the answers that cloudflare is giving I's say it's quickly approaching zero.

Ha! Excellent point!

Incredible. Are they really trying to pin it on Google? Yes, clearing cache would probably remove some part of the information from public sources. But you can never clear all cache world-wide. Nor can you rely that the part that was removed was really removed before being copied elsewhere.

The way I see it, time given by GZero was sufficient to close the loophole, it was not meant to give them chance to clear caches world-wide. They have a PR disaster on their hands, but blaming Google won't help with it.

You really have to see this to really grasp the severity of the bug.

The scope of this is unreal on so many levels.

20 hours since this post and these entries are still up ...

Can anyone provide some context please ?

For anyone being linked directly to the post: the link back to the parent page is right on top: https://news.ycombinator.com/item?id=13718752

You can also click on "parent", and repeat as necessary.

The bottom of the file has contents from another connection. Notably

    Host gateway.discord.gg

After 16 hours, those cached pages are still up...

While it is good that you discovered leaked content is still out in the wild, your tone is somewhat condescending and rude. No need for it.

You might not know the history here. Tavis works at Google and discovered the bug. He was extremely helpful and has gone out of his way to help Cloudflare do disaster mitigation, working long hours throughout last weekend and this week.

He discovered one of the worst private information leaks in the history of the internet, and for that, he won the highest reward in their bug bounty: a Cloudflare t-shirt.

They also tried to delay disclosure and wouldn't send him drafts of their disclosure blog post, which, when finally published, significantly downplayed the impact of the leak.

Now, here's the CEO of Cloudflare making it sound like Google was somehow being uncooperative, and also claiming that there's no more leaked private information in the Bing caches.

Wrong and wrong. I'd be annoyed, too.


Read the full timeline here: https://bugs.chromium.org/p/project-zero/issues/detail?id=11...

I think this is a one-sided view of what really happened.

I can see a whole team at Cloudflare panicking, trying to solve the issue, trying to communicate with big crawlers trying to evict all of the bad cache they have while trying to craft a blogpost that would save them from a PR catastrophe.

All the while Taviso is just becoming more and more aggressive to get the story out there. 6 freaking days.

short timeline for disclosures are not fun.

There was no panic. I was woken at 0126 UTC the day Tavis got in contact. The immediate priority was shut off the leak, but the larger impact was obvious.

Two questions came to mind: "how do we clean up search engine caches?" (Tavis helped with Google), and "has anyone actively exploited this in the past?"

Internally, I prioritized clean up because we knew that this would become public at some point and I felt we had a duty of care to clean up the mess to protect people.

> "has anyone actively exploited this in the past?"

Has this question been answered yet?

We're continuing to look for any evidence of exploitation. So far I've seen nothing to indicate exploitation.

>> "has anyone actively exploited this in the past?"

Wouldn't your team now even have to decide how to deal with this even after some specific well known caches have been cleared? I mean there's no guarantee that someone may not have collected all this data and use it to target those cloudflare customer sites. Are you planning to ask all your customers to reset all their access credentials and other secrets?

Google Project Zero has two standard disclosure deadlines: 90 days for normal 0days, and 7 days for vulnerabilities that are actively being exploited or otherwise already victimizing people.

There are very good reasons to enforce clear rules like this.

Cloudbleed obviously falls into the second category.

Legally, there's nothing stopping researchers from simply publishing a vulnerability as soon as they find it. The fact that they give the vendor a heads-up at all is a courtesy to the vendor and to their clients.

> The fact that they give the vendor a heads-up at all is a courtesy to the vendor and to their clients.

It is the norm, and it is called responsible disclosure. You're trying to do the less harm, and the less harm is a combination between giving some time to the developers to develop a fix and getting the news out there for customers and customers of customers to be aware of the issue.

With all due respect, they should suffer a pr catastrophe.

In this case I feel your comment is misdirected. Cloudflare was condescending in their own post above in which he was replying to- "I agree it's troubling that Google is taking so long" is a slap in the face to a team that has had to spend a week cleaning up a mess they didn't make. It is absolutely ridiculous that they are shitting on the team that discovered this bug in the first place, and to top it all off they're shitting all over the community as a whole while they downplay and walk the line between blatantly lying and just plan old misleading people.

I would be pretty mad if a website that I was supposed to trust with my data made an untrue statement about how something was taken care of, when it was not, and then publish details of the bug while cache it still out in the wild, and now exploitable by any hacker who was living under a rock during the past few months.

Actually I proxy two of my profitable startup frontend sites with CloudFlare, so I am affected (not really), but giving them the benefit of the doubt as they run a great service and these things happen.

They are well past deserving the benefit of the doubt.

I would also advise you notify your cloud-based services' customers how they might be affected (yes really), trust erosion tends to be contagious.

Agreed. The condescending downplaying tones displayed just aren't acceptable.

We only host our static corporate sites (not apps) and furthermore never used CF email obfuscation, server-side excludes or automatic https rewrites thus not vulnerable.


I think you have misunderstood the issue. Just because YOU did not use those services does not mean your data was not leaked. It means that other peoples data was not leaked on YOUR site, but YOUR data could be leaked on other sites that were using these services.

We only host our static corporate sites (not apps)

If this part is true, they're not vulnerable. Only data that was sent to CloudFlare's nginx proxy could have leaked, so if they only proxy their static content, then that's the only content that would leak.

The rest of their comment gives the wrong impression though, yeah.

> Only data that was sent to CloudFlare's nginx proxy could have leaked, so if they only proxy their static content, then that's the only content that would leak.

The way it worked, the bug also leaked data sent by the visitors of the these "static sites": IP addresses, cookies, visited pages etc.

Thanks for clarifying. You are absolutely right.

So far as I know, nothing like this thing has ever happened at any CDN ever before.

There have definitely been incidents where CDNs mixed up content (of the same type) between customers. Not exactly like this, but close.

I find it troubling that the CEO of Cloudflare would attempt to deflect their culpability for a bug this serious onto Google for not cleaning up Cloudflare's mess fast enough.

Don't use CF, and after seeing behavior like this, don't think I will.

On a personal note, I agree with you.

Before Let's Encrypt is available to public use (beta), CF provided "MITM" https for everyone: just use CF and they can issue you a certificate and server https for you. So I tried that with my personal website.

But then I found out that they replace a lot of my HTML, resulting mixed content on the https version they served. This is the support ticket I filed with them:

  On wang.yuxuan.org, the css file is served as:

  <link rel="stylesheet" title="Default" href="inc/style.css" type="text/css" />

  Via cloudflare, it becomes:

  <link rel="stylesheet" title="Default" href="http://wang.yuxuan.org/inc/A.style.css.pagespeed.cf.5Dzr782jVo.css" type="text/css"/>

  This won't work with your free https, as it's mixed content.

  Please change it from http:// to //. Thanks.

  There should be more similar cases.
But CF just refuse to fix that. Their official answer was I should hardcode https. That's bad because I only have https with them, it will break as soon as I leave them (I guess that makes sense to them).

Luckily I have Let's Encrypt now and no longer need them.

Well, the CEO does have beef with Google: https://blog.cloudflare.com/post-mortem-todays-attack-appare...

This led to Cloudflare refusing to implement support for Google Authenticator for 4 years.

lol, really? Google authenticator is just TOTP - it's an open standard. That seems childish.

Also, the notion that the CEO of an internet company would have a "beef with Google" is pretty funny.

This comment greatly lowers my respect for Cloudflare.

Bugs happen to us all; how you deal with this is what counts, and wilful, blatant lying in a transparent attempt to deflect blame from where it belongs (Cloudflare) onto the team that saved your bacon?

I've recommended Cloudflare in the past, and I was planning, with some reservations, to continue to do so even after disclosure of this issue. But seeing this comment? I don't see how I can continue.

(For the sake of maximum clarity: I take issue: 1) with the attempt at suggesting the main issue is in clearing caches, not on the leak itself. It doesn't matter how fast you close the barn door after the horse is gone and the barn has burned down. 2) With the blatantly false claim that non-Google caches have been cleared, or were faster to clear than Google's. Cloudflare should know, better than anyone, the massive scope of this leak, and the fact that NO search engine's cache has or could be cleared of this leak. If you find yourself in a situation so bad you feel like you need to misdirect attention to someone else, and it turns out no one else is actually doing anything so you have to like about that...maybe you should just shut up and stop digging?)

Hey! Don't keep the horse locked in if the barn is burning!

> I agree it's troubling that Google is taking so long.

Google has absolutely no obligation to clean up after your mess.

You should be grateful for any help they and other search engines give you.

You're right, I guess. (Disclaimer: Not affiliated with any company affected / involved)

But I still find it troubling. Is it their mess? No. Does it affect a lot of people negatively - yes. I expect Google to clean this up because they're decent human beings. It's troubling because it's not just CloudFare's mess at this point.

It reminds me of the humorous response to "Am I my brother's keeper?", which is "You're your brother's brother"

Google cleaning this up is going to take a ton of man-hours, which will cost a LOT of money. How much money is Google obligated to spend to help a competitor who fucked up? Are they supposed to just drop everything else and make this the top priority?

I don't see this as them as helping a competitor. The damage has been done (in terms of customer relations).

I view leaving up the cached copy of leaked data as being a jerk move - not towards CloudFare, but to anyone whose data was leaked.

This is an opportunity for Google to show what they do with rather sensitive data leaks - do they leave them up or scrub them?

Had damage from the leak been aleady done (to those whose data it was)? Probably. Even taking that into account, I think the Google search comes off as a jerk in this situation.

I feel like you are operating under the assumption that deleting this leaked data is trivial, that they just have to hit a delete button and the data is gone.

This is not the case; it is not obvious, trivial, or easy to delete the leaked data. It is not simple to find it all. This is not like they are being given a URL and being asked to clear the cached version of it; they are being asked to search through millions of pages for possibly leaked content.

I despise the way you've dealt with this issue with as much dishonesty as you thought you could get away with.

I will be migrating away from your service first thing Monday. I will not use you services again and will ensure that my clients and colleagues are informed of you horrific business practices now and in the future.

Next time, beware of parsers. Or formally verify them :)


(disclaimer: co-author)

For this who haven't been following along, this is the CEO of CloudFlare lying in a way that misrepresents a major problem CloudFlare created. Additionally, they are trying to blame parts of this problem on those that told them about the problem they created.

At least tell me they got their t-shirts lol.

>I'm troubled that they went ahead with disclosure before Google crawl team could complete the refresh of their own cache.

It sounded like they (cf) were under a lot of pressure to disclose ASAP from project zero and their 7 day requirement...

eastdakota is one of the cloudflare guys, so "they" in that sentence can only refer to Google (see also the previous paragraph/sentences, where eastdakota used "we" for cloudflare).

He's the CEO

With something this drastic, 7 days was generous.

>> We have continued to escalate this within Google to get the crawl team to prioritize the clearing of their caches as that is the highest priority remaining remediation step.

If you are using the same attitude as you use in this comment, with their team, i'm pretty sure they will be thrilled to keep aside all their regular work and help you out cleaning up a enormous mess created by a bug in your service.

Oh wow, taking a shit on Google after they helped you by reporting a critical flaw in your infrastructure.

I'm no longer using CF for my own projects, but you've just cemented my decision that none of my clients will either.

Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact