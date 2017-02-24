The caches other than Google were quick to clear and we've not been able to find active data on them any longer.
...
I agree it's troubling that Google is taking so long.
The leaked information is hard to pinpoint in general, let alone amongst indexes containing billions of pages.
I can understand the frustration - this is a major issue for Cloudflare and it's in everyone's best interests for the cached data to disappear - but it's not easy, and they shouldn't say as such (or incorrectly claim that "The leaked memory has been purged with the help of the search engines" as their blog post states).
This is a burden that Cloudflare has placed on the internet community.
Each of those indexes - Google, Microsoft Bing, Yahoo, DDG, Baidu, Yandex, ... - have to fix a complicated problem not of their creation.
They don't really have a choice either given that the leak contains personally identifiable information - it really is a special sort of hell they've unleashed.
Having previously been part of Common Crawl and knowing many people at Internet Archive, I'm personally slighted. I'm sure it's hellish for the commercial indexes above to properly handle this let alone for non-profits with limited manpower.
Flushing everything from a domain isn't a solution - that'd mean deleting history. For Common Crawl or Internet Archive, that's directly against their purpose.
reply
eastdakota 19 hours ago [-] (Cloudflare CEO)
>Google, Microsoft Bing, Yahoo, DDG, Baidu, Yandex, and more. The caches other than Google were quick to clear and we've not been able to find active data on them any longer. We have a team that is continuing to search these and other potential caches online and our support team has been briefed to forward any reports immediately to this team.
>I agree it's troubling that Google is taking so long. We were working with them to coordinate disclosure after their caches were cleared. While I am thankful to the Project Zero team for their informing us of the issue quickly, I'm troubled that they went ahead with disclosure before Google crawl team could complete the refresh of their own cache. We have continued to escalate this within Google to get the crawl team to prioritize the clearing of their caches as that is the highest priority remaining remediation step.
reply
taviso 6 hours ago [-] Tavis Ormandy
>Matthew, with all due respect, you don't know what you're talking about.
(Bunch of Bing Links)
Not as simple as you thought?
But their response here is embarassingly bad. They're blaming Google? And totally downplaying the issue. I really didn't expect this from them. Zero self awareness- or they believe they can just pretend it's not real and it'll go away.
If you find some samples with domain names / unique identifiers of domains (e.g. X-Uber-...) you are welcome to contribute to the list: https://github.com/Dorian/doma/blob/master/_data/cloudbleed....
At this point if you don't consider all data that was sent or received by CloudFlare during the "weaponized" window compromised, you're lying to yourself.
What happens for sites using Full SSL (a certificate between cloudflare and the user and a certificate between cloudflare and the server), could any information from ssl pages have been leaked?
Not exactly breaking news. At some point, maybe people will realise that CF is actively making internet worse and less secure, and that it should be treated as nothing more than a wart to be removed.
Either we can search for obvious strings like X-Uber-* and try to scrub them one by one, or we can just nuke the caches for all the domains that turned on the problematic features (Scrape Shield, etc.) anytime between last September and last weekend. Cloudflare should supply the full list to all the known search engines including the Internet Archive. Anything less than that is gross negligence.
If Cloudflare doesn't want to (or cannot) supply the full list of affected domains, an alternative would be to nuke the caches for all the domains that resolved to a Cloudflare IP [1] anytime between last September and last weekend. I'm pretty sure that Google and Bing can compile this information from their records. They might also be able to tell, even without Cloudflare's cooperation, which of those websites used the problematic features.
[1] https://www.cloudflare.com/ips/
The leaked information is hard to pinpoint in general, let alone amongst indexes containing billions of pages.
I can understand the frustration - this is a major issue for Cloudflare and it's in everyone's best interests for the cached data to disappear - but it's not easy, and they shouldn't say as such (or incorrectly claim that "The leaked memory has been purged with the help of the search engines" as their blog post states).
This is a burden that Cloudflare has placed on the internet community. Each of those indexes - Google, Microsoft Bing, Yahoo, DDG, Baidu, Yandex, ... - have to fix a complicated problem not of their creation. They don't really have a choice either given that the leak contains personally identifiable information - it really is a special sort of hell they've unleashed.
Having previously been part of Common Crawl and knowing many people at Internet Archive, I'm personally slighted. I'm sure it's hellish for the commercial indexes above to properly handle this let alone for non-profits with limited manpower.
Flushing everything from a domain isn't a solution - that'd mean deleting history. For Common Crawl or Internet Archive, that's directly against their purpose.
reply