3438 domains which someone could have queried, but potentially data from any site which had "recently" passed through Cloudflare would be exposed in response, right? Purging those results helps with search engines, but a hypothetical malicious secret crawler would still potentially have any data from any site.
Or Baidu know enough to not purge their caches.
Think of the amount of tangible gratitude that their host nation would show them for access to some potentially tasty information....
Swap baidu for google or microsoft in that sentence and it still has the same problems. Every government 3 letter agency has a vested interest in the secrets.
Whether you believe it or not, there is actually a tangible difference between the relationships US corporations have with the USG vs other nations and their corporate entities.
The concern isn't that they use Cloudflare. The concern is that they're spidering the Internet, and therefore might be storing cached data that Cloudflare leaked.
I'd love to see some evidence that big bounties correspond to more exploits being found. In my experience, they tend to result in an increasing number amount of crap for your security team to sort through.
What I'm wondering: how many fuckups like this need to happen for website owners to realize that uber-centralization of vital online infrastructure is a bad idea?
But I guess there is really no incentive for anyone in particular to do anything about this, because it provides a kind of perverted safety in numbers. "It's not just our website that had this issue, it's, like, everyone's shared problem." The same principle applies to uber-hosting providers like AWS and Azure, as well as those creepy worldwide CDNs.
Interestingly, it seems this is one of the cases where using a smaller provider with the same issue would really make you better off (relatively speaking) because there would be fewer servers leaking your data.
Cheaply fix DDoS attacks as Cloudflare does and people will move away. It's a big problem and the general consensus is, "just use Cloudflare to fix your DDoS problem!"
You might as well scrap http entirely, with or without the "s".
The web simply doesn't scale. The only way to fix DDoS reliably is peer-to-peer protocols. Which hardly ever happens because our moronic ISPs believed nobody needed upload. Or even a public IP address.
as someone who has been involved in a number of moronic ISP designs, operations, and build outs --- asymmetric access networks are designed that way due to actual traffic patterns and physical medium constraints.
you can argue "if everything was symmetric, then traffic patterns would be different" and you might be right, but that's not how the market went or how the "internet" started.
the client-server paradigm drove traffic patterns, and there was never any market demand or advantage by ignoring it.
That's not how the market went because the market is often moronic. Case in point: QWERTY. (Why QWERTY is actually the best layout ever is left as an exercise to the occasional extremist libertarian)
Yes, traffic patterns at the time was heavily slanted towards downloads. I know about copper wires and how download and upload limit each other. Still, setting that situation in stone was very limiting. It's a self fulfilling prophecy.
You don't want to host your server at home because you don't have upload. The ISP sees nobody has servers at home so they conclude nobody needs upload. Peer-to-peer file sharing and distribution is slower than YouTube because nobody has any upload. Therefore everybody uses YouTube, and the ISP concludes nobody uses peer-to-peer distribution networks.
And so on and so forth. It's the same trend that effectively forbid people to send e-mail from home (they have to ask a big shot provider such as Gmail to do it for them, with MITM spying and advertisement), or the rise of ISP-level NAT, instead of giving everyone a public IPv6 address like they all deserve (including on mobile).
There is a point where you have to realise the internet is increasingly centralised at every level because powerful special interests want it to be that way.
Regulation is what we need. Net neutrality is a start. Next in line should be mandated symmetric bandwidth, no ISP-wide firewall (the local router can have safe default settings), public IP (v4 or v6) for everyone, and no restriction on usage patterns (the ISP should not be allowed to forbid servers). Ultimately, our freedom of expression and freedom of information depends on this. They are messing with human rights.
> Peer-to-peer file sharing and distribution is slower than YouTube because nobody has any upload.
And because IP multicast doesn't work over the internet. If it did, even if merely to some limited extent, some asymmetries would be far easier to stomach.
> you can argue "if everything was symmetric, then traffic patterns would be different" and you might be right, but that's not how the market went or how the "internet" started.
It may not have been how the market went but it definitely was how the internet got started.
You say this as I look at my positively anemic upstream that makes browsing even simple Nagios pages painfully slow, and my ISP that doesn't offer anything substantively better without a massive increase in monthly costs.
The traffic patterns for higher upstream aren't there because they can't be there.
How can I find out which services I have accounts with are using cloudflare? Or better have been using cloudflare in recent months? Assume I have a list of domains, where I have accounts.
Hacked this together to determine which ones out of the list are potentially using cloudflare reverse proxies. You could also send an HTTP request to them and look for the cloudflare-nginx Server header.
You can check IP whois records, but it'll be very hard to be 100% sure about any of them. For example, one of the examples from the bug report is Uber, which doesn't use Cloudflare for its home page but apparently does for one of its internal API endpoints.
No. 3438 domains were configured to expose this, and were potentially queried and logged by a far greater number of people. And yet other data (anything in cloudflare for months) could be exposed.
Potentially huge amounts of stuff might be exposed, but I have some assurances that "the practical impact is low" from someone I trust, so I think it's just a lot of random data. I'd still rotate all credentials which passed through Cloudflare in the past N months (and if I were a big consumer site NOT on Cloudflare, I might change end user passwords anyway, due to re-use), but I don't think it will be the end of the world.
It may seem like a nightmare Internet data security scenario, but it looks like Tavis is going to get a free t-shirt out of the deal, so let's just call it a wash.
What anomalies would be apparent in your logs if someone malicious had discovered this flaw and used it to generate a large corpus of leaked HTTP content?
That's also what I'm interested in. There's a lot of talk about the sites that had the features enabled that allowed the data to escape, but it's the sites that were co-existing with those that were in danger.
In terms of the caching, knowing the broken sites tells you where to look in the caches after the fact, but do you have any idea of who's data was leaked? Presumably 2 consecutive requests to the same malformed page could/would leak different data.
> Presumably 2 consecutive requests to the same malformed page could/would leak different data.
Wouldn't the second request be served from the CDN cache? Since for Cloudfare that particular page is a valid cached page, it would send you that same page on the second request.
I don't know enough about the layers in the cloudflare system to say. Does it only apply to cached pages? What about https? They would have the ssl termination first and then these errant servers behind that - none of those pages would be cached, right?
Are you guys planning to release the list so we can all change our passwords on affected services? Or are you planning on letting those services handle the communication?
That list contains domains where the bug was triggered. The information exposed through the bug though can be from any domain that uses Cloudflare.
So: all services that have one or more domains served through Cloudflare may be affected.
The consensus seem to be that no one discovered this before now, and no bad guys have been scraping this leak for valuable data (passwords, OAuth tokens, PII, other secrets). But the data still was saved all over the world in web caches. So the bad guys are now probably after those. Though I don't know how much 'useful' data they would be able to extract, and what the risks for an average internet user are.
> The consensus seem to be that no one discovered this before now, and no bad guys have been scraping this leak for valuable data (passwords, OAuth tokens, PII, other secrets).
This is literally as bad as it gets, anyone trying to palliate the solution has something to sell you. You'd have to be an idiot to think that $organization (public, private, or shadow) doesn't have automated systems to check for something as stupid simple as this by querying resources at random intervals and searching for artifacts.
Someone found it. Probably more than one someone. Denial won't help.
Myself and 4 other people I know all happened to get their reddit accounts temporarily locked due to a "possible compromise" in the past week or so, which has never happened to any of us before. Anyone else?
That would be unrelated to this. We haven't taken any action on any accounts because of this issue and have no plans to, as we (reddit.com) were unaffected.
If anything, it should make you trust reddit more! I don't know the exact details as to why your account may have been locked, but generally it will be because we're being proactive and have some signal that your account is using a weak or reused password.
This list is misguided. It's just a dump of sites using Cloudflare's DNS, a hugely popular and (mostly) free service. The vulnerability only affected customers using Cloudflare's paid SSL proxy (CDN) service. The latter is a much smaller subset. Even then, only a subset of the SSL proxy users, those with certain options enabled that caused traffic to go through a vulnerable parser, were really impacted. I'm not sure a list as broad as this is helpful.
At least some of this is incorrect. The issue is NOT the pages running through the parser — the issue is the traffic running through the same nginx instance as vulnerable pages.
This is not correct in my understanding: The sites with certain options enabled produced the erroneous behavior, but the data that would get leaked through this behavior could be from any site that uses Cloudflare SSL (as this requires Cloudflare to tunnel SSL traffic through their servers, decrypt it and re-encrypt it with their wildcard certificate). So if I understand correctly anyone using the (free) Cloudflare SSL service in combination with their DNS is affected.
I was wrong about the nature of the proxy issue, but right about DNS-only customers. Customers using only the free DNS service were not impacted by this at all, because traffic never flowed through the proxies.
If I'm understanding correctly, that list would include not only the 3,438 domains with content that triggered the bug, but every Cloudflare customer between 2016-09-22 and 2017-02-18.
Not really. If a site is using Cloudflare protection for only some of their subdomains they do not show on this list even if the site itself is in the alexa top 10k sites.
And of course all other sites that are not in alexa 10k are not in this list (if they are not on some other lists used, you can see the source of lists in the README of the Github repo).
Careful. It appears that any Cloudflare client who was sending HTTP/S traffic through their proxies is affected. A small subset of their customers had the specific problem that triggered the bug, but once triggered, the bug disclosed secrets from all their web customers.
You're not exposed if you never sent traffic through their proxies; for instance, if you somehow only used them for DNS.
I suspect there are a large number of Cloudflare customers that only use their DNS. I have a couple of domains in this category.
The DNS service is essentially free. It's an upgrade from most registrars' built-in DNS. It's a pretty robust solution, really -- global footprint, DNSSEC, fully working IPv6, etc.
My point is, the actual number of impacted customers was much smaller than the entire set of Cloudflare customers. There are lists in this thread that still reference hundreds of thousands (millions?) of sites, and that's just wrong.
(I agree on your first point though; I was confused about the nature of the proxy bug at first).
What I find remarkable is that the owners of those sites weren't ever aware of this issue. If customers were receiving random chunks of raw nginx memory embedded in pages on my site, I'd probably have heard about it from someone sooner, surely?
I guess there is a long tail of pages on the internet whose primary purpose is to be crawled by google and serve as search landing pages - but again, if I had a bug in the HTML in one of my SEO pages that caused googlebot to see it as full of nonsense, I'd see that in my analytics because a page full of uninitialized nginx memory is not going to be an effective pagerank booster.
Perhaps as a follow up to this bug, you can write a temporary rule to log the domain of any http responses with malformed HTML that would have triggered a memory leak. That way you can patch the bug immediately, and observe future traffic to find the domains that were most likely affected by the bug when it was running.
Or is the problem that one domain can trigger the memory leak, and another (unpredictable) domain is the "victim" that has its data dumped from memory?
I believe that's the real issue. Any data from any couldflare site may have been leaked. Those domains allow Google etc to know which pages in their cache may contain leaked info, unfortunately the info itself could be from any request that's travelled through cloudflare's servers.
Yes, the victim can be a different site. Cloudflare's post mentions this:
"
Because Cloudflare operates a large, shared infrastructure an HTTP request to a Cloudflare web site that was vulnerable to this problem could reveal information about an unrelated other Cloudflare site.
"
https://blog.cloudflare.com/incident-report-on-memory-leak-c...
What would you like to see? The SAFE_CHAR logging allowed us to get data on the rate which is how I got the % of requests figure.