I actually forgot to consider this angle, as the discussion was centred at archive.is. I'd imagine 188.8.131.52 has a very negative consequences for Netflix local caches, for example. This is where your local homegrown provider gets to save significant $$$$ due to interconnect/bandwidth costs by hosting a local Netflix cache through their appliance, and you get to benefit by the latest shows being locally cached, delivered at maximum speeds, instead of being hailed through the whole internet each time for each device.
If you're using 184.108.40.206, you're basically not only making sure that your internet will be much slower due to suboptimal CDN performance by any CDN other than Cloudflare CDN, but that you're also going to needlessly be running extra simultaneous streams of your favourite shows instead of fetching a local copy from your own ISP, increasing the cost of bandwidth transit to your ISP.
And, remember, you don't actually get any extra privacy by bypassing ECS in the first place, because your exact full IP address will have to be used to establish any subsequent TCP or UDP connections to make those requests for actual content in any case. You're basically breaking the whole internet by using 220.127.116.11, all for no real benefit! It's worse than we initially thought!
Is it possible for Netflix to use anycast?
For their appliances they can advertise to an ISP's routers via (i/e)BGP or OSPF / IS-IS to keep traffic internal, but have a fallback of having a presence in various IXPs.
Isn't this how Cloudflare works, anycast?
Basically, it would seem that Cloudflare is trying to close the performance gap by artificially limiting the performance potential of alternative CDN providers to match their own levels.
You've got it wrong. Netflix saves significant $$$ by not paying the provider for unthrottled transit. Your mental model is not how it works in practice these days.
DNS isn't the only way to handle that. If it becomes and issue for Netflix, they can use another way to handle this situation and it will works just fine.
Both make implicit assumptions. One assumes the worst of Cloudflare and thinks “what’s the worst reason Cloudflare could have for doing this. How do they profit off this?” And the other assumes that Cloudflare has good intentions.
Neither answer is technically wrong. Both flow logically from their initial assumptions. But it shows how different our conclusions can be depending on where our initial biases lie. For the person who believes the first answer and says “prove to me that Cloudflare isn’t doing something nefarious”, it’s not possible. The analysis is correct and can’t be challenged unless the initial assumption is challenged. And for people who strongly believe that Cloudflare has bad intentions, nothing can be done to change their mind.
In this example it’s Cloudflare but it applies to any person or organisation that we feel strongly about.
You can only held companies accountable for the laws and explicit written promises and legally binding agreements.
Currently the price companies pay for privacy violations is low. If a company like Cloudflare writes down all the privacy promise in legally bind manner and puts themselves into legal and financial liability that is above the norm for breaking the contract intentionally it can increase trust.
Companies can do much more than they do now. They can put explicit bounties for whistle blowing them and revealing privacy violations. They can hire trusted third parties to do privacy audits and handle whistle blowing.
Looks like they are.
Employees blowing the whistle internally, or externally? If they want to encourage employees to blow the whistle externally, they could put a carve out for that in their NDA.
All software seems to need that now days.
Of course, in that case you can't put surprising terms into the agreement if they are disadvantageous to the user. Courts don't see that a meeting of the minds took place. https://en.wikipedia.org/wiki/Meeting_of_the_minds
This court decision doesn’t mean “no rules for scrapers”, rather it means “different rules for scrapers, independent of any site-specific TOS”. Or did I misunderstand the decision?
Web scraper as a consumer use is hard to argue.
Neither answer may look technically wrong, but only one reflects what is actually happening here. That we don't know which exactly based on that specific data doesn't mean that both are equally valid.
The sentiment of the red message is great though.
So, yes, good observation.
And while the second answer is a statement, not an analysis the rest of what I said holds. You will only accept their statement as the truth if you assume good intent of them.
Cloudflare has repeatedly said that while they operate for profit, they take the long term view. By doing the right thing now, by being privacy focussed, they will be profitable for decades to come. This seems logical to me, which makes the second answer more believable.
Pepperidge Farms remembers.
While at the same time working to preserve people's privacy with things like giving out SSL for free, pushing for eSNI, running a public DoH server, building a service that makes sure all data from your phone to us is encrypted etc. etc.
Likewise for 18.104.22.168 — when taking into consideration the local caching appliances that the ISPs have invested in, the lack of ECS would make the clients go all the way through the internet for the same content that's already cached locally by the ISP for users of all other decent resolvers — this will only contribute to increased costs for the individual ISPs, extra latency for users, and more competitive advantage of your products due to you diminishing the technological advantages of your competitors, without regard to the actual user experience of the users, or the reliability and scaling of the internet infrastructure at large.
Not to mention that such Netflix/YouTube usage, when going directly through transit providers and through the whole internet, would also subject the users to a greater chance of surveillance at large compared to users of resolvers that would access local copies on the caching appliance.
One of the better alternatives is to get a VPN you trust that puts multiple users behind the same IP address and then operate your own recursive DNS from behind there. The VPN service itself could still log your queries, but at least they have plenty of competitors, and you chose one you trust, right? Or if you don't want to trust any one party, use Tor.
I trust Cloudflare much more than I trust any ISP I've had to deal with, including American ISPs when I lived there. I trust Google much more than any ISP, and I'm not particularly charitable towards Google.
Centralized DoH isn't perfect, but it's better than the status quo. The SNI hole is shrinking. My threat model does not include defending against the Mossad doing Mossad things with my email^H^H^H^H^HDNS.
And Warp+ aims to be about that plus performance.
If you want to be totally anonymous on the Internet then I recommend you use Tor. If you just use a VPN then you may hide your IP address from sites you visit but there are tons of other fingerprinting techniques that can be used.
In the case of DNS information about the subnet, the query etc. is sent around unencrypted.
One is open to eavesdropping, the other is not.
That doesn't sound too bad, privacy-wise.
EDIT: I mean if you were to map all US IP's to a single canonical IP for instance.
To eavesdrop on Warp you'd need to do it all over the world, capture encrypted traffic and then try to correlate traffic. If your threat model is a global adversary capable of doing that correlation and you don't want sites to know your IP, then use Tor.
No, they can sit near your 22.214.171.124 servers and catch all incoming and outgoing traffic, watching connections to your 126.96.36.199 servers that initiate DNS queries and actual outgoing queries that 188.8.131.52 makes to authoritative servers and responses too.
With EDNS sitting in front of the authoritative server of the site this actor is trying to monitor.
The latter is easier than the former.
The market rate for standard SSL certs is zero.
you are saying the accessor function getX() which returns a value of X but you don't trust it, you think it's giving you crap, should not be treated any differently depending on whether the getX() function even has access to X or has absolutely no such access. (For example if the value of x isn't even on the same network partition as the getX() function you don't trust.)
You're saying if you don't trust it, it doesn't matter if the function itself even has access to X or doesn't.
In one sense that might be true, but in another sense that seems silly. If getX itself has access to X, you can try to determine whether it is giving it to you. if getX doesn't have any access to X, then it doesn't really matter what it's doing, its process is irrelevant.
so to me there's a huge material difference. We can try to judge the process by which getX() returned Cloudflare's motivations. What steps did it perform to return that value? What's the code? etc.
huge difference. that knowledge is somewhere in the company.
You do not really address the fact that they are not required to say the truth, or that when the truth is harmful for their public image they are directly incentivised to not speak the truth. The only way you do address this is by saying that this is something that needs investigating. I would posit that the grandparent has done this already, and come to the sensible conclusion: There is less reason to trust someone incentivised to lie than there is to trust someone who knows nothing.
Aside from that trust, we have to evaluate the validity of statements. Given prior knowledge, for Cloudflare in the bad case the likelihood of a valid statement approaches zero. For the random yelling things as they pop into their mind, it is completely unknown.
It's not so pure. For example an outsider here on HN who says "A close relative of mine works at cloudflare on the team that made this decision, and he confided in me..." -- then again you have to somehow judge if this is true or not, but it is worth treating it differently from someone writing "I don't have any insider information and this is pure speculation, but maybe..."
I mean it just doesn't make sense to treat these cases as exactly the same. I wanted to give another example. say you don't trust the gps coordinates you're being given when you make an API call on a device.
would it make sense to treat it exactly the same as making the API call on a device that doesn't even have a gps module, such as a microcontroller without gps or wifi/cellular access or anything that can be a proxy for gps?
if there's a physical module and you don't trust the output, at least you can investigate. it doesn't make sense to treat it exactly the same as if the information isn't even on the same device.
it depends on the details of the process that's giving you the output you don't trust. What's the process by which getX returns its output? What's the process by which Cloudflare employees make statements about their motivations (which they do have access to)?
These are questions we can investigate. if we find that the statements are written by a PR agency who hasn't even stepped in their building and has no contact with the teams they're lying on behalf of, that's a possible result too. but it's worth looking into.
If your site depends on a DNS extension that's only 3.5 years old (and designed to be optional), I think it's fair to say your site is just offline for some users due to a config mistake.
You're free to set up your servers however you like, but there's wisdom in Postel's law.
This statement is based upon a terrible misunderstand of Postel's
robustness principle. I knew Jon Postel. He was quite unhappy with
how his robustness principle was abused to cover up non-compliant
behavior, and to criticize compliant software.
Jon's principle could perhaps be more accurately stated as "in general,
only a subset of a protocol is actually used in real life. So, you should
be conservative and only generate that subset. However, you should also
be liberal and accept everything that the protocol permits, even if it
appears that nobody will ever use it."
Further discussion on the topic:
For the lazy like me: robustness principle, aka Postel's law
Thank you for the reference. I learned something today!
> A flaw can become entrenched as a de facto standard. Any implementation of the protocol is required to replicate the aberrant behavior, or it is not interoperable. This is both a consequence of applying Postel's advice, and a product of a natural reluctance to avoid fatal error conditions.
> I think it's fair to say your site is just offline for some users due to a config mistake.
Archive.is is not making an accidental mistake. Archive.is is behaving very intentionally. They've said so on Twitter. And I believe profmonocle agrees with me on that point.
Just to give you more insight. Google knows which IP address I am using Gmail from. If I use 184.108.40.206 they know what other content I am looking for which websites I visit and tie that to my account. If I use something like Cloudflare who do not expose my IP (or range) then I achieved more privacy. I could use my local DNS server (like I do at home) but I travel a lot.
In this case "misconfiguration" is actually for privacy and archive.is could live with that just like other sites but they intentionally screw with Cloudflare (aka the users who has 220.127.116.11 as the resolver).
The exact same command fails when sent from Cloudflare's datacenters, but succeeds when sent from DigitalOcean:
Two more sources:
End users switching to Cloudflare's DNS endpoint are doing so because they feel the DNS provider is both faster and more secure.
They rightly made the decision NOT to pass on the end user's IP information to the upstream DNS server. I agree with this decision and they are acting in my best interests in doing so. To draw some kind of nefarious intention from this is absurd.
Until Cloudflare are proven to be nefarious actors, I'll continue to use their service.
In this instance, the upstream DNS server and the resultant HTTP server are operated by the same organisation. Cloudflare have opted to not provide the /24 (or /56 if IPv6) network that the original DNS request came from, in the DNS request. Your computer will then provide the /32 (or /128 if IPv6) that your request is coming from when you connect to the HTTP server.
What privacy win have you gained by Cloudflare not providing that information in this instance?
In this particular case, it's one CDN taking issue with another CDN only. No other DNS providers appear to be impacted.
'Feel' being the keyword. Faster, generally yes. More secure, not well defined and users are generally wrong.
> nefarious intention
I don't believe I've heard any complaints of nefarious intent.
But let's be clear, this advantages Cloudflare over other CDNs. That they treat the DNS data very well does not mean they won't have an incident. As well, they are more of a target due to the concentration.
> Until Cloudflare are proven to be nefarious actors,
Nefarious wrt whom? For end-users taken individually, I agree, I don't see and it's hard for me to imagine mal intent.
But IMHO they are bad for the Internet. I mean, more power to them and were I a leader there I'd press the same agenda, but as a 3rd party, the way I see it is that in 10 years they are going to be an anti-power much like Google is. Addiction to their services will allow them to trample over what's good for all.
What I dislike most about them is that they promote themselves as purely a force for good. Except for a few PMs and execs I'm 100% sure they believe it. But it's a disservice to never discuss the negative aspects of any of their services. And woe to anyone who does.
As for proven nefarious deeds, do you not consider "banning" sites from using CF nefarious? What if they take it to the next step now, and stop providing DNS for those sites? Given their stated reason for bans, yes it could happen. Why must you wait until they prove to be nefarious? The concentration of power per se is a bad thing.
Maybe there's a big backstory here, but without context that seems passive-aggressive and quite random?
I’m also surprised that traffic from Cloudflare DNS users caused any significant problem. Was it really that much traffic?
It's not. The proof is that CDNs existed long before edns-client-subnet was introduced. All it does is allow the CDN's DNS servers to return the most optimal A/AAAA records for the client. But the worst that should happen without it is you get sent to a more distant CDN server, and the content loads more slowly.
The fact that archive.is somehow suffers without this feature (which, btw, wasn't standardized until 2016) suggests they're doing something really, really odd. If I were them, I'd focus on making my system more robust, rather than demanding the rest of the Internet adopt a relatively young, optional DNS extension.
Here's an interesting thought — if it's so bad for privacy and isn't necessary for a CDN, does Cloudflare the CDN simply disregard ECS when receiving requests from DNS.Google, or do they take it into account?
If archive.is thinks that Internet standards should be adopted so quickly, it's weird that they don't support IPv6 considering it's been a standard since 1998!
Obviously I'm kidding, but only kind of. When it comes to insisting on adopting new standards, edns-client-subnet is a weird hill to die on, especially considering it was always meant to be optional.
> does Cloudflare the CDN simply disregard ECS when receiving requests from DNS.Google, or do they take it into account?
I don't think they have a reason to use it because they use TCP anycast. Looking at https://cachecheck.opendns.com/ they seem to return the same IPs regardless of geography.
* Yes, if you're running a local resolver for your LAN, or have a website on a single server, of course ECS should be optional.
* If you're running a CDN (and archive.today does), or if you're running a public resolver at 100+ POPs, then, no, ECS is not meant to be optional.
i.e it's not "(...CDN...) then ECS should not be optional"
I don't understand that for various reasons.
1) Privacy is already lost here. If I shout my mobile number on a train with you that's full of people, everyone knows my phone number. If you choose to keep it / use it to call me tomorrow doesn't matter.
2) If Cloudflare can make _better_ decisions based on the information shared by Google, why shouldn't they? As long as it is optional and they don't take their ball and go home^W^W^W^W^W^Wreply with 127.0.0.3 in cases where you don't provide it..
Google isn't the internet, you know?
It's not because it can be bad for privacy that you can't use it for good. The feature exist for a good reason, it's valid, it doesn't change anything to the fact though that it can be use for bad reasons too, which is why you want to remove it. In the means time, there's no reasons not to use it for good reason while it's still there.
Anycast IP is very expensive, unfortunately. Just getting a /22 has been expensive for years, and is now also getting difficult as well. It is beyond the reach of smaller companies.
GeoDNS is extremely cheap in comparison. You can run distributed services using GeoDNS for low latency on multiple continents on a hobby budget these days.
Anycast is technically better in many ways (the combination of anycast and geoDNS is better again), but anycast is so expensive that smaller operators just can't use it.
These days, smaller operators can use Cloudflare for their CDN, and the suspicious mind might think that suits Cloudfare just fine. But that doesn't really help for low-latency interactive services, or non-HTTP services.
> I’m also surprised that traffic from Cloudflare DNS users caused any significant problem.
Maybe the problem isn't amount of traffic, but rather that the site doesn't want to gain a reputation as slow (and therefore incompetently administered, and offputting to use) when everyone running Firefox switches over to 18.104.22.168 DoH automatically.
> Absence of EDNS and massive mismatch (not only on AS/Country, but even on the continent level) of where DNS and related HTTP requests come from causes so many troubles so I consider EDNS-less requests from Cloudflare as invalid.
Does anyone know what they could mean here? I get that having more open connections and slow requests is not great, but there are popular attacks people will try against them in this case. They already have to handle pathologic cases of slow requests, so handling some small number of slower clients shouldn't be an issue.
Or are they talking about some other problem?
Just try one of the akamai endpoints to test it. (E.g media.steampowered.com)
For me 22.214.171.124 serves akamai singapore IPs, while 126.96.36.199 serves IPs of my ISPs akamai cache in Sri Lanka.
If your ISP has a bad route to 188.8.131.52, this just gets worse.
Internet protocols were designed to be redundant and resilient, so that things still work when things break and traffic takes other paths. When people do shit like this, we get a less reliable, less functional internet. Demanding to know the exact subnet a request originated from, and returning incorrect results when that information is not given, seems to me a thoroughly hostile behavior on the part of archive.is.
How many users are explicitly choosing that? How many users are actually choosing something very different, and this is an unintended consequence of their choice, that they would otherwise be unaware of if not for this provider taking a stand?
Not sending anything at all doesn't solve any of this. If a message was shown explaining the situation, sure, but archive.is solution doesn't answer your question at all.
In what case would some extra delay be worse than no access at all?
Seems pretty anti-competitive if Cloudflare's DNS stops Akamai's local caching at your ISP from working, no?
We really dont know the site works in the backend. So I guess the admin did not want to spend time to fix issues cloudflare created.
But that's the thing, Cloudflare didn't really create any issues. If I live in the US and I decide to use some random public DNS server in Australia, it will be an unpleasant setup, but it's a perfectly valid one.
There's no rule that your DNS server must be on the same network as you, or send your subenet if it isn't. When that's the case it allows for some nice performance optimizations. (I.E. sending you to a closer cache.) But it's just that - an optimization. If your service is completely unreachable without performance optimizations, you've created a very fragile service.
It's the default configuration. 99% of internet users follow this configuration (at least, until web browsers start shipping DoH as a default). It's honestly a fairly reasonable assumption to make.
Please can you rephrase your argument. 100% serious, I'd like to know what point you're making.
Pre-emptively: because whatever DNS server you are using already knows your IP address, regardless whether it's the first query for the site itself, or subsequent queries for site-related additional resources.
If I go to a page that links to a bunch of sketchy websites, I don't want my IP (and thus, identity) tied to those sketchy websites just because I hovered my mouse over the links.
Doesn't the browser's internal resolver use an external recursive server (either the host's configured ones or browser-determined ones)? Chrome does, AFAICT. As opposed to being a recursive resolver itself, it just implements a caching stub resolver.
The remote DNS host for sketchy-service.com doesn't see your IP address, they see the recursive server's address.
the fact that I subsequently connect to another place over HTTP or some other protocol is distinct from telling a DNS authority who is asking a question about a domain name: the article implies "its the same leakage" but it isn't: different people get told.
I don't have good sense of this, but people I trust say a surprisingly small collection of information identifies you to a specific level. same /24 is only 255 people if there isn't a CGN. More to the point, if your /24 identifies your economy, you're now subject to IPR limits and can be told different things.
So some ECS objection is rooted in opposition to regional IPR. Netflix. Sub-optimal CDN delivery (to one person) is wall avoidance (to another)
So, in reality, the extra privacy gained from not doing ECS is hardly something with a measurable effect, because this information HAS to leak in any case. Even if make DNS encrypted, even if you employ encrypting TLSv1.3 SNI, the IP addresses will still leak, and with a much higher precision anyways. So, this we-don't-do-ECS-because-privacy is a rather pointless statement in the end.
The main reason that Cloudflare wouldn't share this info is to prevent competitors like Akamai to operate a CDN as good as them.
It looks more like sabotaging competition than increasing privacy.
Exactly. Their own answers in the threads over here at HN are basically admitting as much — they claim to be working on solutions alternative to ECS, because Google and some others have more PoPs than Cloudflare does. They're obviously using this as a competitive advantage to slow down competing CDNs. And noone's talking about!
Does CF DNS not use qname minimization? That would reduce the association between subnets and names looked up.
Many browsers prefetch DNS for links on webpages these days. So it’s entirely possible and even common that you may query DNS for sites you never visit, which would indeed be a privacy leak.
Secondly, many sites have their DNS hosted elsewhere so it may not be the same people you are leaking the information to.
Thirdly, if the DNS query is transmitted to the site’s DNS servers in plain text (which most DNS is), then despite eSNI etc anyone who has access to the wire traffic along the route from the DNS proxy to the site’s DNS servers (which is probably different from the route your own traffic takes to their servers) can see which site you wanted to access.
If the answer to these questions is no, then Cloudflare’s reasons for blocking ECS (ie privacy) carry weight. Otherwise no.
> More generally, does 184.108.40.206 in any way treat Cloudflare’s own nameservers in a special way and send it information that it doesn’t send to others?
ECS doesn't even forward the IP, only the /24.
If several bigger CDNs like akamai or softlayer will consider requests from 220.127.116.11 without EDNS as invalid and block them, Clouldflare wouldn't be able just to say that it's their own problems
They do solve archive.is. But archive.is's DNS servers have been configured to return bogus answers to queries from Cloudflare's servers.
> Archive.is does not block all requests lacking EDNS. They specifically block requests coming from Cloudflare's datacenters.
Even if they eventually make DNS encrypted, even if encrypting TLSv1.3 SNI work properly (and both of these are pretty big ifs, BTW), the IP addresses will still leak, always, and with a much higher precision anyways. So, this we-don't-do-ECS-because-privacy is hardly a rational statement on Cloudflare's part in the end — it merely breaks the performance of their competitor CDNs without any real privacy angle.
Whether you think that's enough to care about or not it's very different than the picture you painted.