Hacker News new | past | comments | ask | show | jobs | submit login
[dupe] Why does 1.1.1.1 not resolve archive.is? (stackexchange.com)
313 points by stargrave 12 days ago | hide | past | web | favorite | 157 comments





Previous discussion concerning this, which includes replies from Cloudflare: https://news.ycombinator.com/item?id=19828317

> We are working with the small number of networks with a higher network/ISP density than Cloudflare (e.g., Netflix, Facebook, Google/YouTube) to come up with an EDNS IP Subnet alternative that gets them the information they need for geolocation targeting without risking user privacy and security. Those conversations have been productive and are ongoing. If archive.is has suggestions along these lines, we’d be happy to consider them.

I actually forgot to consider this angle, as the discussion was centred at archive.is. I'd imagine 1.1.1.1 has a very negative consequences for Netflix local caches, for example. This is where your local homegrown provider gets to save significant $$$$ due to interconnect/bandwidth costs by hosting a local Netflix cache through their appliance, and you get to benefit by the latest shows being locally cached, delivered at maximum speeds, instead of being hailed through the whole internet each time for each device.

If you're using 1.1.1.1, you're basically not only making sure that your internet will be much slower due to suboptimal CDN performance by any CDN other than Cloudflare CDN, but that you're also going to needlessly be running extra simultaneous streams of your favourite shows instead of fetching a local copy from your own ISP, increasing the cost of bandwidth transit to your ISP.

And, remember, you don't actually get any extra privacy by bypassing ECS in the first place, because your exact full IP address will have to be used to establish any subsequent TCP or UDP connections to make those requests for actual content in any case. You're basically breaking the whole internet by using 1.1.1.1, all for no real benefit! It's worse than we initially thought!


> I'd imagine 1.1.1.1 has a very negative consequences for Netflix local caches, for example.

Is it possible for Netflix to use anycast?

For their appliances they can advertise to an ISP's routers via (i/e)BGP or OSPF / IS-IS to keep traffic internal, but have a fallback of having a presence in various IXPs.

Isn't this how Cloudflare works, anycast?


Yes Netflix is anycasted at the edge. Has been for years.

Just because a given CDN uses anycast doesn't mean that they don't also use ECS as well. In fact, Cloudflare CEO's own wording seems to suggest that all of these mentioned providers still need ECS even though they do run anycast.

Basically, it would seem that Cloudflare is trying to close the performance gap by artificially limiting the performance potential of alternative CDN providers to match their own levels.


> This is where your local homegrown provider gets to save significant $$$$

You've got it wrong. Netflix saves significant $$$ by not paying the provider for unthrottled transit. Your mental model is not how it works in practice these days.


> I'd imagine 1.1.1.1 has a very negative consequences for Netflix local caches, for example.

DNS isn't the only way to handle that. If it becomes and issue for Netflix, they can use another way to handle this situation and it will works just fine.


The author of the post specifically addresses these replies from Cloudflare.

This link and the two answers within demonstrate something important, broader than the DNS related issue at hand.

Both make implicit assumptions. One assumes the worst of Cloudflare and thinks “what’s the worst reason Cloudflare could have for doing this. How do they profit off this?” And the other assumes that Cloudflare has good intentions.

Neither answer is technically wrong. Both flow logically from their initial assumptions. But it shows how different our conclusions can be depending on where our initial biases lie. For the person who believes the first answer and says “prove to me that Cloudflare isn’t doing something nefarious”, it’s not possible. The analysis is correct and can’t be challenged unless the initial assumption is challenged. And for people who strongly believe that Cloudflare has bad intentions, nothing can be done to change their mind.

In this example it’s Cloudflare but it applies to any person or organisation that we feel strongly about.


You have to look at Cloudflare user agreement's and texts that describe their relationship to their customers. https://www.cloudflare.com/privacypolicy/ and https://developers.cloudflare.com/1.1.1.1/commitment-to-priv...

You can only held companies accountable for the laws and explicit written promises and legally binding agreements.

Currently the price companies pay for privacy violations is low. If a company like Cloudflare writes down all the privacy promise in legally bind manner and puts themselves into legal and financial liability that is above the norm for breaking the contract intentionally it can increase trust.

Companies can do much more than they do now. They can put explicit bounties for whistle blowing them and revealing privacy violations. They can hire trusted third parties to do privacy audits and handle whistle blowing.


> And we wanted to put our money where our mouth was, so we committed to retaining KPMG, the well-respected auditing firm, to audit our practices annually and publish a public report confirming we're doing what we said we would.

Looks like they are. https://blog.cloudflare.com/announcing-1111/


> They can put explicit bounties for whistle blowing them and revealing privacy violations.

Employees blowing the whistle internally, or externally? If they want to encourage employees to blow the whistle externally, they could put a carve out for that in their NDA.


> They can hire trusted third parties to do privacy audits and handle whistle blowing.

All software seems to need that now days.


None of that is legally binding if you don't have a contract with them.

Click-wrap agreement and browse-wrap agreement are both contracts.

https://en.wikipedia.org/wiki/Browse_wrap

https://en.wikipedia.org/wiki/Clickwrap


But you must actually “click” or “browse” for it to be enforceable, right?

Not necessarily. There can be implied consent.

Of course, in that case you can't put surprising terms into the agreement if they are disadvantageous to the user. Courts don't see that a meeting of the minds took place. https://en.wikipedia.org/wiki/Meeting_of_the_minds


Sure, if you’re actually visiting the site. But (at least in the US) didn’t the recent LinkedIn case find that if my scraper pulls public data off your site, the TOS doesn’t apply?

This court decision doesn’t mean “no rules for scrapers”, rather it means “different rules for scrapers, independent of any site-specific TOS”. Or did I misunderstand the decision?


Consumer law applies for consumer users and has different protections than other users have.

Web scraper as a consumer use is hard to argue.


It would be a lot harder for Cloudflare to argue some clause in the contract is non-binding when they provided the contract in the first place than the consumer on the other end who just clicked "OK" on a button to agree to that contract.

This response reminds me of this meme:

https://randysrandom.com/wp-content/uploads/right-wrong.jpg

Neither answer may look technically wrong, but only one reflects what is actually happening here. That we don't know which exactly based on that specific data doesn't mean that both are equally valid.


That meme’s sort of funny in a meta way because presumably the person who actually drew the number (the comic artist) drew it as both a 9 and a 6, so in that one particular case they actually are both right.

The sentiment of the red message is great though.


Yes and if we really really cared about this, we’d launch an investigation. But we don’t care that much. We just want to scroll this thread a bit, come to a conclusion about Cloudflare and move on to the next HN thread. Given that we’re going to spend only a couple of minutes on this, it’s easy to figure out which answer we’re going to agree with - the one that confirms our previous belief about Cloudflare.

The second one is not an assumption, it's Cloudflare's official position. For a person who is against Cloudflare, I feel like this would only serve to reinforce the confirmation bias as there's seemingly no person except a Cloudflare employee willing to step up and defend the action.

So, yes, good observation.


Arguably, no one except a Cloudflare employee could know the reason why they took this decision. A random person speculating “maybe they did this for privacy reasons” doesn’t strike me as better than Cloudflare saying “we did this for privacy reasons”.

And while the second answer is a statement, not an analysis the rest of what I said holds. You will only accept their statement as the truth if you assume good intent of them.


You are assuming companies make decisions for a single reason.

Nah, I will assume it as truth only when it makes sense.

And whether it makes sense depends on your initial bias. Make sense?

Corporations operate for profit.

And so you accept the first answer. That’s fine.

Cloudflare has repeatedly said that while they operate for profit, they take the long term view. By doing the right thing now, by being privacy focussed, they will be profitable for decades to come. This seems logical to me, which makes the second answer more believable.


Do you remember "do no evil"?

Pepperidge Farms remembers.


Indeed - but there are other ways to make money than to sell of your personal information to the highest advertising bidder.

Such as, in Cloudflare's case, selling our service (the DDoS protection, the caching, the firewalling etc.) to companies that pay for that service because it helps them.

While at the same time working to preserve people's privacy with things like giving out SSL for free, pushing for eSNI, running a public DoH server, building a service that makes sure all data from your phone to us is encrypted etc. etc.


It's been shown that Cloudflare's DoH service is a lot ado about nothing, and is actually worse for privacy, not better:

* https://news.ycombinator.com/item?id=21071022

Likewise for 1.1.1.1 — when taking into consideration the local caching appliances that the ISPs have invested in, the lack of ECS would make the clients go all the way through the internet for the same content that's already cached locally by the ISP for users of all other decent resolvers — this will only contribute to increased costs for the individual ISPs, extra latency for users, and more competitive advantage of your products due to you diminishing the technological advantages of your competitors, without regard to the actual user experience of the users, or the reliability and scaling of the internet infrastructure at large.

Not to mention that such Netflix/YouTube usage, when going directly through transit providers and through the whole internet, would also subject the users to a greater chance of surveillance at large compared to users of resolvers that would access local copies on the caching appliance.


Except in the US the ISPs are some of the biggest surveillance organizations themselves. They are also highly monopolized so most people in the US are on one of a very small number of ISPs

Which is a good argument for not using your ISP's DNS either, but those are not the only two options.

One of the better alternatives is to get a VPN you trust that puts multiple users behind the same IP address and then operate your own recursive DNS from behind there. The VPN service itself could still log your queries, but at least they have plenty of competitors, and you chose one you trust, right? Or if you don't want to trust any one party, use Tor.


It has been argued, I wouldn't say that it has been shown. Both my ISPs operate a DNS blacklist. So did my previous ISPs, in the country I previously lived in. And in a third country, where I was on holiday. ISPs even in the USA are gnashing their teeth at the prospect of losing visibility into DNS. Why would they care if they weren't using that data? Why do they need a subscriber -> [domain] mapping? Routing tables don't care about domain names. Edge caching of web content doesn't work with https. I might care about DNS caching if the ISPs haven't demonstrated time and again that they will abuse my privacy for a buck, after I've already paid them for the privilege.

I trust Cloudflare much more than I trust any ISP I've had to deal with, including American ISPs when I lived there. I trust Google much more than any ISP, and I'm not particularly charitable towards Google.

Centralized DoH isn't perfect, but it's better than the status quo. The SNI hole is shrinking. My threat model does not include defending against the Mossad doing Mossad things with my email^H^H^H^H^HDNS[1].

[1] https://www.usenix.org/system/files/1401_08-12_mickens.pdf


If you're trying to preserve people's privacy, why doesn't the 1.1.1.1 VPN service also mask originating IP?

Warp isn't trying to "hide your IP from the sites you are visiting". It's there to help prevent intermediaries from observing your traffic. A huge percentage of the web is still unencrypted HTTP.

And Warp+ aims to be about that plus performance.

If you want to be totally anonymous on the Internet then I recommend you use Tor. If you just use a VPN then you may hide your IP address from sites you visit but there are tons of other fingerprinting techniques that can be used.


I understand all that, and you didn't answer my question. Why do you push the narrative that 1.1.1.1 DNS resolver protects user privacy (by hiding originating IP / subnet) whereas 1.1.1.1 VPN gladly reveals that data? In both cases, the destination is hidden to any eavesdroppers, but in the latter case (VPN) the source IP is visible to the destination website, whereas you keep insisting how vital it is to hide source IP in the former case (DNS).

In the case of Warp, we add the connecting IP information as a header to the HTTP request for sites on Cloudflare. This will typically be inside TLS to the origin server, and so the source IP information will be encrypted and only visible to the web site being visited.

In the case of DNS information about the subnet, the query etc. is sent around unencrypted.

One is open to eavesdropping, the other is not.


On twitter[0], they claimed the main thing they were after is a very rough geolocation with the dns request. Country level, or at least continent level. So they can respond with a nearby data center.

That doesn't sound too bad, privacy-wise.

EDIT: I mean if you were to map all US IP's to a single canonical IP for instance.

[0] https://twitter.com/archiveis/status/1018691421182791680


Someone capable of eavesdropping on that query sure as hell capable of eavesdropping on incoming connections to 1.1.1.1 where they can see actual IP address that initiated the query. There is no way to justify this as a privacy feature. Well, unless people don't understand enough to believe you.

Not really. An eavesdropper can sit in front of the authoritative server for a site and eavesdrop on all the DNS queries with EDNS information. That's one place they need to be.

To eavesdrop on Warp you'd need to do it all over the world, capture encrypted traffic and then try to correlate traffic. If your threat model is a global adversary capable of doing that correlation and you don't want sites to know your IP, then use Tor.


> An eavesdropper can sit in front of the authoritative server for a site and eavesdrop on all the DNS queries with EDNS information.

No, they can sit near your 1.1.1.1 servers and catch all incoming and outgoing traffic, watching connections to your 1.1.1.1 servers that initiate DNS queries and actual outgoing queries that 1.1.1.1 makes to authoritative servers and responses too.


So if we're talking just about unencrypted DNS to 1.1.1.1 then you're assuming an entity capable of sitting in front of us in 194 cities worldwide.

vs

With EDNS sitting in front of the authoritative server of the site this actor is trying to monitor.

The latter is easier than the former.


In the latter case it's just as easy to catch real IP addresses by sitting in front of authoritative DNS servers and actual servers those DNS records point to. As I said, you just can't justify it as a privacy feature. It does nothing significant in any threat model.

Ok, that makes more sense. So you're basically worried about the unencrypted connection between Clouflare and DNS authority server. Initially I understood that you're worried about leaking IPs to DNS authority server itself.

"like giving out SSL for free"

The market rate for standard SSL certs is zero.


It wasn't in 2014 (https://blog.cloudflare.com/introducing-universal-ssl/) when we launched it.

>A random person speculating “maybe they did this for privacy reasons” doesn’t strike me as better than Cloudflare saying “we did this for privacy reasons”.

you are saying the accessor function getX() which returns a value of X but you don't trust it, you think it's giving you crap, should not be treated any differently depending on whether the getX() function even has access to X or has absolutely no such access. (For example if the value of x isn't even on the same network partition as the getX() function you don't trust.)

You're saying if you don't trust it, it doesn't matter if the function itself even has access to X or doesn't.

In one sense that might be true, but in another sense that seems silly. If getX itself has access to X, you can try to determine whether it is giving it to you. if getX doesn't have any access to X, then it doesn't really matter what it's doing, its process is irrelevant.

so to me there's a huge material difference. We can try to judge the process by which getX() returned Cloudflare's motivations. What steps did it perform to return that value? What's the code? etc.

huge difference. that knowledge is somewhere in the company.


To summarise, your position is that what Cloudflare says is more trustworthy because they know the truth.

You do not really address the fact that they are not required to say the truth, or that when the truth is harmful for their public image they are directly incentivised to not speak the truth. The only way you do address this is by saying that this is something that needs investigating. I would posit that the grandparent has done this already, and come to the sensible conclusion: There is less reason to trust someone incentivised to lie than there is to trust someone who knows nothing.

Aside from that trust, we have to evaluate the validity of statements. Given prior knowledge, for Cloudflare in the bad case the likelihood of a valid statement approaches zero. For the random yelling things as they pop into their mind, it is completely unknown.


My position is only that there's a difference. In some sense we could treat it as though it's just garbage, but it's worth investigating. For example if it was written by an outside PR person who has no access to the people who made the decision, that also changes things.

It's not so pure. For example an outsider here on HN who says "A close relative of mine works at cloudflare on the team that made this decision, and he confided in me..." -- then again you have to somehow judge if this is true or not, but it is worth treating it differently from someone writing "I don't have any insider information and this is pure speculation, but maybe..."

I mean it just doesn't make sense to treat these cases as exactly the same. I wanted to give another example. say you don't trust the gps coordinates you're being given when you make an API call on a device.

would it make sense to treat it exactly the same as making the API call on a device that doesn't even have a gps module, such as a microcontroller without gps or wifi/cellular access or anything that can be a proxy for gps?

if there's a physical module and you don't trust the output, at least you can investigate. it doesn't make sense to treat it exactly the same as if the information isn't even on the same device.

it depends on the details of the process that's giving you the output you don't trust. What's the process by which getX returns its output? What's the process by which Cloudflare employees make statements about their motivations (which they do have access to)?

These are questions we can investigate. if we find that the statements are written by a PR agency who hasn't even stepped in their building and has no contact with the teams they're lying on behalf of, that's a possible result too. but it's worth looking into.


Your GPS/Microcontroller example is not very relevant, as those are chips that malfunction in measurable ways. We were discussing people. Fundamentally I think we just disagree about how much the word of a corporation can (and should) be trusted. That's okay, and to my view unlikely to change with extended discussion.

> I consider EDNS-less requests from Cloudflare as invalid.

If your site depends on a DNS extension that's only 3.5 years old (and designed to be optional), I think it's fair to say your site is just offline for some users due to a config mistake.

You're free to set up your servers however you like, but there's wisdom in Postel's law.


Another interpretation of the Law by Mark Crispin, father of IMAP:

  This statement is based upon a terrible misunderstand of Postel's
  robustness principle. I knew Jon Postel. He was quite unhappy with
  how his robustness principle was abused to cover up non-compliant
  behavior, and to criticize compliant software.

  Jon's principle could perhaps be more accurately stated as "in general,
  only a subset of a protocol is actually used in real life. So, you should
  be conservative and only generate that subset. However, you should also
  be liberal and accept everything that the protocol permits, even if it
  appears that nobody will ever use it."
* https://groups.google.com/d/msg/comp.mail.pine/E5ojND1L4u8/i...

Further discussion on the topic:

* https://news.ycombinator.com/item?id=9824638


> there's wisdom in Postel's law.

For the lazy like me: robustness principle, aka Postel's law

https://en.wikipedia.org/wiki/Robustness_principle

Thank you for the reference. I learned something today!


Thanks for the link, for which there's the counter-argument, "The Harmful Consequences of the Robustness Principle" [0]:

> A flaw can become entrenched as a de facto standard. Any implementation of the protocol is required to replicate the aberrant behavior, or it is not interoperable. This is both a consequence of applying Postel's advice, and a product of a natural reluctance to avoid fatal error conditions.

[0] https://tools.ietf.org/html/draft-iab-protocol-maintenance-0...


Archive.is does not block all requests lacking EDNS. They specifically block requests coming from Cloudflare's datacenters. Cloudflare is not accidentally misconfiguring their EDNS, Cloudflare is intentionally not sending EDNS.

They’re intentionally not sending an optional extension, that seems .. fair honestly.

The EDNS-Client-Subnet extension was not meant to be optional for folks running a CDN or a huge public resolver across 100+ POPs.

"Was not meant" means nothing. It's specified as optional because it's an extension mechanism.

The "misconfiguration" he's talking about is on archive.is' part. Their configuration expects some specific server to have an optional functionality enabled, which it doesn't.

Sorry, I don't understand. I was referring to this quote:

> I think it's fair to say your site is just offline for some users due to a config mistake.

Archive.is is not making an accidental mistake. Archive.is is behaving very intentionally. They've said so on Twitter. And I believe profmonocle agrees with me on that point.


And Cloudflare would happily talk to archive.is to come up with a solution.

And I agree with that as a Cloudflare customer. In fact if this was a paid feature I would pay for it.

Just to give you more insight. Google knows which IP address I am using Gmail from. If I use 8.8.8.8 they know what other content I am looking for which websites I visit and tie that to my account. If I use something like Cloudflare who do not expose my IP (or range) then I achieved more privacy. I could use my local DNS server (like I do at home) but I travel a lot.

In this case "misconfiguration" is actually for privacy and archive.is could live with that just like other sites but they intentionally screw with Cloudflare (aka the users who has 1.1.1.1 as the resolver).


Do you have a source for this?

Sources for archive.is blocking Cloudflare's datacenters:

The exact same command fails when sent from Cloudflare's datacenters, but succeeds when sent from DigitalOcean:

https://community.cloudflare.com/t/archive-is-error-1001/182...

Two more sources:

https://news.ycombinator.com/item?id=19830258

https://news.ycombinator.com/item?id=19829036



I've seen that, it doesn't really clarify whether the block singles out cloudflare in particular, or whether cloudflare is the only (significant) DNS resolver that the block happens to affect.


I really don't see this as a problem of Cloudflare.

End users switching to Cloudflare's DNS endpoint are doing so because they feel the DNS provider is both faster and more secure.

They rightly made the decision NOT to pass on the end user's IP information to the upstream DNS server. I agree with this decision and they are acting in my best interests in doing so. To draw some kind of nefarious intention from this is absurd.

Until Cloudflare are proven to be nefarious actors, I'll continue to use their service.


> They rightly made the decision NOT to pass on the end user's IP information to the upstream DNS server. I agree with this decision and they are acting in my best interests in doing so. To draw some kind of nefarious intention from this is absurd.

In this instance, the upstream DNS server and the resultant HTTP server are operated by the same organisation. Cloudflare have opted to not provide the /24 (or /56 if IPv6) network that the original DNS request came from, in the DNS request. Your computer will then provide the /32 (or /128 if IPv6) that your request is coming from when you connect to the HTTP server.

What privacy win have you gained by Cloudflare not providing that information in this instance?


In this particular case, you're right. But as a general principle DNS is not necessarily owned by the same organisation as hosts the website.

Correct. It's also worth noting that as a general principle, the DNS server making the request on behalf of the user is hosted in the same network as the user, and not an external third party.

In this particular case, it's one CDN taking issue with another CDN only. No other DNS providers appear to be impacted.


> they feel the DNS provider is both faster and more secure.

'Feel' being the keyword. Faster, generally yes. More secure, not well defined and users are generally wrong.

> nefarious intention

I don't believe I've heard any complaints of nefarious intent.

But let's be clear, this advantages Cloudflare over other CDNs. That they treat the DNS data very well does not mean they won't have an incident. As well, they are more of a target due to the concentration.

> Until Cloudflare are proven to be nefarious actors,

Nefarious wrt whom? For end-users taken individually, I agree, I don't see and it's hard for me to imagine mal intent.

But IMHO they are bad for the Internet. I mean, more power to them and were I a leader there I'd press the same agenda, but as a 3rd party, the way I see it is that in 10 years they are going to be an anti-power much like Google is. Addiction to their services will allow them to trample over what's good for all.

What I dislike most about them is that they promote themselves as purely a force for good. Except for a few PMs and execs I'm 100% sure they believe it. But it's a disservice to never discuss the negative aspects of any of their services. And woe to anyone who does.

As for proven nefarious deeds, do you not consider "banning" sites from using CF nefarious? What if they take it to the next step now, and stop providing DNS for those sites? Given their stated reason for bans, yes it could happen. Why must you wait until they prove to be nefarious? The concentration of power per se is a bad thing.


That looks like uncompetitive behavior from Cloudflare, so it's their problem also. Cloudflare can send EDNS if nameserver and the actual server run by the same party, but they don't

Well, yeah. This was one of the significant reasons against Cloudflare's DoH too. They want all Firefox users to use their DNS resolver and deprive the ability of competing DNS-based CDNs (most CDNs) to pick good nodes in Firefox. I've been thinking of blacklisting Cloudflare completely on all of my servers just for that. And it seems Firefox will even be able to detect that and fallback to proper DNS for such domains.

I'm with some of the people on Twitter: It seems weird (to put it mildly) to just blackhole your own site with no explanation whatsoever to the end-user. For everyone on 1.1.1.1 archive.is will now be "down" and they're none the wiser.

Maybe there's a big backstory here, but without context that seems passive-aggressive and quite random?


What's especially weird is that they're returning "127.0.0.3" to Cloudflare's DNS, rather than a DNS SERVFAIL or REFUSED error. On most systems that will cause a connection refused error or a TCP timeout. I would assume that was a network issue on their end, not a DNS problem.

SERVFAIL or REFUSED is also not helpful to the end user. They should return the IP of a host serving a static single-page website explaining the issue.

REFUSED will trigger a lookup on the next DNS server in the list, which may not be Cloudflare, instead of guaranteeing the user can't go to the real page.

Indeed. This is the first I'd heard of this situation. I'd previously just assumed archive.is was a shonky service that didn't work properly. Hadn't connected it with my use of 1.1.1.1

I am no expert by any means. However, I strongly suspect EDNS is not actually needed to run a CDN. There’s a lot of approaches to balancing load and distributing traffic. An example of another approach would be using anycast IPs.

I’m also surprised that traffic from Cloudflare DNS users caused any significant problem. Was it really that much traffic?


> However, I strongly suspect EDNS is not actually needed to run a CDN.

It's not. The proof is that CDNs existed long before edns-client-subnet was introduced. All it does is allow the CDN's DNS servers to return the most optimal A/AAAA records for the client. But the worst that should happen without it is you get sent to a more distant CDN server, and the content loads more slowly.

The fact that archive.is somehow suffers without this feature (which, btw, wasn't standardized until 2016) suggests they're doing something really, really odd. If I were them, I'd focus on making my system more robust, rather than demanding the rest of the Internet adopt a relatively young, optional DNS extension.


Per https://serverfault.com/a/560059/110020, Google's 8.8.8.8 has had support for `edns0-client-subnet` since at least 2013, so, even if it's only been standardised in 2016, it's been a de-factor standard for quite a while, especially in the internet-technology-years.

Here's an interesting thought — if it's so bad for privacy and isn't necessary for a CDN, does Cloudflare the CDN simply disregard ECS when receiving requests from DNS.Google, or do they take it into account?


> even if it's only been standardised in 2016, it's been a de-factor standard for quite a while, especially in the internet-technology-years.

If archive.is thinks that Internet standards should be adopted so quickly, it's weird that they don't support IPv6 considering it's been a standard since 1998!

Obviously I'm kidding, but only kind of. When it comes to insisting on adopting new standards, edns-client-subnet is a weird hill to die on, especially considering it was always meant to be optional.

> does Cloudflare the CDN simply disregard ECS when receiving requests from DNS.Google, or do they take it into account?

I don't think they have a reason to use it because they use TCP anycast. Looking at https://cachecheck.opendns.com/ they seem to return the same IPs regardless of geography.


When you talk about ECS being optional, you also have to keep the context in mind.

* Yes, if you're running a local resolver for your LAN, or have a website on a single server, of course ECS should be optional.

* If you're running a CDN (and archive.today does), or if you're running a public resolver at 100+ POPs, then, no, ECS is not meant to be optional.


"not meant to be optional" is surely a suggestion and not a requirement?

i.e it's not "(...CDN...) then ECS should not be optional"


> if it's so bad for privacy and isn't necessary for a CDN, does Cloudflare the CDN simply disregard ECS when receiving requests from DNS.Google, or do they take it into account?

I don't understand that for various reasons.

1) Privacy is already lost here. If I shout my mobile number on a train with you that's full of people, everyone knows my phone number. If you choose to keep it / use it to call me tomorrow doesn't matter.

2) If Cloudflare can make _better_ decisions based on the information shared by Google, why shouldn't they? As long as it is optional and they don't take their ball and go home^W^W^W^W^W^Wreply with 127.0.0.3 in cases where you don't provide it..


1) probabilities. No one is likely to keep that information on the train. Unless of course AT&T runs the train in which they tell you they will record everything you say and use it for marketing or what ever other purposes.

> de-factor standard

Google isn't the internet, you know?


> If it's so bad for privacy and isn't necessary for a CDN, does Cloudflare the CDN simply disregard ECS when receiving requests from DNS.Google, or do they take it into account?

It's not because it can be bad for privacy that you can't use it for good. The feature exist for a good reason, it's valid, it doesn't change anything to the fact though that it can be use for bad reasons too, which is why you want to remove it. In the means time, there's no reasons not to use it for good reason while it's still there.


EDNS client subnet exists since there are large public DNS servers. Google did implement it very early on 8.8.8.8 (DNS operators had to request them to enable it when querying their authoritative servers) because it is needed to correctly operate a CDN.

I know why it exists, and it's nice to have, but what I'm saying is there's no reason a site should completely fail to load without it. The worst case should be you just get routed to a more distant cache, and the site is slower. The same as what used to happen before edns-client-subnet existed.

> An example of another approach would be using anycast IPs.

Anycast IP is very expensive, unfortunately. Just getting a /22 has been expensive for years, and is now also getting difficult as well. It is beyond the reach of smaller companies.

GeoDNS is extremely cheap in comparison. You can run distributed services using GeoDNS for low latency on multiple continents on a hobby budget these days.

Anycast is technically better in many ways (the combination of anycast and geoDNS is better again), but anycast is so expensive that smaller operators just can't use it.

These days, smaller operators can use Cloudflare for their CDN, and the suspicious mind might think that suits Cloudfare just fine. But that doesn't really help for low-latency interactive services, or non-HTTP services.

> I’m also surprised that traffic from Cloudflare DNS users caused any significant problem.

Maybe the problem isn't amount of traffic, but rather that the site doesn't want to gain a reputation as slow (and therefore incompetently administered, and offputting to use) when everyone running Firefox switches over to 1.1.1.1 DoH automatically.


FWIW, archive.is being unreachable under Cloudflare DNS predates Firefox’s plans IIRC.

You could also in theory use the originating IP of the DNS requests themselves. But Cloudfare messes up that as well:

https://twitter.com/archiveis/status/1018691421182791680

> Absence of EDNS and massive mismatch (not only on AS/Country, but even on the continent level) of where DNS and related HTTP requests come from causes so many troubles so I consider EDNS-less requests from Cloudflare as invalid.


> massive mismatch (...) of where DNS and related HTTP requests come from causes so many troubles

Does anyone know what they could mean here? I get that having more open connections and slow requests is not great, but there are popular attacks people will try against them in this case. They already have to handle pathologic cases of slow requests, so handling some small number of slower clients shouldn't be an issue.

Or are they talking about some other problem?


They are taking about Geo load balancing via DNS.[1]

Just try one of the akamai endpoints to test it. (E.g media.steampowered.com)

For me 1.1.1.1 serves akamai singapore IPs, while 8.8.8.8 serves IPs of my ISPs akamai cache in Sri Lanka.

If your ISP has a bad route to 1.1.1.1, this just gets worse.

[1] https://en.wikipedia.org/wiki/GeoDNS


Blocking a user because the site might load more slowly for them doesn't make any sense to me. If the user is choosing to use a DNS server that returns sub-optimal CDN IPs, isn't that their problem?

This kind of blows my mind about this, and I'm surprised that everyone seems to be focused on conspiracy theories about Cloudflare instead of the apparent situation that archive.is is intentionally breaking fundamental behavior of the internet because they don't they aren't getting information they want from Cloudflare.

Internet protocols were designed to be redundant and resilient, so that things still work when things break and traffic takes other paths. When people do shit like this, we get a less reliable, less functional internet. Demanding to know the exact subnet a request originated from, and returning incorrect results when that information is not given, seems to me a thoroughly hostile behavior on the part of archive.is.


> If the user is choosing to use a DNS server that returns sub-optimal CDN IPs

How many users are explicitly choosing that? How many users are actually choosing something very different, and this is an unintended consequence of their choice, that they would otherwise be unaware of if not for this provider taking a stand?


> How many users are explicitly choosing that? How many users are actually choosing something very different, and this is an unintended consequence of their choice, that they would otherwise be unaware of if not for this provider taking a stand?

Not sending anything at all doesn't solve any of this. If a message was shown explaining the situation, sure, but archive.is solution doesn't answer your question at all.


Yeah, but... why does it matter? They're not some massive retailer where every ms potentially translates to some proportion of lost sales that add up to a significant number. They're serving archived pages.

In what case would some extra delay be worse than no access at all?


> Yeah, but... why does it matter?

Seems pretty anti-competitive if Cloudflare's DNS stops Akamai's local caching at your ISP from working, no?


Akamai caching wouldn't stop working. Depending on how it works, you'd either hit the cache/edge in a different country, or a local one with a matching bgp route anyway. There's nothing anti competitive here.

In the post the archive.is says that it caused "many troubles".

We really dont know the site works in the backend. So I guess the admin did not want to spend time to fix issues cloudflare created.


> issues cloudflare created.

But that's the thing, Cloudflare didn't really create any issues. If I live in the US and I decide to use some random public DNS server in Australia, it will be an unpleasant setup, but it's a perfectly valid one.

There's no rule that your DNS server must be on the same network as you, or send your subenet if it isn't. When that's the case it allows for some nice performance optimizations. (I.E. sending you to a closer cache.) But it's just that - an optimization. If your service is completely unreachable without performance optimizations, you've created a very fragile service.


> There's no rule that your DNS server must be on the same network as you, or send your subenet if it isn't.

It's the default configuration. 99% of internet users follow this configuration (at least, until web browsers start shipping DoH as a default). It's honestly a fairly reasonable assumption to make.


Can't you argue the inverse as well? Cloudflare isn't sending EDNS Client Subnet to any other authoritative name servers, is anyone else having problems? Or are 99% of people working just fine without this optional EDNS information? So isn't it archive.is who is the 1% who isn't following the standard configuration? Which is to still resolve correctly even without the option EDNS information? Sure, it might not be the best possible answer for a client, but you can still return an answer?

It's an even more reasonable assumption for CDN latency-minimisation geo-IP/DNS purposes. Even if it's not on the same network, your DNS server is usually on the same continent!

That was my original question - if it's not about slow requests, what's the reason?

A lot of folks here seem to be saying "if you're going to make a DNS query, you're only going to make a HTTP request," which is simply untrue. Hell, you can add a HTML tag to your page to prefetch DNS queries. Browsers prefetch DNS just for hovering your mouse over a link or typing something into your address bar (without actually navigating). Should some DNS server know your IP address just because you moved your mouse over a link? IMO, no.

I don't understand what side you're taking here.

Please can you rephrase your argument. 100% serious, I'd like to know what point you're making.

Pre-emptively: because whatever DNS server you are using already knows your IP address, regardless whether it's the first query for the site itself, or subsequent queries for site-related additional resources.


My argument is that the "hiding your IP is pointless because the third party will get your IP anyway" is a nonsense argument. The DNS query being prefetched may have nothing to do with the current site you're on.

If I go to a page that links to a bunch of sketchy websites, I don't want my IP (and thus, identity) tied to those sketchy websites just because I hovered my mouse over the links.


If you hover your mouse over an ad linking to sketchy-service.com ... then the remote dns host for sketchy-service.com now has your IP address.

That seems neither here-nor-there for the 1.1.1.1 service.

Doesn't the browser's internal resolver use an external recursive server (either the host's configured ones or browser-determined ones)? Chrome does, AFAICT. As opposed to being a recursive resolver itself, it just implements a caching stub resolver.

The remote DNS host for sketchy-service.com doesn't see your IP address, they see the recursive server's address.


Browsers can also prefetch pages under some circumstances (I'm not sure of the details). In that case, the web server for sketchy-service.com now has your exact IP address (vs. the truncated address encouraged by this extension). In firefox this can be prevented with:

network.dns.disablePrefetch True

network.prefetch-next False


ECS is not equivalent to 'send the IP' but is revealing.

the fact that I subsequently connect to another place over HTTP or some other protocol is distinct from telling a DNS authority who is asking a question about a domain name: the article implies "its the same leakage" but it isn't: different people get told.


What's the actual meaningful difference, though? ECS is limited to a /24 anyways, so, it doesn't even reveal the exact IP address in any case.

Disclaimer: I work on 1.1.1.1. You might not consider your /24 as personally identifying, but others might. The original RFC discusses these problems fairly well (https://tools.ietf.org/html/rfc7871, Privacy notice and privacy considerations). Frank Denis also wrote a good summary on ECS (https://00f.net/2013/08/07/edns-client-subnet/). There's a multitude of ways to fix this - use a whitelist of nameservers to send ECS to to avoid spraying the source prefix everywhere, encrypt the whitelisted connections, or aggregate the source prefix into a largest covering server scope (e.g. if the client is in /24 but nameserver serves the same answer for /16, then using any address in the /16 would do). We're evaluating all of them as there's different trade-offs (see https://blog.mozilla.org/futurereleases/2019/04/02/dns-over-...).

I haven't really looked into EDNS, but can't you send fake the EDNS that points to a Cloudflare PoP close to the user (thus giving them a Cloudfare address)?

How about looking up the client's AS via BGP or whois and broadening the scope so that it matches its net block? Then if a CDN peering with a particular ISP wants more granular DNS load balancing they could ask the ISP to announce their routes by region or something like that.

What do you say to the very often heard criticism that the exact IP address will be leaked the moment the user establishes a TCP connection to the domain they just looked up?

Hi, I answered it in another comment below.

Thats a good question. How you feel about third parties knowning what endpoints you go to depends on what endpoints you're going to, and why. In some economies, its hugely informing. In many cases its explicitly what BI is -to know what you do, and when you do it.

I don't have good sense of this, but people I trust say a surprisingly small collection of information identifies you to a specific level. same /24 is only 255 people if there isn't a CGN. More to the point, if your /24 identifies your economy, you're now subject to IPR limits and can be told different things.

So some ECS objection is rooted in opposition to regional IPR. Netflix. Sub-optimal CDN delivery (to one person) is wall avoidance (to another)


What exactly do you use DNS for? Most folks use it to resolve a domain name so that they can make an HTTP and/or HTTPS requests from the very same IP address over the very same internet connection. Surprise: these subsequent HTTP/HTTPS requests would have your complete identifying information down to the very specific /32 IPv4 or /128 IPv6 address uniquely assigned to yourself.

So, in reality, the extra privacy gained from not doing ECS is hardly something with a measurable effect, because this information HAS to leak in any case. Even if make DNS encrypted, even if you employ encrypting TLSv1.3 SNI, the IP addresses will still leak, and with a much higher precision anyways. So, this we-don't-do-ECS-because-privacy is a rather pointless statement in the end.


+1 @cnst, what privacy concerns are you all afraid of with ECS ? archive.is will get informed that someone around IP range A.B.x.x tries to reach its website, just a few seconds later to see a connection from A.B.C.D.

The main reason that Cloudflare wouldn't share this info is to prevent competitors like Akamai to operate a CDN as good as them. It looks more like sabotaging competition than increasing privacy.


> The main reason that Cloudflare wouldn't share this info is to prevent competitors like Akamai to operate a CDN as good as them. It looks more like sabotaging competition than increasing privacy.

Exactly. Their own answers in the threads over here at HN are basically admitting as much — they claim to be working on solutions alternative to ECS, because Google and some others have more PoPs than Cloudflare does. They're obviously using this as a competitive advantage to slow down competing CDNs. And noone's talking about!


In practice this is one of the most common techniques used to deanonymize proxy users.

How? If your client is leaking DNS outside of the confines of the proxy you're using, you've got bigger problems than ECS.

I don't disagree about DNS leaks being a separate problem, that doesn't change anything about what I said though.

Even if ECS only reveals your /24, immediately afterwards you're going to connect to the service with your own IP, so Eve can correlate the pair of domain name and /24 from the ECS request with the source IP from the TCP connection to match your IP with the domain name you're navigating to.

This is not the privacy concern, check out the https://tools.ietf.org/html/rfc7871#section-11.1 discussing it. Yes, if you open a connection to the target IP, then all transit networks between client and the target IP (including the target itself) know who is talking. These are on-path parties. The main (privacy) issue with ECS is not this, but that it shares client's subnet with potentially every nameserver on the referral path (including transit networks between the recursive and nameserver), for every name client looks up (even when it might not support ECS). The client is also not in control of the prefix length. /24 for IPv4 is a recommended default, but the recursive may use however much it wants and there's no way to prove to the client that it didn't. Opt-out is also difficult (afaik only getdns and Firefox clients support an opt-out).

> but that it shares client's subnet with potentially every nameserver on the referral path [...] for every name client looks up

Does CF DNS not use qname minimization? That would reduce the association between subnets and names looked up.


I don’t understand the privacy reason. If I am querying for domain x, why does it matter that domain x’s DNS servers know what IP I am querying them from? I am going to hit their web server directly with that very same IP in a few milliseconds anyway.

There are a few reasons. Here are three I can think of off the top of my head:

Many browsers prefetch DNS for links on webpages these days. So it’s entirely possible and even common that you may query DNS for sites you never visit, which would indeed be a privacy leak.

Secondly, many sites have their DNS hosted elsewhere so it may not be the same people you are leaking the information to.

Thirdly, if the DNS query is transmitted to the site’s DNS servers in plain text (which most DNS is), then despite eSNI etc anyone who has access to the wire traffic along the route from the DNS proxy to the site’s DNS servers (which is probably different from the route your own traffic takes to their servers) can see which site you wanted to access.


Does 1.1.1.1 send ECS info to Cloudflare’s own nameservers? More generally, does 1.1.1.1 in any way treat Cloudflare’s own nameservers in a special way and send it information that it doesn’t send to others?

If the answer to these questions is no, then Cloudflare’s reasons for blocking ECS (ie privacy) carry weight. Otherwise no.


> Does 1.1.1.1 send ECS info to Cloudflare’s own nameservers?

No

> More generally, does 1.1.1.1 in any way treat Cloudflare’s own nameservers in a special way and send it information that it doesn’t send to others?

No


When using 1.1.1.1 as resolver Cloudflare has your full IP. So they can fully track you.

ECS doesn't even forward the IP, only the /24.


Not sure why this is a link to stackexchange as the second answer is lifted from the previous HN discussion on the topic

https://news.ycombinator.com/item?id=19828317


I think decision of archive.is is very interesting. 1) They attracted a lot of attention; 2) They showed the way to struggle with Cloudflare business that abuse their service.

If several bigger CDNs like akamai or softlayer will consider requests from 1.1.1.1 without EDNS as invalid and block them, Clouldflare wouldn't be able just to say that it's their own problems


I'm using Cloudflare's DoH service built into Firefox and archive.is is resolvable.

I think that's because Firefox falls back to system resolver if something funky happens. I'm using it strictly over DoH and it isn't working for me still.

Is adding "62.192.168.106 archive.is" to /etc/hosts a work around?

I added 217.79.184.91 archive.is to /etc/hosts and did sudo service systemd-resolved restart

It works;


Roads...? Where we're going, we don't need roads.

This is the reason I stopped using 1.1.1.1.

A very silly one at that given their reasons for not resolving archive.is are quite rational and on the contrary makes me want to swap google's DNS servers for theirs.

> their reasons for not resolving archive.is

They do solve archive.is. But archive.is's DNS servers have been configured to return bogus answers to queries from Cloudflare's servers.


Not cloudflare servers in particular. They demand EDNS(optional by design) which cloudflare does not support due to privacy risks.

From https://news.ycombinator.com/item?id=21155852

> Archive.is does not block all requests lacking EDNS. They specifically block requests coming from Cloudflare's datacenters.


Wow, what a bunch of not-so-smart people.

What exactly do you use DNS for? If it's to subsequently make a HTTP and/or HTTPS request, then your full IP address (and not just a /24 subnet) will be leaked to the very same parties anyways.

Even if they eventually make DNS encrypted, even if encrypting TLSv1.3 SNI work properly (and both of these are pretty big ifs, BTW), the IP addresses will still leak, always, and with a much higher precision anyways. So, this we-don't-do-ECS-because-privacy is hardly a rational statement on Cloudflare's part in the end — it merely breaks the performance of their competitor CDNs without any real privacy angle.


DNS isn't always run by the place the site is hosted and until the other 2 are implemented everyone along the lookup path can also see where you are going. Increasingly a destination IP is becoming less of a hint of what you are browsing to.

Whether you think that's enough to care about or not it's very different than the picture you painted.


Quad9 has a variant, 9.9.9.11 that works well with CDNs.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: