Hacker News new | past | comments | ask | show | jobs | submit login
Does Cloudflare's 1.1.1.1 DNS Block Archive.is? (2019) (jarv.is)
181 points by jahnu on Sept 11, 2021 | hide | past | favorite | 116 comments



Out of curiosity - not defending the behavior - what kind of problems could omitting EDNS cause? What is the steelman case for Archive.is here?

The author says Archive.is's claim that it causes problems is "questionable", but he doesn't mention what those purported problems are or address why they're illegitimate, so it's hard to evaluate whether that's accurate.


Archive.is uses ECS (edns client subnet, which sends the client IP's /24 to the authoritative resolver) for geo-based load balancing. The problem is that all IPs in a /24 are highly likely to belong to the same city for residential connections, so plugging it into a geoip service is likely to show the actual city & state that a request originates from (the entire point of ECS).

https://twitter.com/archiveis/status/1018691421182791680 (screenshot: https://aws1.discourse-cdn.com/cloudflare/original/3X/8/2/82... )


But when the user goes to use the IP address they got back, even more detailed information is going to be given to the endpoint; I can see this maybe being a benefit for TXT records or something?

Hiding ECS from DNS queries seems to mostly just further create imbalances between companies that can afford routing at the IP level over companies that want to do cheaper routing at the DNS level.

(And like, if you attempt to directly mitigate the final IP problem by using a VPN or CG-NAT or something, that same solution will work for the DNS resolver, so I really am seeing no benefit.)


You can still do routing at DNS-level as long as you have a less dense infrastructure than Cloudflare.

>1.1.1.1 is delivered across Cloudflare’s entire network that today spans 180 cities. We publish the geolocation information of the IPs that we query from. That allows any network with less density than we have to properly return DNS-targeted results.


>> 1.1.1.1 is delivered across Cloudflare’s entire network that today spans 180 cities. We publish the geolocation information of the IPs that we query from. That allows any network with less density than we have to properly return DNS-targeted results.

Cloudflare makes an exception to this rule for Archive.{today,is,...} domains. All queries for this domains come from Amazon EC2 in the U.S., not the 180 edges of Cloudflare. This was on blog.archive.today. Why? Who knows. But the decision to break up is made by both parties, not just the archive.


Source?


https://blog.archive.today/post/623568857709395968/i-from-th...

There was another answer I could not find quickly where that is named here "another free dns service" was named Amazon.


Amazon doesn't make any sense as the "another free DNS service" since they're described as free and "much smaller than Cloudflare", and Amazon is neither of those things.

And in any case, if I assume that Cloudflare is indeed proxying all DNS queries for Archive.today through some shitty EC2 instance that causes Archive.today to not have any geo information, it's a completely self-inflicted wound. They could've gotten geographical data from 1.1.1.1 to the accuracy of the edge nodes but decided to just outright block 1.1.1.1.


To add, I can't find any other evidence of this. This community post was posted on the same day as that blog entry, and archive.is still isn't loading: https://community.cloudflare.com/t/getting-servfail-for-some...


Amazon is indeed not a "another free DNS service", that should have to be in different points of time. Overloaded a "free DNS service" they launched an EC2 instance, or vice versa.

> They could've gotten geographical data from 1.1.1.1 to the accuracy of the edge nodes but decided to just outright block 1.1.1.1.

Yes.

But they cannot get back in time to getting information from the edges simple by unblocking.

Since CloudFlare sends queries not from the edges.

So there is a deadlock atm.


To add, apparently another reason is that he believes using Cloudflare as your recursive resolver could lead to phishing[0]:

> the same entity which answers your DNS queries is able to issue SSL certs for any domain, so using CloudFlare DNS you never know whether you access the original website or a fishing one

Generally this is protected via certificate transparency+CAA records. If CF's CA were to issue a bad certificate, it'd be blocked by the browser and, should it get out, jeopardize the entire company, likely DigiCert as well given they cross-signed Cloudflare's issuing CA.

0: https://blog.archive.today/post/634795612966125568/when-will...


I don't fully understand how archive.is operates. They don't remove copyrighted content (which I like, since it provides a useful service), they must have probably terabytes upon terabytes of data in some datacenter somewhere, yet they never seem to be shut down by the govt or their datacenter/cloud provider. Am I just naive to be surprised by this? How does all this work exactly?


It’s pretty shadowy for sure.

There’s basically no information on the web site about the company, how they operate, who finances them, what their privacy policy is, or even how to contact them. Their “blog” is an anonymous Tumblr site.


It’s easy, you just make yourself unavailable for contact, there’s plenty of providers that don’t care what you do. It’ll take years to get banned.

The internet is global, it’s a choice to apply US law like the dmca. You can also choose not to.


They’re in russia so dgaf about this. The insistence on tracking and absence of https when served from inside russia kind of implies all sorts of things. Use archive.org


I will continue to use them because things get taken down from archive.org sometimes, or at any rate far more often than with archive.is. I consider it a bug when anything disappears from an archive for any reason.


It’s apparently self funded by him. I don’t know how he doesn’t get shut down but I don’t know how all the DMCA avoiding pirated streaming sites don’t either, or how sci hub isn’t domain blocked in the US, or why millions of copyrighted books aren’t DMCAed from libgen.


Fair use applies i think.


Things on archive.org get DMCA'ed all the time


I'm almost certain the actual reason is producing all the infrastructure to DMCA content on Google/Facebook/Twitter/Youtube and other "mainstream" Web 2.0 platforms is expensive.

Copyright enforcement is a lot stricter on Youtube, than say, Reddit video.

It's all down to the engagement of the IP owners.

My guess is they just haven't gotten around to chasing Archive.is down that hard.


It’s a value proposition. Archive.is doesn’t host Hollywood blockbusters. If it is hard to take them down, nobody is going to do that as long as they don’t cause to much real damage.


I don't think retaining and publishing complete copies of copyrighted works falls under even the most generous interpretation of fair use.


It should, but I don't think it does in its current form.


There have been at least two past threads about this:

Tell HN: Unexpected errors with Archive.is on Cloudflare 1.1.1.1 DNS - https://news.ycombinator.com/item?id=23315640 - May 2020 (10 comments)

Tell HN: Archive.is inaccessible via Cloudflare DNS (1.1.1.1) - https://news.ycombinator.com/item?id=19828317 - May 2019 (197 comments)

as well as god knows how many comments...


In case you use pihole and want to use cloudflare for everything, but archive.is you can create the following file

cat /etc/dnsmasq.d/02-archive.is.conf server=/archive.is/8.8.8.8 server=/archive.is/8.8.4.4 server=/archive.li/8.8.8.8 server=/archive.li/8.8.4.4 server=/archive.to/8.8.8.8 server=/archive.to/8.8.4.4


Fixed formating, sorry.

   cat -p /etc/dnsmasq.d/02-archive.is.conf
  server=/archive.is/8.8.8.8
  server=/archive.is/8.8.4.4
  server=/archive.li/8.8.8.8
  server=/archive.li/8.8.4.4
  server=/archive.to/8.8.8.8
  server=/archive.to/8.8.4.4


There are quite a few different TLDs that the site uses (for resilience I presume). .vn is another, .today may still be in use.


Similar for dnscrypt-proxy: add "forwarding_rules = '/etc/dnscrypt-proxy/forwarding_rules.txt'" to /etc/dnscrypt-proxy/dnscrypt-proxy.toml and then populate the forwarding_rules.txt with lines like "archive.is 8.8.8.8".


I never even thought of configuring my pihole to allow this. Thanks! Much appreciated!!


> I wrote the following reply to Matthew, praising his team's focus on the big picture

Okay great, but:

> 1.1.1.1 is delivered across Cloudflare’s entire network that today spans 180 cities. We publish the geolocation information of the IPs that we query from. That allows any network with less density than we have to properly return DNS-targeted results.

> massive mismatch (not only on AS/Country, but even on the continent level) of where DNS and related HTTP requests come

The problem isn't really EDNS. And someone is either lying or very incorrect.

This should be resolvable. The two sides don't want incompatible things. Has there been zero progress since?


>> 1.1.1.1 is delivered across Cloudflare’s entire network that today spans 180 cities. We publish the geolocation information of the IPs that we query from. That allows any network with less density than we have to properly return DNS-targeted results.

Cloudflare makes an exception to this rule for Archive.{today,is,...} domains. All requests for this domains come from Amazon EC2 in the U.S., not the 180 edges of Cloudflare. This was on blog.archive.today. Why? Who knows. But the decision to break up is made by both parties, not just the archive.

Source https://blog.archive.today/post/623568857709395968/i-from-th...


There are a lot of emotions in the comments here today. CloudFlare provides a clear response and it has merit. Archive.is surely is not 100 reliant on this single mechanism to load share or determine correct routing to cache locations, I agree with the poster - I can't see a reason why they would block this via Cloudflare when so many other mechanisms they should already be deploying to satisfy their requirements across multiple layers in the stack exist. Edit: The position makes or made no sense and smells fishy.


archive.is has some sketchy tracking built into it.

I still use (the discontinued) umatrix with firefox and any archive.is request makes a lot of specific-to-you tracking pixels. like *.pixel.archive.is that drill down to your browser address


It's important to remember, at least for corporate environments, that EDNS Client Subnet is important when working with services such as Exchange Online where the local resolver is what determines your EXO Front Door. If you're using a service like 1.1.1.1, you may be routed to an incorrect Front Door causing increased latency (primarily with search and archive mailboxes which aren't cached).

Quad9 does have a service which provides EDNS Client Subnet support, should you want to leverage it.


Cloudflare has worked providers to make sure they can efficiently route. If you find case where this isn’t the case please let us know.


Cloudflare DNS does not route efficiently with AWS CloudFront anycast DNS. I tracked down insanely slow `rustup update` downloads to incorrect selection of ideal routes to the AWS resources caused by using CF to resolve the DNS. Switching to a different resolver that works with anycast and EDNS fixed it.

CF saying “we break standard DNS geo routing but work with providers to route things right” isn’t very inspiring.


> Cloudflare DNS does not route efficiently with AWS CloudFront anycast DNS. I tracked down insanely slow `rustup update` downloads to incorrect selection of ideal routes to the AWS resources caused by using CF to resolve the DNS.

Please send me details (silverlock at cloudflare) here - AWS has our geofeed.

If you can include resolution details - e.g. dig @1.1.1.1 <cloudfront-host> +nsid - with the incorrect CF results, we can provide them to AWS.

Folks did geo-routing with DNS long before ECS was included, and there’s a privacy trade-off to be had. We’re exploring ways to make this better but there is no free lunch.


Thanks for providing your info. I stopped using CF for resolution because of this almost two years ago; I don’t have a reason to think the situation changed but if I get a chance I can try to reproduce it and get back to you.


I have Quad9's DNSCrypt configured with ECS on my Pi-Hole and it returns the IPs for archive.is and archive.today. Just tested it.


> Sure, it's annoying that I'll need to use a VPN or change my DNS resolvers to use a pretty slick (and otherwise convenient) website archiver.

You can alternatively look up the IP address using something other than Cloudflare DNS and add entries to your /etc/hosts file for archive.is and archive.today.


I had a similar issue with VoWiFi on my network due to EDNS, and it's rightly pointed out by msilverlock in forum.

  # dig vowifi.jio.com @1.1.1.1 A
  ;; ANSWER SECTION:
  vowifi.jio.com. 5 IN A 49.45.63.1
  vowifi.jio.com. 5 IN A 49.45.63.2
  ;; SERVER: 1.1.1.1#53(1.1.1.1)

  # dig vowifi.jio.com @8.8.8.8 A
  ;; ANSWER SECTION:
  vowifi.jio.com. 4 IN A 49.44.59.36
  vowifi.jio.com. 4 IN A 49.44.59.38
  ;; SERVER: 8.8.8.8#53(8.8.8.8)

https://community.cloudflare.com/t/vowifi-issues-due-to-poss...

As the article links to and says "privacy versus convenience", and I am happy that CloudFlare chose the former.


That million Firefox users who chose Cloudflare DNS must know that chose "privacy versus convenience" too and be happy with that.


There is so much I do not know about dns.

I thought: I ask dns server about domain, they return an IP address. I connect to IP address and they in turn can see mine.

So why does cloud flare need to a) query domain for IP address on my behalf? Can’t they just do it on their own behalf and cache the results? B) why do they need to hide my IP address information from the domain? Aren’t I going to visit the destination regardless?


some people use the IP from the DNS query to return a server closer to you

that's pretty much it


(2019)


I noticed a few days ago that it didn't work for me, but used a site up/down checker that also said it was down. Figured they were having issues and didn't think any more about it. I just added a static entry to my pihole, so it works for me now.


i have no trouble to resolve www.archive.is via 1.1.1.1 at all...

however, i noticed archive.is has a CNAME record pointing to www.archive.is while CNAME RR on apex domains are usually not allowed in DNS... what make this even more interesting, i only see the CNAME RR when querying via 1.1.1.1 and not when querying authoritative servers for archive.is (EDIT: while repeating the query via 1.1.1.1 i also saw both A and CNAME record for archive.is in the response :S)

maybe the initial issue is just gone already? considering this is apparently happening in 2019 that does not seem too unlikely...


It doesn't. At least I can view it while using 1.1.1.1


Archive.is is unironically one of the most important websites in the world. I hope this mess gets fixed but I am not holding my breath because we are in the same position for years now.

Interesting read on the probable owner of the site : https://webapps.stackexchange.com/a/149405


That reads a lot like doxxing; if someone isn't open about their identity, they don't want it out, and doing sleuthing work like this (or linking to it) can be considered doxxing.

If archive.is hosts content that has been removed due to oppressive regimes' policies (including western ones), exposing their identity may put them at risk.


I think the question asked on SE "On which country are the creators and servers of archive.today / archive.is based?" incurs not a 'who is he', but 'should I trust them based on their national allegiance'. A similar idea could be presented of large Twitter misinformation accounts that have influenced the 2016-2020 (and future) elections - they're not open about their identity, but the actions they're doing most people would disagree with, so most would decide that it is morally justifiable to go digging for clues to find the source of the misinformation.

For archive.is, it's lower-stakes, but you might not be able to trust the site as an authoritative source in $x years should (for example) their home government take it over and strategically modify archives for their own purposes.


The operator of archive.is is circumventing copyright law in close to every country on earth, including all the democratic ones. Its unique selling point is that they do not comply with site owners' requests not to archive content or to delete content archived in the past.

While that doesn't exclude them from the protection of law, my conviction is slightly weaker when it comes to arbitrary standards of behaviour people on reddit invented. How many pages to they happily serve that contain private information long deleted from the actual websites? When they mutter under their breath, "information wants to be free" (as they are want to be, at least how I imagine it), does their definition of information include their identity?

(I'm slightly irritated by the "research" in that post, though... I really don't need Wikipedia to believe that -vich is a jewish name. And jewish names of Ukrainian/Russian origin are certainly not specific to that location today. I bet there are more people with that last name in Florida than in all of Eastern Europe combined)


Archiving the Internet is not stealing the history books. It’s writing them.


I don't think we need metaphors to grasp what it is. Its importance is so obvious, even the people that wrote copyright law created an exemption for.

That exemption includes an opt-out provision. And while I could see how ignoring such requests could be in the public interest in some cases, ignoring them wholesale is fundamentally incompatible with any view of morality that condemns "doxing".


I condemn doxing in principle. But if it’s out there once, it’s out there. To try to stuff the genie back only harms those who lack the information, regardless of intent.

I understand you don’t care for metaphors but I can’t help wondering who you mean by “we”? Perhaps “we” are not the intended audience. Please let “we” know “we” are free to ignore.


I find this highly implausible, all of the accounts archive.is is "logged into" would have to be put there in a very explicit manner. I'd assume that all of the accounts are fake or appropriated accounts.

For example @volth on Github - as a person - is still around in other places, so I'm guessing that account was stolen and they don't have a way to get it back.


That would be security by obscurity- something which the creator of such an important website does not have the privilege of relying on.


The parent comment is pointing out the moral/etiquette/guideline issue, not making a judgement about security posture.


"Archive.is is unironically one of the most important websites in the world"

Are you sure you're not confusing it with the internet archive https://www.archive.org/


I am not talking about archive.org

Archive.is is faster and does not respect robots.txt. It is recommended by Wikipedia and is widely used by journalists worldwide.


Both sites are important, used by Wikipedia editors, and used by journalists worldwide.


Everything on archive.is is on archive.org. I would say 99% of stuff.

If archive.is goes down we have webcitation.org, etched.page, ghostarchive.org, webrecorder.net, etc....


archive.org obeys the robots.txt exclusion but archive.today doesn't. This means that many websites(like 4chan) cannot be archived with archive.org.


That's true, so does webcitation.org, which doesn't obey robots.txt.

4chan has archives dedicated for that site anyway (Warosu, etc.).


>Everything on archive.is is on archive.org. I would say 99% of stuff.

Given how a LOT of the stuff today is behind paywalls and Archive.today breaks through most of them and Archive.org doesn't, your "99%" figure is way, way off when it comes to popular stuff.

Anyway, I donate to both the Archive.today and Archive.org. They're extremely valuable to me. I feel like Archive.today is in a dire situation when it comes to funding so I donate more than double to them each month.

If you're able, please donate to these sites. They are running on fumes. And take a look at my profile for a list of other orgs to donate to.


I should have said 99% of stuff that is at risk of being lost forever. Paywalled content from major news companies isn't going anywhere anytime soon.


Linked in Profile does not exit anymore. But

"Bachelor of Engineering Bachelor at the Humboldt University of Berlin."

This sounds fishy. I am not sure that you can get an Engineering degree at this University.


Currently only B.A. and B.Sc. as per their studying guide.

I don't know whether on the past there was some way to obtain said degree.

Edit: A different Bachelors besides those two are incredibly rare in Germany. For full universities (not universities of applied science) doubly so.


Is “Informatics” a Bachelor of Engineering?

https://www.informatik.hu-berlin.de/de/studium/Master


Informatics is roughly what's called computer science in the US.


no in germany informatics is bachelor of science


I can't resolve with either 8.8.8.8 or 1.1.1.1.


I switched Chrome to use secure DNS with neither Croogle or CloudBlare (used OpenDNS) and now it works fine. Fuck the megacorps.


OpenDNS is owned by Cisco; it's not exactly a mom-and-pop operation.


amazing how cloudflare has framed this anticompetitve move as a privacy thing.

it doesn't matter if your dns resolver leaks part of your ip address to archive.is's dns servers when you're about to connect to archive.is from your ip address anyway. the only thing dropping the edns client subnet does is prevent services you use from giving you a server that's closer to you when you do the dns lookup. this performance issue, of course, does not affect sites using cloudflare.


Just so we’re on the same page: Cloudflare decided globally not to include client IP in the EDNS data. Then archive.is decided to block Cloudflare’s resolvers from getting accurate records for their site.

To circumvent this, Cloudflare would have to reverse their global stance or make a special exception to satisfy archive.is.

It’s unclear how we could draw “anticompetitive” from this.


Understood, but why? Privacy is not an acceptable answer for the reasons OP stated. If Cloudflare gave a coherent, understandable reason, I'd probably be more on their side.

"Trust us, our network is big enough it will route right" is both not a good answer, nor true.


Privacy isn’t an absolute pass/fail. Giving authoritative nameservers my IP via EDNS leaks my IP. Sure, other things also leak my IP, but that doesn’t mean we should throw in the towel and accept any new way to leak user data.

In many cases, DNS logs aren’t going to the same place as web server logs, so this keeps my data in fewer log files owned by fewer people.


It isn’t the actual IP, it is the subnet. Leaks some info, but unless you own the entire subnet it won’t give up your identity.

https://en.wikipedia.org/wiki/EDNS_Client_Subnet


The entire point of ECS is to give the location, not the actual origin IP, which might be something you'd like to avoid giving away. The main point is that every resolver or network switch in the chain gets the ECS and would be able to combine it with the domain being requested. If you don't only visit Facebook/Google, your ipv4 /24 in combination with some obscure domain only you visit is very likely to give up your identity should an IX or resolver be watching for requests to such domain.


I understand that point, to an extent. I mean, your TCP connection in the next step hits how many switches on the way? With which both your actual IP therefore location could be determined. Trying to hide subnet from just a resolver seems...small in the grand scheme.

And if that's your goal, why not proxy your dns requests? I'd surely have a VPN or at least DNS proxy if my threat model were that which you're trying to avoid.


Sure, that is true. However, the person I responded to said that EDNS would give the authoritative server your IP address, which isn't true.


He didn't mean anticompetitive towards Archive.is, he meant with all content providers in general. By making them all less capable of delivering low-latency content, it makes Cloudfare appear better by comparison. Not sure how likely that would be but I'm pretty sure that was OP's meaning.


Cloudflare (Matthew Prince personally, here on Hacker News few months ago) said that they do reverse that their global stance for Netflix and some other megacorps.

So this is a super-premium feature unavailable to small players.

CloudFlare just changed how DNS behaved and charge corps to make it work as it worked before CloudFlare entered the stage.


Do you have a citation for that? Sourcing from https://news.ycombinator.com/item?id=19828702 , they don’t reverse their global stance for large providers. Their stance is ~”Including client IP via EDNS violates our goal of maximizing user data privacy”, and what they’re working on with other large-scale providers is a way to improve geo-resolution without weakening user privacy.


Exactly on your link, just ctrl-F for "Netflix":

"We are working with the small number of networks with a higher network/ISP density than Cloudflare (e.g., Netflix, Facebook, Google/YouTube) to come up with an EDNS IP Subnet alternative that gets them the information they need for geolocation".

Well, I might be inaccurate in saying "exactly the same protocol as before", but it is clear that what was available to every webmaster via EDNS, now available only to members of a closed club, via good old EDNS or a proprietary alternative. The latter is more likely, not because of privacy-caring, but because they could now charge it as license fee for using private protocol.


EDNS is an optional field. Client subnet is an optional part of that optional field. It’s relatively new compared to DNS as a whole, and most “webmasters” don’t make active use of it.

The quote you pulled is about Cloudflare’s efforts to build a better standard. They’re talking to the people with the expertise and interest to build that standard. You’ve inferred “proprietary” and “closed club”, and a ton of motive besides, and you’ve copy-pasted that speculation as if it were fact into multiple comment trees.


1. EDNS is needless when you are using your provider DNS. It is needed for public DNS servers. So it is optional, as is needless most of the time. Before launching Cloudflare DNS, the biggest public DNS service was Googles, who developed and implemented EDNS. Then comes Cloudflare and "the people with the expertise and interest" to rethink that.

2. I assume that commercial companies are here to make money, not "a better future" (besides the better future for the shareholders). If they implement something, the first question is how do they make money with it.


I’m not going to debate your stance on how you assess someone’s motivations, but it does seem like you shouldn’t attempt to present your speculation as fact.


I think they mean they're working on an alternative standard, not anywhere near "we give you an API to match DNS requests to origin city". These talks might have been as simple as "we'll give you [and everyone] geoip information for the datacenters we request from based on IP, and you can load balance off that".


I do not think it has much sense if the standard is the good-old-EDNS or something new, for example supplying city name in a text form instead of hiding last bits of IP as EDNS does.

Google's 8.8.8.8 provides client-ip via EDNS to every webmaster. Zeroing at least 8 bits for privacy - it was made with privacy in mind too. The privacy could be tuned by zeroing 10+ instead of 8+ bits, etc. There is nothing wrong with EDNS and privacy, which would require to abandon ENDS with privacy stancas.

And Google provides that FOR FREE. To everyone.

How can I - as webmaster - get similar info from 1.1.1.1? Not being a Silicon Valley megacorp.


Again, you keep presenting this as something Cloudflare provides to “megacorps” for money. There’s no evidence this is the case, it’s just your speculation.

I’m really sorry that you somehow depend heavily on EDNS Client Subnets, a feature that was only standardized 5 years ago. But it’s optional, per the spec, and Cloudflare has published their rationale for not enabling it on their resolvers.


Please, tell me - not a megacorp webmaster - how can I opt-in to Cloudflare program available to Facebook/Netflix, to get what is available freely as the source IP of UDP packet in the absence of planet-wide public resolvers and what Google gives for free trying to mitigate the inconvenience caused by the planet-wide resolver.

Indeed, my texts about possible motivation is speculations, but I do understand why webmasters block CloudFlare DNS.

I wonder why there are so few of them.


“We publish the geolocation information of the IPs that we query from”, from the linked comment above. They publish the same info to you and Netflix and me and Amazon.

You keep presenting a difference between what “you” get and what a “megacorp” gets, without any evidence that they’re getting something different from you. You also sidestep here into a complaint against “planet wide resolvers”. To a rounding error, nobody is running their own recursive resolvers. Everybody uses either their ISP’s DNS provider or one provided by a large network entity, virtually all of which are companies. This has been the case for decades. So anybody relying on the source IP of the UDP packet is just out of luck, and has always been out of luck. It’s clear you wish this wasn’t the case, but Cloudflare and Google aren’t really changing the game here, and they don’t owe you optional features because you really want to see user IP data.


I guess you just do not understand what EDNS is, and why it is optional and why its optional-ness is not a pro-CloudFlare argument.

It is very simple:

Query(source IP is an ISP in Paris, no EDNS): gimme IP of "website.com"

WebsiteComDNS: IP of the server closest to Paris

Query(source IP is Google, no-EDNS): gimme IP of "website.com"

WebsiteComDNS: Hm, it is likely Google Cloud, or GoogleBot, answer with IP of own server on Google Cloud

Query(source IP is Google, EDNS: I am acting on behave of an user in Paris): gimme IP of "website.com"

WebsiteComDNS: IP of the server closest to Paris

Query(source IP is Cloudflare, no-EDNS): gimme IP of "website.com"

WebsiteComDNS: where the fuck is CloudFlare? Africa? answer with something random


I appreciate that you’ve moved from assuming what Cloudflare is doing to assuming what I understand. I think this thread has run its course.


I concluded that from sentences like "they don’t owe you optional features because you really want to see user IP data" which reveal misunderstanding on who is sending queries and who decides what to answer to those queries.

From your text it looks like webmasters are sending requests to CloudFlare to get user's IP.

This is totally wrong.

It is CloudFlare wants to see server IP and in the query it has to explain how they will use this info, to which region they will forward my server IP.

That is what EDNS-client-IP for.

If the requester refuses to explain why they need the server IP address for (and their goal cannot be derived from the source IP of the UDP packet, like in the case of local ISP resolvers), they may be denied the privilege of the honor of receiving responses.


> to which region they will forward my server IP.

They're not forwarding it at all. A request from LA will come from the LAX Cloudflare DC, and thus plugging in the requesting IP address into some geoip service will show Los Angeles, California. All you have to do to get this working is to fallback to the incoming IP if ECS is absent.

Or time travel to 2010 and try to respond to DNS queries while no servers are sending ECS.


> They're not forwarding it at all.

They indeed are, "for your privacy".

And our topic started exactly out of this:

From: https://webapps.stackexchange.com/questions/135222/why-does-...

``` Official Statement

archive.today had this to say about the issue:

https://twitter.com/archiveis/status/1017902875949793285

    2018-07-13T1545: yes, unlike other public DNS services, 1.1.1.1 does not support EDNS Client Subnet
https://twitter.com/archiveis/status/1018691421182791680

    2018-07-15T1958: "Having to do" is not so direct here. Absence of EDNS and massive mismatch (not only on AS/Country, but even on the continent level) of where DNS and related HTTP requests come from causes so many troubles so I consider EDNS-less requests from Cloudflare as invalid.
 
```

> Or time travel to 2010 and try to respond to DNS queries while no servers are sending ECS.

That is exactly what `archive.{*}` does.

It responses to

[+] requests from IPs with geo-information (as in 2010, and it seems to be the most of requests still)

[+] AND to requests from public global resolvers with EDNS, which supply information to which region the server IP will be forwarded (as in 2015)

[-] But not requests from a public global resolver which conceal the source region (as it does a single privacy minded megacorp in 2019)


> You keep presenting a difference between what “you” get and what a “megacorp” gets, without any evidence that they’re getting something different from you.

I read it in Mattew Prince sentence above

>

> You also sidestep here into a complaint against “planet wide resolvers”. To a rounding error, nobody is running their own recursive resolvers. Everybody uses either their ISP’s DNS provider or one provided by a large network entity, virtually all of which are companies. This has been the case for decades. So anybody relying on the source IP of the UDP packet is just out of luck, and has always been out of luck.

1. It is not sidestep. It is my main point. EDNS-client-ip has sense only for planetwide resolvers, and it is "optional" only because of it. EDNS-client-ip was designed especially for Google DNS. When you use recursive DNS of your ISP in your city, the source of UDP packet is in your city. When Google zeroes 8 bits of IP, the EDNS-client-ip is still your city. It is needless to know your exact IP to select the best server for you. CloudFlare refuses to disclose even that approximate location, which gives their anycast CDN an advantage.

2. There is no "decades" of "5 years" history. There is only two points on this timeline: the first: launching Google DNS - which introduced ENDS-client-IP to mitigate caused inconvenience to webmasters, the second: launching Cloudflare DNS - you know the story. The rest (Quad9, ...) are negligible. Yandex DNS might be comparable big and, like CloudFlare, it does not send EDNS-client-ip - for no privacy-caring stances (my speculation: just out of lazyness), but it is regional, all requests from there can be safely rounded to Moscow. So we can consider there are only three cases over there: Google, CloudFlare (commonly referred as "planetwide resolvers"), and all the rest are regional businesses, whose very network ownership discloses location.

>

> It’s clear you wish this wasn’t the case, but Cloudflare and Google aren’t really changing the game here,

This is ridiculous. The IP I will know from HTTP logs few miliiseconds later, we are talking about getting origin city from DNS query to answer with IP of the nearest HTTP server.

>

> and they don’t owe you optional features because you really want to see user IP data.

So webmasters do not owe to answer when CloudFlare want to see server IP data, ok?

The divorce of indy webmasters with CloudFlare DNS is very natural, I just wonder why it is no massive.


Likely because DNS worked just fine without EDNS client IP (and indeed DNS) for decades. For example, I remember the use of the 4.2.2.2 server, which was globally accessible but US-based. The responses though were 100% usable wherever you were on the planet. Equally, a national ISP running DNS servers would get you a country at most; a /24 gives you city or better location, carrier-grade NAT aside. Latency between the client and server may be slightly higher, but that's the end user's problem and not an issue for the site. In any case, it sounds like the Cloudflare source IPs for recursive DNS lookups are locatable via GeoIP, so I fail to see the problem.


Exactly.


No I didn’t.


EDNS is an optional feature in general. Client subnet is even more optional.

There may not be a whole lot of private information in the client subnet, especially since it seems likely that after querying for an A/AAAA record, a client would then send a packet to (one of) the resulting IP(s) and reveal their address, but it's not required to pass it on, and it it seems better to reduce potentially private information passed on.


Can we not call literally everything "anticompetitive"? archive.is isn't a competitor of Cloudflare. Cloudflare doesn't treat them differently from any other site, they're not doing anything "to keep them down", their DNS product just has a focus that isn't compatible with archive.is' hunger for data.

That you might connect to archive.is directly isn't of any concern. You might also not do that, and they've decided that leaking data about the user isn't what they want to do.

It's not anticompetitive. It's not evil.


> Cloudflare doesn't treat them differently from any other site

Did we read the same article? Cloudflare is treating them, and anybody else that makes the same choices wrt EDNS, differently from the rest of the Internet.


Cloudflare treats everybody the same: they never include client subnet in the EDNS field.

Archive.is is manually having their nameservers respond w/ junk records when queried by Cloudflare’s resolvers.


I read the article, I'm pretty sure you are mistaken.


archive.is doesn't compete with cloudflare, but they (or other websites) might want to spend money on improving their performance. cloudflare's dns resolver being popular makes one non-cloudflare option for improving website performance less appealing.


Archive.is used to geoblock all Finnish IP addresses at one point because of some alleged dispute with the Finnish government. As far as I remember, he got a takedown request from a Finnish government and then had some incident at the Finnish border, saying that they were linked (as in, the company in question could order the Finnish government to harrass him).

Works now though but it gives me doubts about the management of the site.


The people that run the Pirate Bay went to prison for operating that site, despite not hosting any illegal content there themselves. Archive.is faces similar impedance mismatch with what different people and courts believe about intellectual property.

When's the last time you went to prison for something you believe in?


What does the Pirate Bay have to do with anything? Who went to prison here?


The point is more about what could happen when breaking laws in other countries than what did happen in this particular case.


Tbc, I haven't, but I'm also not the one throwing shade at the site's management.


I have long hated Cloudflare, so it's hard to be on their side here. They MitM large parts of the web and often trap you in long or even infinite loops with their horrendous checker that some sites unfortunately use. It's especially bad on less common browsers. In two cases I had to spoof my useragent or it would literally never pass the check, locking me out of several sites.

Even if the block was without reason, I would've thought "fair enough, Cloudflare sucks". It's weird to me how many people willingly use Google or Cloudflare DNS. I would have to be in quite the pinch to rely on either, like a site I need is blocked and I happen to remember their simple IPs, and also I can't use something like an ssh tunnel instead for some reason.


CloudFlare DNS is not the part that gives you the annoying anti bot checks. Those are from website operators who have opted in to the CDN and anti-DDoS services that CloudFlare offers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: