
A Chrome feature is creating load on global root DNS servers - BerislavLopac
https://arstechnica.com/gadgets/2020/08/a-chrome-feature-is-creating-enormous-load-on-global-root-dns-servers/
======
donor20
This issue only exists because without chrome's work here virtually every ISP
/ Wifi provider etc started hijacking DNS inquiries to pump trash ads down
your throat.

THANK you google for stopping this total abuse of internet standards (very
annoying for some cases where no UI as well).

The irony of Verisign complaining about this is incredible actually - they
were the no 1 abusers of this in the past! They go on about interception being
the exception these days. That's actually because chrome put a stake right
through the heart of Verisgns efforts to intercept.

~~~
donor20
Separate question - if google did what mozilla did and wrapped DNS in HTTPS
and ran it to 8.8.8.8 (allowing user to pick other resolvers) - would that be
better?

It feels like folks complain - you are sending too much traffic. If google
said - we can handle it directly (they obviously can - the bandwidth youtube
alone uses has got to be magnitudes larger) people would complain - no no -
send us the traffic.

Google already runs 8.8.8.8 and encourages adoption of it I think - maybe it
can handle the billion requests per day? Or maybe they can help verisign scale
their infra?

~~~
firloop
Google running the DNS resolver wouldn’t fix this problem; due to the nature
of the requests that Chrome is making, the lookups would have to hit the root
server because they wouldn’t be cached anywhere else.

~~~
donor20
I'm suggesting let google run some root resolvers.

Not to rain on the sob story, but netflix, amazon and google probably pump out
orders of magniture more bytes then these requests.

It's axiomatic. If my session makes 3 512 byte requests when I start browsing,
and then I watch 10 youtube videos at 2K, then a netflix movie at 4K, how in
the WORLD is the bandwidth from the 3 512 byte requests crushing
infrastructure?

And if it is, let google or amazon or someone with a clue run the infra.
Seriously - this makes no sense that THIS is the bandwidth killer out there.
Apple updates - sure, those are monsters and a ton of people get them. DNS
packets are pretty small by contrast.

~~~
sterlind
Bandwidth isn't the issue, it's the dynamic nature of the content and the
global consistency needed between DNS servers. Videos are static content; they
don't get updated so you don't need any schemes to check for updates.

It's certainly possible to design DNS to scale, it's just that they may be
overwhelmed by QPS right now because they made different trade-offs.

~~~
donor20
We are talking 15k qps. A single AMD machine should be able to much higher
rates - millions. Yes, there is overhead, but my point was, if verisign or
whoever is in charge can't run these root servers to handle 15k qps for the
entire global internet then let google do it.

These are ridiculously low numbers, and dns is EASIER not harder than video.
Google is doing full live streaming, multi res / format delivery with chat
across a ton of platforms.

And no - root DNS does not change at a high rate of speed. It's a 2MB file you
can download yourself. .com stays .com for a LONG time and the TTLs are going
to be in hours and days.

------
jiggawatts
In a previous YCNews discussion of this topic, I worked out based on public
Root DNS statistics that this traffic amounts to at most a few gigabits spread
across _hundreds_ of DNS servers, each of which has at least a gigabit
connection.

Notice that all of the breathless articles going on about how evil Google is
are using percentages, not absolute numbers?

This is because journalists don't do journalism any more, they just put their
name on corporate PR that is intended as a weapon in a war for control.

~~~
x0
You know, that actually makes sense to me. You'd want a lot of margin. If they
were running the root DNS servers and were hitting like 80% of their bandwidth
on a normal day, I'd be worried.

~~~
GauntletWizard
We're not talking about 80%. We're talking about 8%. The size of the traffic
they are talking about is low enough that I, personally, would consider paying
for it. It is a couple thousands of dollars a month to run this "abused"
infrastructure. If they were doing it as a thankless task for free, it would
make sense to be upset about it; but they're not. They're doing it for a cut
of every single .com purchase on the whole web.

------
kevin_b_er
Ironic that this is Verisign complaining considering the very idea to capture
nonexistant domains and replace with spam was a Verisign concept from 2003
called Site Finder. All of .com and .net were wildcarded at the root level.

The ISP just took it as a great idea. Now we have to make countermeasures for
it.

[https://en.wikipedia.org/wiki/Site_Finder](https://en.wikipedia.org/wiki/Site_Finder)

------
manigandham
So half of the traffic is Chrome requests, but how much _capacity_ is all of
the traffic actually using?

Are the root servers 90% idle or are they overloaded? That changes everything.
Also servers are fast, bandwidth is cheap, and DNS is lightweight. How is this
really a major resource issue?

------
lilSebastian
This topic has been posted several times in the last few days

~~~
jwilk
[https://news.ycombinator.com/item?id=24231857](https://news.ycombinator.com/item?id=24231857)
(4 days ago, 207 comments)

Anything else?

------
creeble
What does Firefox do to prevent the problem chromium is trying to solve?

~~~
stefan_
They turned on DNS over HTTPs by default and send your queries to Cloudflare:

[https://blog.mozilla.org/blog/2020/02/25/firefox-
continues-p...](https://blog.mozilla.org/blog/2020/02/25/firefox-continues-
push-to-bring-dns-over-https-by-default-for-us-users/)

~~~
stuartd
You’re seemingly pushing you own agenda here?

As the article itself says:

> Are other approaches feasible? For example, Firefox’s captive portal test
> uses delegated namespace probe queries, directing them away from the root
> servers towards the browser’s infrastructure.

~~~
stefan_
In your agenda to expose my agenda in linking a Firefox blog post, you have
missed that this isn't even about captive portal detection. Captive portals
are easily detected by trying to load a known response page; captive portals
don't fake that because they want to be detected so that the browser redirects
you to their portal.

This is about malicious DNS resolvers that don't return NXDOMAIN for non-
existent domains but instead send you to an ISP advertisement page. This
messes with the omnibox. For all other domains, they resolve just fine. These
resolvers are inclined to evade detection, e.g. if browsers checked a static
list of domains, they would just return NXDOMAIN for only those.

~~~
carlhjerpe
I can't even remember last i got an advertisement page. Must've been about 10
years ago, though the last 5 I've been using third party DNS. I don't know any
Swedish ISP that does this. I'm using ISP DNS on my phone.

~~~
stefan_
I assume at some point the calculus flipped for ISPs to not annoy you with
low-quality ads and make customers cut them out, but instead operate a real
DNS resolver so you can still sell their real-time browsing history to
advertisers. Part of that is the Google and Cloudflare marketing campaigns for
their public resolvers, so people had a ready alternative.

------
toast0
Is the root zone AXFR able? Maybe recursive servers should just periodically
do that, and then they don't need to forward these queries.

~~~
JdeBP
Yes, it is, and that's in fact one popular way to set up a private root
content DNS server for a resolving proxy DNS server to consult. I do it on my
machines.

* [https://github.com/jdebp/nosh/blob/9a113ec2c2e58679bab8d8b34...](https://github.com/jdebp/nosh/blob/9a113ec2c2e58679bab8d8b34413de4acab377f2/source/convert/tinydns.do#L77)

------
Animats
The root servers may have to delay NXDOMAIN replies. Those are rare for legit
requests.

------
lilyball
I don’t understand why it has to randomly generate new domains each time.
Surely all that matters is that it’s testing for domains that are
statistically unlikely to exist, right? Why not seed the RNG with something
like the current date, such that the domains are still random, but all
instances of Chrome generate the same domains every day? That way the NXDOMAIN
results should be able to be cached by upstream resolvers, thus significantly
reducing the load on the root servers.

~~~
JoshTriplett
If they were in any way predictable, people would register them, or the broken
DNS servers would handle those domains properly while still giving broken
results for others.

~~~
1f60c
That shouldn't be a problem: I don't think it's possible to make qwajuixk
resolve to anything on the public internet. I bet they could hardcode a list
of domains, change it up every couple releases, and that'd stop 80% of the
rogue sysadmins who are doing this.

~~~
theonemind
I've mostly seen DNS hijacking from large ISPs. Probably hardly any sysadmin
anywhere wants to implement DNS hijacking, but have corporate overlords that
tell them to do it. The first big ISP would get it working pretty quickly no
matter what Google did, then other big companies would see it and tell their
sysadmins to figure it out, the other company is already doing it.

I bet there's a decent amount of cash in DNS hijacking for big players, so you
should probably think of it like the hypothetical cryptography attacker. You
should assume they will know everything about your method but still can't beat
it. If you can tell them what you're doing and they can beat it, they will.
They wouldn't have done it in the first place without motivation.

~~~
SAI_Peregrinus
Send one request to such a random domain over DoH or DoT to some DNS server
you control for the purpose (Google can easily set such a thing up). Ensure
the response is NXDOMAIN. If it's not, generate a new random domain and retry.

Send a second request of the same domain via the system DNS. If it's not
NXDOMAIN, it's hijacking unknown DNS requests.

One request that might hit the root instead of 3.

------
cordite
I've been using google DNS, and cloudflare DNS (as of late) since I found out
about this ISP behavior.

It's unacceptable, it's unhelpful, and I see it as a vector for getting
untrustworthy software on less savy users hardware.

------
cmroanirgo
Previous discussion (5 days prior)

[https://news.ycombinator.com/item?id=24231857](https://news.ycombinator.com/item?id=24231857)

------
t-writescode
What does this do to users of PiHole?

------
tobyhinloopen
Interesting problem

------
adrr
Just require HTTPS already.

~~~
godman_8
HTTPS doesn’t protect against DNS or SNI interception

~~~
adrr
If there’s no valid cert for the domain, you can’t 301/302 redirect over https
connection. SNI just lets the middleman peak at the host name, it doesn’t
allow you to redirect. DNS interception allows the attackers to change IP but
https still needs a valid cert for the host.

~~~
stordoff
Incidentally, this affects how my ISP implements its domain blocks. Non-HTTPS
sites are 307 redirected to [https://assets.virginmedia.com/site-
blocked.html](https://assets.virginmedia.com/site-blocked.html), whereas
connections to HTTPS sites are just halted (IIRC, it sends a TCP reset, but
it's a while since I looked into it).

~~~
amiga-workbench
That reminds me, I turned this off the other day and it still hasn't taken
effect yet.
[https://my.virginmedia.com/advancederrorsearch/](https://my.virginmedia.com/advancederrorsearch/)

Its a good thing DoH is now in the stable version of RouterOS, I've just
switched it on.

~~~
JdeBP
You don't need DoH to avoid that. Just set up and use your own resolving proxy
DNS servers and don't use the Virgin Media ones at 194.168.4.100 and
194.168.8.100 either directly or indirectly. (This means not forwarding to
those servers, and not letting your DHCP client, in any of your machines,
configure your system to use them when it gets a lease.)

Here is the difference between using Virgin Media's resolving proxy DNS
servers and using your own:

    
    
        % # http://jdebp.uk./Softwares/djbwares/guide/commands/dnsqrx.xml
        %
        % dnsqrx a z398nvfsd0098u3qwtltnk. 194.168.4.100
        1 z398nvfsd0098u3qwtltnk:
        56 bytes, 1+1+0+0 records, response, noerror
        query: 1 z398nvfsd0098u3qwtltnk
        answer: z398nvfsd0098u3qwtltnk 0 A 92.242.132.24
        %
        %
        % dnsqr a z398nvfsd0098u3qwtltnk.
        1 z398nvfsd0098u3qwtltnk:
        40 bytes, 1+0+0+0 records, response, authoritative, nxdomain
        query: 1 z398nvfsd0098u3qwtltnk
        %
    

That last query didn't even escape the machine in my case. I run a private
root content DNS server on every machine if possible. The query here got
answered by a tinydns instance listening on 127.53.0.1:

    
    
        % tail -n 1 /var/log/sv/tinydns/current | tai64nlocal
        2020-08-26 09:37:44.305619726 7f000001:cfd9:f163 + 0001 z398nvfsd0098u3qwtltnk
        %
    

No-one gets to complain about (minuscule) extra load on the root content DNS
server except me. It's my server on my machine. (-:

~~~
amiga-workbench
I've just realised my mistake, I needed to turn "use-peer-dns" off.

I did have my router setup to cache DNS requests to google and updated the
DHCP config so everything would use my router for resolution, but virgin's DNS
servers were still being pulled through as a "dynamic server" according to the
RouterOS panel.

I thought Virgin was fiddling with my DNS requests even though I was sending
them to a different provider.

