

Chromium DNS performance - afimrishi
https://plus.google.com/103382935642834907366/posts/FKot8mghkok

======
spaznode
I hate articles I can't read on my mobile iOS device due to endless redirects.
Esp when they sound like interesting articles. Wtf google + team?

~~~
projct
It seems to work fine here (iPhone 4, iOS 5.1), and I've not had a problem
with g+ on 4.3.x or 5.0.[01], AT&T and various Wi-Fi networks...

Is it really endlessly redirecting or perhaps timing out and trying to reload?
I had that happen once when latency was extreme, on one occasion.

EDIT: here is the text for those who are having trouble.

Host resolution accounts for a significant portion of the networking component
of page load time. As such, Chromium developers are always looking to optimize
it. Here I present the current considerations of host resolution in Chromium,
and look forward to the IPv4+IPv6 dual stack world and what that entails for
browsers (and what Chromium is doing about it). I finish off with a
presentation of some of the latest data we're operating on.

Currently, Chromium uses getaddrinfo() to ask the OS to resolve a host. This
is a cross-platform, blocking API that abstracts the complicated host
resolution. There are a number of advantages to using this API: * Correctness
- it handles all the complicated rules of hostname lookup correctly. It
understands /etc/hosts, non-DNS namespaces like NetBIOS/WINS, etc. Re-
implementing this behavior would be difficult. * OS Caching - we get to share
the OS host cache with other applications. Note that this doesn't exist by
default on Linux systems. * Less code - Having to maintain code sucks. It
leads to code/binary bloat and endless bugs for corner cases and OS-specific
issues. And it takes engineering time.

There are many disadvantages to using this API: * Blocking - we need to use
unjoined worker threads so we don't block critical threads * Performance - we
can't optimize the host resolution process, since it's behind the
getaddrinfo() call. There are lots of optimization opportunities we miss. *
Application caching - we can't tell how long to cache a DNS record for in the
application, since we don't get TTLs from getaddrinfo(). We try to be safe by
only caching for a minute.

As noted previously, host resolution is a substantial portion of networking
time, so we try to start it as soon as possible. Our network predictor
subsystem learns to predict network requests and may initiate host resolution
prefetches. While this provides substantial performance benefits, it also can
lead to horrible user experience when it overloads upstream resolvers/devices.
For example, check out
<https://code.google.com/p/chromium/issues/detail?id=3041> and
<https://code.google.com/p/chromium/issues/detail?id=12754>, where users
report "internet loss" that seems correlated to DNS prefetching. The problem
is that Chromium is issuing too many DNS queries in a short period of time,
which may overload upstream devices (typically cheap NAT routers) which then
enter a anti-DoS mode and ignore DNS queries for a period of time. Due to
excessively high getaddrinfo() failure timeouts varying by platform, this
results in Chromium appearing to "hang" while displaying "Resolving host…" in
the status bar. Our first step to combat this was setting limits on the number
of outstanding getaddrinfo() calls. Currently, the limit is 8. However, it's
still possible to overload upstream devices. Worse, when that happens, now we
have this limit of 8 jobs, so even if we cancel the navigations and wait a
short while for the NAT router to exit its anti-DoS mode and then try to
browse to other pages, we can't issue anymore host resolutions until one of
the 8 getaddrinfo() calls timeout. We identified this problem in
<http://code.google.com/p/chromium/issues/detail?id=73327> and implemented
back off behavior to recover from this situation. The important thing to note
though is that we are somewhat limited by existing network devices in the
degree to which we can aggressively do DNS lookups for performance reasons.

Prior to last year's World IPv6 Day, we implemented Happy Eyeballs in Chromium
(<http://codereview.chromium.org/7029049>) to mitigate the problem of broken
dual stack implementations on the web. In particular, when a hostname had both
IPv6 and IPv4 addresses and the first address is IPv6, we will connect() to
the IPv6 address, but start a 300ms timer to fallback quickly to connect() to
IPv4. We use a fast fallback system rather than simply racing both IPv6 and
IPv4 in order to give an edge to IPv6, to compensate for the likelihood that
the IPv6 path is initially not as fast as IPv4, since we want to encourage
people to switch to IPv6. Note that this only addresses the actual TCP
connect() to a IPv6 address where the IPv6 pathway is broken (be it an
intermediary or the origin server).

The getaddrinfo() call itself is often slower
(<http://www.belshe.com/2011/06/15/ipv6-dns-lookup-times/>) in the dual stack
IPv4+IPv6 world due to serialization of the DNS queries
(<http://www.potaroo.net/ispcol/2011-12/esotropia.html>). Usually the OS's
getaddrinfo() implementation will issue the AAAA query first, and only once it
completes, issue the A query. OS X Lion (when using CFSocketStream) changes
this because it issues the A and AAAA queries in parallel, which is generally
superior in terms of performance. However, remember that this doubles the
number of DNS queries in a period of time, so we are more likely to hit the
upstream device DNS overload situation again, since the Chromium host
resolution limit is per getaddrinfo() call and not per DNS query (that's an OS
specific implementation detail) and we don't pace. Also, it strictly prefers
whatever is fastest, so it doesn't help incentivize IPv6 adoption. Note that
Chromium will disable dual stack support if it is unnecessary (no IPv6
interfaces). But if you have both IPv4 and IPv6 interfaces, your slower
getaddrinfo() calls are probably slowing down your browsing experience.

So what is Chromium doing about the various issues? After a long time of being
unwilling to sink our time into it, we're going to implement our own DNS stub
resolver. It sucks that we have to write this code and get it to work on all
platforms and lose OS caching, but the browser is an increasingly important
application, almost its own OS (especially with ChromiumOS), so it makes sense
to do it. We've got an experimental DNS stub resolver implemented and are
working on implementing features and flushing out bugs with it right now
(--enable-async-dns).

Once we have a functionally correct DNS stub resolver, we can begin playing
with optimizations. One of the obvious things we'll look at is parallelizing A
and AAAA queries. Also, this obviously gives us the option of beginning a TCP
connect() once we have a DNS response, be it A or AAAA. There are some
subtleties here, because there's the performance vs IPv6 incentive tradeoff.
On the one hand, the most performant choice would be to do a TCP connect()
immediately upon receipt of the A or AAAA response. On the other hand, we want
to incentivize folks to adopt IPv6, so even if the IPv4 pathway may be faster,
we debatably ought to give IPv6 a slight handicap (a timer) here to help with
adoption. When striking this balance, we also need to be careful not to issue
too many DNS queries at the same time and overload cheap NAT routers. Clearly
we'll have to do some experimentation and watch our bug tracker to make sure
things work reasonably. We should switch our job accounting to be on a per DNS
query basis, not a per hostname basis, to better account for parallel A/AAAA
queries. Also, now that we can detect these NAT router overload situations, we
should remove the fixed concurrency limit of 8 queries to be dynamically
determined, so users aren't limited by engineering for the least common
denominator hardware.

There are a bunch of other ideas we're kicking around, although we're
undecided if/when we want to pursue them. We obviously want to experiment with
revisiting DNS retransmission timers, failovers to different servers / name
suffixes, etc. Also, we'd like to experiment with connecting to different IPs
in the DNS response(s), in order to figure out which ones are faster (lower
RTTs), so we can prefer connections to those IP addresses. This may cause some
compatibility issues, so we need to be careful, but often times RTT may vary
greatly among IPs in the DNS response(s), especially between A and AAAA
records. Also, now that we have TTLs, if we find cached DNS entries that are
about to expire, but are likely to be used again, then we can reissue the DNS
query to prevent expiration.

OK, now that we've explained some of the tradeoffs we're looking at, we can
delve into some numbers to assist in our understanding of how important these
issues are. First off, how long does a getaddrinfo() call take? Well, that's
complicated. Chromium issues a number of speculative host resolutions for non-
existent hosts in order to detect domain hijacking. We also speculatively
resolve tokens typed into the omnibox in case they're real hosts. Also, there
will be a number of considerations with platform and IPv4/IPv6. Note that all
data I present here are based on Chrome 17 on a single day in March (sorry, we
have some internal issues with our metrics analysis dashboard, otherwise I'd
get a longer sample). So take it with many grains of salt (especially Linux,
which has a far smaller population), but it's still hopefully largely
relevant. I'll update again later with way more samples.

getaddrinfo() invocations for non-speculative requests, with successful
resolution Win: mean - 644ms. 10% in <= 1ms, 25% in <= 12ms, 50% in <= 43ms,
75% in <= 119ms, 90% in <= 372ms. Note, there's an upward blip of 1.45% of
samples completing in around 1s (95.90 percentile), due to the Windows DNS
retransmission timer. Mac: mean - 230ms. 10% in <= 0ms, 25% in <= 5ms, 50% in
<= 28ms, 75% in <= 67ms, 90% in <= 279ms. Note, there's an upward blip of
2.11% of samples completing in around 300ms (91.51 percentile), and another of
1.07% at 1s (97.36 percentile), indicating retransmission timers around these
intervals. Linux: mean - 293ms. 10% in <= 2ms, 25% in <= 12ms, 50% in <= 37ms,
75% in <= 89ms, 90% in <= 279ms. Note, there's an upward blip of 1.81% of
samples completing in around 4250-4900ms (99.26 percentile).

I'm quite disappointed in Linux's retransmission timer here, as a 4s hang in
navigation on 1.81% of URL fetches is a pretty horrible user experience if you
ask me. Dude, almost 3/4 of the remaining samples in the distribution are
covered by the retransmission, retransmit earlier already! Also, it's not
immediately clear from the percentiles I presented, but Windows and Mac beat
Linux on low latency responses, most likely due to DNS caching being off by
default in Linux distributions. Remember that this sample of getaddrinfo()
calls ignores failed resolutions and also only applies to getaddrinfo() calls
that either weren't predicted by our network predictor, or the network
predictor's DNS prefetch did not complete early enough to completely hide the
DNS latency.

getaddrinfo() invocations with successful resolution, AF_INET (IPv4 only) Win:
mean - 443ms. 10% in <= 0ms, 25% in <= 5ms, 50% in <= 37ms, 75% in <= 103ms,
90% in <= 322ms. 1.36% samples completing around 1s (96.65 percentile). Mac:
mean - 181ms. 10% in <= 0ms, 25% in <= 3ms, 50% in <= 24ms, 75% in <= 58ms,
90% in <= 182ms. 0.89% samples completing around 1s (97.97 percentile). Linux:
mean - 243ms. 10% in <= 2ms, 25% in <= 12ms, 50% in <= 32ms, 75% in <= 89ms,
90% in <= 242ms. 1.50% samples completing around 4250-4900ms (99.47
percentile) getaddrinfo() invocations with successful resolution, AF_UNSPEC
(IPv4+IPv6*) Win: mean - 363ms. 10% in <= 1ms, 25% in <= 8ms, 50% in <= 37ms,
75% in <= 103ms, 90% in <= 279ms. 1.06% samples completing around 1s (96.97
percentile). Mac: mean - 266ms. 10% in <= 0ms, 25% in <= 6ms, 50% in <= 37ms,
75% in <= 279ms, 90% in <= 372ms. 9.77!!!!!!!% samples completing around 300s
(83.79 percentile), 5.74!!!!!!% within 322-372ms (89.53%). Linux: mean -
359ms. 10% in <= 1ms, 25% in <= 12ms, 50% in <= 50ms, 75% in <= 119ms, 90% in
<= 322ms. 1.31% samples completing around 4250-4900ms (98.79 percentile)
AF_INET:AF_UNSPEC ratios (W/M/L): 1.09/7.05/26.6

There's a lot of stuff going on here. First, we have to study how Chromium
chooses whether to use AF_INET or AF_UNSPEC. For details, please refer to
[http://code.google.com/p/chromium/source/search?q=IPv6Suppor...](http://code.google.com/p/chromium/source/search?q=IPv6Supported&origq=IPv6Supported&btnG=Search+Trunk).
Basically, we run a series of basic local tests (creating sockets,
availability of a IPv6-enabled interface, etc) to see if we can support IPv6.
It's pretty fascinating to see that, despite our efforts on Windows to
identify this, basically 50% of the population still seems to support IPv6,
while the proportion of Mac and Linux users is way way lower. The local tests
are designed to be a bit conservative, so I think it's more likely that the
Windows IPv6 probing is overly conservative. So there's a lot more conflation
in the AF_INET vs AF_UNSPEC results for Windows. It's indeed unclear to me at
all whether Chromium's dual stack implementation is slower at all for Windows,
although I suspect that's mostly due to our insufficient IPv6 support probing.
On Mac and Linux, dual stack support is markedly slower than IPv4 only
support. It's quite incredible seeing the 300?ms retransmission timer in Mac
having such a HUGE effect. In dual stack Mac, this retransmission seems to
help for 10-15% of the time!!! TODO(willchan): run manual tests on Mac
versions to study the retransmission…is the OS doing the AAAA query first and
then falling back to A in 300ms?

Roughly speaking, the times of resolution successes exhibit a roughly gaussian
distribution, other than at the low-end (due to caching in the DNS hierarchy)
and the retransmission timeouts. Resolution failures on the other hand are
distinctly not gaussian. On Windows, we see 27.66% finish within 3ms and
slowly decrease through 9ms (35.74 percentile), spike up again between 10-24ms
(49.80 percentile), spike again around 300ms (58.8 percentile), and a huge
spike in the 1798-2394ms buckets (24% in these buckets, at 84.5 percentile),
and then one last spike between 10048-13386ms (8.6% in these buckets, 98.76
percentile). Mac and Linux distributions differ greatly from the Windows
distribution, to such a degree that I don't believe can be explained by
differences in population, but most likely are due to OS differences. Their
distributions are start very high in the low latency areas, decreasing into
some very long tails, with much more muted spikes at retransmission timers. On
Mac, 70% of resolution failures complete in <=2ms. There's a small (1%) spike
in the 762-1013ms bucket (93.4 percentile), a larger (2.63%) spike in the
27425~ms bucket (98.6 percentile), and a last 1% spike at the 56188~ms bucket
(99.87 percentile). Linux is basically the same as Mac here, although a lower
percentile (61.6) finishes in <=2ms and different failure/retransmission
timeouts. #chromium #dns #ipv6

~~~
spaznode
Try logging out and then reading. I'm on iPhone 4 / iOS 5.1. Endless redirects
for me, all the way down..

------
icebraining
It's unbelievable how kitchen-sink distros like Ubuntu and their derivatives
don't install dnsmasq by default.

------
JBiserkov
>But if you have both IPv4 and IPv6 interfaces, your slower getaddrinfo()
calls are probably slowing down your browsing experience.

 _Disables IPv6 interfaces._

-IPv6 adoption is too low!!

------
nikcub
Interesting topic, but a great example of why Google+ makes a terrible
blogging platform - the formatting and link-wrapping make it impossible to
read in places

------
davidu
So much smoke. I like how they mask very crystal-clear strategy with barely-
relevant DNS statistics and data in an attempt to obscure what's happening.

My prediction from 2 years ago + and again a couple weeks ago rings true now:
[http://www.forbes.com/sites/eliseackerman/2012/02/25/a-close...](http://www.forbes.com/sites/eliseackerman/2012/02/25/a-closer-
look-at-google-public-dns/)

 _"I’ll reiterate my view that I think Google controlling search, the browser,
and the network or DNS layer is a dangerous trifecta that the consumer will
probably be best served avoiding. I’m sure we’ll find out soon enough."_

~~~
simoncion
It seems that you don't know what a stub resolver is.

A stub resolver is a piece of software which uses the DNS (and other) servers
listed in the system configuration to resolve names to addresses. The stub
resolver may use a variety of strategies when deciding which of the configured
servers to use, how to time the queries, and what to cache; but stub resolvers
respect the system admin's wishes WRT what name resolution servers to talk to.

See: <https://tools.ietf.org/html/rfc1123#page-74> and:
<https://tools.ietf.org/html/rfc4033#section-7>

~~~
davidu
There is no connection between a stub resolver as Chrome is implementing it,
and the system-indicated DNS server.

This, as an aside, is what's wrong with HN. People cite RFCs or other
datapoint as if it validates their point, when in fact, it does no such thing.
You can run 10 stub resolvers on your system, each talking to a different DNS
server. A stub simply indicates that it lacks the fortitude for full-blown
root-down DNS resolution and validation. It might even still have a cache of
sorts. That's it.

Why my post got marked down, I'll never know.

~~~
simoncion
Firstly I'm _not_ making a point, I'm giving you information that your vague
hand-wavy rant seems to indicate that you don't have. Secondly, I point to the
RFCs to indicate that my description of stub resolvers isn't just my opinion,
but is somewhat in line with the community definition.

I never said that one could not have multiple stub resolvers installed on a
single system. I also never intended to imply this. There's no reason to
believe that the Chromium stub resolver will do anything but intelligent
caching and fetching. Certainly, there's nothing in the linked Google Plus
post to make me believe otherwise. Mr. Chan mentions that writing a browser-
specific stub resolver is really the wrong thing to do, but it's something
that they really need to do to get their target DNS query performance and to
detect and work around certain kinds of really shitty breakage.

Why did you get downvoted? Perhaps because you said

 _I like how they mask very crystal-clear strategy..._

when the strategy is anything but clear to anyone but you. By way of
explanation, you point to a multi-page article about you and OpenDNS in which
-among many other things- you say "[Google has] a separate privacy policy for
Google DNS, and I’m sure they are hypersensitive about privacy concerns, so I
wouldn’t be too paranoid [about the possibility of DNS query logging and data
mining].". I'm not sure what strategy it is that you're worked up about, but
it would be really nice if you'd come out and say it, rather than being all
oblique and mysterious.

Additionally, it would seem that you're the CEO or President or something of
OpenDNS? It's exceedingly poor form to say vague FUDdy smelling things about
companies that compete with your core business. If you're going to say
something, man up and say it; don't make others feel around to maybe discover
a hint of what your point was.

