
DNS Cookies – Identify Related Network Flows - codezero
http://dnscookie.com
======
codezero
Someone linked this on a great thread about how dns can leak info.

[https://news.ycombinator.com/item?id=19828769](https://news.ycombinator.com/item?id=19828769)

The parent thread is really interesting too.

[https://news.ycombinator.com/item?id=19828702](https://news.ycombinator.com/item?id=19828702)

------
DanielDent
It's fun when you check the front page of HN and see your work :).

~~~
codezero
I'm bummed it didn't stay there longer, not just for my own karma :)

------
ble52
OK, now please tell me how do I block it?

~~~
DanielDent
\- Never allow any part of the computing systems you use to cache anything.

\- Insist that everything in your life exist in a state of being functionally
pure & stateless.

\- Eliminate access to all sources of timing data.

\- Make sure that all tasks are completed in a pre-determined fixed amount of
time regardless of resource contention.

There are so many different side channel attacks, and the computing primitives
& API choices we have been making for years make it challenging to build
secure systems.

Caches are very deeply embedded in the culture of how computing is done.
Making tasks take longer than strictly necessary to avoid leaking information
goes against our instincts to optimize system performance.

It's going to take a lot of work and cost a lot of money to get software to a
point where we aren't playing whack-a-mole with side channels.

More pragmatically, the current implementation of this technique can be dealt
with by being very conscious of how much data your DNS resolver(s) are leaking
& being conscious of how large the anonymity set is of the userbase of your
DNS resolver(s).

If you limit DNS cache times and use blinding computation techniques to limit
the identity information your DNS resolver has or retains about you, then DNS
cookies can be largely mitigated. If you have faith that 1.1.1.1 is operated
in the manner that Cloudflare claims, the measures they have taken go a long
way to making DNS cookies unusable.

I also pointed out some additional specific mitigations when I reported this
issue to the Chromium team in October 2015:

[https://bugs.chromium.org/p/chromium/issues/detail?id=546733](https://bugs.chromium.org/p/chromium/issues/detail?id=546733)

~~~
feanaro
What if we designed the resolver to fetch many responses with the caching
disabled and then caching _all_ of them? In essence, force it to give you as
many cookies as your desired anonymity set size and then sample this local
store of cookies when calculating the response for the end client.

~~~
DanielDent
This would make it harder to build a fingerprint, especially if responses were
sampled from a number of independent sources.

The next logical step in the arms race would likely involve fingerprinting
systems using more bits than strictly necessary, and using error correcting
codes - i.e. treat the sampling as "noise" to be overcome.

It seems both more straightforward and more effective to build recursion paths
that you can trust aren't doing any intentional or unintentional caching.

This of course means the performance benefits of caching go away. This has
been a theme in computing lately (i.e. CPU speculative execution leaks such as
meltdown).

A recursor could be built which only uses each query response once, with
prefetching used to reduce the performance impact.

However, the mere fact prefetched responses exist would also leak data.

~~~
feanaro
> It seems both more straightforward and more effective to build recursion
> paths that you can trust aren't doing any intentional or unintentional
> caching.

I agree, but as you say, that will take quite some work and time to happen and
will be costly. I was thinking of this as a possible temporary mitigation
which would retain some benefits of caching. If it was made adaptive[1], it
would also have the nice side-effect of being more resource intensive for
those servers that attempt to use tracking.

[1] i.e. only fetch many responses if they appear to vary while doing a
smaller number of "probing" requests. Continue fetching more responses for
your local sample until they stop varying with some degree of confidence.

~~~
DanielDent
It would be difficult to differentiate between responses that vary due to load
balancing and responses that vary due to active fingerprinting.

Even when a site only has a single physical location, load balancing might be
done in part by having DNS randomly return one of many valid IP addresses.
E.g. this is a behaviour supported by Amazon's Route53.

Larger sites frequently use a combination of anycast and DNS based routing to
get packets to the closest POP. This introduces both (1) difficulty
identifying when fingerprinting is occurring, and (2) still more opportunities
for fingerprinting.

Most users will find it impossible to control which POP their packets get
routed towards. For someone doing fingerprinting, it could be a very useful
signal.

~~~
feanaro
Yes, but variation due to such load balancing would surely be limited in
entropy in practical non-tracking scenarios?

Approaching from the other end, it points towards anycast itself (and similar
techniques) being incompatible with hard tracking resistance.

I'm glad to see that Firefox containers already mitigate this by using a
separate DNS cache for each container.

------
phicoh
In general, publicly visible DNS cookies are set by a DNS recursive resolver.
Typically, multiple IP addresses share one recursive resolver. So it seems to
me that a DNS cookie has strictly less information than an IP address.

------
sairamkunala
If this fingerprinting stays across say HTTP Proxies (or VPN/Tor network) and
a regular network, this may be a way to track users especially for ad
networks.

------
SimeVidas
ELI5 description?

~~~
bluejekyll
The abstract from RFC 7873:

—— DNS Cookies are a lightweight DNS transaction security mechanism that
provides limited protection to DNS servers and clients against a variety of
increasingly common denial-of-service and amplification/forgery or cache
poisoning attacks by off-path attackers. DNS Cookies are tolerant of NAT, NAT-
PT (Network Address Translation - Protocol Translation), and anycast and can
be incrementally deployed. (Since DNS Cookies are only returned to the IP
address from which they were originally received, they cannot be used to
generally track Internet users.) ——

At the end of the day, the data can really be any 8 byte set of data for the
client part and up to 32 bytes for the server section. Which you could
technically use to store anything you want (or the upstream resolver could).

The linked article talks about using it for tracking users, which the abstract
ironically says isn’t generally possible.

~~~
DanielDent
I published dnscookie.com in late 2015. I google "dns cookies" and a few other
things terms, was surprised that the terminology appeared unused, and it
seemed suitable for the concept I was describing.

In May 2016, RFC 7873 was published which also uses the term "DNS cookies".

These two things share a name but have different meanings. The naming
collision is an unfortunate coincidence.

~~~
bluejekyll
Oh! Wow! That was a huge assumption on my part.

I had assumed it was the afore mentioned RFC. That explains why the site make
no mention of EDNS or OPT records.

TIL...

