
Passive Fingerprinting  of HTTP/2 Clients [pdf] - lainon
https://www.blackhat.com/docs/eu-17/materials/eu-17-Shuster-Passive-Fingerprinting-Of-HTTP2-Clients-wp.pdf
======
feelin_googley
If I were an online advertiser or a company selling to online advertisers, I
would be happy with these findings.

If I were a user concerned with privacy and tracking I might wonder why the
authors of the HTTP/2 standard did not make any efforts to address this type
of fingerprinting.

~~~
jhgg
Because doing so is not something that the authors of the HTTP/2 standard need
to do.

This fingerprinting is just sniffing settings negotiated between the
client/server, and comparing them to a known list of preset settings for a
given browser. I think allowing these settings to vary is a generally good
idea, as the best values for these settings aren't really able to be
determined ahead of time, without trying a bunch of different configurations
out and collecting real world data.

Whenever there is a variation of configuration options between
implementations, fingerprinting will be possible. These configuration options
are fundamental to how http/2 works. Differing vendors may arrive at differing
configurations based on what they believe to be the "best" set of options to
use.

If I was an online advertiser, or a company selling to online advertisers,
these findings would be mostly irrelevant to me (outside of anti-spam/abuse),
as for the 99.99999% use case, the user agent string is probably correct.

It's worth mentioning that there is not enough entropy in these fingerprints
to be used as a "super cookie" but rather to just determine whether or not a
user-agent string is probably spoofed, by comparing the settings negotiated by
the client to settings that the given user agent is known to negotiate. This
is a smart approach to determining whether or not a user agent is spoofed,
which has its uses in anti-spam/abuse, but a fairly sophisticated adversary
could also just have their client send the same settings, and packet sequences
as the user agent it is trying to spoof.

------
cm2187
Maybe I missed something but it seems to only fingerprint the browser/OS type,
not the individual user. It's not like a super cookie.

~~~
yborg
Yeah, this seemed more like a bunch of guys at Akamai padding out their
resumes. Their use cases seemed pretty weak, and with all the other
information browsers leak, being able to get platform type "with increased
confidence" seems like a tiny achievement.

~~~
buro9
This approach is common and obvious, and importantly it isn't about privacy vs
advertising.

What it is about are things like click-fraud, content-scraping, and general
classification of bots and other actors to allow customers to more confidently
express their wishes.

Header ordering, H/2 options, and even capabilities via headers and the values
they hold... these are just signals. When combined with other signals they
allow for better classification and better control.

Is this really a Googlebot? Well ASNs aside, only if it has these traits. But
Googlebot is the easiest, what about the SEO and SEM bots? What about the
numerous bots that claim to be Baidu but really are not? What of the spam bots
that claim to be Firefox and are little more than C# clients?

Classification of traffic based on all available signals is what this is
about. The more signals, and the more that cannot be controlled by a user of a
client, the more confidence in the classification.

About the only advertising use here is that it would be easier to spot click
fraud when one knows the client is spoofed.

------
BillinghamJ
Interesting. Perhaps adding some random variance between common settings from
different browsers would help to prevent this?

Assuming the vast majority of the time the settings are pretty arbitrary and
don't need to be set to anything particularly specific, when specific values
actually are needed, they should blend in fairly well.

------
billyhoffman
I can see why people would love to read a paper about passively fingerprinting
individual users. However passive HTTP/1.1 fingerprinting of servers and
clients is a valuable, and well researched topic (at BlackHat no less).
Finding ways to extend that to H2 is helpful to the industry as a whole.

Don't discard the work because its not a "super cookie"

------
monkpit
PDF is unreadable for me on iPhone/safari. Zooming in is just blurry.

~~~
cpach
Works fine here. I’d kill Safari and try again.

