

User-Agent detection, history and checklist - rnyman
https://hacks.mozilla.org/2013/09/user-agent-detection-history-and-checklist/

======
w00kie
It was 1999, I was a freshman at university, doing my first internship at a
small hosting company. I was asked to develop a live web analytics tool that
the company could offer to their customers. I made a 1px transparent image in
PHP that logged the IP, referrer, user-agent, etc. in mySQL and showed the
live stats in a nice table in the backend with the list of 50 last visitors.

This was one of my first projects in PHP and I naively did not know about
never trusting user input and htmlentities().

Queue a couple of weeks later, I get a frantic call from my manager in the
middle of the night. One of the customers is making a huge scene that someone
hacked his website to put pornography on it. It turns out someone had changed
their user-agent to <img
src="[http://goatse.cx/hello.jpg">](http://goatse.cx/hello.jpg">) and visited
one of our websites...

~~~
mcherm
Brilliant hack! (And technically it's your fault - as you said - for failing
to properly escape the user-provided string before displaying it. But you were
young and we forgive you.) Thanks for sharing.

------
junto
Using the _User-Agent_ in combination with the _Accept-Language_ header is
probably enough to finger-print specific users that have their traffic exiting
a VPN.

Assuming the NSA are monitoring the traffic exiting VPN exit points, one minor
identity leak by a user (i.e. accessing a website that leaks their identity
such as their Facebook profile) would allow the NSA to back-track against any
stored traffic using those exact headers.

e.g. NSA db search:

    
    
      -- Search for 'Junto' traffic exiting VyprVPN London
      SELECT [Url],[Timestamp],[Headers],[Ip],[Body],[Response] FROM 
        HttpTrafficLog
      WHERE
        UserAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
          (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36'
      AND
        AcceptLanguage = 'en-GB,en;q=0.8,de-DE;q=0.6,de;q=0.4,
          zh-CN;q=0.2,zh;q=0.2,en-US;q=0.2'
    

The user agent leaks far too much in my opinion, and no, I do not have
British-German-Chinese heritage.

~~~
Nick_C
I agree it leaks far too much, especially for niche operating systems like
Linux. For Firefox, I use the "Modify Headers" add-on to remove the extra
info, so UserAgent=Mozilla/Firefox and AcceptLanguage=en.

[http://www.garethhunt.com/modifyheaders](http://www.garethhunt.com/modifyheaders)

------
derefr
Is there any reason, from the _user 's_ perspective, to still send a User-
Agent string at all in modern browsers? What is it useful for, besides making
clear that you aren't a (self-identified) robot or script?

~~~
talmand
Well, here's one potential reason.

I work on a site that involves fancy uses of video and modals and other
things.

For some reason, a recent update to Firefox broke some of the functionality.
Testing shows that it involved Firefox on OSX and Flash video. We are using a
HTML5 video with Flash fallback library on the site. Almost all browsers,
including Firefox on Windows, use HTML5 video on the site and not Flash.

Therefore, I had to use the user-agent to tell me if they were using Firefox
and then see if Flash content is used. If so, I fork the functionality to fix
the issue for those people. Everyone else gets the original code.

Not an elegant solution but for that edge case it worked. Without knowing the
exact browser then I couldn't fix the problem for those users.

There's also a need to know in terms of analytics what browsers are being used
to view your sites so you can know what level of functionality you can
reasonably expect from your visitors. That is, until all browsers in use
render everything, and behave, exactly the same way.

~~~
derefr
Rather than being sent to the server, it might be okay in the first case if
the UA was just exposed through the DOM as a property of the window.renderer;
that way, the page, after delivery, could actively decide to apply a shim, but
the initial content would be the same for everyone.

For the second case, you can at least do JS feature detection and send back
the results with AJAX. That seems like the "public API" way of getting to know
your users' capabilities, rather than trying to infer them from their UA
strings.

~~~
talmand
Usually I go with feature detection, but in this case I was unable to deduce
how to do that. I would require broken feature detection, if I knew what
feature was broken, or something along those lines.

~~~
derefr
> broken feature detection

Interesting idea! I can imagine an implementation--every day, along with
updating a malware database, each browser pulls down the list of open bugs on
its own bug tracker (and upstream component bug trackers) applying to its own
version. Then, from Javascript, you could say something like

    
    
        if(window.renderer.hasBug("http://trac.webkit.org/3440224")){
          /* you're on a version of WebKit affected by this bug */
        }
    

The nice part of this is, to get a globally-unique token identifying a bug you
just noticed, you "just" have to file it with the party responsible for fixing
it. When the bug is closed, so does your code-branch :)

