Hacker News new | past | comments | ask | show | jobs | submit login

User agent sniffing is still sometimes required. Take the SameSite bugs that Safari had. On the server you needed to act on that based on User Agent sniffing.





> SameSite bugs that Safari had

Safari has SameSite bugs and it's infuriating that after years they don't even update the bug reports, let alone fix those.

https://bugs.webkit.org/show_bug.cgi?id=200345


Agreed. For media streaming UA sniffing is required because of quirks each browser has in decoding media. Safari has a few quirks about the MP4 packaging it likes via MediaSourceExtensions. Firefox's network stack includes a delay not related to the actual request because of its implementation[0] which makes timing of small downloads very inaccurate. None of these differences are discernible except by UA.

[1] I believe FF includes in the download length the time a request spends sitting in the request queue, but I can't exactly remember.


Browsers have those quirks because web designers work around and hide them.

I also remember a CSS bug Safari 4 or 5 had with tables and touch scrollbars that we had to resort to UA detection for - it's not always possible to feature detect bugs.

Right, and it’s not just Safari. This is the current complexity of User-Agent sniffing that the SameSite change requires if you need to disable it:

https://www.chromium.org/updates/same-site/incompatible-clie...


I think it's distinctly possible that the current status quo may cause an overreaction. Feature detection is good. But we may also need the ability to do specific version checking sometimes.

What I think would be best is two things: A feature advertising string (preferably with some versioning in itself), and a new User Agent string that is just that: The user agent. As in, "This is Firefox 72.2.813". Maybe the OS in there too. I think a non-trivial part of the problem with the UA today is "Hey... I'm Firefox. Also, I'm Mozilla. Also, I'm edge 3, and Internet Explorer 6, and today I think I'm also Chrome. Also, I may be a monkey with a banana plugged in to an ethernet cable." Part of the problem is that people abuse the user agent, but part of the problem is that the user agent string is also a pack of outright, outrageous, self-contradictory lies. A truthful field may still have significant utility.

One of the things I've learned over the years in software engineering is that you should do you best to make sure that your system doesn't contain lies, or, failing that, treat them like any other issues that you should contain at the edges rather than letting it run riot through your system. For instance, if you decide to let customers have 10GB of bandwidth free before you start charging them, you should never accomplish that goal by tweaking the bandwidth counting system to report 7.5GB of usage as 0GB. That is a lie. Tell the truth about the usage and let the billing system apply the discount, which is itself a truth as well. You always pay for lies in the system in the long run. It is a very common pattern in coding as well where some function author figures out how to assemble the correct lie to some other function to make it do what the author things they want, but in the long run you're better off telling the truth and fixing the code to work with the truth. Otherwise, well, you really do end up with a tangled web of code. Which is... exactly what we've gotten with UA-detection-based code. We won't necessarily get the same mess if there's a new UA that isn't a lie.

I'm not claiming this will lead to utopia. Minority forks of popular browsers will fail to pick up the necessary ad-hoc bug fixes this way, for instance. My claim is more like in the long run, the best thing for everybody is just to have a user agent that tells the truth, even if on occasion in the short term you experience occasional problems.

(One way to at least partially achieve this is to standardize the new truthful agent string to something like (\w+) ((\d+.)+\d) (\w+), and specify it as "non-conforming true user agent strings MUST be entirely rejected and treated as being absent", so anyone who tries to be Firefox and Chrome and also a monkey ends up being nobody in particular.)


But that's what this proposal is about. Instead of using the user agent string (which lies and needs to be parsed) you have some well specified variables to check.

Also I think you're missing why the user agent lies now. It wasn't done on a lark. It's because it's been abused so much by web devs. Both by sloppy regular expressions and deliberately. For example, even big names like Microsoft and Google have at times used it to unnecessarily deliver poorer browsing experiences to people not using their browser.

Sure browsers could have stuck to their guns and been "honest" about their UA. But honesty is no comfort to users of the browser when they find websites break for no reason. The average user is more likely to blame the browser than the website.


I am completely aware of why the user agent lies now. Didn't I explain exactly how it is broken in my post? I was basically there for when it broke; I remember when IE came out claiming to be "Mozilla" because otherwise a surprising number of sites wouldn't serve them the latest whizbang Netscape 3 HTML. (I thought it was a bad idea then, but with much less understanding of why.) This is why I kept calling what I'm asking for a new field; the User Agent itself can't be rehabilitated.

The parent of my post is correct; in practice we're still going to need the occasional ability to shim in browser-specific fixes, because even if the browsers do their best, they're going to inadvertently lie in the future and claim to support WebVR1.0 in Firefox 92, but, whoops, actually it crashes the entire browser if you try to do anything serious in it. Or, whoops, Firefox 92 does do a pretty decent job of WebVR1.0 but I need some attribute they overlooked. Or any number of similar little fixups. We know from experience from the field in the real world that we're talking about crashing bugs here at times; this is real thing that has happened. Whatever proposal gets implemented should deal with this case too.

If we standardized on the format like I suggested at the end of my post, it would go a long ways towards preventing future browsers from mucking up the field. If you just get "$BROWSER $VERSION $OS" in a rigid specification, and if the major browsers are sure to conform to that, and the major frameworks enforce it, it'll be enough to prevent it from becoming a problem in the future. It won't stop Joe Bob's Bait Shack & Cloud Services from giving their client a custom browser and/or server that abuses it, but there's no stopping them from doing things like that no matter what you do, so shrug.


Then I'm not sure I understand you. The proposal is clearly proposing new fields that are less susceptible to abuse (whether intentional or not). Your idea of parsing a "$BROWSER $VERSION $OS" string seems inferior to client hints that use structured headers.

I'm saying we still need a browser version field, in addition to a feature field. Features would do on their own, if they were perfect, but we shouldn't plan on them always being perfect. We have a demonstrated, in-the-field history of browsers claiming to support features when in fact they don't quite support them, and can even have crashing bugs. In the real world, supporting WebVR1.0 is more than just putting "web-vr/1.0.0" in the feature string.

Culturally, you should prefer to use feature detection. Most developers would never need to use anything else. But when Amazon makes its new whizbang WebVR1.0 front-end in 2024, they may need the ability to blacklist a particular browser. Lacking that ability may actually prevent them from being able to ship, if shipping will result in some non-trivial fraction of the browsers claiming "web-vr/1.0.0" will in fact crash, and they have nothing they can do about it.

Besides... they will find a way to blacklist the browser. Honestly "prevent anyone from ever knowing what version of the browser is accessing your site" is not something you can accomplish. If you don't give them some type of user agent in the header, it doesn't mean the Amazon engineers are just going to throw their hands up and fail to ship. They will do something even more inadvisable than user agent sniffing, because you "cleverly" backed them into a corner. If necessary, they will examine the order of headers, details of the TLS negotiation, all sorts of things. See "server fingerprinting" in the security area. You can't really stop it. Might as well just give it to them as a header. But this time, a clean, specified, strict one based on decades of experience, instead of the bashed-together mess that is User-Agent.

Or, to put it really shortly, the fact that a bashed-together User-Agent header has been a disaster is not sufficient proof that the entire idea of sending a User Agent is fundamentally flawed. You can't separate from the current facts whether the problem is that User Agent sniffing is always 100% guarantee totally black&white no shades of grey mega-bad, or if it's the bashed-together nature of the field that is the problem.


Working around browser bugs only helps to ensure that those bugs will never get fixed. Better that things like the UA are removed to make that impossible. In that world, sites would be best served be coding to the standard and putting the work of fixing incompatibilities/bugs back onto the browser makers where it belongs.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: