Hacker News new | past | comments | ask | show | jobs | submit login
Downgrade User Agent Client Hints to 'harmful' (github.com/mozilla)
150 points by ronancremin 70 days ago | hide | past | favorite | 111 comments



> Moving stuff around (from User-Agent to Sec-CH-UA-*) doesn't really solve much. That is, having to request this information before getting it doesn't help if sites routinely request all of it.

I think this is sort of ignoring the whole point of the proposal. By making sites request this information rather than simply always sending it like the User-Agent header currently does, browsers gain the ability to deny excessively intrusive requests when they occur.

That is to say, "sites routinely request all of it" is precisely the problem this proposal is intended to solve.

There are some good points in this post about things which can be improved with specific Sec-CH-UA headers, but the overall position seems to be based on a failed understanding of the purpose of client hints.


> browsers gain the ability to deny excessively intrusive requests when they occur

But Set-Cookie kind of proves what happen to that kind of feature. If at first sites gets used to be able to request it and get it, then the browsers that deny anything will simply be ignored. And then those browsers will start providing everything, because they don't want to be left out in the cold.

That's what happened to User-Agent, that's what happened to Set-Cookie, and I can't see why it won't happen to Sec-CH-UA-*. Which the post hints at several times. Set-Cookie was supposed to have the browser ask the user to confirm whether they wanted to set a cookie. Not many clients doing that today.

To be honest, I feel the proposal is a bit naïve if it thinks that websites and all browsers will suddenly be on their best behaviour.


> Set-Cookie was supposed to have the browser ask the user to confirm whether they wanted to set a cookie. Not many clients doing that today.

No worries, that's why we have laws to make the website do in the content what the browser no longer wants to do in the viewer. ;D


Having the browser explicitly prompt for cookies is neither necessary nor sufficient to do what strong, consistently-enforced privacy laws can do, because the browser can't tell a tracking cookie (which needs a prompt) apart from a settings cookie (which does not).


And the law also only requires you to ask the user if they want to be spied on.

It's not tightly bound to cookies in any way.

And vastly misunderstood.

There was a predecessor which was somehow tied to cookies but even then you didn't need to ask for setting purely functional cookies.

But somehow everyone ended up interpreting it as such.

Maybe because most sites don't have many purely functional cookies or fingerprinting, as they always track you for other purposes, too.


I’m convinced that a lot of the really annoying cookie prompts are the result of two things:

* paranoia, from small websites that are understandably worried about massive fines that could actually put their one-man-show into the poor house

* retaliation, from large websites that intentionally want to turn public sentiment against privacy laws


We were naive if we ever thought the end result would be otherwise.


But browsers could disable third party cookies, and autodelete first party cookies on page/tab close by default.

There would be a "keep cookies for this site" button somewhere near the address bar, and at each login, the browser would also ask you if you want to save your password and/or save cookies for that domain.

99% of websites don't require persistant storage, and those who do, 99% of them are sites you're logged into and already prompt the user, asking if they want to save the password.


That's private browsing currently. Why not use a private window?


Because i might want cookies on this page, gmail and reddit, and nowhere else. This would mean me starting a private window, googling something, finding a link on reddit, opening it, either logging in again, or copying the link to a non-private window, commenting, closing that window, and back to search results.


Firefox has containers tabs that does this exactly from a new tab.


I often do that, but now I have to click on cookie confirmation banners all the time. It is very annoying. Might just take seconds, but it sums, eventually I have been clicking on these banners for hours

Sometimes these banners do not even work because of my NoScript



Because software is supposed to make our lives easier, not to insist we keep making the same choices again and again, and undo everything as soon as we make a mistake.


That would be an extension or fork of Set-Cookie.


Of course a web server could report which cookies are for tracking, and which are for authentication or configuration, instead of doing it within the content.

But so what? The browser has no way to tell if it’s lying.


Yes, this looks like DNT all over again. Just another header that quickly becomes meaningless, wasting terabytes of bandwidth all over the world for no good reason.


DNT does nothing technically, but it has political power and that's where privacy happens to a great degree. When 70% of users say 'do not track me', it is hard to claim that they don't care about privacy.


Unless a big vendor (coff Microsoft coff) decides to enable it by default, them it becomes meaningless.


It was meaningless from the beginning: DNT was always nothing but an Evil Bit. You’re getting mad at Microsoft for pointing out that the emperor had no clothes.


It was an Evil Bit becaut it didn't have the force of law behind it. Now we have cookie laws.


We had "cookie laws" when DNT was created, too.


There were people promising to implement it. That's a lot better than nothing.


Is it? The whole point to this thread is that none of the big players stood by their "promises" for longer than a few months. Especially Google's hypocrisy of promoting DNT in Chrome and knowing full well their adtech teams would ignore it as soon as they had an excuse. (Microsoft and Mozilla enabling it by default sure was a "good" excuse, despite that obviously being the best interest of the users.)


It's possible they were looking for an excuse and lying. But enabling it by default definitely seemed to defeat the entire point of the fragile agreement. It doesn't matter to them whether it's better for users.


My entire point is that if it was a "fragile agreement" it wasn't in good faith and it was lying and waiting for any excuse to break it, by definition. It doesn't matter what excuse broke it. It was always a bad faith attempt to score some regulatory points and it was never about actually doing good for users. They never should have offered a "standard" for such a "fragile agreement" they didn't really believe in, and everything they said about it was hypocrisy.


I meant a fragile good faith agreement.

I haven't seen any particularly compelling evidence it wasn't in good faith.

It was an "attempt to score some regulatory points and it was never about actually doing good for users". But that was always obvious because it was companies doing it.

It still would have done good if it was implemented.


Again, it was implemented and it didn't do any good, on purpose as soon as users tried to actually use it.

It's not a good faith agreement if "we agree to do this only so long as the setting to enable it is in the sub-sub-basement of the browser locked in a closet marked 'Beware of Leopard'". If it wasn't "oh no a browser enabled it by default" it would have been "oh no a tutorial went viral on social media telling people how there isn't really a leopard and that everyone should just open that closet and click the button", because again the excuse doesn't matter why they stopped supporting it they never planned to support it for more than the theory of it. There's always some other excuse. It was only ever a "heisenberg feature": it can either not be used or it could just not exist. I'm saying it's not possible at all to design a "heisenberg feature" like that in any possible definition of "good faith". If it wasn't designed to be used by even a paltry 5% of users at the time they balked and stopped supporting it, it wasn't designed in good faith. There's no way to look at that and think they meant anything about any of their promises when they said they'd support it.


You can keep claiming bad faith and that they would have used any excuse all you want, but if you don't back that up with anything but gut feeling then I'm not going to be convinced.

Yes it would be bad faith if this thing you're claiming is true. It would help if you had some evidence.

"We agree to let users opt out." is a pretty simple proposition and the motives make sense. It helps those users and it significantly reduces the demand to regulate them, but doesn't cost much money. Removing the part where users do the opting is a very different situation. Balking at that does not require bad faith. It requires they not have the users' best interests at heart, but... yeah, we knew that. If they did then ads would be much rarer and much less invasive, if they existed at all.


We probably have very different perspectives. I feel like I should warn you that you are very close to an ad hominem attack and I don't appreciate that.

My perspective from over a decade of software engineering is that you cannot ethically build a (privacy) feature that is "only ever to be opt in only" and not expect (privacy) experts to make a big deal about it ("hey everyone should opt in to this thing that makes your privacy better") and/or encourage other (browser) manufacturers to go ahead and make it default option ("this would make privacy better for non-expert users"). That's not an ethical feature, that's a bait-and-switch no matter what the timescale is between "this feature is opt-in only" and "oh no too many users opted in".

I am making a very opinionated judgement in leaping from it's not just a feature that was designed unethically, but that it was morally wrong for them to do so. It's fine if you don't agree that a bunch of people made a morally bad judgement call in building and marketing that feature. I believe that entire hype cycle of that feature did far more to setback privacy debates on the web for years than it did to help. I believe by making some of those morally wrong decisions years back Chrome established once (and maybe for all time) that their privacy teams cannot be trusted to build equitable/ethical/fair privacy systems. I worry that Mozilla is the last big opposition standing up to this behavior and I worry about what will happen if we lose Mozilla in this fight to keep an eye out for what other "privacy" ideas the Chrome team generates that should (rightfully) be deemed 'Harmful'.

I really don't expect you to agree with me at this point. The number of Firefox users left is abysmally tiny according to statistics. Chrome has won, despite whatever it is I (and I hope Mozilla continues to) think about their (lack of) professional ethics. At this point I'm only breaking it down for you as much as I can not to try to convince you, but to feel like I've done my part to say that I believe what Google did with DNT was very wrong, if not very evil, because there aren't a lot of people left to speak out against such wrongs. Because we as a profession don't have a proper ethics board to try these sorts of things in a court of our peers rather than let them fester and rot in the halls of companies that some people still believe the "Do No Evil" marketing despite actions they have taken.


I'm not sure what you think is close to ad hominem, but I'm sorry.

> My perspective from over a decade of software engineering is that you cannot ethically build a (privacy) feature that is "only ever to be opt in only" and not expect (privacy) experts to make a big deal about it ("hey everyone should opt in to this thing that makes your privacy better") and/or encourage other (browser) manufacturers to go ahead and make it default option ("this would make privacy better for non-expert users"). That's not an ethical feature, that's a bait-and-switch no matter what the timescale is between "this feature is opt-in only" and "oh no too many users opted in".

Well I never said it was a particularly "ethical" feature. If they wanted to be ethical they would shut down 90+% of ads.

But you're making a big assumption here, that privacy experts bugging people about it and browsers making it the default would get the same response.

My bet is that privacy experts bugging people about it would have been tolerated just fine, and that most people still wouldn't flip the switch.

But browsers changing the default is qualitatively different from the user being able to set it.

I think your characterization of "too many users opted in" is flat-out wrong. The browser is opting, not the users. And it's not like people were choosing Edge because it would opt them out. That's a completely negligible percent of people.

Another way the two are different is very simple: If you had a big fraction of users manually flipping the switch, and the advertisers tried to cancel the feature because that was too many users, you could mobilize those tens of millions of people into a powerful political campaign to bake DNT into law.

> I believe that entire hype cycle of that feature did far more to setback privacy debates on the web for years than it did to help.

Maybe. But I'd still rather have privacy set back and have sites respecting my do not track header vs. privacy set back and it's completely useless...

As for the rest of your post, I'm confused about what you're accusing the Chrome devs in particular of doing? I agree that they're a big problem, see also flock, but with DNT they're not responsible for what the advertising group does. What could they have done better?


> My bet is that privacy experts bugging people about it would have been tolerated just fine,

But they weren't tolerated. Comment sections of such articles were full of passive aggressive bullying about how "such a feature was only for nerds to care about and people shouldn't really do it". How much of that was ad industry paid, who knows. Rumors and reports abounded that articles were deranked in especially one well known search engine and how to videos were demonitized and hidden from recommendations in the current biggest video site. (And to not just point fingers at Google properties, there were similar rumors about algorithm shenanigans with posts on Twitter and Facebook, who both also have adtech firms deep into tracking.)

Obviously, privacy experts bugging people about it and browsers making it the default would get different responses: one is easier to deal with skullduggery and the other is "safe" enough to make passive-aggressive PR releases about as it gives you someone to point fingers at while you "take your ball and go home". I mentioned way above that it was a "good" excuse, and that's exactly how I think of it. They could pretend to be the victim and point the blame at a company with much less adtech as the real villain (for doing what users wanted and what was good for users); win/win.

Sure at this point we don't have clear evidence of skullduggery and it is mostly academic/hypothetical and a small assumption how the adtech firms would have reacted if something had slipped their nets and actually went viral. I don't think its a big assumption from there how they would have reacted nearly as quickly in "take their ball and go home" mode. The only difference there is whatever excuse they come up with to blame it on, and I can imagine all sorts of excuses they might have come up with. I'm "happy" for them they found such a "good" excuse.

> But browsers changing the default is qualitatively different from the user being able to set it.

I think we're never going to agree here. It's not qualitatively different. People had plenty of choice in browser at the time and the two browsers that did it had tiny minority user bases. It was quantitatively different. It was a lot of people opting in to more privacy at once with a browser upgrade. You can claim all you want that some number of those people were simply lazy and made "no choice", but the statistics don't actually agree with that.

One, because they were already minority browsers.

Two, if we want to get specifically into Edge details at the time Edge was already the browser with the highest adoption of Do-Not-Track even before the version that turned it on by default. Edge put the feature front-and-center in the Settings window and made it easy to find. Edge also did a very gentle prompt "Hey there's this new feature that could enhance your privacy. Do you want it on? [Learn More]" in the versions leading up to turning it on by default. Microsoft pointed out at the time that the leading feedback they kept getting from users from those prompts was "Obviously this is a good idea, why are you even asking, please turn it on by default." Most users that upgraded to the "on by default" version would have seen one of those prompts. A few were convinced by propaganda at the time that Microsoft was trying to "do some evil" (by asking if you wanted more privacy?) and switched to Chrome.

> And it's not like people were choosing Edge because it would opt them out.

I know I convinced a few people. Anecdata isn't data, but statistically Edge started briefly growing in users again right around that time. Not by much certainly (not enough to save Edge), but clearly some. (Some people were fed up with how hard Chrome made that setting to find.)

As a user, it was qualitatively better user experience with DNT for the brief windows where adtech actually abided by their "promises" (lol) and respected it. As an Edge user at the time, I don't think the web has felt as nice until Firefox added their "Enhanced" Tracking Protection, years later. (And Apple's similar tools just recently.)

> That's a completely negligible percent of people.

I think by definition it was not negligible if it spooked the adtech companies to quit so fast.

> Another way the two are different is very simple: If you had a big fraction of users manually flipping the switch, and the advertisers tried to cancel the feature because that was too many users, you could mobilize those tens of millions of people into a powerful political campaign to bake DNT into law.

I disagree. My entire point is that they never would have allowed it to get to "tens of millions" of people in the first place. Whether by skullduggery or passive aggressive PR notes doesn't matter. They very calculatedly stopped when it was a tiny fraction of people just enough to impact the bottom line. It's not a big assumption on my part that no matter how we got to that small of a fraction of people learning about the feature and actually using it, they would have always have stopped it before it became popular (whether or not you believe the skullduggery to be real or a conspiracy theory), at the very least because their shareholders would have demanded it because it was impacting perceived profits.

> As for the rest of your post, I'm confused about what you're accusing the Chrome devs in particular of doing? I agree that they're a big problem, see also flock, but with DNT they're not responsible for what the advertising group does.

I'm mostly admonishing all of Google for acting like an evil company. If we want to talk about the Chrome team's specific responsibilities: I believe that as a professional it is your job to make sure that you follow ethical standards. The Chrome team knows that their checks are signed by the adtech teams. That's a conflict of interest that makes it very hard to maintain professional ethics. I can't tell any individual developer on the Chrome team that they should stand up and walk away from that conflict of interest for the betterment of the web and the profession. How you navigate your ethics code is a personal matter. I can blame the team collectively for not standing up to the people writing their checks as a massive ethical failure. I doubt we'll see a Chrome dev team strike anytime soon, but that's within their rights.

> What could they have done better?

The obvious answer is a feature that was actually browser enforceable similar to today's Firefox's Enhanced Tracking Protection (which is good enough a lot of "publishers" call Firefox itself an "ad blocker" today, despite the fact that it blocks no ads just trackers), on by default and with an "opt out of privacy" model rather than an "opt in to privacy" wish-it-were.

It may have been "impossible" to do, because they would have actually needed to confront that conflict of interest in their hearts. They would have needed to tell their owners and masters that they were going to have to eat a couple of down quarters in profits until they either adjusted the market to charge appropriately for untracked advertising or managed to build a propaganda machine big enough to convince users in bulk that privacy wasn't in a user's best interest and that they should opt out of tracking protection.

But the guts to make that sort of ethical push, would have been the right thing to do, for everyone. Doing it then, and doing it with the majority browser, would have been an impactful statement. That would have been a "Do No Evil" Chrome moment for sure, and it is certainly obvious and easy to imagine that they had the power to do something like that, just not the ethics or morality.


Well I understand your opinion even if I disagree with parts, except a couple notes.

> I think by definition it was not negligible if it spooked the adtech companies to quit so fast.

I'm pretty sure what spooked them was all the people that were going to be on Edge because it comes with windows, plus the extremely high chance of bigger browsers doing the same thing. Not the people switching to Edge specifically because they wanted DNT.

> The obvious answer is a feature that was actually browser enforceable similar to today's Firefox's Enhanced Tracking Protection (which is good enough a lot of "publishers" call Firefox itself an "ad blocker" today, despite the fact that it blocks no ads just trackers), on by default and with an "opt out of privacy" model rather than an "opt in to privacy" wish-it-were.

I'm not convinced that a feature like that is anywhere near as effective. I don't think we're going to have a good solution without legally mandating US servers respect something like DNT or GDPR.


> I'm pretty sure what spooked them was all the people that were going to be on Edge because it comes with windows,

It was barely 5%-10% of web traffic at the time, it was a minority browser. Statistically no one was using the "default Windows browser" for anything other than "downloading Chrome" for more than a decade by the point that happened. Overall, Windows is itself at something of a perpetual "full market saturation" any given year. There generally aren't massive waves of new Windows users "to be afraid of" and when there are the pipeline of "Google tells people they need Chrome to use the web" convincing new users that they have to have Chrome still seems to be going very strong.

There was no "potential users" to be afraid of. That said, 5-10% of web traffic was enough to get reported as impacting the bottom line in quarterly earnings to shareholders and Occam's Razor clearly suggests that they were worried about existing users of Edge. I do think that would have been a shutdown triggering amount of web traffic no matter how long it took to get to that point and whether or not it was through a browser making it a default or any other means of growth "organic or not".

> Not the people switching to Edge specifically because they wanted DNT.

It was the only notable reason for Edge usage to go up even a fraction of a point in that time period. There weren't major new features. There weren't major new Windows versions or massive PC sales. DNT maybe only influenced "a few dozen" people, but there was a small spike and it did seem to get noticed.

> plus the extremely high chance of bigger browsers doing the same thing.

What "bigger browsers"? At the time Firefox and Safari both had equally small shares of web traffic as Edge. The only browser that was "bigger" in this time was Chrome, and I'm sure they felt pretty safe that Chrome wouldn't do it. They might be afraid that Chrome not doing it might have actually impacted Chrome's huge userbase to consider other browsers for the first time in a decade. If Edge did have a noticeable spike in that time, it would have been for that reason most likely, and that may have been scary to the hegemony.

> I'm not convinced that a feature like that is anywhere near as effective.

I really have had so many websites tell me that "Firefox is an adblocker" despite it blocking zero ads only trackers that I feel it is very effective in practice. Some of these websites (or their ad networks) had marketing teams build entire cute little multi-step animations to show you how to either 1) download Chrome like a good sheep, or 2) click through Firefox's "are you sure you want to white list this awful tracker?" privacy warnings. Anecdotally, from usage experience it's clearly far more effective than either DNT in the brief period when it was working or GDPR have been so far.

(Which the GDPR adds enforceable penalties, sure, but it doesn't enforce it in the user's own browser and the enforcement has the usual delays that a violation would need to be caught and sent to a court to be enforced. We absolutely need more laws in the US like the GDPR, but that still isn't the best solution because it only fixes things after the fact. We also still need strong "before the violation occurs" tools in the browser to find them/enforce them on the user's behalf in the first place because tools like GDPR need time and government intervention to be enforced.)


Yes, but it's not hard to ignore DNT on Microsoft user agents, which are a small part of the population.


which were a large part of the population at the time.


Yes, I wish they would engage with how this fits into the rest of the Privacy Sandbox proposal (https://www.chromium.org/Home/chromium-privacy/privacy-sandb...). My understanding is it's:

1. Move entropy from "you get it by default" to "you have to ask for it".

2. Add new APIs that allow you to do things that previously exposed a lot of entropy in a more private way.

3. Add a budget for the total amount of entropy a site is allowed to get for a user, preventing identifying users across sites through fingerprinting.

Client hints are part of step #1. Not especially useful on its own, but when later combined with #3 sites now have a strong incentive to reduce what they ask for to just what they need.

(Disclosure: I work on ads at Google, speaking only for myself)


I think pretty much all browsers and a lot of web platforms made it clear in their response to FLoC that everyone except Google (and Twitter, I guess?) considers Privacy Sandbox to be harmful as a whole.


Objections to FLoC are basically about what should be included in #2. I don't understand why people would be opposed to #1 or #3 though?


It's a fundamental disagreement on the very idea:

Google's position is that it's okay for a website to know X amount of data about a user, you know, as long as it doesn't, in total, cross the creepy line.

Everyone else's position is that if the data isn't required to operate, you don't need it. If we accept that the User Agent, as it is going to be frozen, is going to be served anyways to avoid breaking the legacy web, very little of this proposal adds value, and much of it adds harm. It isn't practical to move to not serving the User Agent, so any replacement for the data in it is pointless at it's very best. The frozen UA provides enough to determine if someone is mobile, the only real need for UA strings. And when most browsers are looking at reducing the tools for websites to fingerprint, Google is introducing new ones.

So Firefox's position on Privacy Sandbox as a whole is pretty logical: If it's optional enough to be requested, why offer it at all? The entire premise of Privacy Sandbox is that it wants sites to have access to some amount of information about the user, and the position of every non-Google-browser is that they want to give sites as close to no data at all as possible.

This is the core of the problem with a single company being legally permitted to operate a web browser and an ad company. Every single browser developer that doesn't own an Ads and Analytics suite is opposed to Privacy Sandbox.


> Google's position is ... Everyone else's position is...

I don't think this categorization is accurate. For example, Apple built https://webkit.org/blog/8943/privacy-preserving-ad-click-att...

> if the data isn't required to operate, you don't need it

This is simple, but it's also wrong. Some counterexamples:

* Learning from implicit feedback: dictation software can operate without learning what corrections people make, or a search engine can operate without learning what links people click on, but the overall quality will be lower. Each individual piece of information isn't required, but the feedback loop allows building a substantially better product.

* Risk-based authentication: you have various ways to identify a user, some of which are more hassle for them than others. A login cookie is lowest friction, asking for a password adds more friction, email / SMS / OTP verification add even more. You don't want to ask all users to go through the highest-friction approach on every pageview, but you also don't want to let a fraudster who gets access to someone's cookiejar/leaked password/old device/etc impersonate the user. If you have a small amount of information about the current user's browsing environment, in a way that's hard for a fraudster to imitate, you can offer much lower friction for a given level of security.

* Incremental rollouts: when you make changes to software that operates in complex environments it can be very difficult to ensure that it operates correctly through testing alone. Incremental rollouts, with telemetry to verify that there are no regressions or that relevant bugs have been fixed, produces better software. You're writing as if your position is Firefox's but even they collect telemetry by default: https://support.mozilla.org/en-US/kb/telemetry-clientid

> the position of every non-Google-browser is that they want to give sites as close to no data at all as possible ... Every single browser developer that doesn't own an Ads and Analytics suite is opposed to Privacy Sandbox.

I cited Apple's conversion tracking API above, but another example of this general approach is Microsoft's https://github.com/WICG/privacy-preserving-ads/blob/main/Par... I don't know where you're getting that they're trying for "close to no data at all", as opposed to improving privacy and preventing cross-site tracking?

(Still speaking only for myself)


> Learning from implicit feedback: dictation software can operate without learning what corrections people make, or a search engine can operate without learning what links people click on, but the overall quality will be lower. Each individual piece of information isn't required, but the feedback loop allows building a substantially better product.

That sounds cool. How do I opt into it?


I would highlight that both Microsoft and Apple (to a lesser extent, mind you) also operate their own ad platforms. Don't get me wrong, I'd be happy to see a blanket ban on web browsers and ad companies being related, and have it apply to all three. I'm an equally opportunity antitrust breakup advocate. ;)

Regarding risk-based authentication, I see a lot of value in it, but I think the cost may be too high, and often less robust methods it uses are a poor metric anyways. I gave an example elsewhere that someone might be using a wired PC and a wireless phone on two different carriers with vastly different user agents at the same time, for instance.

I think there's some merit in some very rough Geo-IP based RBA, but I'm not sure how many other strategies for that I find effective. The fact that Outlook and Gmail seem equally happy to let someone who's never signed in from outside the United States get logged into in Nigeria seems like low-lying fruit in the risk-based authentication space. ;)


> I would highlight that both Microsoft and Apple (to a lesser extent, mind you) also operate their own ad platforms.

Do you mean that before when you said "every single browser developer that doesn't own an Ads and Analytics suite" you meant to exclude nearly all the browser vendors? Google, sure, but also Apple, and Microsoft. And then Opera, UC Browser, Brave, DDG, ... I think maybe everyone but Mozilla and Vivaldi has an ads product?


Perhaps it would be best to say companies support privacy in web browsers inversely correlated with their dependence on ad revenue. So Google is worse than Microsoft, which is worse than Apple, etc. I think it'd be fair to assume if you gave all three a choice to keep their ad products or their browser, Google would keep ads, and both Microsoft and Apple would keep their browsers, because of their relative value to their core business.


IMHO #3 is fundamentally flawed as I just can't imagine browsers improving to a point where you couldn't cross reference such "fixed" entropy budges to clearly identify the user.

The only IMHO reasonable technical solution is to reduce entropy as much as possible, even below any arbitrary set entropy limit.

Through in the end I think the right way is a outright (law based) ban of micro targeting and collecting of anything but strongly, transparently and decentralized anonymized metrics.

Also I don't seen Google fully pulling through, e.g. one area where chrome is massively worse then Firefox wrt. entropy is the canvas (at least last time I checked). It's an area where there are known reliable ways to strongly hinder fingerprinting of the canvas. But I don't see Google using them as it would be in conflict with Flutter Web rendering animations in the canvas (which inherently has problems and is technically sub-par compared to how the browser could render web animations (and does in case of Firefox)).


There are really only two ways this can go:

A. Browsers successfully reduce available entropy to where users cannot reliably be tracked across sites.

B. Browsers fail at this, and widely available JavaScript libraries allow cross-site tracking. If it's possible to extract enough bits, they will be extracted.

The thing is, if you can't get all the way to (A) then in removing bits you're just removing useful functionality and adding work for browser developers and web developers. Fighting fingerprinting is only worth it if you have a serious chance of getting to (A).

If you think (A) is off the table then I agree a regulatory solution is the best option. Even then, #1, as exemplified by UACH, is still helpful because it makes tracking more visible. If every piece of information you collect requires active work, instead of just receiving lots of bits by default, then it's much easier for external organizations to identify excessive collection.

(Still speaking only for myself)


Why not both (A) and a regulatory solution? I see no reason to avoid the regulatory route.


Legislation prohibiting fingerprinting would be great!

(Though potentially a bit tricky to craft and enforce)


Well, if the browsers can just deny those requests, then they can just drop the information entirely. (And they are dropping them from the UA.)

From the two non-harmful pieces, one is of interest of all sites, and the other one has the implementation broken on Chrome, so sites will have to use an alternative mechanism anyway. If there's any value on the idea, Google can propose them with a set of information that brings value, instead of just fingerprinting people.


I think the idea is that there are some legitimate uses for UA information that they don't want to eliminate entirely, otherwise yeah they could just deprecate the User-Agent header and be done with it.


Yes, I got that from your post. It's just that for Google, proposing it again with harmless content is very easy, but for anybody else to filter the bad content once the Google proposal gets accepted is almost impossible. (Although, if I was working on Firefox, I would just copy the most common data from Chrome, adjusting for those 2 fields that matter. That would create problems, but it's the less problematic choice.)

So, no, it should be rejected. Entirely and severely. It doesn't mean that contextual headers are a bad practice, it's just that this one proposal is bad.


I think most of the legitimate uses could be solved in a simple statement: Let users know whether the device is mobile or desktop, and then expect websites to send all of the logic to handle the rest client-side, so the server does not need to know.

I'd love to see browser metrics being absolutely devastated as an analytic source: It just is used today as an excuse to only support Chrome.


Risk-based authentication can use a change in user agent as an increased risk factor.


It could, but as someone who has spoofed user-agents in the past (primarily to get Chrome-only websites to cooperate) I would prefer if it wouldn't. If the baddies can snoop my https traffic or directly copy the auth cookies from my machine then also copying my user-agent isn't that big of a step for them. One might argue that detecting changes in user agents could be part of some kind of defense in depth strategy, but as a user I imagine I'm already so boned in that scenario that I doubt it would save me. So overall such a mechanism would bring me more inconvenience than security.


That's the whole point of RBA, though. That two requests have the same user agent doesn't tell me much, but if you have two different user agents from two different IPs that may sound really risky (use case dependent, of course).


Unless someone is sitting at their desktop computer with their phone connection to 4G...

Privacy initiatives will probably make some risk-based authentication tricks break, but they probably weren't robust methods anyways.


>By making sites request this information rather than simply always sending it like the User-Agent header currently does, browsers gain the ability to deny excessively intrusive requests when they occur.

Browsers can just not send a UA header


I tried this. It breaks a surprisingly large number of sites (or perhaps not-so-surprisingly), and good luck trying to beat Google's captcha without a User-Agent header.


Good luck trying to beat ReCaptcha if you're doing anything that puts you outside of the normal web browser behavior as imagined by Google's Algorithm.

If User Agent Client Hints become the new normal, I'm sure anyone excessively denying requests will be flagged in the same way.


Having to request it is a terrible idea to begin with. If I want to use different templates for mobile vs desktop, I need to know, on the backend, whether the device is a mobile device, and I need it on the very first request. Having to request these headers explicitly is an unnecessary complication that would slow down the first load.

However it is nice that there's now a separate header that gives a yes or no answer on whether it's a mobile device.


Why would you need different templates for mobile/desktop? CSS is quite capable responding to any screen orientation.


Yes it is. Except you can't use the same markup for both because the input devices, and thus interaction paradigms, are so radically different. Mice are precise and capable of hovering over things, so it makes sense to pack everything densely and add various tooltips and popup menus. Touchscreens are imprecise and don't have anything resembling hovering, so UI elements must be large, with enough padding around them, and with menus appearing on click.


Between CSS Flexbox and CSS Grid there shouldn't any reasons today that you can't handle 100% of those differences with the same markup and media stylesheets. (There's also obviously JS if you really must contort the HTML DOM to get what you want.)


You're not wrong. However, there are times when CSS isn't enough. For example:

- The Mobile vs Desktop design differences are too great.

- The site was originally created without considering mobile, and retrofitting mobile support is unfeasible.


Can you expand on the design differences?


"By making sites request this information rather than simply sending it like the User-Agent header currently does..."

This is also true with respect to SNI which leaks the domain name in clear text on the wire. The popular browsers send it even when it is not required.

The forward proxy configuration I wrote distinguishes the sites (CDNs) that actually need SNI and the proxy only sends it when required. The majority of websites submitted to HN do not need it. I also require TLSv1.3 and strip out unecessary headers. It all works flawlessly with very few exceptions.

We could argue that sending so much unecessary information as popular browsers do when technically it is not necessary for the user is user hostile. It is one-sided. "Tech" companies and others interested in online advertising have been using this data to their advantage for decades.


How would this work?

SNI is sent by the client in the initial part of the TLS handshake. If you don't send it, the server sends the wrong/bad cert. The client could retry the handshake using SNI to get the correct cert but:

- This adds an extra RTT, on the critical path of getting the base HTML, hurting performance.

- A MITM could send back an invalid cert, causing the browser to retry with SNI, leaking it anyway (since we aren't talking about TLS 1.3 and an encrypted SNI).

I suppose the client could maintain a list of sites that don't need SNI, like the HSTS preload list, but that seems like a ton of overhead to avoid sending unneeded SNI, especially when most DNS is unencrypted and would leak the hostname just like SNI anyways.


"I suppose the client could maintain a list of sites that don't need SNI."

That list would be much larger than the list of sites that do require SNI.

Generally, I can determine whether SNI is required by IP address, i.e., whether it belongs to a CDN that requires SNI. Popular CDNs like AWS publish lists of their public IPs. I use TLSv1.3 plus ESNI with Cloudflare but they are currently the only CDN that supports it. Experimental but works great, IME.

The proxy maintains the list not the browser. The proxy is designed for this and can easily hold lists of 10s of 1000s of domains in memory. That's more domains than I visit in one day, week, month or year.

Is it not a question of whether this is possible. "How would this work". I have already implemented it. It works. It is not difficult to set up.

Why this works for me and would unlikely work for others.

I am not a heavy user of popular browsers, I "live on the command line". Installing a custom root certificate with appropriate SANs to suppress browser warnings is a nusiance that would likely dissuade others since they are heavy users of those programs. However I generally do not use those browsers to retrieve content from the web.


Ahhh. I see, you are default-no-SNI, and whitelist those that do.

If your threat model is such that you absolutely positively cannot leak the signal of what domain names you want to make HTTPS connections to, then I suppose this is an approach that can be used. But if you believe that is your threat model, I imagine you have bigger issues to protect against. As you say, it's unlikely to work for others.


No "threat model" here, just a dissatisfaction with so-called "modern" browsers and TLS extensions that disproportionally benefit hosting companies over users (privacy in this case). Plus I genuinely prefer commandline TCP clients and text-only browser to read HTML for most web use. I like the speed, reliability and more uniform presentation I get across all web sites. I like text. Big browsers that do everything under the sun written by people working for "tech" companies funded by advertising are not interesting to me. In fact, I find them annoying.

Some folks write "browser extensions" to control graphical browsers to their liking. I generally do not use graphical javascript-enabled browsers; I prefer to use a different program, a proxy, to control the browser. It works with both graphical browsers and text-only ones.


I don't think you can ever determine that a site doesn't need SNI using HTTP alone. All you can have is that it doesn't or you don't know.


I do not use "HTTP alone", I use DNS, more specifically IP address. I generate lists. The lists are largely based on the hosting provider and created automatically, but I also edit them manually when necessary, which is the exception not the rule. Most sites requiring SNI that are submitted to HN all use the same CDNs: AWS and Cloudflare. The SNI list is dominated by sites hosted on AWS. The ESNI list is all sites hosted on Cloudlfare.

When I first started developing this workaround I thought I would be manually editing the SNI list constantly for "all those random sites that use SNI". This has not been the case. For the sites submitted to HN, use of SNI is mostly a CDN phenomena.

The important point here is that I do not send SNI by default. The default is privacy-by-design: no SNI. If I encounter a site that fails because it needs SNI, I add it to the list. The failure is caught by the proxy (the proxy verifies certificates, I do not rely on the browser), the SSL error is visible in the logs, and the error page the browser receives is a custom one I created myself that tells me where in the configuration the failure occured. I can test whether a site requires SNI very quickly.

Popular browsers cannot do this, we know that. If they could, I would not be coming up with workarounds. They routinely send more data than is needed, including SNI. That is the point of the original comment.


s/phenomena/phenomenon/


> "User Agents MUST return the empty string for model if mobileness is false. User Agents MUST return the empty string for model even if mobileness is true, except on platforms where the model is typically exposed." (quoted from https://wicg.github.io/ua-client-hints/#user-agent-model)

Honestly now - who drafts and approves these specs? Not only does it make no sense whatsoever to encode such information this way - it also results in unimaginable amounts of bandwidth going to complete waste, on a planetary scale.

This is just plain incompetence. How did we let the technology powering the web devolve into this burning pile of nonsense?


Drafts: Google

Approves: no one.

Chrome just releases them in stable versions with little to no discussion, and the actual specs remain in draft stages.

Edit: grammar


Why/how does this waste bandwidth? These are opt-in, so they are only sent if requested.

I mean sure http being plaintext is silly but that's not down to the authors of this particular rfc.


> "These are opt-in, so they are only sent if requested."

Are google.com, youtube.com, netflix.com, facebook.com, amazon.com and reddit.com going to ask for User Agent Client Hints? If they're going to (which is more than likely, let's not kid ourselves) - I don't see how your point holds?

> "Why/how does this waste bandwidth?"

Based on the current proposal - non-mobile browsers or browsers that simply do not wish to expose the specific model are somehow required to return the following header in response:

    Sec-CH-UA-Model: ""
Those are 19 absolutely useless bytes. Wouldn't it make more sense to simply omit the header from the response altogether? It would convey the exact same information to the server ("my Sec-CH-UA-Model is empty"), without the overhead of sending additional data.


> Are google.com, youtube.com, netflix.com, facebook.com, amazon.com and reddit.com going to ask for User Agent Client Hints?

Possibly, but they only need them on initial request I think.

> It would convey the exact same information to the server ("my Sec-CH-UA-Model is empty"), without the overhead of sending additional data.

It doesn't commit the exact same info though. This says "the client is aware of this scheme and doesn't reply, vs the client is unaware of the rfc. It's the true/false/none issue.

In a sane world, there would be auch shorter way to encode that, but http is a bad protocol so you can't nest/namespace things or whatever.


> "Possibly, but they only need them on initial request I think."

I don't see how this is going work.

If this is data the backend needs the client is either going to have to send it out on every request (like it does today with the User-Agent header) - or the client will only send it out once, and the response would be cached by the backend somehow. The requirement to cache this data on the backend is non-trivial to implement and is a huge paradigm shift. That's actually another pitfall with the RFC in a sense.

> It doesn't commit the exact same info though. This says "the client is aware of this scheme and doesn't reply, vs the client is unaware of the rfc. It's the true/false/none issue."

This doesn't make any sense to me whatsoever.

The "scheme" is what we make it out to be. If the scheme would allow for omitting headers (as it should have) - then the client would be in compliance with the scheme, wouldn't it?

By designing an RFC which allows for omitting headers we actually resolve the ambiguity between "false" and "none": the data is either explicitly there, or it's not.

I'm simply claiming that the scheme as currently proposed is nonsensical and negligent. At Chrome's scale - I expect each proposal to introduce more headers and data to every request to be very carefully weighed against impact on global bandwidth consumption and resource utilization (and energy use). I don't feel like any of that is reflected in this RFC, which is an absolute shame and a big part of the problem: we keep piling up more crap thinking "it doesn't really make a difference". I think it reflects poorly on the people involved with designing and building this.

Perhaps we should mandate providing these numbers explicitly with every RFC so that we can know how much the implementation actually "costs" in terms of bandwidth, memory, etc at the web's scale. This should help identify RFCs like this one which are seriously off on the engineering side of things.


> By designing an RFC which allows for omitting headers we actually resolve the ambiguity between "false" and "none": the data is either explicitly there, or it's not.

The point is that a server may care about the difference between "I am explicitly opting out of giving you this information" and "What are you talking about".

Also keep in mind that Chrome, when communicating with Amazon, Google, or Netflix probably uses HTTP/2 or HTTP/3, which would compress the headers, making much of your complaint moot. I'd expect `Sec-CH-UA-{foo}: ""` to compress very well.


The bar for creating a wicg draft is _very_ low. Things in that space are not "specs" that are "approved" in any way.


I would rather have all this information (along with whatever is being inferred from them) be exposed through a Javascript API instead of having browsers indiscriminately flood global networks with potential PII.

Chrome came up with this? Figures. Stay evil, Google.


Can you explain the attack vector where encrypted HTTPS network traffic is vulnerable but a JS API isn't?


Your browser opens an encrypted connection to somewhere you don't want it to (e.g. loads an image or iframe, JS not required). How many connections and resources does a normal web page load? 100? More? Almost nobody has time to audit all of them. Not technically inclined? You're screwed.

My secondary concern is that there would be more traffic going around the internet that isn't being used 99+% of the time.


There is a JS companion to this proposal that splits up the information in a similar way

https://wicg.github.io/ua-client-hints/#interface


A JavaScript API has been considered as a replacement for the user agent string, but it has two big downsides:

1) JavaScript must be enabled. If it's not, then the server can't get any of the user agent data - at all.

2) The server won't get the user agent data until after it has already responded to the first request it receives from a client. That makes it a lot less useful overall. Having to load a page, then perhaps redirect the user using JS based on what the JS API says is a bit untidy.


Serving different content for the same URI based upon various metadata fields in the request goes completely against the spirit of a URI.


No it doesn't? Ever heard of Accept or Lang headers? Or cookies for that matter? Dynamic content?


Agreed, and thanks for bring up the Accept header. The author seems uninformed about HTTP's built in Content Negotiation. They write about servers using the User-Agent header, specifically talking about WebP. Accept: "image/webp" works just fine for the major CDNs regardless of the UA.


Accept and content negotiation has a long established history, and content negotiation is different than the server making decisions based upon metadata.

It's one thing for the client to say "give me this resource in this format" its another for the server to say "oh you're coming from version X.Y of OS Z, I know what you really want."


This is unfortunately the world of web apps, where a URI just gets you to the app, and the content within is dynamic.


Even with web apps, you can serve the same app from the same URI. URI doesn't imply static content.

Serving a slightly different web app from the same URI based upon other random metadata on the other hand. Makes caching all the more complicated.


I get that. I do think by and large, the user's agent (the browser) should be making display and format decisions based on itself, rather than the server serving different content. Though I think the exception is mobile, where we probably shouldn't serve the client endless garbage it doesn't need.

I mostly think the replacement for user agent should be a boolean of mobile or not mobile. And everything else should be dynamically handled by the client.


Honestly though, if its enough content for mobile, its enough content for desktop as well.

The "garbage" we don't want to serve mobile, is often also garbage for desktop, autoplay videos, too many tracking scripts, etc. If we force people to optimize their site for mobile and desktop then maybe we'll actually get good desktop sites.


Eh, navigation layout should definitely be different for mobile, and we shouldn't ship the desktop navigation to phone browsers, and I still think it's reasonable to offer phones smaller/more compressed image sizes and stuff by default.

I agree tracking scripts and the like should be blocked and removed across the board. But I think there's probably a suitable amount of visible UI and content that should be shipped differently or less to phones, because of how they're interacted with.


I hear you, but I'd wager the size differences are actually pretty minor. Absolute worst case you have 2X the CSS and HTML but much will be redundant so it will probably compress well with gzip.


> UA Client Hints proposes that information derived from the User Agent header field could only be sent to servers that specifically request that information, specifically to reduce the number of parties that can passively fingerprint users using that information. We find that the addition of new information about the UA, OS, and device to be harmful as it increases the information provided to sites for fingerprinting, without a commensurate improvements in functionality or accountability to justify that. In addition to not including this information, we would prefer freezing the User Agent string and only providing limited information via the proposed NavigatorUAData interface JS APIs. This would also allow us to audit the callers. At this time, freezing the User Agent string without any client hints (which is not this proposal) seems worth prototyping. We look forward to learning from other vendors who implement the "GREASE-like UA Strings" proposal and its effects on site compatibility.

https://mozilla.github.io/standards-positions/#ua-client-hin...


I'm late to the ballgame, but what does "Sec-" mean as a HTTP header prefix anyway? I am failing at googling.


It means the browser is in control of the header, and not some script. From https://datatracker.ietf.org/doc/html/rfc8942 :

   Authors of new Client Hints are advised to carefully consider whether
   they need to be able to be added by client-side content (e.g.,
   scripts) or whether the Client Hints need to be exclusively set by
   the user agent.  In the latter case, the Sec- prefix on the header
   field name has the effect of preventing scripts and other application
   content from setting them in user agents.  Using the "Sec-" prefix
   signals to servers that the user agent -- and not application content
   -- generated the values.  See [FETCH] for more information.
As near as I can tell, the bit they're talking about in the Fetch standard is just this:

    These are forbidden so the user agent remains in full control over them. 
    Names starting with `Sec-` are reserved to allow new headers to be minted 
    that are safe from APIs using fetch that allow control over headers by 
    developers, such as XMLHttpRequest.


Does it stand for something? Why the letters 'Sec'?


I don't think I've ever seen it called out, but I always assumed it's "Secure" in the sense it hasn't been modified by a script.

But that's 100% a guess on my part.


Great, so now we have the HttpOnly flag for cookies which differs from the Secure flag for cookies, while the Secure in the Sec headers has the same meaning as HttpOnly.


And we have SameSite in Cookies, and Allow-Origin in headers!


I hope they avoid situations like the SameSite=None debacle[0] if they are going to freeze the User Agent header and not provide an alternative.

The assertion of Mozilla seems to be:

>At the time sites deploy a workaround, they can’t necessarily know what future browser version won’t have the need for the workaround. Can we guarantee only retrospective use? Do Web developers care enough about retrospective workarounds for evergreen browsers?

When there are significant numbers of users on devices like iPads that don't get updated any more, you can't rely on "evergreen browsers".

[0] - https://www.chromium.org/updates/same-site/incompatible-clie...


> Sec-CH-UA-Model provides a lot of identifying bits on Android and leads...

intentional?


Is there a typo or a pun or something I'm not seeing?

Knowing the exact make and model of an Android device is a lot higher entropy than knowing the exact make and model of an iPhone.


> I'm not sure why you used such an old Chrome version to test this.

That quote from the first comment on the issue is just a cherry on top.

Chrome 88 was released in December 2020. 7 months ago.


I'm going to cut them some slack since December 2020 feels both 2 weeks and 4 years ago.


Because when you’re implementing a new spec that is still in “draft” status and constantly being updated, things could have changed drastically in 7 months and 4 major versions?


Chrome releases a new major version once every two months. It's not the job of Mozilla to reverse engineer Google's internal processes and figure out which version is "extremely old one". And no, 6-7 months do not a "very old version" make.

It's also a very good thing that Mozilla picked version 88. It had all the described problems and Chrome still shipped this draft spec with known issues enabled by default in the very next version.

v88 was the last version that had this behind a feature flag. Now that it's enabled by default, devs will rely on it and Chrome will refuse to change it because "once it's out we can't change it".

Good on Mozilla to call bullshit on Google (and not for the first time).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: