Hacker News new | past | comments | ask | show | jobs | submit | pes10k's comments login

Its definitely true that you can key responses to request information beyond the URLs, but I think theres more to it than this:

- Lots of tools don't allow you to block requests off those additional keys (Google's Manifest v3 for one, Apple's iOS content blocking too)

- In many cases, making consistent replies based off those keys requires additional information privacy tools also try to limit (referrer fields, cookies, etc)

- There is practical, real world value in making things more difficult for trackers / folks sending you stuff you don't want; decisions are made at the margins!

- Ad blockers are common on the web (estimates 10-30% of users!); they're effective bc they work with the grain of the web, and circumvention is deterred (though not prevented) bc there are costs to circumventing adblockers; WebBundles push those costs to zero. Again, decisions are made at the margins

Again I wrote the piece but…

Folks "who are well actually, you can already make URLs meaningless"'ing are missing the point of how practical web privacy tools work, including the ones you all have installed right now.

There is an enormous, important difference between:

i) circumventions that are possible but are fragile and cost more, and ii) circumventions that are effortless for the attacker / tracker

Every practical privacy / blocking tool leverages that difference now, the proposal collapses it.

I am not in any way pro-Google and I feel what they are doing with AMP is a perilous path, but your article was not convincing. It’s trivially easy to rotate a JS name and serve it from the same cached edge file (eg with cloudflare workers) and not incurring any major additional cost. It’s not hard to imagine advertisers doing something similar with their own CDNs. This could even be done at the DNS level with random sub domains. Unless we are ok blocking (star).facebook.com/(star), every other part of the URL is trivially easy to randomize.

This is just super wrong. With WebBundles I can call 2 different things, in two different WebBundles https://example.org/good.js, and that can be different from what the wider web sees as https://example.org/good.js.

Random URLs are just an example of the capability they are not the fundamental problem

> With WebBundles I can call 2 different things, in two different WebBundles https://example.org/good.js, and that can be different from what the wider web sees as https://example.org/good.js.

I can also do that on my server based on whatever criteria I want, just by returning different content. How is this significantly different than the problem you've brought up?

Based on the example at [0], my understanding is that no, you can control and serve up "/good.js" from your domain, but webbundles allow you to override "https://example.com/good.js" while within the context of your bundle - a totally different domain you don't control.

[0] https://www.npmjs.com/package/wbn

The criteria are well-known and basically restricted to what is contained in an HTTP request. With web-bundles (like with service workers) the logic which actual resource the URL resolves to is instead deeply embedded into the browser.

At least, that's my understanding.

We're talking about what the server decides to put in the initial HTML file you request. Since those are almost never served cacheable to the client, the server can just randomize the URLs it produces.

If the bundles are different than the bundles themselves will have different URLs.

If they're served by a 3rd-party then you block on domain or address to the bundle just like today. If it's 1st-party then they can already randomize every URL.

And now they have another option to randomize urls.

Sigh. About time GPT-3 is put to use for content-blocking.

Disclaimer: I wrote the article.

I'm not sure where the claimed confusion is above.

The argument is: I want to include something like fingerprint2.js in my page. I know filter lists block it, bc users don't like it.

W/o web bundles, you have to either inline it (bad for perf), copy it to a new URL (which could later also be added to a list), or add some URL generation logic to the page, and have some server side logic somewhere to know how to understand the programmatically generated URLs.

With WebBundles. I call it https://example.org/1.js (which is a diff than what https://example.org/1.js points to in a different bundle) or even just call it https://example.org/looks-to-be-benign-to-the-web.js.

The claim is not that bundles are coming from random URLs, its that the bundles create private namespaces for URLs, and that breaks any privacy tools that rely on URLs.

Understanding your situation: you're imagining running a website that wants to include fingerprinting JS? So the site today looks like:

   <script src=/fingerprint2.js>
The blocker sees ".*/fingerprint2.js" and excludes it. So far so good.

But your site could, with minimal effort, do:

   <script src=/asdf23f23g.js>
randomizing the URL on every deploy. This would circumvent url-based blocking today, with the downside of preventing the JS library from being cached between deploys.

Web bundles change none of this. Just as you can generate an HTML file that references either /fingerprint2.js or /asdf23f23g.js, so can you generate a bundle. Unlike your claim in the article, this does not turn "circumvention techniques that are expensive, fragile and difficult" into ones that are "cheap or even free".

(Disclosure: I work for Google)

Again random urls are just a demonstration of the problem.

At root is private name resolution. What one bundle sees as asdf23f23g.js is different from what another bundle sees as asdf23f23g.js is different from what the web sees as asdf23f23g.js.

A site changing URLs often is a pain filter list authors deal with (automation, etc). Private namespaces for URLs is the real problem here that makes the proposal dangerous (and using it to randomize URLs is just a demonstration of how the dangerous capability can be used)

Making it per-user instead of per-deploy is simple today, though! Assign each user a first-party cookie, and use it to generate per-user URLs. Now users can cache across deploys as well:

1. First HTML request: Generate a random string, put it in a server-side cookie, use it to name the fingerprinting script.

2. Later HTML requests: Pull the random string out of the cookie, use it to name the fingerprinting script.

3. Requests for the fingerprinting script: see that the cookie is present on the request, return the script.

This is some work, but not that much. And unlike a WebBundle version, it is cacheable.

I might misunderstood how web bundles work, but it still seems significantly easier to implement with web bundles than without.:

Your cookie-based solution still requires the cookie as a sort of state tracker to memorize the URL in question. This requires some server-side logic to coordinate different requests and deal with cases where the cookie gets lost.

In contrast, implementing the same with web-bundles is as easy as generating a single HTML page: There is only a single script needed that generates the whole bundle in gone go and therefore can also ensure the randomized URL is used correctly without any kind of state.

If you serve a full bundle in response to each request then you've given up on caching for the fingerprinting library.

If you're ok with giving up on this caching then you don't need cookies, you can have the server generate random-looking urls that it can later recognize are requests for the fingerprinting library.

Yes, I understand that.

I think the point is that this part:

> that it can later recognize are requests for the fingerprinting library.

is made significantly easier to implement by web bundles. (Because with web bundles, the server doesn't need to understand the URLs at all)

I agree however that it's questionable how well this technique would fit into real-life situations. I imagine as most ads and tracking scripts are not served locally, you usually wouldn't be able to embed them into a bundle anyway, randomised URLs or not.

> is made significantly easier to implement by web bundles. (Because with web bundles, the server doesn't need to understand the URLs at all)

I agree it's a little easier than what I described because there are no longer separate requests to connect, but it's not much easier. On the other hand, if you're going to give up on caching and bundle everything into one file you could just have your server inline the tracking script. (Or use HTTP2's server push.)

Taking a step back, your last paragraph looks right to me. Ad scripts are very rarely served locally because there is an organization boundary between the ad network and the publisher (and a trust boundary), and they're large enough that you do really care about caching.

I don't understand this "private namespace" ability you claim WebBundles have that URLs don't already have. Any URL remapping that can be done inside a WebBundle, can be done outside the WebBundle, and on a per-user basis as well.

The address to the bundle will also be different in that case, therefore you can block the bundles themselves.

You might lose some granularity in the filtering but most blocking is already done per-domain rather than specific assets.

Today you can randomize the URL to the asset in the page. With bundles you can randomize the URL to the bundle, which then can randomize the URL to the asset.

It's one level of abstraction but if you can control the content of the bundle then you can also control the URL to that bundle, therefore there is no difference. If it's from a 3rd party then you can block the domain or bundle address. If it's 1st party then it's all the same abilities as today.

I'm confused about what a WebBundle is.

Right now, when I Google "WebBundle," the first result is your article. The second result is an intro to Webpack.

These just look like optimizers that bundle multiple JavaScript files into a single file to avoid latency.

I'm confused. At least for adblocking, it's always an arms race.

> I'm confused about what a WebBundle is.


Will also add that all the other benefits (a single application package is great! signing things is great!) are true! But those do not hang on this aspect of WebBundles

Just like url remapping/obfuscation isn't a unique aspect of WebBundles?

Hi, I do privacy research at Brave. Our fingerprinting protections are still being improved (our second round of FP defenses should hit nightly in Jan), but the comments here are wrong; our defenses are much better than Firefox and Chrome.

We don't consider as part of our threat model preventing sites from knowing you're using Brave. I'm not aware of any browser or tool that considers this as part of their privacy threat model either (including the terrific Tor Browser Bundle). Our goal is to prevent sites from distinguishing between Brave users.

My claim is that our fingerprinting protections are strictly stronger than Firefox and Chrome. A partial list of fingerprinting protections that are enabled by default in Brave are at [1].

Fingerprinting is tricky, and we're tacking the problem in on multiple dimensions. We need to go further, and expect to have v2 of our defenses in nightly in the next month or two [2], but we're also leading efforts in W3C to address fingerprinting in standards (I'm snyderp, those are my / Brave issues identifying privacy harm in standards) [3], and we're pushing hard to prevent web standards and privacy efforts from being eaten up by vaporware [4].

So, all that is to say, Brave has the most aggressive fingerprinting protections of any general purpose browser, and is getting better. The naysayers are welcome to backup their claims ;)

1: https://github.com/brave/browser-laptop/wiki/Fingerprinting-...

2: https://github.com/brave/brave-browser/issues/5614

3: https://github.com/w3cping/tracking-issues/issues

4: https://brave.com/brave-fingerprinting-and-privacy-budgets/

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact