Things like AMP and WebBundles clearly appear like a self-interested attack on the open web by Google, all the while making reasonable-sounding noises about efficiency and speed.
With the inexorable rise of ad-blockers, allowing 'User Agents' to be true agents of the user is a threat to Google's business model, and any significant engineering and standards efforts by that company can and should be evaluated by assuming the driver is not web altruism, but ferocious defence of their business model.
Interesting to see Google accounts drop in to misdirect on the essence of the argument outlined in the article: the essence being that bundles create private namespaces for URLS, which is an attack on the agency of a user in consuming content.
It should be unsurprising that Google wants to blackbox websites. If these kinds of initiatives succeed the philosophical debate around whether a user has the right to filter content running on their machine becomes moot, because technically they won't be able to.
The bottom line is: Google's business model is threatened by users being able to control the content they consume. Splitting hairs on how bad WebBundles will be in practice wilfully or otherwise misses this larger, more important point.
Operating in an Australian context, it's exasperating arguing forcefully for Google and Facebook [1] in regards to the new shakedown laws outlined brilliantly by Stratechery [2], all the while knowing there are very real criticisms that should be discussed, such as AMP and WebBundles.
I just don't get this criticism, nothing in web bundles appears to be particularly new, nor would it make it much easier to create private javascript namespaces. Years ago I built web apps that were packaged as zip files, which then were decompressed in a worker and then had javascript loaded from them. If you want to do this right now, it's not hard.
The blog author summarises the core issue here [1].
WebBundles are an attack on user agency: the right to filter content on the open web.
Google putting forward a proposal which directly attacks the ability of general-purpose blockers to operate is not a case of "nothing to see here, I did something I think approximates this situation years back".
The moral case of "blocking = theft" clearly isn't getting political traction so an alternative or accompaniment is pushing standards to destroy the rights of user agency.
Keep in mind Gorhill's statement on justifying uBlock Origin [2]:
"That said, it's important to note that using a blocker is NOT theft. Don't fall for this creepy idea. The ultimate logical consequence of blocking = theft is the criminalisation of the inalienable right to privacy."
Noble and correct, but if this technical war is successfully waged as mentioned it all becomes moot.
Some may desire that the internet moves to a set-top box model of pressing buttons to get an un-inspectable, unmodifable content window for consumption, but many of us do not.
I can't think of anything worse: the open web transformed by ad-tech behemoths into some kind of locked-down hotel entertainment system.
You're talking to someone who hasn't used a browser without ublock origin since 2014. Just looking at the spec, it's not any more effort to build a zip file than it is to build a web bundle.
Google pushing this does more to normalize the practice than anybody in this thread who 'tinkered with an equivalent idea a few years ago' or whatever.
I don't think that is an important distinction to make. Google isn't alone in wanting this, with a simple search you can see by now there are multiple webpack plugins that produce zip files: https://www.npmjs.com/search?q=webpack+zip
If someone really wanted to sabotage the URL right now, they would have already been doing it by hacking these to randomize the filenames, which is the same thing you would have to do to deliver that type of obfuscation in a web bundle anyway.
You’re missing the point. It is one thing for a small percentage of websites obfuscating their code and assets in zip files, but it’s a whole other thing if the practice is standardize and a large number of websites move to implement it.
I don't see how that is relevant at all. Obfuscating code and assets isn't in the spec. Either way, you have to go out of your way to do it. Web bundles actually seem like they would be significantly better for content blockers than zip files would be, because accessing them still uses the standard xhr APIs which can still be hooked.
If you're really worried about big mysterious obfuscated blobs of code running in your browser that can't be broken up and blocked individually, the real culprits are web assembly, and minimizing bundlers like webpack. Not some packaging scheme.
By "obfuscate," I meant obfuscate the URL, which has been mentioned multiple times in this discussion, so I thought it wouldn't be confused with code mangling.
Obfuscating of the URLs is a non-issue and I think that comment is being unnecessarily negative instead of looking at ways to deal with the problem. There is no reason an adblocker should not be able to get at the actual source (i.e. knowing that request A is actually going to be fulfilled with a resource included in web bundle B coming from site C). If ad blocking tools are unable to do this for reasons unrelated to the web bundle spec, that's a separate problem.
It also seems to paper over the fact that randomizing urls requires a server side component to keep rebuilding the bundle which is the same thing you would have to do with the zip file method, and has most of the same drawbacks (i.e. changing the url would invalidate cache).
> I think that comment is being unnecessarily negative instead of looking at ways to deal with the problem [...] If ad blocking tools are unable to do this for reasons unrelated to the web bundle spec, that's a separate problem.
The Github thread starts with making the spec and adblockers compatible [0]. You seem to be of the opinion that they are, or that their incompatibility is better solved on a different layer. If so, your input would be appreciated at that thread, which is still active.
[0] And in response to the push back, explains the incompatibility.
I probably won't because I don't maintain any ad blockers at this time, in my opinion if you really care about this as an attack vector it's better to just block javascript entirely. As I've said before this type of circumvention of ad blockers is not anything new and has been possible for years. The sites that really wanted to mess with your url filtering are already doing it, with or without web bundles.
I didn't say that, and I would assume most HN readers understand I was moving from the specific concerns content blockers have with your 'WebBundles' proposal to commentary on the wider motivations of Google: that its business interests logically drive it to find ways to thwart content blockers which frustrate their ad-tech ecosystem.
A web architecture which does not allow content to be modified – one that results in websites being a 'black box' – is the ideal outcome for Google's ad-tech ecosystem.
I am hardly claiming any one initiative takes us straight there - that would be quite the poor strategic play from Google. But Google's changes to Chrome to frustrate content-blockers [1], through to AMP and now WebBundles paint a disturbing picture for independent observers.
For examples that might speak to the institutional strategies employed by Google, recall the claims from Johnathan Nightingale that Google systematically sabotaged Firefox over a decade [2].
Those claims are telling not just for the institutional analysis, but for the revelation of an honest mindset among Google engineers internally:
"I think our friends inside Google genuinely believed that. At the individual level, their engineers cared about most of the same things we did."
When I see the valid claims made of serious issues around AMP and WebBundles, and see honest, heartfelt responses from Google engineers that it's all fine and a beat-up, I can't help but think of Nightingale's observations.
"Hey everyone, it's all fine. We mean well. Don't worry - nothing to see here."
I don't work on WebBundles, but I do work on open source web libraries, used by web developers, who often struggle to tie together tools to make up for the lack of asset bundling in the web. I care very much about this feature as a way to reduce complexity and friction for developers, fully unlock new web features, and improve UX via fast asset loading and better caching.
It's effectively serializing a HTTP/2 stream. A browser doesn't have to fetch from the bundle and can fetch from the URL directly as well. Any processing currently done at the request/response level, like blocking, is still done on the request/response level.
There is an objective truth here that is not subject to conspiracy theories about the intent of Google.
WebBundles do not prevent content from being modified, and does not make web sites a "black box". That's just FUD, and I challenge you to point to where WebBundles do any such thing.
WebBundles are an archive format with an easily parsable index, and where it's easy to read individual files based on their offset in the bundle. The contents of WebBundles are individually processed, individually addressed by URL, individually populate the network cache.
If you have any evidence to back up your description of WebBundles as a black box, please provide it, because the fact on the ground do not support that assertion, and the article in question doesn't even directly claim that, even though it sneakily skirts around the issue by comparing bundles to PDFs. PDFs aren't modelled as a bundle of several responses, so the comparison is flat out wrong.
Again, I did not state "WebBundles = a blackbox". That is your misinterpretation of my comment which I took the time to clarify:
"I am hardly claiming any one initiative takes us straight there [to a black box of the web]".
I was extremely clear on that, and further explained that we must consider Google's pattern of behaviour and commercial interests here – not one specific action.
Ignoring my good-faith clarification you seemingly continue to suffer this misconception – that I asserted something I didn't ("back up your description of WebBundles as a black box"). You may be more interested in engaging with and refuting arguments based on a convenient misconception, but I really am not.
For anyone seriously interested in this topic from the perspective of the open web and content blocking, the blog author summarised his concerns in a Github ticket [1].
(The responses throughout the Github issue reflect a similar strategy of a 'fingers-in-the-ears, everything is fine, what are you talking about' approach)
You've held to a strongly combative & in my opinion fairly arrogant/dismissive close minded outlook in the dozens of comments you've posted in this thread. It's frustrating having to cope with such a lopsided one-dimensional perspective that injects itself everywhere. I feel like you have overrun & dominated every single thing anyone else has offered.
I see that you think this is a bad spec, but you don't seem to acknowledge comprehend or grasp that a lot of people want this for really good & wholesome reasons. Signed exchanges for instance are absolutely critical in allowing users to share content with each other while offline, which has radical & excellent potential.
The work exactly the same. They don't even need any modification, as far as I understand.
That's what I'm trying to point out by saying that WebBundles adhere to the current request/response model - they just preload a bunch of responses so you don't need the network round-trip. See the section I already linked: https://wicg.github.io/webpackage/draft-yasskin-wpack-bundle...
An extension that can modify requests and responses still can with bundles. In fact, it should be easier to identify and block individually address resources out of a WebBundle vs the transpiled bundle out of WebPack, et al.
Just coming across this now, but since you seem to know a bit about this: can a tracker blocker prevent the preload requests from being sent in the first place, or alternatively, is it the case that the preload requests do not allow whoever initially serves up that content to track who has been loading it?
This seems like an advanced form of what service workers do, or am I misunderstanding?
Provided that adblock/content block functionality wouldn't be impacted, I can provisionally get behind this. It would certainly make my life as web developer easier.
One of the reasons to block content selectively is to save bandwidth on metered connections. Does a WebBundled site have to also provide separate resources, or is there some way for this bandwidth to be saved for those who need to save bandwidth?
They have wanted to kill the URL for a long time. The mask often slips. There have been so many half baked attempts trotted out, and nobody liked them. every browser has tried de-emphasizing the URL, and Chrome even tried completely eliminating it.[1]
They hate that people can deep link into their platforms and share these links with each other, untracked. They want to have complete control of the experience. They are slowly migrating us to a network of crummy walled gardens.
If it was up to them, they would just have you download a giant .exe file and require that you run with admin privileges. The only protections that the likes of Google would offer you is that the different pieces of opaque adware clogging up your device not interfere with each other, and not to so destabilize your device that you can't be advertised to. In other words, Android.
And they will use every bit of leverage they have to continue to corral us from the open pastures of the web into their nightmarish silos.
A powerful bulwark against this was Mozilla. But their #1 priority now is to validate and amplify Google's vision of the web. Rest assured, now matter how awful and user-hostile webundles turn out to be, Mozilla will be there, offering an inferior implementation.
Firefox got popular because they completely rejected Microsoft's vision of the web. They didn't try to implement Microsoft's terrible ideas, because they were user hostile and not worth implementing.
A bundle transparently "expands" into the url'ed resources in it. URLs are not going anywhere.
I'm one of the sometimes shrill voices when the url is getting de-emphasized but I really don't feel like webbundles are part of that threat.
What is different is that a server no longer has to be online serving content for folks to be able to get content. 3rd parties being able to serve content is a very different world indeed, & I have reservations about folks no longer having to run their own big world facing services & becoming reliant upon 3rd party serving. But those concerns are far outweighed by how excellent it will be being able to share web content with friends while offline together, being able to easily transfer webapps among users, &c.
I can already give everyone a unique URL to the same resource at the cost of it not being cached (which is a cost this mechanism doesn't avoid).
The article admits this, but then simultaneously tries to claim that the loss of caching would prevent that from being common... but this loses caching.
I can already give everyone unique resources for the same URL by keying the response on the Referer (which I would also add to the Vary header).
The article also admits this, but then tries to claim problems that are out of scope... the "first visit" will never be to a subresource of a page.
The article further tries to claim that this is difficult or defeats caching, but adding Referer to Vary is trivial and works with all CDNs and caches.
Meanwhile, if we believe this kind of argument about "people won't bother to do it because it is slightly hard" (I don't), then we need to be consistent.
It will take work to generate the multiple obfuscated copies of this bundle and be maximally disadvantageous to caching to do so for each visitor.
Ad blockers that work using URLs are fundamentally only possible because they aren't common: if you make them common, they will break.
I mean, ad blockers are so uncommon right now that you can often get away with blocking hosts at the DNS level... that is so trivially defeated!
If you want to do something useful, work on post-download ad "view" blocking by having an AI classify regions of the page as ads and then erase them.
This is also an arms war, but it is one that will more than likely at least result in subtle ads that you don't mind (as the AI "lets them slide") ;P.
Its definitely true that you can key responses to request information beyond the URLs, but I think theres more to it than this:
- Lots of tools don't allow you to block requests off those additional keys (Google's Manifest v3 for one, Apple's iOS content blocking too)
- In many cases, making consistent replies based off those keys requires additional information privacy tools also try to limit (referrer fields, cookies, etc)
- There is practical, real world value in making things more difficult for trackers / folks sending you stuff you don't want; decisions are made at the margins!
- Ad blockers are common on the web (estimates 10-30% of users!); they're effective bc they work with the grain of the web, and circumvention is deterred (though not prevented) bc there are costs to circumventing adblockers; WebBundles push those costs to zero. Again, decisions are made at the margins
One of the problems with ad "view" blocking is that the problematic code still downloads and loads, so you lose much of the benefits of content blocking making your page lighter and only get the benefit of removing mental overhead/distractions (I say "only", but I'm not trying to say it's not a huge gain). Ideally this AI would work at the network level, but the fundamental problem is that a URL doesn't have to tell you anything about what is on the other end, and we haven't tried "browser antivirus" software yet that blocks JavaScript by matching it against (say) an ad network's "fingerprint"…
Yeah. For various article-context reasons I am concentrating on the blocking of "ads" here rather than on "malicious code". The latter is a much more complex problem to solve in the first place, though: like, it relies on a specific model of opponent that is also extremely dumb; amazingly, this opponent exists in the wild, but I swear it is only because people who go out of their way to block them with simple techniques are so rare as to not be worth investing into including in your un-targeted resource utilization attack. (And FWIW, the ad pages I run into that are maximally bad -- ones that abuse the browser using history state tricks, popping up notification boxes and printer dialogs, sometimes even managing to lock up the entire rendering engine on purpose in an attempt to get the user to agree to the terms of the scam -- all have URLs that are constantly being randomized so often that I have a hard time showing them to other people.)
I'm not sure where the claimed confusion is above.
The argument is: I want to include something like fingerprint2.js in my page. I know filter lists block it, bc users don't like it.
W/o web bundles, you have to either inline it (bad for perf), copy it to a new URL (which could later also be added to a list), or add some URL generation logic to the page, and have some server side logic somewhere to know how to understand the programmatically generated URLs.
The claim is not that bundles are coming from random URLs, its that the bundles create private namespaces for URLs, and that breaks any privacy tools that rely on URLs.
Understanding your situation: you're imagining running a website that wants to include fingerprinting JS? So the site today looks like:
<html>
...
<script src=/fingerprint2.js>
...
The blocker sees ".*/fingerprint2.js" and excludes it. So far so good.
But your site could, with minimal effort, do:
<script src=/asdf23f23g.js>
randomizing the URL on every deploy. This would circumvent url-based blocking today, with the downside of preventing the JS library from being cached between deploys.
Web bundles change none of this. Just as you can generate an HTML file that references either /fingerprint2.js or /asdf23f23g.js, so can you generate a bundle. Unlike your claim in the article, this does not turn "circumvention techniques that are expensive, fragile and difficult" into ones that are "cheap or even free".
Again random urls are just a demonstration of the problem.
At root is private name resolution. What one bundle sees as asdf23f23g.js is different from what another bundle sees as asdf23f23g.js is different from what the web sees as asdf23f23g.js.
A site changing URLs often is a pain filter list authors deal with (automation, etc). Private namespaces for URLs is the real problem here that makes the proposal dangerous (and using it to randomize URLs is just a demonstration of how the dangerous capability can be used)
Making it per-user instead of per-deploy is simple today, though! Assign each user a first-party cookie, and use it to generate per-user URLs. Now users can cache across deploys as well:
1. First HTML request: Generate a random string, put it in a server-side cookie, use it to name the fingerprinting script.
2. Later HTML requests: Pull the random string out of the cookie, use it to name the fingerprinting script.
3. Requests for the fingerprinting script: see that the cookie is present on the request, return the script.
This is some work, but not that much. And unlike a WebBundle version, it is cacheable.
I might misunderstood how web bundles work, but it still seems significantly easier to implement with web bundles than without.:
Your cookie-based solution still requires the cookie as a sort of state tracker to memorize the URL in question. This requires some server-side logic to coordinate different requests and deal with cases where the cookie gets lost.
In contrast, implementing the same with web-bundles is as easy as generating a single HTML page: There is only a single script needed that generates the whole bundle in gone go and therefore can also ensure the randomized URL is used correctly without any kind of state.
If you serve a full bundle in response to each request then you've given up on caching for the fingerprinting library.
If you're ok with giving up on this caching then you don't need cookies, you can have the server generate random-looking urls that it can later recognize are requests for the fingerprinting library.
> that it can later recognize are requests for the fingerprinting library.
is made significantly easier to implement by web bundles. (Because with web bundles, the server doesn't need to understand the URLs at all)
I agree however that it's questionable how well this technique would fit into real-life situations. I imagine as most ads and tracking scripts are not served locally, you usually wouldn't be able to embed them into a bundle anyway, randomised URLs or not.
> is made significantly easier to implement by web bundles. (Because with web bundles, the server doesn't need to understand the URLs at all)
I agree it's a little easier than what I described because there are no longer separate requests to connect, but it's not much easier. On the other hand, if you're going to give up on caching and bundle everything into one file you could just have your server inline the tracking script. (Or use HTTP2's server push.)
Taking a step back, your last paragraph looks right to me. Ad scripts are very rarely served locally because there is an organization boundary between the ad network and the publisher (and a trust boundary), and they're large enough that you do really care about caching.
I don't understand this "private namespace" ability you claim WebBundles have that URLs don't already have. Any URL remapping that can be done inside a WebBundle, can be done outside the WebBundle, and on a per-user basis as well.
Today you can randomize the URL to the asset in the page. With bundles you can randomize the URL to the bundle, which then can randomize the URL to the asset.
It's one level of abstraction but if you can control the content of the bundle then you can also control the URL to that bundle, therefore there is no difference. If it's from a 3rd party then you can block the domain or bundle address. If it's 1st party then it's all the same abilities as today.
Will also add that all the other benefits (a single application package is great! signing things is great!) are true! But those do not hang on this aspect of WebBundles
This post is very confused. If one entity generates the page and the ads, then they can already reasonably easily randomize URLs and circumvent ad blockers. Facebook does this. On the other hand, if one entity generates the page and another handles the ads (the typical situation where a publisher uses an ad network to put ads on their page) then web bundles don't do anything to resolve any of the coordination problems that make url randomization and ad blocker circumvention difficult.
I'm currently exploring using WebBundles to serve multiple ads in response to a single ad request, because it allows ads to be served more efficiently while keeping them secure and private, but ad blockers would still prevent the ad js from loading before the ad request was ever sent.
> I'm currently exploring using WebBundles to serve multiple ads in response to a single ad request, because it allows ads to be served more efficiently while keeping them secure and private, but ad blockers would still prevent the ad js from loading before the ad request was ever sent.
Why is this a use-case that I as a web user should care about? Frankly, I don't find it very comforting that an article raises privacy concerns, and an engineer working on ads at Google responds with "these concerns are misguided; I'm exploring this technology to make ads better."
> Why is this a use-case that I as a web user should care about?
If you block ads, then my work will have no effect on you. If you do not block ads, then my work decreases bandwidth usage and makes it harder for ads to deface each other.
> I don't find it very comforting that an article raises privacy concerns, and an engineer working on ads at Google responds with...
The article describes privacy concerns, but those concerns are based on a misunderstanding of what WebBundles make easier. Specifically, they're concerned about URL randomization, which is no easier with bundles.
This is just super wrong. With WebBundles I can call 2 different things, in two different WebBundles https://example.org/good.js, and that can be different from what the wider web sees as https://example.org/good.js.
Random URLs are just an example of the capability they are not the fundamental problem
I can also do that on my server based on whatever criteria I want, just by returning different content. How is this significantly different than the problem you've brought up?
Based on the example at [0], my understanding is that no, you can control and serve up "/good.js" from your domain, but webbundles allow you to override "https://example.com/good.js" while within the context of your bundle - a totally different domain you don't control.
The criteria are well-known and basically restricted to what is contained in an HTTP request. With web-bundles (like with service workers) the logic which actual resource the URL resolves to is instead deeply embedded into the browser.
We're talking about what the server decides to put in the initial HTML file you request. Since those are almost never served cacheable to the client, the server can just randomize the URLs it produces.
If the bundles are different than the bundles themselves will have different URLs.
If they're served by a 3rd-party then you block on domain or address to the bundle just like today. If it's 1st-party then they can already randomize every URL.
why is it no easier with bundles? If I have a template with `<script src="{adtrackername}.js"/>` that I render into a web bundle on each request (relatively cheap), then each bundle can have random url names with no issues.
I don't block ads, and didn't block them even before I decided to start working in advertising. The ads are what fund most sites I visit; I like the sites and I wouldn't want to freeload. If a site's ads are too annoying I leave.
Funding a site by mind pollution, which is what push advertising is, is antisocial behaviour. The onus is on the site publisher to find an ethical source of revenue (just as with any business model). Putting up content on the web and funding that publishing are two different things. It's fine to be interested in and read the content while blocking an antisocial means of funding and that is not freeloading. It's protecting yourself from abuse.
I wanted to write a snarky response to this and went to look for some ad hominem seasoning for my zinger on www.jefftk.com. But I got played. It's hard to hold malice against someone who donates over 50% of their income every year [0]. Good on you. I'm going to keep using my adblocker, though.
> Julia and I believe that one of the best ways to make the world better is to donate to effective charities.
I would argue that the best way to make the world better is to First do no harm. Making a large income by pushing ads at Google doesn't uphold that rule, in my opinion. And perhaps this person's sensitivity around their job is why they feel the need to publicly list their donations, rather than giving quietly. Public philanthropy has often been used as a way to try to deflect criticism from how the money was first made.
Public philanthropy by billionaires of what for them is actually a miniscule portion of their wealth is suspect. But this is not that.
I can't speak for Mr. Kaufman, but I imagine he shares this information publicly not to brag, but to encourage a new norm of publically donating a significant portion of one's income to effective charity.
He and his partner may have literally saved hundreds of lives with their contributions over the last few years alone[0].
What have you done for the world lately? For myself, the answer is: not enough.
> perhaps this person's sensitivity around their job is why they feel the need to publicly list their donations, rather than giving quietly
I've been listing my donations publicly since before I started working in ads. I wrote https://www.jefftk.com/p/make-your-giving-public when I was working on open source software that rewrote web pages so they would load faster.
Actually the post says that we won't be able to block ads and read the rest of the page anymore. I can understand how Google benefits from that but I hope you understand why I wish that this standard will fail and you start working on something else, maybe in a different company.
FWIW I trust Brave a lot more on privacy than Google. Almost anyone who knows both companies would. The only less reputable source for privacy information is probably something from Facebook.
After experiencing the UX disaster that is AMP, I'd just prefer Google stop throwing its weight around "innovating" on the core web standards and stick to what it's good at: giving me what I ask for in the search bar. No, I don't trust the company that first monopolized search, then the browser, then manufacturer-neutral mobile phone OS to maintain a level playing field. Google has plenty of cash to throw at developers to dream up new schemes to bend the internet to their will in subtle ways. The rest of us don't have the resources to play that game, and we shouldn't have to. No one is more eager to have a well-reasoned, thoughtful debate Monday to Sunday than a lawyer paid by the hour. I suppose engineers, too.
If only society treated advertising like it treats pornography - require people viewing it to be 18, require that they opt in to it, and allow them to block it voluntarily without stigma.
Ads OR payment? Cute theory. Here's the reality: if I pay for a subscription to the NYtimes, I still get ads. If I pay for CNN, I still get ads. If I pay for a movie ticket, I get ads and probably product placement as well (just another form of ad.)
How much money do I have to give the NYTimes before they give me a newpaper without ads? The answer seems to be that it's not something they're selling. I don't think they ever have, and I don't really expect they ever will.
If they don't provide it then you can opt to just not read it. Not paying and not viewing ads is clearly expecting free content, the very same expectation that pushed so many ads online in the first place.
Or I can use an adblocker and give such businesses the finger. I will continue to do this. If they don't like it, the ball is in their court. Let them try to stop me, ruining their site in the process for anybody who pays for a subscription while blocking their ads, or offer an ad-free service to people who pay. Either way, using an adblocker is a clear win for me and nobody can provide me with a compelling reason to stop. Whining about how I'm being unfair to the poor w'ittle corporation isn't persuasive. It was never my intention to follow the rules of the game they want me to play.
I pay for YouTube Premium and don't get ads. Other than sponsorships in the videos, but they're not nearly as annoying as external ads. Oh and some channels cross-post their videos to a paid platform called Nebula – there you can often find non-sponsored versions even.
It's user generated content. What is Google supposed to do about sponsors in videos?
Technically the Youtube subscription is for the platform itself and extra features, and a little bit of the money is shared with creators in return for removing the ads. They actually make less with Youtube premium users, and if you paid them the same as their ad revenue then your total subscription would be much higher.
"Content costs money to make so companies need money to make it" is a true statement.
"Companies therefore must run advertisements to make content because a significant portion of the target audience does not want to pay for it" does not follow from that.
There are many industries that typically do not run advertisements for their content - book publishing comes to mind here - and they seem to get by just fine.
90% of everything is crap, so 90% of entertainment disappearing because production houses and producers stop whoring themselves out to Madison Ave and big brands is a net positive for society. Make the 10% of things you can get funded by the target audience without having to mentally and emotionally manipulate them (which is what advertising is) and you'll still have too much content for the average person to consume.
Advertising is immoral, and advertising to children is both immoral and explicitly evil.
> "significant portion of the target audience does not want to pay for it"
This is true for most online content. For various reasons, content that's not music or video has an assumption of being free. Book publishing is not comparable.
> "90% of everything is crap"
Maybe, but everyone has a different 90% so there's no "average". Content is made to match demand and there's something for everyone.
> "mentally and emotionally manipulate them ... Advertising is immoral"
Everything is manipulation and influence. Recommending something to your friends is a form of advertising (and even officially called "word of mouth marketing"). I agree that advertising should have better safeguards since it's influence for a price, but calling it immoral gets into some strange philosophical territory that doesn't accomplish anything productive.
Disclaimer: I work at Google, though not on this, and I very much want to see WebBundles succeed to solve many problems with asset distribution right now and reduce the gymnastics that bundlers have to go through.
So, this is a super unfortunate analysis that seems to be founded on a major logical inconsistently.
The post claims that:
1) It's costly to vary URLs per-user without WebBundlers because it harms caching. By caching I presume here they mean edge caching.
2) It's cheap to vary URLs per-user with WebBundles.
The costs are not any different:
* If you vary the bundle per-user then you also harm edge caching.
* If you build a bundle per-deployment, like current bundlers do, then you're not varying URLs per-user
* If you build bundles on the fly per request, say in an edge worker, then you could also just as easily and cheaply vary URLs per user the same way, with the same or less impact on caching.
The whole thing just doesn't make sense, and undermines one of the most important proposals for fixing not only resource loading but also much of the tooling ecosystem.
> I very much want to see WebBundles succeed to solve many problems with asset distribution
I’m wondering what those are. The use cases I’ve seen seem pretty tenuous to me (people don’t want to send web bundles to each other. They want to share a link. They don’t want alternate transports in general, really, or the complexities that come with that.)
I guess I may not know what the use cases are, but right now it looks like a complication with little upside.
The benefits to native bundling are huge. I really cannot overstate how much they will improve UX, and DX.
The core web file formats: JavaScript, CSS, HTML, and images, do not have native multi-file bundling support. Today's bundlers have to jump through serious hoops, which drastically reduces the effectiveness of the network cache, to emulate bundling. And those hoops bring constraints that limit the native features of the web platform that are usable with bundlers.
JavaScript is bundled by significantly rewriting it. At the very least imports and references to imports have to be rewritten. Live bindings from modules require special care. Dynamic import() is severely restricted. import.meta.url usually doesn't work well, or at all. Base URLs for all resources are changed, so getting assets with new URL() and fetch() is usually broken. That causes bundlers to support importing module types that the browser doesn't support, so at development time you still have to run the bundler. The whole thing just diverges from standard semantics.
CSS is bundled by assuming a global CSS scope so that the individual files are not needed separately. URLs within the CSS have to be rewritten. The upcoming CSS Modules specification allows JS modules to import CSS, but since you need to import the actual stylesheet you want, and CSS doesn't have it's own multi-module support, you really need native bundling to make this work well in production.
Images can be bundled _sometimes_. Maybe a spritesheet will work, or data: URLs. It all depends on how you're using them.
It's all a complicated mess that requires complicated tools and WebBundles just fix it all. You no longer have to rewrite any references in any files. If you do build files, it's purely for optimization, not because you have to.
What's important for users is that files in a WebBundle are individually cached. This makes it finally possible to do delta-updates on pages where you only send the updated files. Maybe the initial visit is with a bundle and subsequent visits use HTTP/2 push. Maybe when you deploy you build delta bundles against the previous X versions of the site. With current bundlers only the bundle is cached so you can't do this at all.
Explain to me why any of this is necessary once HTTP/3 is widely adopted?
Isn't the whole point of HTTP/3 to "emulate" bundling without actually requiring content to be zipped up into a binary blob? Isn't it also quite good at caching, such as incrementally updating individual files without complexity? Isn't it also capable of pipelining a bunch of small logical files into a contiguous stream as-if they were in one big blob? Isn't the head-of-line blocking issue, which HTTP/3 solves, going to be a problem with monolithic web packages? What's next... WebPackage 2.0 with a yet another head-of-line-blocking-solution!?
This all feels like the left hand not talking to the right hand at Google.
It also feels like Google is perfectly happy to bloat web standards to the maximum extent possible given the size of their development team, permanently and irrevocably blocking anyone from ever again developing a web browser from scratch to compete with Chrome. Unless of course they can devote a few thousand man-years to the task. Which noone short of two or three companies can now.
I've never even heard of this "problem" that Google is trying to solve, but clearly Google is pushing very hard for it to be solved... for some unstated reason.
From the outside, this feels incredibly self-serving.
HTTP/3 is just HTTP/2 over QUIC, fixing head-of-line-blocking will help some, but...
Having spent lots of time trying to optimize servers to correctly utilize HTTP/2+push, and building bundle-less deployments, not only is it quite difficult, it doesn't achieve the compression performance that bundling does. We measured ~30% better compression from a bundle over separate files.
And I don't see this as bloating standards at all, but fixing a huge problem in current web development and paving a cow path. Bundlers are central to web development today, maybe even more so than frameworks. And they cripple the features of many of the files types they bundle. WebBundles fixes that.
Thanks. That's all good. But I don't think this "...WebBundles just fix it all" is going to happen. I think it's going to be: web bundles is (yet) another option, with its own pros, cons, caveats, learning curves, migration challenges, support, etc.
That's just how I see it, I know I can easily be wrong. If it takes off, we'll be able to measure its value by penetration. E.g., if it's being used on 1% of sites, that means it has some decent niche value. If it's being used on 10% and rising sites, we'll know it has really good value.
Disclaimer: I work on asset bundling at Google, but unrelated to Chrome's work on web bundles.
One problem with asset distribution is that bundling (good transfer compression, efficient full download) and fine-grained loading are fundamentally at odds. Both Google and Netflix depend on dynamically combining chunks into responses today. Web bundles makes this possible using platform features and potentially also adds proper cache invalidation on top of it.
It's not quite there in the current specs but being able to send ES module graphs in web bundles does have properties that are hard to replicate without it. This isn't really related to users sharing URLs, it's more about the behind-the-scenes of loading assets for a page that got loaded using a traditional URL.
AFAIK HTTP/3 still only has header compression and does nothing for cross-resource compression. So I'm not sure how it relates to this?
I can see ways to model this kind of behavior on top of individual requests but it'd likely be even more complicated than it already is with web bundles. E.g. telling the browser that certain resources shouldn't be fetched because they'll likely be pushed because of a different pending request is super awkward without a "resource bundle" abstraction.
> people don’t want to send web bundles to each other. They want to share a link. They don’t want alternate transports in general, really, or the complexities that come with that.
Arguably "people" don't even want HTTP, either. They don't want any technical feature, they want the things they can do with that technical feature. You're right that they want simplicity, but Web Bundles done right should be very simple to the end user.
Just a random example that comes to mind: I want to share a number of photos with someone. A web bundle could very easily group those photos together along with a quick webpage showing thumbnails, selection etc. without ever having to involve uploading photos to a remote server. That seems like a really interesting possibility to me.
These are suggesting people want to share or access web sites in some other way than links. I don't think that's true in any significant way.
> ...they want the things they can do with that technical feature...
I agree. But the problem with all of the alternative transport scenarios is they have the user managing transport through some additional actions on their part. People couldn't care less about HTTP. What they like about it is they click/tap or copy/paste a link and that's it. Additional steps, like going through a specialized UI to share files via bluetooth is not going to interest very many people. That web site actually suggests people may want to share/access their website through the sneaker-net. That's just not a very important scenario.
To take your photo sharing example. That could be done using a web bundle. But it could be done in a variety of ways that would look the same from a user perspective. Keep in mind a network (with servers) is needed to share the data, whether there's a web bundle or not. So the remote servers are there, regardless.
Interestingly, most comments are talking about the cache/tech capabilities and not the privacy concerns.
I mean the way it was shown is that WebBundles allow the author of that content to arbitrary hide and package content in such a way where you cannot filter specific content; so either you view a page with ads and trackers or you don't see anything at all.
WebBundles can't hide content any more than a regular server can, and blockers can block individual resources from a bundle like they can individual URLs from a server.
remember that chrome / chromium hamstrung the ability to block at the loading stage; only firefox, ff derivatives and hacked up chromium derivatives can still block prior to loading with complex rules.
All claims seem to be based on the same URL mapping ability that's ascribed to WebBundles and for some reason, not servers or edge workers.
The infrastructure needed to randomize and remap URLs for bundles is basically the same as for endpoint URLs. You can already serve all requested content, including things a blocker might want to block, as first party, meaningless URLs. https://my-site.com/1g276dn4kas for everything. Store the URL map in the session data rather than bundle.
I don't even think the hard part in either case is mapping the URLs and serving the content, but rewriting all the references to the URLs in the source files.
No, the tech claims were wrong but since remaking bundles is proven cheap and easy then the bundles can quickly be modified to have malicious payload (for some definition of malicious)
Many of these complaints seem... odd. The primary complaint with Web Bundles is "they make the web like SWFs" - last I remember, people LIKED the fact that Flash content could be bundled that way. You actually were able to distribute Flash content like a normal web application, broken up into multiple SWFs, images, and so on. The vast majority of Flash content was a single file, because it's easier to distribute that way.
With WebBundles, if you want to deny specific subresources, you can still have the browser block them by identifier or hash, which is the same as you can do with a regular web request. The reason why there are no content blocking tools for PDFs is because nobody's bothered to write a PDF reader with that capability. Furthermore, I don't see how putting the resource identifier in the content of a bundle makes it easier than changing the URL. Both of those will either defeat or be defeated by caching.
If you have a CDN that generates random URLs for a single cached resource and serves them on the same domain as user websites, then identifier-based blocking won't work with or without WebBundles. You can do this today with Apache rewrite rules and PHP, which are about as economical as you can get. If your threat model is "people who just know how to SFTP a handful of JavaScript files" then yes, WebBundles makes things "easier". But I'd argue your threat model is laughably unworkable if that's the case. The state-of-the-art in adtech is to smuggle your fingerprinting platform as a randomized subdomain of the client site - how do you even begin to block that with identifiers alone?
> The reason why there are no content blocking tools for PDFs is because nobody's bothered to write a PDF reader with that capability.
The reason why nobody has bothered is because PDFs contain the desired content only. Generally, when viewing a PDF, you're not bombarded by five concurrent video ads competing with the text for your attention. There's no clickjacking, or overriding of scrolling behaviour to force a full-page popup down your throat the instant you scroll up, scroll down, or interact with the page in any way.
The latest insanity that has suddenly become popular in the HTML world is the ad-forward page. When you first visit a site, it forwards you to the content page, but if you try to click "back", you end up going to a full-page advertisement instead of the referrer page. I'm sure someone thought this up to defeat ad-blockers: You can't get to the content without being forwarded through the ad page, and then the browser history is polluted, so it's too late to simply block content. Pretty soon, uBlock will have to rewrite the browser history to stop ads.
This is bordering on insanity.
It's the same behaviour as when I walk into a whitegoods shop and a pushy sales guy steps into my way and demands to "help" me. Or when on holidays in a third world country, where being white means being endlessly harassed by hawkers.
Doesn't it make you people feel dirty to have to write the code to do these things?
Why, for the love of God, can't we have some sort of model where browsers pay for content with microtransactions?
I know it has been talked about a lot in the past, and rejected for various reasons, but seriously: It has to be better than this.
Mark my words: The inevitable outcome of funding content only with ads will be all such pages eventually evolving into DRM-protected video with the ads overlaying the text. It's just a matter of time.
people LIKED the fact that Flash content could be bundled that way
Developers did. People didn't. People hated Flash because it was bloated, downloaded slowly, and killed anything with a battery.
All this pining for Flash is just revisionist history inside the HN bubble from people who have forgotten how much Flash was hated by the general public.
> All this pining for Flash is just revisionist history inside the HN bubble from people who have forgotten how much Flash was hated by the general public.
At least in my social circle that's incredibly far from the truth. My friends and I loved Flash videos and games. Downloading the Flash runtime was a very small price to pay for the functionality it unlocked. And these people are definitely not HN users.
Wasn't the majority of Flash content made before the rise of JIT-compliation and the use of mobile devices everywhere?
For me, the downfall of Flash were security issues and its similarity to Java plugins. Additionally, Adobe never invested all that much in Flash, Macromedia is still more associated with the name.
Strictly speaking, there is nothing particular to Flash that precludes JIT compilation. In fact, AS3 code runs on a VM specifically built with JIT in mind. AS1/2 code does not, but that could be remedied with shittons of guard statements like modern JavaScript VMs have.
Folks "who are well actually, you can already make URLs meaningless"'ing are missing the point of how practical web privacy tools work, including the ones you all have installed right now.
There is an enormous, important difference between:
i) circumventions that are possible but are fragile and cost more, and
ii) circumventions that are effortless for the attacker / tracker
Every practical privacy / blocking tool leverages that difference now, the proposal collapses it.
I am not in any way pro-Google and I feel what they are doing with AMP is a perilous path, but your article was not convincing. It’s trivially easy to rotate a JS name and serve it from the same cached edge file (eg with cloudflare workers) and not incurring any major additional cost. It’s not hard to imagine advertisers doing something similar with their own CDNs. This could even be done at the DNS level with random sub domains. Unless we are ok blocking (star).facebook.com/(star), every other part of the URL is trivially easy to randomize.
> WebBundles Allow Sites to Evade Privacy and Security Tools
I completely disagree. Browser-level code can still effectively apply user-defined privacy and security policies onto WebBundles.
Didn't Firefox originally solve the issue of deep-linking into opaque bundles with it's jar: URI scheme idea? For example, this URI could be used to reference specific files in a zip archive in Firefox. iirc this scheme might not be supported by Firefox anymore.
The same concept could be applied for WebBundles with another similar URI scheme so that content blockers and security tools can effectively introspect the bundle before initializing resources from the bundle. For example, something like:
bundle://[optional hash, origin, or regex?]![uri or regex]
// e.g.
bundle://https://site.example/!https://site.example/regulated-sub-bundle-file.html
Yet another case study in why allowing a privacy-hostile advertising company to control the most popular web browser and influence web standards is bad for the web.
There is no difference between an URL without web bundles, and an URL that refers to a "web bundle" and a path within in.
All the things that they claim can be done with Web Bundles can be done without just as easily by putting all dependent resources in a versioned subdirectory.
> So-called "Binary Transparency" may eventually allow users to verify that a program they've been delivered is one that's available to the public, and not a specially-built version intended to attack just them. Binary transparency systems don't exist yet, but they're likely to work similarly to the successful Certificate Transparency logs via https://wicg.github.io/webpackage/draft-yasskin-http-origin-...
The circumstances that enabled Certificate Transparency to succeed were pretty narrow, but people seem to keep proposing more of these X Transparency systems so maybe I should actually properly write down why I think that's largely futile, not because I expect it will stop anyone but because at least then I can just paste the link.
For now, let me highlight a really big consideration:
- Most of the effectiveness of CT logs stems from the logging mandate not the cryptography. The publicly trusted CAs proactively log either everything they issue or all but some relatively small controlled subset of issued certificates which purposefully fall outside the mandate.
Will your X transparency be mandatory? Who will enforce that? Won't it be more practical to reject systems that try to mandate X transparency than to implement it?
Yes, especially you currently can't run .wasm based apps without a server on either Chrome or Firefox, but WebBundles will make this possible (I hope).
1) The use cases discussed by Google are largely offline. So how is this a "web" standard when it doesn't involve the web? If Google wants to implement an offline app format, that's their prerogative, but it's not clear to me why it should be enshrined as a web standard, just because it happens to use HTML/CSS/JS.
2) Google keeps talking about it as a tool for you and your "friend". I'm suspicious that this is the real motivation. Is the Google Chrome team really very concerned about in-flight entertainment? (Especially now when hardly anyone is flying anymore.) A number of people have suggested that the real motivation here is AMP. The "spin" in Google's proposal is too "friendly".
To me the key part is this: "Loads in the context of its origin when cryptographically signed by its publisher"
Then it feels like a bit of misdirection to immediately pivot to "easy to share and usable without an internet connection". What if you do have an internet connection? What if the real goal is actually to republish entire web sites on google.com, i.e., AMP?
> Why would Google have hosting everything on their domain be an end goal?
To become the internet! Much like AOL was in the past, kind of like how Facebook is now. Each of the big tech corps is trying to set up a walled garden that you never leave.
In the beginning, Google Search was not a destination, it was merely the way to find your destination. But now a large percentage of searches never actually leave Google. The data is re-presented by Google Search.
Google has a search engine, DNS, browser, email service, office suite... oh yeah, and a mobile platform. All of which have millions or billions of users. They just need to host web sites too in order to create their own web-on-the-web. The Google Internet. It's about controlling the entire "experience".
Think about it: the web is supposed to be "free" (as in freedom), but increasingly web site owners have to design everything around Google: ranking high in Google Search, having the right format for AMP, rendering "correctly" in Google Chrome. Even being forced to adopt https (read Dave Winer on that issue) or be erased.
Not a long time ago, the understanding was that browsers were mostly simple renderers for HTML with a HTTP client built in. You could serve HTML from a local file system or from a server without much difference.
It's only recently that due to security reasons, the ability to load HTML from local files was greatly restricted - so I personally would be pretty happy if loading pages from non-web sources is on the table again.
We don't make revenue from ads in pages, or lose revenue if Google smuggles ads into SWF-like WebBundles via AMP imposed on publishers under pain of search rank loss. Brave user ads fill user inventory, not in-page ad slots (perhaps you heard misinformation claiming otherwise). So even as an ad hominem rhetorical ploy, what you wrote doesn't work.
I'm not here to agree with the who you responded to, but Google smuggling unblockable ads does 100% harm Brave. If part of Brave's selling point is the built-in ad blocking and in a hypothetical dystopian scenario the web were overrun with unblockable ads from WebBundles Brave would inevitably be harmed to some degree. I say that as someone who runs Brave on my phone primarily. I know I'd probably switch browsers if one of the primary reasons to use it were largely killed off.
If AMP2.0 emerges with ads and tracking embedded in bundles, we will not lose money from those ads. Our performance and battery+dataplan savings will be gone, but we will be no worse than Chrome for such content.
Because Apple and likely other browsers will not adopt SXG etc, such AMP2 content will have to include a fallback link (as AMP does today). We may in such a scenario automatically prefer the non-AMP2 content.
In any case, your argument fails because we are no worse off in the worst case, and in better cases we still block many requests for greater user wins.
With the inexorable rise of ad-blockers, allowing 'User Agents' to be true agents of the user is a threat to Google's business model, and any significant engineering and standards efforts by that company can and should be evaluated by assuming the driver is not web altruism, but ferocious defence of their business model.
Interesting to see Google accounts drop in to misdirect on the essence of the argument outlined in the article: the essence being that bundles create private namespaces for URLS, which is an attack on the agency of a user in consuming content.
It should be unsurprising that Google wants to blackbox websites. If these kinds of initiatives succeed the philosophical debate around whether a user has the right to filter content running on their machine becomes moot, because technically they won't be able to.
The bottom line is: Google's business model is threatened by users being able to control the content they consume. Splitting hairs on how bad WebBundles will be in practice wilfully or otherwise misses this larger, more important point.
Operating in an Australian context, it's exasperating arguing forcefully for Google and Facebook [1] in regards to the new shakedown laws outlined brilliantly by Stratechery [2], all the while knowing there are very real criticisms that should be discussed, such as AMP and WebBundles.
[1] https://news.ycombinator.com/item?id=24186376
[2] https://stratechery.com/2020/australias-news-media-bargainin...