Hacker News new | past | comments | ask | show | jobs | submit login
AMP pages displaying your own domain (googleblog.com)
200 points by dfabulich 6 months ago | hide | past | web | favorite | 325 comments



I can’t believe there is no way to opt out of AMP as the end user. The UX is so terrible. Often times I will search for something and have a Reddit result come back. When I tap the link, I get the AMP page which:

* does not show all comments, often ones I am actually looking for

* does not let me collapse comment sections

* uses the default white background theme which burns my retinas if I am looking at my phone in a dark environment

* shows overlay ads for the Reddit app that cover about 40% of the screen for no goddamn reason

* requires 2-3 separate actions to get to the original page

Yet I cannot find a browser extension or setting to tell AMP to fuck off. Honestly AMP might be what finally gets me to switch search engines after many years of using Google.


Frankly reddit is the website that has the worst AMP implementation by far.

In contrast, say, Urban Dictionary is undistinguishable from the real thing.


You won't see AMP if you switch search engines.


Using DuckDuckGo with a backup of !g (send search to google), I don't think I've ever hit an AMP page in search results in my life. Maybe because I only use !g for really technical searches.


!s takes you to Startpage, which, if you're trying to avoid Google, gets you where you want to go by proxy.


Just this morning I got pissed off by an AMP page and was considering a search engine switch. Maybe this is my sign.


AMP was what made me abandon Google Search in January 2018 for DuckDuckGo.

Surprising how little I've noticed the change, after using Google Search for over 15 years. I try queries on google.com maybe once or twice a week if I don't find what I'm looking for on DDG. If it's anything media or product related, I feel like I'm on an old, crowded MySpace page. DDG feels more like the old Google.


You convinced me. Switching now. Fuck this noise.


You can even tweak your DuckDuckGo settings (dark theme, etc) and save them in DuckDuckGo using a passphrase of your choice, which you can restore and keep them in sync across those devices.


Bing implements AMP, and there's nothing technically stopping other search engines from adding support.

Also, links in the Twitter app default to AMP as well.


I wouldn't mind AMP as a feature as long as Google let me specify that I don't want it.


Good for most people they use neither of those things then.


Sounds like a criticism of Reddit moreso than AMP.


I pick on Reddit because it has the most glaring issues. But there are others. Certain tech sites like Gizmodo come to mind. So do some news sites. It's especially weird when the result is a page that contains a video, and the video is what I actually want but because of AMP it doesn't load, and it's not immediately apparent what's going on.

AMP is straight up broken technology. Imagine if you subscribed to a print version of the NYT but instead of getting the Sunday edition you got a ransom note looking summary of some of the articles from Clipper Magazine. Would you be OK with that?


I use Firefox, DuckDuckGo and Redirect AMP to HTML https://addons.mozilla.org/firefox/addon/amp2html/ everywhere I can, including mobile.


Use something like Searx [1], either self-hosted or hosted by someone or some organisation you trust. You can still get Google search results if you feel the need, either by explicitly asking for them (!go for search, !goi for images, !gon for news, !gos for scholar, !gov for video) or by enabling the Google engines in your config. In the latter case you get Google results mixed up with the other enabled engines. The results are presented as normal links without redirection through the originating search engine.

Searx can be extended so it would be possible to create a plugin which rewrites AMP links into non-AMP equivalents, where available. It can already do things like Open Access DOI rewrite (Avoid paywalls by redirecting to open-access versions of publications when available) so the ground work has been done. I'm currently working on improving (and fixing, where necessary) the image search engines and will probably start on such a plugin if nobody else beats me to it.

[1] https://asciimoo.github.io/searx/


It's not really google's fault Reddit's amp sucks. AMP sucks, IK, I'm working on my company's AMP pages right now and they are a PAIN. But with enough tweaking, they can be gotten right. So I wouldn't blame Google for Reddit's devs.


Using the signed exchange mechanism means you allow anyone to serve your content. You will no longer know when it has been served and by whom. Instead, Google will know more about what your users are consuming on your website than you - despite HTTPS!

Also, there is no mechanism to limit who is allowed to serve your content for you.

I see no technical reason why the content has to be prefetched from Google instead of your own server.

It's also confusing for users and administrators. Want to block access to a website in your network? Guess what: Your block will not be effective because Google will proxy the data unbeknownst to the firewall.


The reason it has to be prefetched from not-you is to protect the users privacy. Until they click a link it is not considered acceptable to leak their search to the potential destination. Links have to be fetched from a third-party who the search engine trusts not to share the data, that at the moment is Google but will hopefully expand.


Google already can do this by preloading a cached page from its own domain. So this specification is unnecessary.

I think the real reason is that Google wants to build a walled garden, but doesn't want the walls to be noticeable. Even with AMP, they display a header that looks like a browser's address bar [1]

Also, on that page Google admits that it uses AMP Viewer to collect information about users:

> Data collection by Google is governed by Google’s privacy policy.

Which is probably their real motication for creating AMP.

[1] https://developers.google.com/search/docs/guides/about-amp


> Google already can do this by preloading a cached page from its own domain.

That's what AMP already did. This spec is better because it ensures publishers retain control over their own content, and doesn't confuse users by showing "www.google.com" in the URL bar for content that didn't originate from Google.


Publisher might want to display their URL in the address bar. But as a user I want to see the actual URL, not what Google or publisher want to show me. I don't want to see "example.com" in the address bar while I am actually connected to Google via a TLS connection signed by Google's key and my IP address is collected according to Google's privacy policy.

What confuses users is Google displaying a fake address bar [1] or browser displaying the wrong URL.

[1] https://developers.google.com/search/docs/guides/images/amp0...


The URL you see _is_ the actual URL. It doesn't matter where the content was initially loaded from because the page is signed by the publisher's private key (the publisher has full control over the page contents, Google can't alter it).


The content is served from Google's servers according to Google's (not publisher's) privacy policy. While Google cannot alter the content, it sees the unencrypted HTTP request. I don't want neither Google nor publisher to control contents of my address bar.


Google already knows the unencrypted contents of the page, and they know you clicked on a link to it (from their search results page). The signed exchanges system doesn't reveal any information to Google _or_ the publisher that they don't already know.

Your browser controls the contents of the URL bar, not Google or the publisher.


But Google controls my browser.


Have you tried changing browser? -- Written from my Chromium browser installed from the Fedora Linux package repository.


I actually use Firefox and avoid Google products where possible, but for the majority of users Google is controlling their browser.


I copy your post, but make it available further up the thread. Even though I sync your comment’s edits to mine several times a day, I also control Hacker News so I get them to display your username in place of mine, as to not confuse readers.

Page and DNS prefetching exists, HTML exists, why not just link to the page on the original domain?


> I think the real reason is that Google wants to build a walled garden

Exactly, this is the real reason why this abomination came into existence - all of this is masked as work for greater good all for those poor kids with limited network speed. As end effect everyone will suffer - user will never leave google ecosystem, he will remain on search page without even knowing about it, creator will lose control over his own content


Exactly: Google built a walled garden and now replaces the fence by glass, because people don't like the view from inside their cage. The worst thing is, that it'll get away with it.


„Considered acceptable“ by whom?

Why should the user‘s privacy be protected toward the content provider instead of the search provider? The search provider already knows more about me.


Edit: I completely misread the comment, but can't delete it anymore a minute later because someone already commented. So oops.


The caching doesn't necessary need to be by Google.

https://blog.cloudflare.com/announcing-amp-real-url/


I think that product is still cached by Google. Cloudflare is just providing the cryptography for Web Packaging so that the browser will show the url from the original page instead of the Google cache.


Caching not by Google doesn't give you a priority placement in the search carousel so it is pretty useless.


> You will no longer know when it has been served and by whom

I'm OK with publishers knowing less about the people seeing their content.


They just have to access the data via Google. Who can cross-link it with all the other data they have. No real privacy gain here.


That statement is also false, because publishers can still track when the page is served via JS.


Amp does allow Google analytics and other analytics services. Unfortunately, most places don't use server logs for much :(


Chances are if you don't want Google to serve your content to protect the privacy of your users, you don't want to use Google Analytics either.

Btw your account seems rather active for an account with the description "Inactive. Deletion Requested." :-)


Or if you want that stuff to work e.g. in China.


A bit more on this... there are a LOT of secondary bots out there, either searching for security holes to exploit or otherwise slurping content for reasons other than search.

JS based analytics (google or otherwise) is generally a better option for detecting actual usage. Yeah, you lose maybe 2% of actual users. You also lose 99% of the various bots. You still have to filter google's and bing's bots that execute JS though.


This is such a strange reaction from HN. The AMP cache URLs have been a top 3 complaint about AMP here. "I can't copy-paste URLs, it's hard for users to understand which site they are on, it looks like the content is provided by Google rather than the real provider", etc.

Now there's a solution that preserves the preloading and validation benefits of AMP caches but maintains the original URLs, in a way that's cryptographically sound, in the process of being standardized, and controlled by the publisher. This gets launched much faster than one would have expected. And suddenly everyone pretends that the AMP cache URLs were never a problem and this is some kind of a power-grab.


I think that most people are worried about Google using a controversial[0], draft web "standard" (Signed HTTP Exchanges), that introduces a major change in how the web works, in mass production, without trying to first resolve the problems raised with the proposal.

[0] For instance, Mozilla considers the current specification to be harmful[1].

[1] https://mozilla.github.io/standards-positions/


It also puts a _lot_ of work on the publisher to implement these changes. Again.

(Unless you're using CloudFlare)


Currently, it's difficult to implement however unlike rewriting a page in AMP, signing the page is a purely mechanical operation. All that's required is to improve the tooling, it is theoretically possible to be a one-click change for any website out there. Initially adding gzip support to a web server was difficult and out the reach of many webmasters, now it's basically universal.


There was an attempt to address Mozilla's concerns[1], but Mozilla never responded, unfortunately. If the Mozilla community chooses not to respond, that might cause people to consider whether or not their position should be given much weight.

[1] https://github.com/mozilla/standards-positions/issues/29#iss...


What do you mean they never responded? They say they are working on a response[0]. Taking time to respond and informing the other party that it will take a while is not "never responded".

[0] https://github.com/mozilla/standards-positions/issues/29#iss...


3 months is a long time.... how long is someone supposed to wait for a response before you just move ahead? If the answer is "forever", it becomes trivial to perform a denial of service attack on a standard. Mozilla specifically said, "this is not high priority for us". If it's not high priority for them to respond, that's fine, but waiting forever doesn't seem like a reasonable thing to require.


I am against this standard. First, I want to see the real URL in the address bar. Second, I don't want Mozilla to spend resources to implement specification that is made by Google for its own purposes.


I simply don't trust Google to not change the rules later.

What will stop Google from down-grading 2nd class URLs (ie, not hosted with google) to page 2 results?

It's effectively the same thing as having no AMP at all, yet they cleverly got everyone on board with this tactic.

Edit: I just skimmed through this... this looks _WORSE_ than having Google show their domain. This is some of the sneakiest most deceitful garbage I could have ever imagined.

Just no way. Need convincing? Look at the animated gif half way down:

https://3.bp.blogspot.com/-Xqfy7IhiTzc/XLY7goySWzI/AAAAAAAAD...


Sneaky because now you don't know what server a web page is coming from?

Because yes that's true, although cryptography it's maybe half true.


Sneaky because (especially for news articles), the most common web-based attack is google (or fb, etc) slurping up my information.

Now, they want to remove the remaining user interface element that says they’re spying on me!

Also, this makes it even harder to ad block their junk at the network layer (is foo.com down, or is this more amp bs?)


On their page about AMP Viewer Google admits that they are collecting user's data when they view AMP pages [1]:

> The Google AMP Viewer is a hybrid environment where you can collect data about the user. Data collection by Google is governed by Google’s privacy policy.

With replaced URL it will be more difficult to spot.

[1] https://developers.google.com/search/docs/guides/about-amp


Because literally the entire private reason for AMP is a power grab.

Not the public reason, but absolutely the private reason.

If Google, Apple, Amazon, Microsoft, or whatever publicly traded company makes a move its for money and power and preferably power, since that yields even more money.

AMP on web and email is the perfection of embrace, extend and extinguish


Power grab from whom? Bing serves AMP too.

https://blogs.bing.com/Webmaster-Blog/September-2018/Introdu...


They also push a browser based on Chromium. I wonder how many intentional incompatibility issues there will be with other web browsers that are not developed by G$$gle.


Well, I wouldn't call this a solution just yet. If you read through the documentation, you'll find that this won't work on shared hosts and requires a TLS certificate "that supports the CanSignHttpEchanges flag. As of April 2019, only DigiCert provides this extension." [1] Plus, as if the lift of transforming HTML into AMP HTML wasn't already big enough for your average web site owner, implementing signed exchanges will be over the head of 99% of the folks building web pages on the Web.

IMO, while the URL problem was a big issue, the bigger issue is that AMP's restrictions and limitations gives your users a neutered user experience in the final end. As others have pointed out, if it wasn't for Google's implicit requirement to implement AMP (e.g. to get into their carousel and other locations), AMP would have been DOA.

[1] https://amp.dev/documentation/guides-and-tutorials/optimize-...


> as if the lift of transforming HTML into AMP HTML wasn't already big enough for your average web site owner, implementing signed exchanges will be over the head of 99% of the folks building web pages on the Web

Converting web pages into AMP isn't something you can automate, but supporting signed exchanges is. You need certificate authorities to support the flag and web servers support the protocol, but if this catches on then the only thing you'll need from the site owner is the decision on whether to allow it.

(Disclosure: I work for Google)


Well, it's disappointing DigiCert didn't tell Google to fuck off. I hope this never comes to something like Let's Encrypt, so the vast majority of developers can never use this.

Sometimes, Google needs a gentle nudge from users saying "we don't like this" and hope they reconsider (I doubt it).


> I hope this never comes to something like Let's Encrypt, so the vast majority of developers can never use this.

Let's Encrypt's response:

    I think it’s likely too early in
    this draft’s development for Let’s
    Encrypt to prioritize implementation.
    It looks like it has a ways to go
    within the IETF before it would be
    an internet standard.
https://community.letsencrypt.org/t/cansignhttpexchanges-ext...


Honestly, if comcast and friends started blocking this crap by default (with an opt in for people that want to be spied on by google) I’d take back at least half the mean things I’ve said about Pai.


How do you personally feel about AMP? It looks like an attempt to make the web a walled garden.


AMP has a lot of things all together, some of which I like:

* I like that when AMP is used for ads then the ads are fully declarative. Advertisers getting to run custom javascript, even in a cross-domain iframe, isn't great.

* I like that AMP allows sites (currently primarily search engines) to trigger preloading in a way that doesn't leak information to the site that is being preloaded.

* I like the way things like "sorry AMP only allows us to use 50k of CSS" can give developers leverage to push back against bad site designs.

* I like that it centralizes some measurements: instead of every ad provider using their own custom polling system to determine if the ad is on screen they can all subscribe to events triggered by a single well written system. This doesn't affect the amount of tracking (there's lots either way) but it makes it hurt the user experience less.

On the other hand, I don't like that:

* AMP uses a ton of JS, and if all you want is a simple website it's going to slow things down in the non-preloaded case. For example, taking a random post on my site (https://www.jefftk.com/p/trycontra-implementation and https://www.jefftk.com/p/trycontra-implementation.amp) I see a median speed index of 1.611s on non-AMP but 2.051s on AMP: https://www.webpagetest.org/result/190417_XB_22673cb98ce390a... https://www.webpagetest.org/result/190417_PS_1a60378762d87fb...

* A lot of people that don't want to implement AMP are doing it because then they get more search traffic. I understand how there isn't currently a non-AMP way of doing preloading in a way that doesn't leak information to the site (see above) but I think Web Packaging should be extended to support this in the general case and allow publishers to use AMP only if they want to.

* The interaction between AMP and content blockers isn't great. If you have a content blocker set to allow some JS but not all (for example, no third party JS) then it's not going to run the AMP JS or the contents of the <noscript> block, and AMP pages will render with 8s of white screen before the CSS times out. This is a pain, but I'm not sure what the right way to fix it would be. (I wish content blockers were smart enough to figure out which <noscript> tags to run, but that's probably asking too much.)

If you wanted to expand on how AMP seems like an attempt at a walled garden I would be interested in reading it; I haven't previously read any explanations that made sense.


I guess the question is: Do you trust google to treat non-AMP pages the same as AMP pages?

If they don't/won't, no matter what your justification for why is (you believe it will provide speed, security, whatever), that's one of the walls.

Sure, you can not use it, but does that limit your ability to be found on the internet? If yes, then there's that wall again.

They're in the extend stage of Microsoft's favourite strategy.


> Do you trust google to treat non-AMP pages the same as AMP pages?

Google clearly doesn't treat AMP and non-AMP pages the same way: only AMP pages are eligible for the carousel in Google search, and there's a little icon.

Once there's a way for non-AMP pages be safely preloaded I would be very surprised if Google search didn't start doing that, though. (Speaking only for myself, not the company.)


And replacing the URL bar contents makes it more difficult to spot the walls [1].

[1] https://developers.google.com/search/docs/guides/images/amp0...


Fair enough but there is now zero need to load them from the AMP cache at all - this security model could allow News Carousel to load them from the originating site and still have access to the pre-rendering instant load magic/lies that AMP provides.

It feels a little dodgy to me this standard and a bit embrace extend but I'll see how it plays out and reserve judgement until we see this happening in the wild and how well it works. Personally I'd like to be informed in the browser chrome that it was being served via this mechanism rather than me visiting the original site.

Can you maybe see that people feel the browser is now lying to them about where the content is coming from?


If you're loading the content from the originating site, surely there's no benefit at all to signing. If you're loading the content directly from the site, the browser just needs TLS to verify the integrity of the content.

And you're also back to the situation where you can't preload the content in a controlled manner or privacy-preserving manner, nor have the page-speed guarantees since the version being served to the user is not the version that Google crawled.

It's kind of the opposite. The cache is where the actual benefits come from. That's not the part you want to get rid of. The AMP spec was just a vehicle for making the caching possible in a secure manner.

This model would theoretically allow the validation, caching and prefetching to be done for all (signed, so opt-in by the publisher) HTML pages. Which is another one of the historical top complaints about AMP: why can't light, fast-loading, mobile-friendly HTML get the same treatment in search results.

> Can you maybe see that people feel the browser is now lying to them about where the content is coming from?

I can see that they are feeling like that, I just don't understand how they arrived there.

How is this different from a e.g. company X's website being behind Cloudflare? The browser didn't contact the actual server that company X hosted the content on. Instead the browser contacted a server run by Cloudflare that could prove cryptographically (via TLS) that it was authorized to serve content on behalf of the actual site.


> And you're also back to the situation where you can't preload the content in a controlled manner or privacy-preserving manner...

A few people have pointed out the privacy-preserving aspect of AMP. I'm not sure I get how that's the case. Is this referring to the fact that the page is not being pre-loaded from the content owner's own webserver? The main privacy violators on the internet are Google and Facebook. How is loading something from Google cache protecting my privacy?

Worse still, if someone posts an amp link on Twitter or a chat client Google now gets to know when I access a specific website even though they are an unrelated third party[1].

Edit: [1] In practice this was probably already the case since Google Analytics is so popular. But still.


Good question.

If you make a search query, but have not clicked on any results, you have a privacy expectation that the web servers of the search results you have not clicked on will not know you performed this query, your ip address, cookie, etc. For example, if you search for [headache] and then close the window, mayoclinic.com knowing that you made this query would probably be a surprising result.

With naive preloading, you would preload a search result from that origin. Your browser would make an HTTP request to the site and that site (sending an ip address, the URL you are preloading, and any cookies you may have set on that origin). So, this approach would violate your expectation of privacy.

Instead, if the page is delivered from Google's own cache, the HTTP request goes to Google instead of the publisher. Google already knows that you have made this query, and are going to preload it (the search results page instructed your browser to do so in the first place). The request will not have any cookies in it except for Google's origin cookies, which Google already knows as well. Therefore this type of preload does not reveal anything new about you to any party, even Google.

AMP has been doing this for a long time in order to preload results before you click them. However, until Signed Exchanges the only way to do this was that on click the page would need to be from a Google owned cache URL (google.com/amp/...). With Signed Exchanges, that can be fixed. The network events are essentially the same.

Note that once the page has been clicked on, the expectation of privacy from the publisher is no longer there. The page itself can then load resources directly from the publishers origin, etc.

To your last point, if someone posts a link on twitter to an AMP page on a publisher domain, and then you click it, your browser will make a network request to the publisher's origin. Google will not be involved in this transaction in any way. If someone explicitly posts a link to an Google AMP Cache Signed Exchange, then yes this will trigger a request to Google but this will be far less likely going forward as these URLs will never be shown in a browser. For example, try loading https://amppackageexample-com.cdn.ampproject.org/wp/s/amppac... using Chrome 73 or later. This is a signed exchange from one domain being delivered from another. You'll never see that URL in the URL bar for more than a moment, so it's unlikely to ever be shared, like I'm doing now.


Thanks, this was very informative. I'm not a fan of AMP at all, but this helps me understand the reasoning a little bit better and why Google hosting the AMP cache is necessary for preserving privacy.

At its root, I think my objections to AMP boil down to a few things:

On a technical level:

1. It's buggy and weird on iOS.

2. I'm not convinced I care about a few seconds of loading time enough to justify the added complexity of making this kind of prefetching possible. Additionally, this seems like a stop-gap that will be rendered unnecessary by increasingly wide pipes for data.

On a philosophical level:

3. It gives Google way too much power over content.

4. I want the option to turn it off completely because of points [1] and [3], and because I fundamentally want to feel in control of my internet experience.

Edit: The point about SXG making AMP URLs less likely to get copy/pasted to other mediums is a key benefit I hadn't considered and will likely make avoiding AMP outside of Google search easier.


2. How many URL's do you load in a day? My browsing history over the last 10 years averages to 417 pages per day. 2 seconds per URL is 35 days of my life...

I totally want that time saved if possible.


Making everything faster won't give you more time.


That's literally not true.

It looks like you were trying to make some deeper philosophical point, but you'll have to be clearer because your statement makes no sense.


Bandwidth increases do not fix latency. If a document has to round trip from the other side of the planet, that adds about 200 milliseconds until we break the speed of light. If that same document must make several round trips to be able to initially load (very common!) this adds up rather quickly. The only solutions are localized caching and prefetching.


Yes, exactly why people think this is creepy. I also expect you not to start using battery rendering shit I haven’t asked you to in the background or data that again you don’t have permission to use. Just because the majority of users don’t care doesn’t mean you gregable are not corrupting the foundation of a free web. I still feel you’re making the web super creepy, grabbing extra data and the whole project focused could be accomplished without this embrace and extend - derank slow pages more aggressively doesn’t lead to a two tier web and doesn’t tie everyone further into Google’s brain washing algorithms. But this “solution” to the problem at least for now Chrome doesn’t visit Google for these new style links from elsewhere so at least that is some improvement. The fact this whole project should not exist and adds zero value and I can’t opt out is a massive problem for me.


If the browser were to prefetch search results, it would leak information to all the result pages about the user having done that search. (I once had a blog post accidentally rank on the first page for "XXX". I really don't want to know who is searching for that particular term.)

Google has to know what you're searching for to compute and show the results. So there are few additional privacy implications from the preload.

And your last case is exactly what will no longer happen. People will now copy-paste the original URL rather than the cache URL. Click on the link, and you're taken to the original site.


Then don’t cache stuff that you haven’t told the user you are caching from third party sites.


> If you're loading the content from the originating site, surely there's no benefit at all to signing. If you're loading the content directly from the site, the browser just needs TLS to verify the integrity of the content.

The browser security model stops them from doing this, but presumably in this new world they could allow this to work and not host the content in the carousel themselves.

I think the argument about content suddenly becoming "slow" and no longer AMP validated if it's not served from the AMP cache is a poor one.

Finally I'm willing to postpone judgement but I did just explain why people feel that Google is embracing and extending the web if you can't understand why people are worried about this that's not something I can help you with ;-)

Cloudflare does not have the same scope, power, monopoly or scale that Google have - I can change CDN provider if they start doing weird stuff, no problem, but I can never really get away from Google.


Most people agree that URL spoofing is bad. I'm not sure why google should get a pass.


My biggest quarrel with this is that its just another way for google to take control over the internet. Does any other search provider than google use AMP? Does any browser other than googles own support this? How busy are you? You can't wait 0.5 seconds for an HTTP request? And you think its worth feeding google with more precise data about your movements online than they already have? And as a business integrating AMP, loose control over your own content and platform? Why?


https://blogs.bing.com/Webmaster-Blog/September-2018/Introdu...

Disclaimer: I work at Google, nothing related to AMP or search.


Bing is the best thing that ever happened to Google. It's the fig leave that protects you from antitrust.


my counter argument to this would be: we don’t need more corporate control of the internet and standards, we need less... bing throwing their weight in isn’t any better imo


> You can't wait 0.5 seconds for an HTTP request?

I don't have links to hand but everything I've seen shows real dropoffs in users as you increase the time. Once you're looking at low numbers of seconds you're looking at significant numbers of users simply abandoning the site. Half a second extra is not insignificant, and the user experience changes a lot between things that feel instant and things that have a noticeable wait.


Yes but you don't need AMP to have a fast loading website or even one that applies the same principles as AMP when it comes to having inline CSS, loading scripts async etc. The biggest problem in all of this is usually ads and analytics anyways.


Also Google controlling AMP specifications means Google can decide what widgets (from what companies) can be there on the page, what ad networks and analytics systems can be used.


This is actually the opposite: users are deceived because they think they connect to publisher's site but in fact they are still inside Google's walled garden. Their data are collected according to Google's privacy policy but it is difficult to spot looking at the address bar.

Also, Google controlling AMP means that Google decides what analytic systems and ad networks are allowed on the AMP page. With Google having its own ads and analytics business, doesn't this tempt them to make life little easier for its own products and little more difficult for competitors'?


As bad as the URLs were, at least you could edit them to get back to the non-AMP version if you were technically literate enough. Now there'll be no distinction, you could get sent to an AMP link from Google which is a lesser experience than the 'real' site and have no way of getting out.


> have no way of getting out

I believe if you refresh the page it triggers a request to the original site, which will probably then choose to give you the non-AMP version of the site.


Can't you just click the link at the top right that will send you to the real page as it does today?


I believe they said that bar would be going away once they rolled out 'real URLs'.


What reason does Google now have for keeping the link there?


It only works in Chrome. The Web has now been split in two, and you now have to use Google Chrome to be on the faster version. Google is shamefully abusing its power in several places here.


If by not using Chrome you don't end up on an AMP page, I consider that a feature.


Google is unethically abusing their power against non-Chromium browsers like Firefox. Speed matters in the eyes of users, even if we individually block AMP. See the link below for a general pattern.

https://www.zdnet.com/article/former-mozilla-exec-google-has...


Google just gives users what they want. I've checked the link you provided and the website is total wreck in terms of user experience (subscription popup, large obtrusive ad banners and so on).

I have push notification disabled but it wouldn't be surprise for me if they asking to subscribe for push notifications on the first page view.

Current era of content websites is a disaster except few cases like medium and maybe reddit with a discount.

AMP is an only solution for general users who just want to google a cooking recipe or latest news in their town.


First, everyone go out of their way to break REST[1] caching by eliminating proxies from SSL (for some good, some bad - reasons).

And now we're trying to shoehorn it back in?

It used to be that a local caching squid proxy was a great way to make load times of various "front pages of the Internet" bearable on a shared low bandwidth uplink (local/national news sites etc typically being served from the cache/lan).

New ssl/tls kinda-sorta breaks that (there's no middle ground - either install intercepting cert that catches everything, or abandon caching on everything. Either cache CNN. com and medical records, email(webmail) and Facebook messages - or neither).

AMP might be a bridge too far - but some kind of (semi) public "signed, not encrypted" would still be a good fit for hypertext applications/documents - because of the caching benefits.

[1] As excellently outlined and contrasted by Fielding in his thesis: https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm


"Oh, you don't like the AMP url? OK, how about, we solve this problem, BY MAKING YOUR BROWSER LIE TO YOU. How does that sound?" - Google, probably.

If you look back, Google has made multiple attempts at "improving" the url. It starts to become clear what they are trying to do now.


I think everyone sees clearly that AMP is a power grab but fishes for proximate reasons to reject it instead of just sticking to that.

I don't care how AMP works. It's a power grab. Done.


the preloading and caching seems like a marginal speed boost. the main win from AMP is just the stripped-down format, which does not require the cache.


People here have been complaining that AMP is a power-grab from the very start. So I do not see this as a strange reaction.


It works best if you try understanding everything from the perspective of someone tail’ing Apache logs and fiddling with their kernel parameters.


Different groups of people are commenting at different times.

Also, Mozilla members rally around a ton of stuff here on HN. That's why you see so many posts about Rust despite the fact that it's not really that popular. That's also why the top comments on stories about MS Edge switching to Chrome where lamenting the fact that they didn't choose Firefox, despite the fact that hardly anybody uses Firefox.


Let’s hope AMP, like most google products, is shut down within the next 2-3 years


What's wrong with it exactly... beside being weird. I'm not a fan of manipulating the URL the way they do with this change, but couldn't you just opt to not use AMP if you don't like it?

Ideally people would develop fast sites on their own, but apparently they need the help of Google.


If you don't use AMP your search engine placement suffers. Often dramatically, as all the pages in Google's top-most carousel are all AMP pages.

And AMP is a pain in the ass. It's sold as being "just HTML" but it isn't, really. You can't even use an <img> tag, it has to be <amp-img>. So you have to generate two versions of every page. Achievable for large companies but if you don't have a lot of resources that's a big overhead. As is so often the case, it helps concentrate all web traffic to a smaller and smaller number of sites/publishers and shutting the rest out. That's not good.


The issue is that you can't, or you risk your site being basically blacklisted from google. Especially if your a news site.

Users have no control outside of not using google. If google were to provide a setting for the user to never see AMP, I would have less issue with this. But they don't

Instead, they basically force publishers to use this because if they don't the news carousel will not show their article. It just gives Google more control over the web for minimal at best benefits


I don't known much about AMP so my question is why can't it be a standard?

If there are some benefits to it why shouldn't those benefits be standardized? Is Google preventing the standardization of AMP?


> but couldn't you just opt to not use AMP if you don't like it?

This is false. as a user I cannot easily opt out of using AMP


Nah I see AMP staying around long enough to capture a significant portion of web share that they mine data from. Essentially is just a way to insert themselves as "the internet"


Google products that suck the most tend to last a long time. See Google+.


Amen!


Why is that? I think AMP is great.


Users love it so unlikely.


We need to decouple two things that are mashed together in this post:

Web packaging and Signed exchanges seems benign and beneficial, you can sign a particular page inside a package (let's say a zipped folder of some kind) and now anyone can cache that data and show it, while both the browser and the user knows that it's safe to display it. Since the AMP format is similar, it seems quite beneficial to now have all your AMP content support this feature. And anyone who made some of their pages AMP can use that same process to support other Signed Exchanges (such as p2p networks or CDNs) . This is great since it makes distributed caching much easier.

The bad part is that google search uses this signed exchange format not to show the actual URL but rather put it in an iframe inside chrome (and only chrome). The real question is whether we will be able to use this functionality outside search, if I have my own site and show a large iframe with signed exchange page, will I also be able to change the browser url bar? mmph, probably not.


There is a little confusion here, understandable. Google search will not show these signed exchanges in an iframe, the pages are full frame.

Try it for yourself. Using Chrome 73 or later (you probably already have this), and a mobile browser (either a phone or mobile emulation), try the query [amp dev success stories].

It will only use signed exchanges in Chrome because currently only Chrome supports signed exchanges. The search engine explicitly looks for the browser to state that it supports signed exchanges in an Accept header, like any other new technology.

Yes, any page can use this. So, for example if you went and fetched a signed exchange from https://amppackageexample.com/ (or any other site that supports one, this is just an example), you could then serve that from your own server, more or less just like any other file (the less is that you need to set the right Content-Type header, but it otherwise works just like serving an image or a zip file).

Then, if a user visited the URL on your site https://yoursite.com/cached-copy-of-amppackageexample.com/ then the browser would display https://amppackageexample.com/ in the URL bar, as though that URL had 301 redirected, but without the extra network fetch.

Google search does exactly this, just loading a cached copy of the Signed Exchange, and any other cache (or even any website) can do the same.


Forgive my lack of know-how, but does this theoretically mean I could download this _signed package_ to my computer along with the signature and use it later to prove that the information was provided by the source according to the signature?


Yes. You can see this among other planned use cases for the web packaging spec here https://wicg.github.io/webpackage/draft-yasskin-webpackage-u...


I'm not quite following which parts were or weren't needed for what's been enabled in the post here, for the usecase of delivering a single offline package that can be opened like a website, is there something that works yet? Or a repo I should be following other than the spec?

Once I can create webpackages and deliver them to clients a lot of thing I want to do become hugely easier and nicer.


I believe Chrome has already shipped an implementation, I don't know any more details unfortunately. It's still in the standardization process.

I know it's not exactly easy to follow but the only implementation repo I can think of to follow right now is the Chromium repo.


Oh interesting! I'll see what I can find there, thanks!

I also had a look in the blog and the "progressive web apps" might be the right thing to look at. There's probably something subtle that's different but I think I can use these to solve the actual problem I have.

https://developers.google.com/web/updates/2019/03/nic73?hl=h...

edit - damn, I don't think this is right at all. Frustrating as it seems pretty perfect but I have to serve from my own domain for 30s before a user can install it :( I just want a single file way of delivering web content! It seems like all the features are basically there, just with restrictions to focus on different use cases.


You could prove the document was signed using the source's private key. That does prove the document was signed by the source if you can prove that only the source had access to the key.


Yes


How does the browser verify that the AMP is up to date?


Good question. The publisher signs an expiration timestamp in the Signed HTTP Exchange. The publisher can choose this timestamp and the browser will not respect signatures with expirations in the past. Note also that the specification requires, and browsers enforce, that the expiration cannot be more than 7 days in the future.


Wouldn't it be better to borrow from HTTP and allow a head request to the original source - with a reply of a current signature?

Isn't this whole exercise really just adapting public key signatures on top of old school caching?

With a http proxy you ask for an url, the proxy fetches or serves on behalf of the owner. This adds some circumvention around the way tls/ssl breaks that type of caching. But it should still be able to do a head-like request for a current signature - with no need to download the content again if it is unchanged?


There is in fact some draft language around this kind of a mechanism to update a signature to extend the lifetime of the document by fetching a remote URL. See https://tools.ietf.org/id/draft-yasskin-http-origin-signed-r... .

Doing this on every page load breaks either user privacy (by making the origin fetch before the user clicks) or the preload performance gain itself (by blocking load while waiting for this round trip).


But if the signature is expired, preload would fail anyway, which would trigger a regular load "on click" - but that click should maybe result in a head request for possibly just getting an updated signature?


The intermediary (Google in this case) can choose not to serve an expired exchange.


It's ridiculous. Google wants to keep users at its domain so much that it invents a whole technology to substitute address bar contents. This shows how harmful it is when a company has a significant market share in several different areas (browsers and search engines).

I hope at least Mozilla doesn't adopt this technology and will show the true URL.

This technology is complicated. Browser vendors have to implement all of this only to please Google.


Indeed, Google just has too much market share.

Last week I blocked Google from my domains (blog: lucb1e.com/!130), hopefully others will follow suit and degrade the search quality until people get better results (at least for some more obscure content) elsewhere, or perhaps until Google notices we are really not okay with their behaviour.


Are you blocking GoogleBot by IP range or User-Agent match? Why aren't you using your robots.txt file to block GoogleBot instead or in addition to your server-side logic?


Robots.txt was my first thought as well, but that is said to not actually block your site from appearing in the results. They'll gather from other sites what the page is about (think <a href=mysite/somepage>how to knit a sweater</a>) and show that as title without page summary. Maybe if it looks like the site is down, they won't bother.

Blocking is based on user agent, they seem to set that reliably and the IP addresses change. You can do some reverse lookup magic but this was way easier than looking up every single IP that visits my site.


This is the exact opposite of "keeping users at its domain". That was the situation _before_ they implemented this standard. Now users will get sent to the publisher's domain instead (via a prefetched page load).


But it's served from Google, so they still control all the analytics


No they don't. The page contents are controlled by the publisher and cryptographically signed so Google can't alter it. Another improvement over the previous situation.


Since it’s not fetched from the publisher they will have to use Google analytics or have nothing.

Guess what they’re going to choose.


Google Analytics isn't the only analytics solution out there. Publishers can use literally _any_ method of gathering analytics that's not server logs.


Remember the talk about how the Chrome team was going to "rethink" the navbar, and what domain and site identity really mean? And people were a little worried about this?

Turns out people were right to be suspicious. This is hot garbage. You can no longer ask a user "What URL does your navbar say you're at?". It is no longer a source of truth. They will actively be lied to.


But what does it mean that you are on a particular URL?

For a long time already it's not being connecter to a particular physical server. Now it's the next step - to be completely decoupled from the server and just mean content instead.


This is meant to offload tracking from just Google Analytics and SERP clicks, which is used to track user behavior (but can be blocked) into services that cannot be blocked beyond Google domains.

If Google hosts the website and is masking the resulting url, they're able to have more visibility than Google analytics. They'll likely give this AMP some SEO boost temporarily and that will get web admins to adopt the technology.

It's just like reCaptcha, which is used to track users across the web (requires google.com + gstatic.com urls to load, which drops its own cookies or scans existing ones), blocking recaptcha will break core web functionality... and recaptcha v3 is even worse.


Web publishers don't necessarily want their content decoupled from their own servers, but they don't have a choice now if they depend on traffic from Google.


You are not decoupled from the server. Google still sees HTTP request you make in plaintext and collects your data according to their privacy policy. It just won't be obvious because of publisher's URL in the address bar.


there was no need to be suspicious. google wasn't being sneaky about it, they have been actively talking about, promoting, and openly developing this feature for at least a year.


This sounds terrible. Does it mean that browsers will begin lying to users and say that the users are visiting the website's server when they are really visiting a restricted version of the website that is hosted in Google's cache? I don't want my content restricted or hosted in Google's cache.

AMP doesn't load in a privacy sensitive way. It's on Google's servers and it takes many seconds to load if you have JavaScript disabled.

Also, the feature only works on Google Chrome and possibly Edge, which gives another point to the article below.

https://www.zdnet.com/article/former-mozilla-exec-google-has...

AMP is a fundamentally bad idea that needs to disappear.

Edit: Mozilla has marked Signed HTTP Exchanges as harmful.

https://mozilla.github.io/standards-positions/


The browser displays the URL from the origin that digitally signed the unmodified content.

A browser already doesn't show you what server delivered the content. That would be your wifi AP, cell phone tower, or ISP node. The internet has already long established that we can trust content without trusting intermediaries.

There are two elements that are important: integrity and privacy. The content integrity is protected via a digital signature, the "signed" part of "signed http exchanges". The signature proves that the document hasn't been tampered with.

Regarding privacy: The intermediary (a search engine in this case) already has the content being delivered as a result of crawling it. It also knows the user clicked on a link to get that content, and knows the user's ip address. Even without AMP or Signed Exchanges, the privacy situation is the same. Once the page is loaded, all further interactions with the origin are normal https traffic, so later requests are not different in privacy either.

What this enables, for search results, is the ability to load the bytes of the content before the user clicks a search result. If the browser prefetched those bytes with the origin's awareness, then the user's privacy with respect to the search query would be violated, making prefetch problematic. With this setup, documents can be prefetched while preserving user privacy and after the user clicks all browser behavior continues as normal from that point forward.


AMP allows Google to see exactly how you interact with every page on the internet.

Just from the text of the pages you visit they can build a profile around you. What your interests are, how much of an article you're likely to finish, whether you're the type of person to highlight text as you read, etc.

Unless you live on an island with a poor satellite connection AMP is useless as anything more than a corporate user data collection tool.


AMP documents don't share user data with Google, which can be trivially seen by inspecting the network events that the page generates.

If the publisher chooses, they can send logging to Google Analytics, but this is not part of AMP.

The typical argument otherwise is that the AMP javascript is loaded from Google's cache, however these javascript resources allow for a very long cache lifetime (1yr if the page came from the Google Cache), so relatively few page loads will actually end up fetching them from the network for most users.

Edit: These resources are also on cookieless domains.


> The typical argument otherwise is that the AMP javascript is loaded from Google's cache, however these javascript resources allow for a very long cache lifetime (1yr if the page came from the Google Cache), so relatively few page loads will actually end up fetching them from the network for most users.

Christ this is thin as a privacy argument.


> AMP documents don't share user data with Google, which can be trivially seen by inspecting the network events that the page generates.

Is there anything preventing Google from changing this later?


No, if Google can change the way web works from day one they can change anything they want. Don't forget Google is killing imap and dns already. Why not http to?


Also, Google explicitly states that it is collecting data in AMP Viewer [1]:

> The Google AMP Viewer is a hybrid environment where you can collect data about the user. Data collection by Google is governed by Google’s privacy policy.

I assume they collect information from HTTP request the browser sends when requesting an AMP page.

[1] https://developers.google.com/search/docs/guides/about-amp#a...


> AMP documents don’t share user data with Google

They might not now, but could ‘t Google start creating unique URLs on each page, allowing them to track you that way?


They can already do that, and are doing so, through Search, Analytics (maybe), ads, etc. That war is long lost.


They can't if you block all their shitty domains and don't use google services. Things that many privacy-conscious users do.


We are talking about their AMP cache. If you don't use Google Service, except if you like to prepends their amp cache URL before your links, you'll never get there.

Their AMP cache happens only on their search service. They already know which links you click... having an AMP cache on top doesn't give them MORE information than they already get. The use of that cache also make sure the website doesn't get more information because it's preloaded.


That's not entirely true though, is it ? any link shared on reddit, or here, on on any social network by a chrome user can be an amp one.


If (or when) the share of that privacy-conscious users will rise, Google might motivate webmasters to compile GA scripts in the main JS script, and considering pretty much any website now a days just doesn't show content with no Javascript enabled, it would be much harder to avoid.


I browse mostly without javascript on and that's not true; easily more than half of websites work just fine without it, and that number goes far up if you accept some lack of features. Though there are some that indeed don't work at all.

Although your point is well taken that there could be ways to sneakily track users eventually despite the aforementioned measures, and potentially even without javascript being required (though I doubt that share of privacy-concious users will ever raise significantly - most people simply don't care).


No excuse.


Google can't tell if a link has been clicked if JavaScript is off and the `ping` attribute is removed, so AMP removes privacy there.

By forcing web publishers to host their content on a Google cache, they lose their server-side logging and the ability to determine how they set up they way they serve their own sites.

Also, why do you artificially slow page loads on AMP pages to 8 seconds when JavaScript is disabled? That is a privacy issue.


The linker (google in this case) could rewrite the link to use a redirector if they choose. If Javascript is off, AMP and thus Signed Exchanges are disabled on Google search results anyway.

You misunderstand the 8 second CSS animation in the AMP boilerplate. Here's the code (simplified):

  <style>
    body { animation:-amp-start 8s steps(1,end) 0s 1 normal both}
    @keyframes -amp-start{from{visibility:hidden}to{visibility:visible}}
  </style>
  <noscript>
    <style amp-boilerplate>
      body{animation:none}
    </style>
  </noscript>
See the noscript section: if javascript is disabled, the CSS displays the body immediately. If Javascript is enabled, but for some reason the AMP javascript fails to load, after 8 seconds, the page is displayed anyway. The page is probably somewhat broken without the javascript loading, but the 8s is a fallback, not code to slow down non-javascript browsers.


There are legitimate (privacy/speed) reasons to not load AMP's JavaScript while still not turning of JavaScript entirely. Google does have the capability to know when you're on an AMP page, because the JS loads from ampproject.org, which is registered by Google.

An 8-second delay seems like an intentional "bug" to coerce users to turn on JavaScript (and advertising).


The javascript is heavily cached, so will not give a request on every page load.

That is not the intention. If javascript is disabled entirely, Google Search won't even load AMP pages. The scenario you describe of a user loading an AMP page directly without javascript enabled is somewhat rare.


Many people use tools to block third party JS from loading. AMP can't be called privacy-friendly while making it extremely difficult to use when tracking (AMP Analytics) is blocked. The 8-second delay happens to me every time I accidentally click an AMP URL in my browser.


I don't use Google Search, and I frequently get sent to Google's AMP cache via other link sources (e.g. HN).

I don't have javascript blocked, but I do have Google's tracking blocked via standard tracking protection (which is now a built-in feature in most non-Google browsers), which means <noscript> tags are not triggered, and I get the 8 second delay due to non-loading JS resources.

I don't think my setup is as rare as you make out.


It is clear that the current developments on the web are worrysome and we need real privacy. We need to be able to find a website and visit it completely anonymous, unless we actively submit information to said website or a court order is issued.

A cell phone tower or ISP node is ideally just infrastructure, "plumbing". Google seems to be trying to advance their strategic position in that direction. Rather than just being one search engine among several, they are trying to become part of the infrastructure. This could prevent future privacy solutions (and even prevent competitions between search engines).


The real reason to make this spec is not to improve integrity o privacy or something else but to make users stay on Google's domain instead of going to other site. Google wants to build its little walled garden, and this spec is needed to make users think that the walls aren't there.

> With this setup, documents can be prefetched while preserving user privacy and after the user clicks all browser behavior continues as normal from that point forward.

But Google can already preload and show cached version of the page without this spec. The only difference would be that address bar shows "google.com" instead of publisher's domain. There is no need for this specification.


> A browser already doesn't show you what server delivered the content. That would be your wifi AP, cell phone tower, or ISP node.

No. Incorrect. Completely backwards. Factually wrong. You just failed your networking-exam.

Those things you mentioned would be transparent networking nodes forwarding your TCP-packets and they have nothing to do with any layers above that.

The fact that you don’t even know this completely invalidates any other point you may have.


I think you're missing the point of the GP. It says that you don't know and you don't care which particular server returns your content - is it a self hosted machine, is it a cloud machine, is it a CDN? No way of knowing unless you inspect the deeper stack. What you see very visible is which BRAND (I. E. URL) returned your content.

So this Amp exhange technology changes nothing in this regard. It's like Google provides its own Free CDN, it is just not done in a traditional manner.


> It says that you don't know and you don't care which particular server returns your content

Which is plain wrong. I care.

When the URL-bar says I’m looking at company.com, I expect my browser to have used my OS’s DNS-resolver to look that name up, connect to the IP-given and nothing else.

I certainly don’t expect it to send traffic to certainly-not-the-nsa.com which are MITMing my traffic and tracking/monitoring it.

If I can’t trust my browsers URL-bar to exclusively and accurately reflect what is actually requested, it is effectively lying to me, the user, it’s owner.

And then suddenly all URLs are phishing URLs because Google made URLs no longer matter or mean anything.

Completely unacceptable.


My point is that even if you look at the URL bar currently and it says company.com, you don't know what you're connecting to. Probably you're connecting to CloudFlare/CloudFront/Akamai/Fastly/any other CDN which is set up with good-enough certs to impersonate the domain. Therefore you're not trusting a particular server, you're trusting a relationship that the domain owner built with her's service providers.

The proposed scheme is just another way to extend this kind of relationship that the publisher builds, a new mechanism if you will. There is nothing in there that requires more or less trust from your part than before.

You're complaining that need URLs to reflect what is requested - in fact, I argue that you want the URL to tell you what is being served. But this is not what's currently happening.

URLs are already lying to you.

I doubt that you WHOIS-lookup all DNS resolved-IPs to verify that the IP presenting a cert is assigned to the organisational entity that you want to connect to, and have a whitelist of those entities that you actually allow your browser to connect to. Because that's what currently required to make sure you don't go through CDNs and other intermediaries between you and the publisher.


Using a CDN currently means the company use trusted mechanisms like DNS to delegate certain traffic to other providers (like with Cloudflare). And it does so for everyone.

In which case the URL serves what was requested.

What AMP does is provide google.com content and lie to the user and says it comes from company.com.

Which isn’t true, and it only does so for users coming from google.com. Where I’m sure google will be happy for the additional tracking data.

This is NOT the url the user was lead to believe he requested. This is not what everyone else is served.

This is malware.


> I don't want my content restricted or hosted in Google's cache.

how is this different than using your own domain, but pointing it to a github.io page? Or using medium, but with your own domain (but still being served from medium's servers)?

Is it just google you're adverse to, or the entire idea of someone else hosting your content?


1) I want full control over my servers and to not be penalized in search engines for not hosting my sites on Google. Where are the server-side logs?

2) I want full control over how I publish my sites with real web standards. AMP is not a web standard, it's a Google format that they are strong-arming people into using.

3) Mozilla considers Signed HTTP Exchanges harmful. This technology is as bad as what Microsoft was doing with IE in the old days.

4) I don't publish on Github pages, but if I did, I would still have a choice over which servers I put the sites on.

5) There shouldn't be a single company (or few companies) that dictates how we publish online.

6) Shame on the people who are splitting the web with this fake-opensource technology. There's even a Google engineer over here referring to the Web like it's a Google product. https://news.ycombinator.com/item?id=19631136


As per point 6, I wouldn’t take what was said there as a statement from Google, or potentially even an employee of Google. They did it as a throwaway .. anybody wishing to kick the hornets nest could have posted that, employee or not.


It's not written like someone trying to kick a hornet's nest. It's written like someone who has been conditioned inside of a culture that has begun to view the Web as a Google product on some level.


And if somebody was wanting to kick a hornets nest, that’s exactly how you’d want to write it :).

My point is, you cannot just blindly trust anonymous comments to be who they say they are, it’s an easy way to get yourself in trouble.


But if the comment was, say, digitally signed, on the other hand... ;)


DNS is the answer to the first two questions.

However the last question is a fair point - nobody complains about CloudFlare's caching of your web page as you designed it.

The critique of AMP is that it receives privileged placement in search results, and that content authors are being pressured into adopting this de-facto Google-controlled spec, where they host your content and control its presentation. Anything that furthers AMP helps Google in this effort.


That's a good point! Domain owners can host their websites wherever they like, and yes that includes Google's cloud.

If they go through a content network like Cloudflare, you can't even tell who's hosting the site by looking at the IP address.

It drives home the point that websites are abstractions that have no necessary relationship to any particular physical hardware. Network tools may or may not tell you a bit more about the source, depending on if there are any leaks in the abstraction.


There is a difference between the web publisher controlling that abstraction and a web publisher that has been strong armed into one abstraction or another.


There are incentives, but publishers still make their own decisions.


Being penalized in the search results is outright coercion, not an incentive.


I didn't even know about "HTTP Exchanges", and I'm more interested than ~98% of the population about this kind of stuff.

Showing the name of the "signer" in the address bar, instead of the server where the content is actually hosted goes against decades of browser UI design.

Good on Mozilla for marking it as harmful.


> Showing the name of the "signer" in the address bar, instead of the server where the content is actually hosted goes against decades of browser UI design

Does it though? If you use Cloudflare or Akamai or Cloudfront or Netlify or etc. etc. then what shows up in the URL bar is not the server where the content is actually hosted. Well, it is the server where it is hosted, it's just one of the many domains hosted by that server.


That has never been different. Cloudflare & co are reverse proxies, for all intents and purposes from a user agent view, they are where the content is coming from. They are the ones pointed to in DNS, and they have valid SSL certs.


And how is this all that much different? In fact I would say it's more secure. DNS can be spoofed pretty easily. This is a cryptographically signed package. If anything, I'd have more faith in this changing my URL than a proxy via DNS.

Just because Google invented it doesn't make it bad.


> In fact I would say it's more secure. DNS can be spoofed pretty easily. This is a cryptographically signed package

How is it more secure? If, as you say, DNS can be spoofed easily - I can easily get a certificate issued with the required extension and make a "cryptographically signed package".


> If, as you say, DNS can be spoofed easily - I can easily get a certificate issued with the required extension and make a "cryptographically signed package".

Spoofing DNS to clients is much easier than spoofing DNS to certificate authorities. Otherwise domain-validated HTTPS certs wouldn't mean much.


> And how is this all that much different?

It changes the meaning of the address bar from "this is who I'm talking to" to "this is who (at some point in time) signed this content".


But when there is a CDN there, "who I'm talking to" is really just an intermediary who pretends to be you, and may have in fact modified the content. With this, it is still an intermediary pretending to be you, but at least now the package is signed and can be verified.


The CDN is you, for all intents and purposes. It's your agent in the back and forth, as much as your hosting provider would be. A third-party cache isn't.

I don't mind that you can sign and verify content, that's fine and useful. I'm just not a fan of changing the address bar's meaning.


But what I'm saying is that the meaning that you ascribe to the address bar is incorrect -- it already only tells you who published the content, not who you are actually connected to.

What I'm saying is that this does not change the meaning of what's in the URL bar. It's the same as before. It tells you who published the content originally.


> it already only tells you who published the content

No, it tells you the origin of the document. If you are the creator, and you choose to put your content on server X it will tell you "I've got this from server X". Whether that server is a reverse proxy or a shared webhost or a dedicated server in a DC or a raspberry pi running on your desk doesn't matter - it's the designated original that you, the owner of example.org chose.

That's what it always meant, and it changes when you do a redirect, and it shows you the current URL even if there is a canonical header of http-equiv. I can put a reverse proxy on my host and proxy example.com to example.org - the address bar tells you that you're reading example.com, not example.org, as it should, because you're connected to me, not to example.org.


This is just semantic.

Do a trace route on any domain and you'll see that the server isn't the one that give you the answer, but some intermediary. Sure in that case when you did the request, the content is fresh and the server answered RIGHT NOW, but that cache still get the content from the server, it's just a bit older.


I decided it's time to give DuckDuckGo another shot. I just realised it's a lot nicer to scroll through its results than Google is now.


I've been using DDG for at least a year now. On some occasions I can't find what I need and end up checking Google, but in those cases, Google usually can't find what I need either.


Signed HTTP exchanges may be harmful, but Google is beginning to get enough dominance so they implement it and browsers with a minor market share must follow or are left behind.


What happens if other browsers don't implement it? It seems like they'll just show CloudFlare or Google's domains, instead of the signing domain?


The behavior for browsers without support is to show the google.com/amp URL as before, along with a small html-based bar with additional information about the original domain and share intents.


With a button to disable AMP results entirely if that's the wish of the user?

Yeah, I didn't think so.


> share intents

Does that mean that the Google+ button is coming back? Seriously? Why not just serve the content and leave it at that? Is the tiny bit of extra data you get from a unique "share on Facebook" URL worth it?


The share button simply calls the browser's share API, for example: https://developer.mozilla.org/en-US/docs/Web/API/Navigator/s...

> The Navigator.share() method invokes the native sharing mechanism of the device as part of the Web Share API.


I didn't know the Web Share API existed, but based on the RFC, it looks like yet another Google-driven "standard." I still don't see why it needs to be added to the page.


> AMP doesn't load in a privacy sensitive way. It's on Google's servers

Only if you load the page from a Google SERP, in which case, Google would already know if you visit the page. If it's loaded from a Bing SERP, it's served from a Bing server, and the same for Baidu and other AMP caches. This is far more privacy preserving than preloading a page from some third party web server that the user might never visit.


I feel like Google is pushing AMP and components too hard. I've started to balk at the idea of using them even for explorations.


AMP is like Brussels Sprouts. If it's forced on you when you are young, you will grow up hating it.


Brussels sprouts are good for you. AMP is more like medical experiments performed on you during an alien abduction.


It's solely the search engine boost you get from AMP that bothers me. Because of the money involved many sites have no choice but to implement AMP and stay competitive. If it weren't for that, it would just be another technology and the fact that it only works in few circumstances would probably make many sites not bother with it. The UX downsides would probably see many sites actively avoid it.

The fact that it has seen such adoption is testament to Google's ability to influence with it's rankings alone.


Have you noticed ranking improvements? We've done AMP on some sites and not others and we saw no difference in ranking.

Sure, having a very quickly opened page is nice, but on the other hand, features are limited. That might or might not work well, depending on what kind of content you have, what engagement you're looking for.


What's up with this extreme hatred for AMP? I personally love AMP.


AMP is a threat to the entire Web itself. It forcibly takes control away from web publishers and attempts to turn the Web into a Google product.

Type "amp sucks" into a search engine to find out more.


On iOS it continues to be very broken, although the difference in scrolling "inertia" was resolved by Apple.

AMP introduces a very non-Appley top bar within the browser, adds new swipe semantics that can be confusing, breaks "tap status bar to scroll to top" behaviour, breaks reader mode (although this is inconsistent), and generally looks out of place. The best way to describe it is like a GTK or KDE app running in macOS. It's clearly not a "native" experience and doesn't really look or act like any other webpage in mobile Safari.


Most people do not like to be forced to do things a certain way. Especially if they have a working site already and now have to remake it from the ground up just because some other actor decided it isn't good enough to get visitors.

Just you wait until you notice you can't go to town in your car any more. Only teslas are allowed into city.


That's a bad example for me personally since I think all cars should be banned (except maybe electric cars but I haven't done the necessary research to see the actual environmental impact). I haven't used my driver's license in years and always take the train (I've also stopped flying).

But anyway, that's off topic. I understand that it's a pain for developers but for users like me who are often on a bad connection it's a life saver.


Seems like a reasonable idea. The content server says "here, you hold this for me", and the address bar only shows who originally signed it.

One could imagine replacing the serving layer with something like BitTorrent or IPFS.


Yeah, I think this feature and the signed exchanges standard both sound great. It allows CDN-like servers to host content without having to be trusted to not modify the content. That sounds like an improvement over the current CDN situation.

Also, sites that link to other sites can preload the linked site's content into the user's browser, without leaking the user's IP to the linked site, so if the user doesn't follow the link, nothing about the user is revealed to the linked site. That sounds like a performance and privacy improvement wrapped up into one. I'm finding the rest of this discussion thread extremely disappointing as it seems like most of the posts here are just "amp=bad and amp people like this so it's also bad".


Signed Exchanges could also remove some of the concerns around JavaScript crypto, because there is a (potentially offline) key that signs the web app you're running, so you're not vulnerable to hacks of the hosting environment itself.

What's really needed is a way for the browser to lock a given web app/package to a specific version (and hash), so that even if the signing key becomes compromised, the app can't auto-update to a newer version containing malicious code.

Combining this with something like Certificate/Binary Transparency would allow browsers to check that they are not being uniquely targeted with a specially altered version, and you could set a policy saying "Only auto-update to a newer version of this web app if its hash has been published in a log for more than a month (and/or endorsed by signatures from N out of M other organisations I trust)".


Semi-related, I think Web Packages and Signed Exchanges could have some usefulness outside of Google's caches. One of their spec examples was for verifiable web page archives.

Another idea it could be used for a wifi "drop box" (drop station?) when there's no internet connection around. That isn't uncommon at some popular spots up river into the woods in the US.

The idea is that as people enter the area, they can update the drop station automatically for things like news or public posts with whatever they've cached recently.

I'm pretty sure I read about this idea before the spec was drafted but I couldn't find or remember the site, something like vehicle-transported data.


You may be thinking of this use cases section here: https://wicg.github.io/webpackage/draft-yasskin-webpackage-u...


Thanks. IIRC the site I saw was from a few years ago, before the spec was drafted. (I updated my post to be more clear). Pretty sure there was a few photographs on the page out in the flat grasslands.


In general, this sounds like an interesting use case.

One thing to note is that the specification currently limits the lifetime of a signed exchange to 7 days. It's possible that by exploring some of these use cases, especially offline, the spec could be improved with respect to some of these constraints.


Unfortunately, signed packages won't work for archival or any significant offline use. The signed exchanges are forced to be short lived (in days) to limit the damage that can be done when someone steals a TLS private key.

It's a very narrow spec designed just for AMP, basically.


AMP pages take forever to load, I hate seeing a white screen > 1 second. with amp that's all I see on mobile. A white screen burning my eyeballs, seemingly forever. And then I snap out of it and hit back to escape from AMPs empty void.


I read in probably another HN thread that using an adblocker will actually slow down an AMP page - there's a hardcoded 3 second CSS delay which is cancelled if (via JS) it's detected the page is loaded.


It sounds like you're probably running something that blocks 3rd-party JS and also doesn't execute <noscript> blocks. See: https://news.ycombinator.com/item?id=19680033


I would _pay_ google to be able to disable AMP permanently on mobile web results. The experience is the absolute worst. I'm fine with them wanting to ruin mobile web (that's their choice), but PLEASE let the users be able to disable this terrible "feature."


This, x1000. I just can't believe there is no setting to turn it off.


I want Google to die already.

It's so obvious why AMP is on the main google.com domain.

They're collecting your shit.

They're doing the same thing with ReCaptcha, also on the main google.com domain.

Break this shitty company apart.


After reading all of the comments here, this seems like a good thing.

This fixes the main UI issues with how AMP is currently used in google search - mainly the url not properly showing where the content is.

If secure exchange is treated the same as AMP pages in google search, I.E. SXG content will be preloaded whether or not it's AMP, it would get rid of the second complaint of AMP - that google's preloading of the content is an unfair playing ground and that the only reason it's fast is because it's preloaded.

If SXG is treated the same as AMP in the carousal then that would fix the last and most serious complaint about AMP.

As far as I can tell, google do seem to be moving in that route, so this should be applauded not derided (the original fiasco that is AMP non-withstanding)


Probably time for a congressional and/or DOJ inquiry into whether AMP is an example of Google abusing its monopoly power in the search engine space.


The headline is an outright lie: these AMP pages are loaded from Google and not your domain.

The new feature is that Google's browser displays your domain, obscuring the fact that Google is doing the serving. The change is what is displayed, not the server.


Indeed. If a web page is being served or loaded "from your own domain" that implies something very specific.

What Google actually means here is "We make AMP pages _appear_ to come from your own domain".

That's something entirely different.

This whole thing is just more doublespeak.


It's even worse than that.

When I had a website with embed videos from other sites, I had user contacting me because the other sites had some problems. They couldn't tell the difference between megavideo/youtube/dailymotion content and my site, so they came to me and blamed me.

So what this means is that not only Google bullies you into putting your traffic under their control, but now, any problem on their part will be blamed on you by the user.


> So what this means is that not only Google bullies you into putting your traffic under their control, but now, any problem on their part will be blamed on you by the user.

I hadn't even considered that. Add to this Google's notoriously absent customer support department and you have a recipe for a lot of frustration.


WTF. Lost for words.

So Google’s browser now directly lie to the user about what’s being loaded?

I bet the SSL mark is still there though?

How can anyone trust this Googlan horse?

This is why you don’t make a browser and control major web-assets at the same time. These lines should not be muddied.


At this point, if you ignore the amp aspect, how is this any different from plain http caching?


the consumer is being lied to about who is serving their request and who is tracking their online activity as a result.


When I click a search result for company.com I expect to be taken to the resulting page at company.com, not Google’s HTTP-cache of that page.


next month they'll also style it like your browser's native address bar for a better user experience and intoduce a w3c standard API for hiding the real address bar. /s?


They already had the braindead idea of hiding parts of the URL like "www." or "m." so it's not that unrealistic unfortunately.


They have been trying to make it so that users can't tell if they are on real webpages or AMP pages, and it looks like they finally implemented it. AMP is about Google, tracking, and ads, not page speed, even if they have convinced many of their engineers that it's about page speed.


I hate AMP. Not just for google thinking they own the internet. They never seem to load right on my phone and crash a lot too.


So, when will Google roll out signed exchanges for plain HTML content? That's much more interesting, and if combined with e.g. a restriction such as a Lighthouse speed score of > 60, it'd be in all measurable ways better than AMP.

Faster than AMP, more open than AMP, and all the benefits of AMP.


Does nothing for publishers' needs for deeper control and analytics. Just a "feel good" gesture that results in additional complexity for everyone involved. Google is not the only company in the world that knows how to load a page efficiently.


The publisher's cookie-based analytics will operate on the origin in the URL bar in this case. The document (though not the delivery server) will have access to publisher origin cookies.

Conceptually, you can think of a signed exchange as a 301 redirect to a new URL which has already been cached by the browser (so there is no 2nd network event). The cache was populated by the contents of the signed exchange, assuming the signature validates.


AMP URLs are ugly, so cleaning them up is good for users.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: