
Improving URLs for AMP Pages - taytus
https://amphtml.wordpress.com/2018/01/09/improving-urls-for-amp-pages/
======
ENGNR
A step in the right direction but it doesn't entirely solve the issue. To make
this complete they simply need to remove all Google nav/branding/back button
from AMP pages (or at least offer the option)

AMP should have been purely an open source library implementing a
specification, not a way to opt-in to becoming a sub page within Google

If I go out to a new URL it should then cut away the relationship with the old
site, even if AMP makes that transition a bit quicker and gives the new site
tools to load the page more quickly

~~~
DyslexicAtheist
_> A step in the right direction_

AMP is a pest. A textbook example of all things wrong with the web today. I'll
spare you my thoughts because it would only be filled with hate, foul language
and insults. Good luck with it though.

~~~
larkeith
I agree wholeheartedly - there are very few things that will cause me to avoid
a site quite so quickly as the Google result leading to an AMP page.

------
jopsen
The most interesting thing to me seems that they plan to decoupled signing of
content from TLS connections. So that packages could be signed using normal
TLS certificates (or something like that).

Hmm, so maybe in some future static-only sites will be able to sign a bundle
with offline keys and not use TLS at all. Or maybe we just sign static bundle
with a TLS key for our origin and upload the bundle to Google and other web
caches. As in maybe the internet can be distributed again.

I see lots of interesting potential in decoupling origin verification from TLS
connections.

Web Packaging Format Explainer:
[https://github.com/WICG/webpackage/blob/master/explainer.md](https://github.com/WICG/webpackage/blob/master/explainer.md)

~~~
majewsky
> maybe in some future static-only sites will be able to sign a bundle with
> offline keys and not use TLS at all

These two approaches are built for different threat models. Both protect you
from tampering, but only TLS protects you from collecting metadata like what
exact page you visited. Attackers can only observe which domain you visited.

~~~
WorldMaker
Glancing at the spec, these packages could have multiple pages or the entire
sites' worth of pages for a static site, so it should still protect most of
that metadata for packaged static sites. Someone might be able to collect that
you downloaded the full package, but not have any idea of which pages within
that bundle you visit.

The spec also encourages/opens up the possibility of exchanging those bundles
over peer-to-peer networks instead of HTTP, which further mitigates the threat
of over-the-shoulder metadata collection.

------
benatkin
It's tacky reading "y'all" in an official google blog post about how they're
going to hijack browsers to further centralize the web, and where they're
suggesting that giving google all the traffic helps improve privacy, but I
guess that's the kind of world we live in.

~~~
kavok
Y’all is a pretty useful word. Is using you’re or they’ll tacky as well? It
might be that you have a subconscious cultural bias against the word.

~~~
mattmanser
Outside a certain community of Americans, it's an incredibly tacky word that
really grates whenever I read it.

Why anyone think it's the slightest bit professional to use it in official
communications is mind boggling. It was sort of trendy for a little bit on
Reddit with a certain type of abrasive and annoying American, but thankfully
it seems to becoming uncool again.

It is also an extremely imperialistic word. Americans use it, it feels as if
it only refers to Americans, and Google is only listening to Americans.

~~~
kavok
It isn't only used in America, but I would agree that it is predominantly
known for that. Usage of the word appears to be increasing. It fulfills a role
in the English language that no other word seems to hit as easily. It'd be
nice if the word were more publicly acceptable. I didn't grow up in the south
but always found the word to be useful.

~~~
WorldMaker
Yeah, ever since the Middle English combination of orthography issues
(eth/thorn versus y) and religious literature accidentally merged the old
singular second person pronoun (thou) and the plural one (you), English has
missed an important counting word. `y'all` might not be the best solution to
that missing hole, but it's the best one we (all) have around these days.

~~~
dfxm12
...and the fine folks from one end of Southern Pennsylvania to the other
collectively sigh as "youse" and "yinz" aren't even considered.

~~~
WorldMaker
Personally, I've considered and rejected them. :)

"youse" falls into the bad pattern of also picking up "guys" or "all" as
hangers on, in my opinion, defeating the purpose. ("youse guys" being the
terrible patriarchic movie Mafioso cliché, and "youse all" a terrible
Frankensteinian monster I've heard far too frequently.)

"yinz" to me looks and sounds more like a weird pharmaceutical than an English
word, and y'all aren't going to convince me otherwise. ;)

But of course, my opinion is biased by geography and familiarity.

------
makecheck
Somewhat predictable to see the mess evolving. Once you start peanut-buttering
over something, not quite all the nagging problems go away and then you need
even more “solutions”. Then even more.

Enough, Google. Making small web sites is EASY, OK? No AMP needed: just write
your content and, as if by magic, it is small and loads nearly instantly. If
web sites are bloated and slow, close them and use something else. Stop
hyperextending the web to make lousy programming practices the norm.

~~~
mikeokner
Yeah I find it strange that publishers & such bother to release AMP versions
of their sites but then opt to keep their crappy, bloated versions around as
the default.

Google should just more explicitly rank pages based on load times and
incentivize sites to fix their crap.

~~~
wmf
_Google should just more explicitly rank pages based on load times and
incentivize sites to fix their crap._

Apparently they already did this and it had no effect. It seems like anything
less than a binary signal (in the carousel vs. not) is too subtle for
publishers to understand.

~~~
CaptSpify
They could just strengthen the weight of that to be more noticeable. But
Google has a vested interest in keeping JS everywhere. I seriously doubt that
they tried hard.

~~~
Andrex
> Google has a vested interest in keeping JS everywhere

I believe AMP pages have no JavaScript.

------
garganzol
What a nice touch they published that on amphtml.wordpress.com. Like, "see, we
are free as in beer and represent the voice of people". By using
_.wordpress.com instead of_.google.com they execute a well-calculated PR
strategy.

But in reality, Google tries to racket the free and open web in order to
squeeze even more juice to feed its insatiable corporate greed.

~~~
ec109685
No, they have a marketing company run their amp website.

------
dcow
Usually I try to be constructive, but I just need to get this out: fuck AMP. I
don't care downvote me to oblivion I'm a little buzzed but FFS who actually
wants AMP and why is it even a thing? Why can't Chrome just prefetch shit from
the actual servers and let ISPs handle the caching? Why does mother Google
need to serve me all the content from its overly suckled teat? I know everyone
working on AMP means well but why why why does Google insist on destroying the
internet and entirely undermining TLS in the process? Sorry. That was
therapeutic.

Bonus Quiz:

1\. When I encounter an AMP link I... a) Click it. b) Don't click it. c) I
don't see AMP links using FireFox.

2\. When my friends send me an AMP link... a) I click it. b) I don't click it.
c) Friends don't send friends AMP links.

3\. Reasons I've switched to FireFox... a) I love RUst. b) I care about
privacy. c) I hate AMP.

4\. My ISP is... a) Google b) Chrome c) None of the above.

The answer is 'c'.

~~~
Zarel
> who actually wants AMP and why is it even a thing?

AMP is a standard that restricts webpages to a subset known to load very
quickly, which is especially useful if you have a mobile device, or are in a
country with poor internet such as in the USA.

Here in this thread are people talking about how much they like AMP from a
user's perspective.

> Why can't Chrome just prefetch shit from the actual servers

The linked article answers this: for privacy reasons.

> and let ISPs handle the caching?

HTTPS does not allow ISP-level caching. This is generally a good thing; I
trust ISPs significantly less than I trust Google.

> Why does mother Google need to serve me all the content from...?

Performance, presumably.

> I know everyone working on AMP means well but why why why does Google insist
> on destroying the internet and entirely undermining TLS in the process?

I don't think they're doing that.

~~~
grey-area
_AMP is a standard that restricts webpages to a subset known to load very
quickly, which is especially useful if you have a mobile device, or are in a
country with poor internet such as in the USA._

That's not all it does. No-one would object to it if that was all it does.

The more important part is that it hijacks content and serves it from other
servers, _and_ requires including a js file from a large corp in every page.
That's a massive vulnerability waiting to happen, but it also gives complete
control of the web to whoever controls that js.

They need to ditch the requirement for js, and ditch the requirement for
framing with Google junk around pages. The web is an open ecosystem, that's
its strength.

Also, google should not be using their influence in search to push changes
which are profitable for them - that's abusing their monopoly position.

~~~
Zarel
I never said it was all it does. It's the reason AMP exists, and the reason
certain users like it.

The "hijacks content" part is by itself unobjectionable, especially with this
latest update we're discussing which fixes the URL bar issue. The other server
will serve a checksum so Google can't tamper with the contents. That makes it
just a free CDN.

Is the problem that you do not want Google to see your content at all? You'll
need to use robots.txt to ban Googlebot. You can't simultaneously want to
appear in Google search results and also not let Google see your website.

The required JavaScript is legitimately frustrating, I know. The AMP project
has an article about why they did it that way:

[https://medium.com/@cramforce/why-amp-html-does-not-take-
ful...](https://medium.com/@cramforce/why-amp-html-does-not-take-full-
advantage-of-the-preload-scanner-7e7f788aa94e)

Specifically, it's to prioritize resource loads. I personally don't think
their explanation is very convincing. But whatever, maybe it's easier for them
to do it this way or something.

I don't think Google is doing it to push changes which are profitable for
them, though. I legitimately believe they're doing it to make the pages load
faster and otherwise be better for the users. I don't even understand how it
could be profitable in any other way.

~~~
grey-area
_I never said it was all it does. It 's the reason AMP exists, and the reason
certain users like it._

Strategically, for google, owning the frame around the web and a bit of js on
each web page is vastly more valuable than customers having faster web pages.

So no, speeding up web pages is not why AMP exists.

------
niftich
The meat of the story is:

 _" We embarked on a multi-month long effort, and today we finally feel
confident that we found a solution: As recommended by the W3C TAG [1], we
intend to implement a new version of AMP Cache serving based on the emerging
Web Packaging standard [2]."_

I'm just reading through this so I'm gleaning as I go, but it looks like the
W3C TAG came out with a recommendation for 'Distributed and Syndicated
Content' [1] that specifically addresses AMP by name, and recommends
strategies to do this kind of content syndication in a way that preserves the
original provenance of the data.

The Web Packaging Format [2] aims to, apparently [3], solve packing together
resources, but, rather, HTTP request-response pairs, maybe HPACKed?, and
signed and hashed for integrity, in a flat hierarchy, in a CBOR envelope, that
nonetheless has MIME-like properties? I'm still digesting what's all involved.

[1] [https://www.w3.org/2001/tag/doc/distributed-
content/](https://www.w3.org/2001/tag/doc/distributed-content/) [2]
[https://github.com/WICG/webpackage](https://github.com/WICG/webpackage) [3]
[https://github.com/WICG/webpackage/blob/master/explainer.md](https://github.com/WICG/webpackage/blob/master/explainer.md)

------
ghughes
> Publishers shouldn’t know what people are interested in until they actively
> go to their pages.

Yes, that privilege is reserved for Google.

~~~
baddox
Well, Google doesn’t know anything until you go to a Google page either. Not
counting things like Google Analytics, of course, but publishers choose to
share that info with Google.

~~~
dannyw
Google Chrome, when signed in, uploads your web history to Google by default.

~~~
jankey
Emphasis "when signed in".

~~~
dmitriid
You don't need to sign in. Everybody knows everything about you. Whether it's
through tracking cookies, or through Google Analytics, or through device
fingerprinting.

Google quoting privacy concerns is especially disingenuous, as they can track
you and your activities across multiple devices.

------
spc476
Can someone explain what this is about?

> As we detailed in a deep-dive blog post last year, privacy reasons make it
> basically impossible to load the page from the publisher’s server.
> Publishers shouldn’t know what people are interested in until they actively
> go to their pages. Instead, AMP pages are loaded from the Google AMP Cache
> but with that behavior the URLs changed to include the google.com/amp/ URL
> prefix.

To me, this reads as "for _our_ privacy, we don't tell the publisher what page
has loaded" but that may be an uncharitable interpretation. I read the
referenced blog post and it didn't clear up anything about the "privacy"
issues.

~~~
niftich
The post makes the case that they don't want the publisher to know that AMP
thinks the user would want to visit, and that's more of a matter of the user's
privacy, since that implementation would have the user's user-agent as the
origin.

Instead, the concern is sidestepped by the extra indirection: the user's user-
agent will load the prefetches from the AMP cache.

How much of this is moot given that many browsers, including Chrome, offer
speculative fetching as a feature, is debatable.

~~~
pgeorgi
> How much of this is moot given that many browsers, including Chrome, offer
> speculative fetching as a feature, is debatable.

There's a difference between explicit prefetching (given by html tags, ie
there's intent), speculative prefetching on the same origin (you already talk
to them) and speculative prefetching across the entire net (you talk to
somebody new out of the blue).

Preloading search results for faster display without a local Google-side cache
means that more parties know that you (IP, User Agent, cookies) are
potentially interested in certain pages due to a Google search (referer
header).

With the AMP cache as currently implemented (and with the TAG bundles in a
future version), Google gets to know that you just got the URLs A, B and C
proposed by Google. Which is no additional information for anybody, at Google
or elsewhere.

If this new scheme allows rolling back some of the less fortunate effects of
AMP (the visibility of the AMP cache URL, the in-page URL bar emulation as a
workaround to that), all the better.

------
matthewmacleod
Still shite. AMP still breaks scrolling and results in weird stub pages that
are missing features. And I can't turn it off.

AMP-enabled pages load faster, but on the other hand I have an ad-blocker and
LTE that gets 10Mbps, so the improvement is negligible. Not worth breaking the
web, IMHO.

~~~
dcow
Stop using Chrome. Problem solved.

------
ericflo
I don't understand. So now the URL bar won't always show where the page was
actually loaded from? It could show example.com but really be loaded from
Google'S AMP servers? If I'm reading this right, I find it very sad.

~~~
wmf
That's how CDNs already work. For example, apple.com is served from Akamai.

And AMP is planning to use cryptographic signing so that the browser can
verify that the CDN didn't tamper with the page.

~~~
jopsen
Well, today Apple is trust Akamai with a certificate for their origin:
apple.com

In this future, it seems like the content will be signed with Apples TLS
certificate/key and then the content can be distributed by anyone.

This could open the doors for a more distributed web too :)

~~~
ericflo
To be honest, the more that I've looked into this, the more it seems like a
good idea.

It was not presented as a new and fairly dramatically different way of
distributing content on the web, it was presented as a way to get around
showing the URL in response to pushback from the web community.

I think if this announcement were written differently, I would've come away
with a much better impression of the whole thing.

~~~
jopsen
Yeah, the AMP thing just seems like a nice use-case...

But the new door this opens in terms of content distribution are very
interesting.

Shared caches might be a thing again... Who knows :)

------
bla2
I don't get all the amp hate. They seem to make an effort to do the right
thing, and from a user perspective it's a great experience imho.

~~~
acdha
> from a user perspective it's a great experience imho.

Until you need to share a link, wonder why the page loaded slower than normal
thanks to 100Kb of render-blocking JavaScript, or get phished or believe a
spoof because it has google.com in the URL.

I really like the stated goals but shipping something with usability problems
is a great way to get tarnish its reputation. Hopefully this new incarnation
will live up to the original hope.

------
lucideer
From the article:

> _while maintaining the [...] privacy benefits of AMP Cache serving_

"AMP Cache serving" == hosted on Google's server. This makes this statement at
best, stupidly oxymoronic, at worst, deliberately dishonest advertising.

> _privacy reasons make it basically impossible to load the page from the
> publisher’s server._

Browsers (including Firefox[0][1]) already do this. There are no "privacy
reasons" preventing this. The only reason not to do this is to present another
justification for opting into their AMP Cache product.

> _can take advantage of privacy-preserving preloading and the performance of
> Google’s servers_

Also a contradiction of terms.

[0] [https://developer.mozilla.org/en-
US/docs/Web/HTTP/Link_prefe...](https://developer.mozilla.org/en-
US/docs/Web/HTTP/Link_prefetching_FAQ)

[1]
[https://bugzilla.mozilla.org/show_bug.cgi?id=1016628](https://bugzilla.mozilla.org/show_bug.cgi?id=1016628)

------
ferongr
I'll go against the grain. I like amp. The less 50MB pages drain my 250MB data
plan, the better.

------
mcny
I'm sorry. I don't get this hatred for amp. Amp is not the problem here. The
problem is that we accept third party JavaScript on every page on our
monetized website. From what I understand, you can completely self-host and be
amp-compliant. Is that not the case? If you care about user experience, you
should not let the highest bidder* run JavaScript on your website on your
user's devices. Demand your ad network that they provide ads that are just
text and images. Demand your ad network to host that text and those images
themselves with a reliably low latency and so on. Or better yet, remove the
advertising if you care about user experience.

If you are a news organization and Google won't let you be in a certain
section without serving on Google, complain.

My understanding is that AMP exists to solve a problem. It obviously isn't the
only way to solve that problem.

~~~
lucideer
> _From what I understand, you can completely self-host and be amp-compliant.
> Is that not the case?_

I don't believe this is the case, no. Unless how AMP works has significantly
changed recently, it requires your site to be hosted on, and served from,
Google's servers.

This hosting is called "AMP Cache". When AMP first launched, Google's servers
were the only available "AMP Cache". They have since added the ability to set
up your own AMP Cache, which seems like some attempt to appease people's
concerns with AMP being Google-only, but doing so seems an utterly pointless
enterprise because of the below (from AMP's docs):

> _How do I choose an AMP Cache?_

> _As a publisher, you don 't choose an AMP Cache, it's actually the platform
> that links to your content that chooses the AMP Cache (if any) to use._

So if you set up your own AMP Cache, this just means your site will be hosted
on Google's servers, and served from Google search results on Google's, and
probably noone will ever visit the copy on your own AMP Cache server.

~~~
pgeorgi
The idea of additional AMP cache servers is that bing, yahoo, facebook,
twitter, ... could set them up for their outbound links, so they can implement
the same preloading behavior as an optimization (since they tend to have
servers that are closer to users, and the connection already exists and can be
reused).

Think of the Google AMP cache as an extended web site snippet on the
google.com search, not as your primary delivery mechanism: To make this safe
and useful for all parties (publisher, user, link service providers such as
google search or twitter), AMP defines an html/css/js subset that is
considered safe but still functional - so that (for example) analytics can
still be made to work, which is important to publishers, but without working
in the AMP cache hoster's domain context, which is important for their
security.

Another AMP cache provider is Bing:
[https://blogs.bing.com/search/September-2016/bing-app-
joins-...](https://blogs.bing.com/search/September-2016/bing-app-joins-the-
amp-open-source-effort)

------
fragmede
In related news, Firefox Quantum is out; rewritten in Rust. And, good news The
mobile version supports extensions, in case you're missing ad block on mobile
Chrome.

~~~
jhasse
Unfortunately it's a pretty bad experience on Android thanks to
[https://bugzilla.mozilla.org/show_bug.cgi?id=806385](https://bugzilla.mozilla.org/show_bug.cgi?id=806385)
(for example YouTube links won't open the native app)

~~~
vocatus_gate
If that's the worst that happens I'll take it.

~~~
jhasse
Scrolling performance is also a lot worse.

------
dveeden2
I once wrote something similar to figure out what happened based on corrupted
and non-corrupted input.

[http://unicode-doctor.myname.nl/](http://unicode-doctor.myname.nl/)

