Hacker News new | comments | ask | show | jobs | submit login
A letter about Google AMP (ampletter.org)
866 points by lainon on Jan 9, 2018 | hide | past | web | favorite | 346 comments

I do not believe page speed should play a role in search rankings at all. It has nothing to do with the content and the person with the correct and most relevant content doesn't always have the relevant skills or a DevOps specialist handy to meet the requirements.

This only helps the heavy handed SEO optimized sites with deep pockets and time to kill get yet another edge.

The article with the best insight isn't likely to be the one with the perfecly optimized website.

I understand that pagespeed effects end user experience when they hit a website, however that's not what I searched for. I did not search "fastest website with okay knowledge about dogs" I asked for "website with the best knowledge about dogs".

I want Google to be able to show me the most awesome page about dogs. The most in depth and relevant information. That niche dog blogger who is so passionate that they spend all their time researching dogs. I don't want "10 cool facts about dogs by Buzzfeed".

This seems clearly false.

Let's say the page with the best information about dogs took an infinite amount of time to load (never loaded). I don't think you'd want that page to be ranked highly.

What about if that page took a year to load?

What about if that page took an hour?

A minute?

What's your cutoff? Let's say, generously, that you're willing to wait 30 minutes for the page to load. Why such a sharp cutoff? Why do pages that take 30 minutes and 1 second to load get penalized, but pages that take 30 minutes are treated as fine? The user experience is basically just as acceptable (by your standards) / just as bad (by my standards).

Perhaps maybe instead of a sharp cutoff we need some sort of sliding scale, where pages that are only slightly slower than optimal are penalized only slightly, and pages that are much slower than optimal are penalized more.

Gosh, that sounds familiar.

There's a cutoff built into browsers in the form of a KeepAliveTimeout. A longer load time could have a valid reason, that's not for search engines to decide.

Also your example of a year is obtuse. It's like saying you should drive BMW because you will get killed in a Tesla if you crash at a million mph.

What Google's page speed factor does is differentiate between 1.5 and 2.5 seconds loading time. That has nothing to do with the quality of a page.

Your assumption is that the time factor can't be handled by the user. Users try somewhere else when a page doesn't load, and in my opinion that is fine. Granted, it's my opinion. Perhaps most users don't agree and here we are.

If it is truly the best information, it is what I want and I will wait. If I trust that Google can give me the best information then I would be willing to wait. I give up easily now because I can't trust Google to give me quality results. I will not wait for an ad bloated news aggregator.

When a user does a search, clicks on a result, that result does not have what they want so they hit back, and then they click on another result that is a failure of the search engine. "Users can decide the site is worthless after they click" is an opportunity for somebody to come up with a better system.

I guess I simply don't agree. The search engine should give me the best information, period. It does not, and one of the reasons it does not is because it is busy prioritizing things that aren't relevant to content.

That example doesn’t reflect the real world differences in page load time, however. IMO the load time arbitration doesn’t belong in the “search” engine. If your search engine fetches 10 fast loading pages but lacking the depth of information you need it loses utility.

The tricky thing is also the fact that Google does not inform the user that it is prioritising the faster loading pages instead of the pages with the most accurate match. This is a little dishonest too.

It's good if it plays some role when the content quality is similar between results. Some pages are so slow or obnoxious that I'd almost rather not get the information at all, than dreadfully waiting for ads and pop-up-modals to load. Perhaps ranking is the wrong tool, but it would be nice if I could know that a site is bloated before clicking the link. AMP lets me know that.

I agree. And if the speed metric is well-designed, and if making not-slow pages is easy enough (and it really is) then the Nash equilibrium is "Everyone gets the most relevant page and it loads quickly."

Once upon a time Matt Cutts used the example of 'a cure for cancer' marked up badly with poor HTML structure but providing the true result versus a well marked up site with the wrong info. Which should rank better?

The same IMO can be applied to speed as a factor. A slow site, (due to depth of content?) badly marked up but providing the correct result should outrank the fast site providing wrong info.

I’d argue that pages that are static with few images should be given preference. Simple CSS rules, no JavaScript (or very little of it), not a ton of images, etc. People from all walks of life are capable of publishing this content and it’s equally as accessible.

Pages that should be heavily penalized are those with lots of extraneous JS, bloated CSS files, lots of auto play videos, and huge image files that are supposedly created by “experts” that need a DevOps team to deploy and distribute all the bloat.

What if it's a photographers portfolio? An art collection site?

Sometimes images make sense.

But nobody waits longer than 8 seconds. That's the golden webdev rule for ages. Anything above deserved to be punished.

Art collections are easy to develop with icons and progressive images.

Unless you go there for the artist and you know what you're after.

Progressive images... eh. The js you load for it is 3-4x the images at 720p. Progressive jpeg, that's better.

> This only helps the heavy handed SEO optimized sites with deep pockets and time to kill get yet another edge.

This sounds plausible, but doesn't translate to reality. Look at the slowest content sites today: they're large, well-resourced news sites. Look at the small, independent individuals running their own personal site: many are very fast.

In reality, I'd say the biggest causes of website performance issues are (a) institutional bureaucracy - managers requiring their website to do something without regard for the overhead and (b) professional graphic designers defining UI features without technical knowledge of their implemention details. These are both, usually, corporate inefficiencies that have little effect on smaller maintainers.

What also seems to be missing in Google's metric is that of Page Size.

That is, by also having less bytes in the page, then less bandwidth is used. Since less bandwidth is used, less energy is used.

Unfortunately, Google doesn't measure, nor seem to care about energy consumption, as GoogleBot only cares about response times. (Of course, there are energy concerns regarding that too...but it's only every about speed with google and punishes everyone else for not evangelising it)

Using Page Size as metric has important benefits that Page Speed deliberately undermines. It also resurrects the notion that using external cached css and js is a good thing.

Did you know that in order to get 10 urls (the thing that you and I actually want) from google, you need to download 500K? It seems ludicrous to me. Each search result is chock-full of css and js that doesn't change.

Thankfully HN is old-school and uses external files and good-ol' minimalism.


Aren't page size and page speed (and energy consumption) correlated? If you request a 1mb page and a 1kb page from the same server, the 1kb one will load faster in basically all scenarios

If you can test speed close to the requester geographically, it's possibly an even better indicator of energy usage because it also implies the data doesn't have to travel as far.

Also, Google has no idea what external resources your browser has cached locally.

> Also, Google has no idea what external resources your browser has cached locally.

I'm pretty sure they have enough data to figure that out.

I absolutely agree with you. Speed shouldn't play any role in ranking. However in regards to relevancy RankBrain is going to eventually train itself to push the most relevant/respected results. Will have to wait and watch how it spans out.

> The article with the best insight isn't likely to be the one with the perfecly optimized website.

That isn't what search is ranking. It's ranking what users want. And users very most definitely want, on average, among other things, fast.

Did you ever try to look up something over satellite internet? AMP is a tremendous help, not because all other pages are slow but because you know you'll be able to load the page in <3s.

Funnily enough I spent a year or two in rural Australia on satellite Internet and I definitely see your point. I think it's okay for a site to offer an AMP capable site, I just don't think it should factor into the ranking. Perhaps there could be a Prefer AMP pages flag? Or the inverse, ignore Page Speed. I will concede that the majority of users probably get by with page speed as a factor and that my concerns are not those of the general user base.

As an aside, gaming on satellite internet was a real drag, the only game I had that would handle the full seconds of latency was Star Wars Galaxies. Thankfully that was already a favorite.

It is obvious Google is doing it for their own benefit as well. The faster your page loads, the faster they can crawl.

> Instead of granting premium placement in search results only to AMP, provide the same perks to all pages that meet an objective, neutral performance criterion such as Speed Index.

But who will measure it "objectively" (leaving aside no one agrees on how to universally measure load speed).

Googlebot? From what location and what machine types and how often? Should I get search results based on an average of all possible variables or the closest matching my locale and device? What happens when page content changes? What happens when the page sometimes loads slow content (e.g. ads) and sometimes doesn't? What happens when SEOers start cloaking their ad loads or taking advantage of flaws in the benchmark?

It's good to have a speed signal in search rankings, but this petition shouldn't pretend that's an objective replacement for something that always displays content first and loads media and third party content async

> But who will measure it "objectively"

One potential solution that fits the theme of the letter could be logic like this:

    MAX = <reasonably fast load time>

    if page loads in > MAX:
        -> punish page rank
Rather than the current logic:

    if page uses GOOG proprietary:
        -> show at the very top
        -> trick anyone that clicks
        -> profit
        -> ban website
        -> reach out to sell AdWords
"Reasonably fast" could be relative to the context - for example, ecommerce results might be a bit larger on average than news results, etc., so not a static "3 seconds at 3G speed".

"Reasonably fast" could also be defined more loosely as "page uses good optimization practices" (e.g., like a YSlow grade).

  >  MAX = <reasonably fast load time>
If you define 'reasonably fast' as ~10ms after click, the only possible option is AMP. Even a .txt file will be orders of magnitude slower.

Privacy reasons mean that it is basically impossible to load the content from the publisher's server until after the user clicks on the result. If you wait until the user click event to load the content, then there is nothing you can do to avoid a full network round trip.

AMP preloads the minimal content before the click, removing the round-trip. That's the whole magic, and the only way to get the 'instant' load.

There are unavoidable technical requirements to do this:

- Loading from Google's (or link provider X's) servers to preserve privacy.

- Constraining the html so that the load itself can't break privacy or add jank to the search result page.

If you aren't willing to accept these constraints, then you are always going to face a network round-trip in added latency. You can't have all 3 of:

- Zero perceived latency

- Privacy

- Serving from the publisher server

You can only pick 2. AMP forces choosing the first 2 which are what the vast majority of users care about.

Then AMP goes way out of it's way to give publishers back everything they could possibly miss from the 3rd.

> - Constraining the html so that the load itself can't break privacy or add jank to the search result page.

Why this? If the HTML is already loaded from Google's servers, the load shouldn't be able to break privacy even more, and jank shouldn't be affected by the specific HTML that's being served, just by its size. So Google could just preload everything that's small enough.

Good question. Preload that actually renders the documents includes running enough javascript and parsing enough CSS that the initial document will render. With a non-constrained document, that can include fetching resources not on Google's servers and which can peg the CPU, causing jank.

If you do constrain what can run in your preload to avoid these, you end up designing AMP.

> AMP preloads the minimal content before the click, removing the round-trip. That's the whole magic, and the only way to get the 'instant' load.

There is another way which enables almost instant loads. Turn off js completely. Try it yourself before you disregard it. It literally turns page loads instant over a regular home connection.

This is strictly slower than an AMP preload, which is why I mention that even a .txt file will load slower. The .txt file document still requires a network round-trip, regardless of javascript.

On some wired connections, a network round trip is within user-perceivable 'instant', but for many people on mobile devices, this is very noticeable.


Even on the best LTE networks, the 'core network latency' is 40-50ms. 'core network latency' is the latency for getting the packet from the phone to tower to the packet gateway. This is before the packet even goes on the internet. And you need to double it for the round trip. You usually need to double the whole thing for initial DNS lookup.

Best possible latencies for an html page load on excellent LTE connections without prefetching are around 150-200ms. Most users in the world will experience >1s. A United States major city wired connection is not at all representative of what most people experience on a mobile connection.

Any WebPageTest comparison between Desktop and Mobile will demonstrate that RTT can be _vastly_ different between Desktop and Mobile.

Just look at Time to First Byte (TTFB); when you pile on SSL negotiation (since most have moved to HTTPS), the advantage is even greater since all AMP pages currently fall under the aegis of Google's domain and all non-AMP pages will require that overhead at first click.

Yeah I was talking about computers, not phones.

Then you are missing the point of AMP.

Increasingly, phones are the dominant computing tool accessing Google search results. Optimizing for phone experience is key to serving the largest growing demographic of searchers.

Preloading website content is not what I want for my phone. I have a very low datacap and I go out of my way to disable AMP and preloading because web publishers believe that my bandwidth can be used freely by them in an effort to "reduce latency". Instead it robs me of the little data I have for my cap.

That's fair, but I think losing perspective of the tradeoffs. The AMP preload will fetch only the main html file and AMP javascript. Images, videos, etc are all deferred until later. The javascript is cached for a year, so you probably already have it. So, compressed, this is probably ~100kb or less.

Google doesn't start preloading 50 documents, it loads the top result or 2. Even viewing a hundred amp pages in a month is going to cost you less data than one 10kx10k unoptimized image that a publisher decides to load, which happens a lot. Note that AMP optimizes images too taking into account your device resolution, so it's most likely saving you bytes.

I don't want anyone preloading anything on my mobile connection other than the absolute necessary. Period.

Everything else wastes my bandwidth and I don't care that it is only 100kb, my monthly, effective datacap is about 50MB and I always browser without images.

The "100kb or less" is only about 500 pages preloaded, not accounting for the size of the google webpage itself. I'd rather not have any preload at all.

Note also that Google doesn't show AMP for desktop user-agents.

Google already measures a bunch of site speed parameters (time to render start, time to render the content text the user is trying to read/search for, etc.)

They factor in all that stuff to a user happyness metric which impacts rankings, probably via a bunch of machine learned models.

The AMP logos and stuff are to try and force webmasters to take speed seriously. For years people have said it's important, yet big sites like eBay and Amazon, who certainly have the financial means and motivation to, still take 10x longer to load than what would maximize both revenue and user happyness.

> They factor in all that stuff to a user happyness metric which impacts rankings

And speed is way down on the list. Try searching for lyrics to any popular song and tell me it isn't so.

All they really need to do is add more weight to it. But they don't want to because that breaks the lock-in that they are striving for.

Google owns a browser. They should be able to measure average load times for every page on this planet from almost any location

(Of course User permission must be granted, but shouldn’t that be feasible?)

They do..for over a million websites! https://developers.google.com/web/updates/2017/12/crux

This data is a much better indicator of how fast a page is vs if it is AMP or not.

I wouldn't be surprised if they already have an opt-out feature which does just that.

The one thing AMP does right is have a hard and fast set of rules that pages must follow. Page performance for regular web sites dies a death from a thousand cuts. Was is the web developer who added that last analytics tag that made it slow? Or was it the 12 other third-party tags that already existed? Unless you have a single self-contained team who control the entire stack, do a ground up rewrite, or have a dedicated performance team, it's hard to hold the line against a thousand little things that slow the web down.

People are already cheating with AMP by putting only introduction on the AMP page and then linking to the heavy main content at the end. Cheating is something that Google has to deal with all the time.

Starting February 1st they'll be blocking AMP doorway pages.


So making my own page display what I want is cheating now?

No, serving differing content based on the source, making users double click to view it all, while trying to reap the “SEO” benefits of AMP is.

GoogleBot runs only in a few datacenters in certain locations in the world with amazing connectivity.

It is a horrible predictor of load times.

This is probably the easiest problem to solve. Hosting a GoogleSpeedTestBot in every city with the sole purpose of speed testing a page would have laughably negligible impact on Google's resources.

You're proposing hosting thousands and thousands of servers across every city in the world where each server would be accessing all the top 1M sites in the world every day and measuring the speed of each page in that site?


Not every day; at the time of publishing, and maybe few days later. (This could be improved even further: you don't need to access every article. If you could identify a page that's representative of a real article, that could suffice. Maybe have periodic random checks to see if this is still representative.)

Not every city; the top 500 cities would probably cover 90+% of the users.

You're not understanding the scale of Google. They're incredibly good at exactly this stuff.

Oh, I am, but maybe you're not understanding the scale of the internet. Connections, cables, CDNs change on a daily and weekly basis.

Is a cable is cut between Jakarta and Singapore, it can take months to fix as Indonesia only allows Indonesian-flagged cable layers to operate in its waters.

Individual datacenters, CDNs, cables are going offline constantly, so you'd need to be continuously testing every site to rank it correctly. Not only that, each page in a site might have a very different structure, so you can't just rank a site, you need to rank individual pages and articles too, which means testing every new page and article from every city.

I guess you're not getting the scale of that...

AMP standardizes the CDN serving part, largely eliminating that issue. AMP articles can't prevent 3rd party CDNs from caching them, which means that they can be served from the closest CDN after the first time they are used.

Or just use your established botnet aka chrome and android.

If you have the number of roundtrips and size of documents you can easily calculate page load speed dependent on latency, packet loss rate and connection speed. That allows you to simulate nearly any connection in the world.

Which is the reason why Google focuses so much on the number of roundtrips. For most, downloading 50kbyte doesn't cause a lag but 10 roundtrips before the site can be rendered do.

Counterpoint: You can already simulate the way bad Internet connections work. You don't want that introducing noise on top of the site's metrics. Measure the site itself, then apply a variety of scenarios for what practical fetches look like.

Chrome's own dev tools can emulate slow connections for you. If their browser can do that, why not their crawler?

Because the user isn't sitting at a datacenter.

An user in Indonesia isn't at the closest datacenter in Singapore. It doesn't matter what you do, you can't, from Singapore, try to emulate what an user on a particular ISP in a particular city in Indonesia will face.

In case it isn't clear:

Consider 2 websites, both have hosting in Singapore, but one also uses a CDN in Jakarta.

The Google crawler running on the Singapore datacenter connects to both websites, speeds are great because they are sitting just next to the datacenter. Both get great ratings.

An user is in Jakarta, and tries to connect to both websites. One of them takes a century to load, because it has to load everything from Singapore, and international cables from Singapore to Jakarta are expensive, so connections are capped. Experience sucks.

The second website uses a CDN in Jakarta, located at the local coloc facility to which the user's ISP is also connected to. The website loads instantly.

CDNs today are located in every single small ISP, in all countries, in all medium-sized cities in the world.

Beyond the fact that the network can be mapped (Google surely does this) and simulated, I feel the more important observation is that this doesn't matter. A good estimate of bloatiness should not require knowledge of network topology.

CDNs are a hack that can make websites load a bit faster, but that's about it. It won't make them use less RAM, or consume less of your data cap. Ultimately, performance is about the size of the site, the resources it uses on the client, and the number of requests it makes. All of which is independent of the number and location of CDNs.

The argument about bloatiness reads like a privileged position, IMO. Everyone is interested in getting customers with shitty 2G connections and tiny data caps, because there literally billions of these people and nobody wants to bet that they're never going to have money to spend on your services.

If you pretend that all your users have net connections like the US, Europe, or parts of east Asia then you're going to end up as a small player in a balkanized internet. That's why all these companies are making aggressive investments and acquisitions in other parts of the world.

CDNs are also part of the fundamental reality of the internet. Could you imagine Netflix or YouTube without CDNs? The amount of network capacity investment necessary to run the internet as we know it without CDNs would cripple it.

And it's not just the internet, but it's a fundamental fact of computing that you have hierarchical caching layers with some data far away and some data closer.

No CDN is changing the fact that your website with 2MB worth of content and form weights 20MB and loads another 10MB of trackers.

It's not about privilege or capturing emerging market. It's about laziness and waste, pure and simple. The same bloat that prevents someone from accessing your site on a smartphone in the middle of Africa is also significantly slowing down the computer of someone in western Europe, and eats into battery life too.

And let's not bring streaming video into this as it's completely off-topic.

I've heard this argument before. Think about the people with shitty 2G connections and equally bad backhaul or backbone networks. The fact is these people are not necessarily as interested in your 2MB pages as you'd like. They have their own web pages and apps that they use. They do care about CDNs because some of these networks are just plain bad, and it's not the first hop.

And that's the privileged position. Bloatiness is annoying if you're an American reading American pages, or a Spaniard reading Spanish pages, and we like to complain about it because it means the difference between being able to read an article on the subway and having to wait for the next stop. But I'm never in a position where I have to complain about the backhaul network or the backbone.

In the future, please don't try to police "off-topic" conversation.

The original point I made -- that Chrome's dev tools can simulate a slow network connection -- got nit-picked by people saying it can't figure out which CDN serves which ad bundle. This is not a useful, relevant or even coherent reply.

Public web sites that only work for "first world internet" customers shouldn't be a thing. Sites that load 100MB of ads and trackers and fifty different JS frameworks to display a few paragraphs of text should be getting the search-engine death penalty until they clean up their act. They should be thinking about people who aren't on fast broadband connections. If they aren't, the search engines should be.

YouTube learned this from the "paradoxical" result of a lighter-weight page increasing load times:


(spoiler: it was because the lighter-weight page meant it was suddenly usable by people who had slow connections who couldn't use it previously, and those people began to use it, which skewed the average load time higher)

> The original point I made -- that Chrome's dev tools can simulate a slow network connection -- got nit-picked by people saying it can't figure out which CDN serves which ad bundle. This is not a useful, relevant or even coherent reply.

Let's say your website is hosted in Jakarta, and serves Indonesian customers. You make it lightweight and fast.

Your competitor hosts its website from Singapore, and is much heavier.

Now Google, from its datacenter in Singapore, wants to test the load times for your website and your competitor's. Sadly, the host you chose doesn't pay for a lot of international traffic on the cable to Singapore, since it primarily serves the Indonesian market. Google tests both sites, and your competitor's, based in Singapore about a mile away from the Google datacenter, loads several times faster, besides being heavier.

Google ranks your competitor higher, and you are pissed off, as your customers load your site faster.

Now another case: you and your competitor host your sites in Singapore. You build a lightweight site, and your competitor's is full of heavy assets, with oversized images, etc. Google tests both of your websites and yours loads faster, and is ranked higher. You're happy.

But your competitor signs up for Akamai and replicates the heavy assets in local servers in all the major ISPs in Indonesia.

Customers in Indonesia search and see both your and your competitor's results. You rank higher and get more clicks, but as your traffic is subject to the congestion on the international cables, your site actually loads far slower than your lower-ranked and heavier competitor. Customers are unhappy.

In all of this, simulating a slow connection on Chrome's dev tools is irrelevant, because you don't know where the bottleneck is!

Again you are missing the point. And at this point it's hard not to believe you're missing the point deliberately.

> A good estimate of bloatiness should not require knowledge of network topology.

It does, because the definition of bloatness depends on the network speed.

What counts as bloat for an user sitting in Singapore isn't the same as an user sitting in Jakarta.

> Beyond the fact that the network can be mapped (Google surely does this) and simulated

Network topology is irrelevant in this case. The ISP in Jakarta might have a satellite connection to Singapore, but have a CDN and it will load faster than in another ISP with a 10G international fiber link to Singapore.

> All of which is independent of the number and location of CDNs.

That's completely wrong. Today, the vast majority of internet traffic is served from CDNs, CDNs (and their placements) are critical for the internet as we know it.

> It does, because the definition of bloatness depends on the network speed.

No, it doesn't. It depends on the proportion of actual content to the unnecessary cruft. The difference is that what a Singaporean considers bloat might prevent someone in Jakarta from accessing the site at all.

> That's completely wrong. Today, the vast majority of internet traffic is served from CDNs, CDNs (and their placements) are critical for the internet as we know it.

That is if you include streaming video, which is off-topic for this discussion.

Imagine a perfect world with the ultimate CDN solution - something like IPFS, which ensures everyone is a CDN to everyone else (instead of wasting bandwidth with multiple downloads of the same content). The concept of bloat wouldn't disappear in such a world, because bloated websites will still require you to download unnecessary amount of data to your machine, and will reduce your computer's ability to multiprocess while also using up your battery life.

I stand by what I said before - CDNs are a hack (basically a hand-rolled, rudimentary distributed system) to give you some extra download speed. They don't solve the problems of download size and resource usage.

None of this is relevant, though.

Simply simulating a globally slowed-down connection will tell you what's going to be a problem. And, really, common sense will tell you what's going to be a problem, too -- if a site "needs" to load 100MB of ads and trackers and other JavaScript to display a couple kilobytes of actual text content, then you have a problem. A user on a slower connection is going to see that site take forever to load even if there's a CDN across the street.

The poster was arguing that negative network conditions can be simulated.

If the problem is a lack of information about global network conditions, then you can deploy canary machines around the world to gather that information.

They can be simulated, but not predicted.

> If the problem is a lack of information about global network conditions, then you can deploy canary machines around the world to gather that information.

You'd need to put machines in every single city and every single ISP in the world continuously testing every website in the world.

Doesn't make any sense at all. That's not how networks work.

OP was talking about a "neutral" speed index. You're talking about solving for every possible combination of user, ISP, and website. There's probably a reasonable middle ground in there somewhere.

But that's the point, it doesn't depend on a "neutral" speed index, it varies from website to website, ISP to ISP.

If you're in Jakarta trying to access a certain website, if that website is available at the local CDN might make a bigger difference than which ISP or even which plan you're using.

You could have a 1Gbps fiber connection from your ISP in Jakarta. If you're trying to access a website in Singapore the your ISP's international link is congested, that 1Gbps will be useless.

OTOH, if you have a 10Mbps connection, and you're trying to access a website replicated at an Akamai CDN in your ISP's network, it will be way faster.

If we accept your argument that Google can't measure the speed of a webpage then wouldn't it also follow that Google can't know if the AMP version of a page is actually faster than the non-AMP version?

> If we accept your argument that Google can't measure the speed of a webpage

Correction: Google can't predict the speed of a random webpage to a random user around the world.

But you can reasonably predict that pages hosted at widely available CDNs around the world will load faster on average than pages not available.

Google _can_ predict the speed of any webpage to any user. Google Analytics data is enough. Your whole thread is predicated on Google having that one product (crawler). They have everything.

It's unfeasible, but also completely unnecessary. CDNs can give you some speedup, but they don't magically make a site smaller and less resource-intensive to use.

If a site is running Google Analytics, couldn't that information feed back into page loading speeds?

You basically have the same problem then, of forcing sites to use google products in order to be ranked highly.

Measuring page speed accurately seems like a way smaller engineering problem than attempting to convert the whole internet to AMP.

Here's a potential solution for Google: just log load times for people who use Chrome and are logged in to their Google account at the browser level.

All the hard work of eliminating abuse is something they've done already for those accounts. And their userbase is so big, and presence so strong, that it would generate pressure everyone would benefit from.

Erg, the idea of Google phoning home information about each page I visit at the browser level is a bit terrifying...

It does happen already - after all, you have shared browsing history between logged in devices using Chrome.

My idea doesn't require anything new except them measuring load times, which I'm guessing they're doing already for improving Chrome, and then using this data to influence search ranking. Everything else is already set up and working, and you opt into it if you're using Chrome and are logged in browser-wide.

How confident are you that that doesn't happen already?

It certainly does happen already if you are opted into it.

It certainly does happen already without having to opt-in for the many sites that collaborate with Google, reporting your interactions with their site to GA.

I was trying to avoid making a controversial statement. I didn't mean to imply that you were explicitly opting in, just that you are opted in.

It already happens with every site using Google Analytics, which is nearly every site you visit...

Unless you are running add-ons that block Google Analytics, which I would assume everyone here would already be doing.

Last I knew, you couldn't disable Chrome's async DNS feature, which will bypass your system's configured DNS servers and use Google's own. So, while certainly not the same level of tracking that Google Analytics or even browser history can perform, many (all?) domains you visit are still sent back to Google in one form or another:


Holy hell. So this is the dumb DNS crap Chrome is doing. I had occasional issues with Chrome suddenly not being able to load anything due to what looked like obvious DNS issues, except they didn't affect anything else and there was no way for me to do anything with it except wait it out. There were moments I thought I'm imagining things.

Yeap! And if you're on a private network you can leak internal domain names. Or, if you some hosts that are available publicly but resolve differently internally, you'll get the wrong IP. I think the only real solution is to kill Google's DNS IPs in your routing table.

Well, I don't use Chrome. Do you know if this affects Firefox as well?

It shouldn't. I must've just gotten myself mixed up. I thought this thread was about Chrome phoning home. Sorry for propagating confusion.

Everyone here constitutes maybe 1% of total traffic on the internet; still plenty of data points to be had.

UMatrix is a good way to get an easy visual of what's getting sent and loaded and from where.

If if was every page, that would be terrifying. But if it only measured the speed of pages directly linked from a Google search result, I would be fine with that. Google wouldn't be learning anything new there.

> or taking advantage of flaws in the benchmark?

Then you improve your benchmark.

That's a cat an mouse game that search engines and SEO companies have been playing for years. "Improve your benchmark" only works when you don't have adversaries.

Edit: To clarify, "improve your benchmark" is not a viable strategy by itself any more than "put your opponent in checkmate" is a viable strategy. As advice it is unhelpful. You generally have to mitigate flaws in the way you rank pages, keep large parts of the algorithm secret, and attack the economic viability of your adversary's strategies. Rather than going for checkmate, you want your opponent to resign so you can do something else instead.

There have been incentives for bad behavior on the web for so long that I don't think we have any hope of coming up with any kind of automated benchmark to tell us which pages have good user experience.

> "Improve your benchmark" only works when you don't have adversaries.

If this were true, then there would exist no search engines consistently capable of finding content highly relevant to the user's query -- as opposed to just returning promoted content, which according to your premise is always winning the adversarial battle.

Sure, there is an adversarial element, which means there is a constant 'battle' for 'fairness' among results (or whatever your goals are as a search engine). But in reality, search engines as a whole seem to work rather well at finding what we want. Of course results are biased from adversarial pressures (and that's what we're discussing on this thread), but it does seem possible for "benchmarks" to work well in an adversarial environment.

Search engines work by not just improving the benchmark but also making the benchmark a trade secret. That's what I was saying. And if it's a trade secret I would say that the word "benchmark" is a misnomer. Broadly speaking, a benchmark is a standardized way of measuring something, and if only one company can perform the measurement then it's not really a standard.

> If this were true, then there would exist no search engines consistently capable of finding content highly relevant to the user's query -- as opposed to just returning promoted content, which according to your premise is always winning the adversarial battle.

I can’t tell if you agree that google doesn’t return highly relevant content or if you just get better results than I do. Almost all my search results are dominated by low relevance high click through brands, regardless of quality. For example: quora often replaces high quality content with low quality paywalled answers researched from wikpedia or other public resources. 10 years ago you would have gotten metafilter, which is (imho) objectively a higher quality source less hostile to web users.

> I can’t tell if you agree that google doesn’t return highly relevant content or if you just get better results than I do.

The degree to which search engines are gamed successfully by adversaries is not the variable I wish to present a strong opinion of here, which may explain your confusion on my position here.

Instead, I just wanted to illustrate the logical connection between an adversarial-defensive ability to rank based on [website loading speed], and the ability to rank sites based on [any other factors]. If you can pull off adversarial defense of the latter, you probably can achieve the former with even less effort.

That said, I do present to you tentatively the claim that most popular search engines do a decently good job at fulfilling their goals, even if those goals are not in alignment with users' goals in many cases.

For example, there are certainly many examples of paid ads given preferential treatment in search results; however I do not believe this constitutes a case of an adversary beating the search engine's metrics. Rather, it is a consensual relationship/contract between the search engine and an advertiser being fulfilled. Whether this contract/relationship is in the best interest of consumers, is another matter altogether, of course.

It's funny how no one was up in arms when facebook (the leading source of publisher traffic) did their version of accelerated pages.

AMP serves a purpose for the end user and it does so well, it loads instantly and doesn't consume much data in the process.

As for their "demands":

1. Google already states that AMP pages are ranked higher because they're faster to load.

2. I'm not sure if it's related but they they addressed that only yesterday: https://amphtml.wordpress.com/2018/01/09/improving-urls-for-...

So are you saying -- all else equal -- that if your webpage can respond just as quickly as an AMP one, that your search rank won't be docked?

E.g., your site is just as performant as AMP, but you're not using AMP.

Yup, that was in the the blog post announcing AMP, and subsequent press comments.

I'm of the persuasion that they can rank and display the results however they please, it's their site after all, so it's a non issue either way.

I don't care at all about AMP really, but this:

> I'm of the persuasion that they can rank and display the results however they please, it's their site after all, so it's a non issue either way.

I don't get why anybody says this. Of course they are in control. Nobody can force them to do it differently (maybe the government but whatever).

Most people aren't saying that Google has no right to do what they're doing. People are saying that Google _should_ be doing it differently.

> I'm of the persuasion that they can rank and display the results however they please, it's their site after all, so it's a non issue either way.

Of course they can legally do it, and we're not judges debating that.

There's a difference between what they are allowed to do legally, and what they can do that keep me coming back as a user. This is legal, but it makes me use DuckDuckGo instead.

I wonder how Google manages the network topology for testing this so that the fact that AMP is served from a Google-local cache does not give it a speed advantage to Google's speed-testing bot beyond any it may have in typical, outside of Google, use.

Google preferches the amp content.

Your site will never be faster than AMP because google can’t prefetch content from your site.

They can't? Or they won't?

They could've made an open standard that not only websites but other search engines could implement as well, but then again why would they do that? They'd lose a competitive advantage. I think AMP is very deliberately locked in to google.

Any site can implement an amp reader. E.g. Cloudflare has setup an amp cache third parties can use. I believe twitter makes use of this.

This new open standard they are pushing will eliminate the need to even have an amp cache for a site to consume amp, win win.

Good luck trying to get to AMP performance without using AMP. That will likely be more work and in the end you come up with AMP in all but name.

> It's funny how no one was up in arms when facebook (the leading source of publisher traffic) did their version of accelerated pages.

Hmm, I guess it depends where you look. I saw quite a LOT of people upset by it. Many participated because they felt like they had to in order to get views.

> 1. Google already states that AMP pages are ranked higher because they're faster to load.

Cool, so how do I get my plain text page which loads faster than the motherfuckingwebsite.com into the AMP carousel?

Look at the source of motherfuckingwebsite.com. You'll find that it loads a certain analytics script. Motherfucker indeed.

And it's well aware, right before loading the script in the HTML is:

    <!-- yes, I know...wanna fight about it? -->
The script is also loaded async, so it's not that bad.

Exactly. Which is why I have sites that load even faster than motherfuckingwebsite.com, yet they don't get any of the search ranking benefits.

Who cares about facebook? This is the web! AMP threatens to force even small blogs to convert from open web to walled garden where Google decides what will be published and what will get blocked.

Why convert? I find it pretty easy keeping an amp and non-amp version on the same codebase. Just a few switches what's activated where. And many of the restrictions (images with set sizes, no style tags, no use of !important) make a lot of sense for any page. And going once through Bootstrap CSS and deleting everything you don't need also helps improving user experience.

If anything, supporting AMP has accelerated my non-AMP pages a lot.

I would normally never do this, but over the past couple weeks I've been drafting a blog post proposing an open AMP alternative called Particle.

You can read it here: https://andrewrabon.com/particle-a-proposal-for-tinier-html-... [Draft]

Would anyone here be interested in seeing the blog post completed and/or helping me build it out? If so drop me a line at andrewrabon at gmail. :)

I also want feedback on if my ideas are actually solving a problem, and are doing so in the correct way. I only recently joined a news publisher so I'm not 100% cognizant of the issues at play.

Sounds like extra work that websites would need to do in order to adopt. If there was a way to say automatically remove the javascript and other html extensions and the automatically serve the site via particle, then that would facilitate adoption.

Hm, maybe have a <link rel> to the Particle version and switch over to that with a browser extension?

The real issue is incentives; that's what you need to work on.

Why not leave the css out for extra performance?

> Instead of granting premium placement in search results only to AMP, provide the same perks to all pages that meet an objective, neutral performance criterion such as Speed Index. Publishers can then use any technical solution of their choice.

Hasn't Google gone on record saying that AMP doesn't affect search results given the same page load speed for non-AMP sites?

Except that AMP pages are often in the band of promoted results that show above the remainder of the search results, at least in the mobile app. And are prominently icon'ed in the desktop results.

So yeah, Google's not honest about this.

From my test the AMP icon in Google's search result is gone (Chrome, Android, Incognito tab).

Disclaimer: I work for Google but not on AMP product.

Search for Trump on iOS and you'll see an AMP icon.

Is the issue the icon or the carousel being at the top? I just checked and the carousel had both AMP and non-AMP content for me.

If you can make your page load as fast as an AMP page you'll probably get ranked there as well. But that's really not easy to achieve. The reason Google can say they don't prefer AMP is because AMP is extremely fast (esp with high latency)


The First Amendment gives them the legal right to curate results as they wish without interference by the U.S. government. Nowhere in the Constitution does it say that other people aren't allowed to get pissed off.

I don't think the letter authors were calling for an Act of Congress. My read is that they were calling attention to the damage that AMP does to the content providers. And perhaps they could have put more emphasis on the fact that it's in Google's long-term interest not to bleed content providers dry.

They seem to suggest that Google's control over their own site's design/function is somehow an affront to their cause, not mention malicious.

They certainly see it as a big enough grievance to include in a manifesto type document which repugnantly entitled.

Free speech cuts both ways.

You seem to suggestthat OP's control over their own site content (ie the manifesto) is somehow an affront to your belief in google's free speech.

You certainly see it as a big enough grievance to write a comment on HN which is repugnantly entitled.

They get to pass comment on google's behaviour (a comment which, incidentaly, a lot of people seem to agree with). You get to pass comment on their behaviour. I get to pass comment on your behaviour.

Holding people and companies to account in a soft manner, without resorting to force (legislation counts as force) is an important aspect of a functioning free society. The freedom of speech is not freedom from criticism but freedom to criticise.

God, I hate this argument with a passion. The First Amendment was a means to and, which was protecting free expression from undue influence of the day's most powerful institutions. While it's technically true that all kinds of censorship and other shady shit from powerful organizations complies with it, I consider that a oversight and a bug. An organization as powerful as Google needs more constraints put on it, whether that's through antitrust or some other mechanism.

Speech is protected.

Anti-competitive business practices are not.

They are certainly entangled in this instance.

It's funny how everyone's a legal scholar these days.

It's true that AMP doesn't affect organic search rank. What's misleading about that statement though is that Google displays a carousal specifically for AMP results.

They'll probably do away with the carousal eventually, but for right now an AMP page does have an advantage in search.

The carousal displays both, amp and non-amp content.

Does it? Then I may be misinformed (I very rarely use mobile).

If that's true, and amp doesn't have any preference in the carousal, then Google's comments are completely accurate.

I personally don't believe it.

Does anyone actually believe a fancy AMP powered site isn't getting nice traffic boost from Google?

> Does anyone actually believe a fancy AMP powered site isn't getting nice traffic boost from Google?

If they were to get a nice traffic boost, would it be because the site actually loads faster or because it was ranked higher. I think both would increase the traffic boost.

But the question is do the speed changes made (only regarding adding AMP) effect the PageRank

You’d have to be pretty daft to take their word for it. Google hasn’t played fair in the search space in over a decade.

Well, they may not be risking it, as the EU would happily give em another monstrous fine for that.

Doesn't https://amphtml.wordpress.com/2018/01/09/improving-urls-for-... address one of their two demands?

To me the proposal there is arguably even scarier than what AMP is doing now.

It requires changing browsers, to let people distribute Google Hosted Apps on their own domains (as long as Google approves). Sites distributed in this way will get preferential treatment by Google.

This is essentially an App store model, similar to Google Play, and the really big danger is that we end up in a place where you either distribute (and host) your website through Google's AMP registry, or you're not actually on the web (in the same way that distributing your Android app as an APK outside of the Play store is not a realistic distribution channel).

I hope Firefox, Safari and Edge will resist this.

> you either distribute (and host) your website through Google's AMP registry

As long as anyone can setup their own AMP-like cache and build a search engine similar to Google, that also supports preloading... then I'm not sure it so bad.

The proposal GP refers to has a lot of use-cases, it basically makes caching of HTTPS by third-parties possible. At-least to some extent.

> As long as anyone can setup their own AMP-like cache and build a search engine similar to Google, that also supports preloading... then I'm not sure it so bad.

Yes, anyone could build a cache. But can anyone build a cache that can compete with Google's absolutely massive infrastructure and pile of money, not to mention the reach they already have with search?

I'd say this is not a discussion on the level of ‘can you build a better product and win from a giant company?’, more like ‘can you build a planet and get everyone to move there?’

Cloudflare can (and does). So do Apple, Facebook, Microsoft and a few others.

Yes, the web is dominated by big companies more than ever before but there's at least some competition left.

I agree, doing it is a different thing.

cloudflare is already doing an AMP-cache. If the specs works out I'm sure other big players might join.

But yes, taking on Google search is hard.

> It requires changing browsers, to let people distribute Google Hosted Apps on their own domains

Does it? The web packaging spec linked to doesn't seem to require any of that. And it's intended as an open standard for all browsers to implement, so you wouldn't need to switch.

I mean it's just looks like content addressing with more steps. Really rough around the edges but the idea that large static portions of the web could be cached by anyone sounds great.

No. In fact, the solution detailed in that article accomplishes the exact inverse of one of the letter's demands.

While the letter wants third-party content not to be surfaced in a wrapping viewport that downplays the fact that it's actually Google's AMP Newsreader, the recent AMP announcement details a planned change where emerging tech [1] will be used to make the URL appear as if the content was directly loaded from the distant origin, because the content being served has been digitally signed by that origin and its serving has been delegated to Google akin to how a run-of-the-mill CDN is delegated the authority to serve content for a domain that's 'spiritually' owned by someone else.

However, the recent AMP announcement does address a very frequent complaint about AMP. Just one that's at odds with the one the letter is requesting.

[1] https://github.com/WICG/webpackage

If the content has to be signed by its true origin, does it matter who's technically serving the content?

There could be privacy considerations I suppose, but the standard addresses that already: https://wicg.github.io/webpackage/draft-yasskin-http-origin-...

That is a great article because it sheds light on actual technical reasons for the current behavior. The original article is pretty light on technical details of how to achieve their suggested goals, which is weird because it assumes the user knows/cares about URLs, and that is probably only true of those with more tech savvy.

In theory it does, but there are a lot of open questions there (including whether Apple will implement it). I'd say the search engine placement is the bigger of the two issues anyway, as it clearly shows Google putting their thumb on the scale.

1. H2 2018

2. Relies on a "web standard" that Google is pushing and is implementing in Chrome (and so far, I think, only confirmed to be in Chrome)

We're expecting to follow roughly the same path that SPDY (eventually HTTP/2) took: iterate on some early versions in Chrome to get experience while the format goes through the standardization process, and eventually remove those in favor of the standardized version. You can follow the IETF part of the process at https://tools.ietf.org/html/draft-yasskin-http-origin-signed....

The letter is undated so it could be old and I'm sure they'd prefer if URLs get fixed immediately rather than in a few years.

Update: So after Google announces that they're fixing AMP, some people decided to write a letter demanding that Google fix AMP? I guess the next step will be to retroactively take credit for Google's changes.

The letter seemingly dates to 2018-01-09 [1], with the earliest non-content commit, which doesn't include the letter's body, on 2017-11-10, the same day as the domain's registration. I agree it's poor form to not include an origination date in the document itself.

[1] https://github.com/amp-letter/amp-letter.github.io/commits/m...

It's on GitHub also and looks to be from today: https://github.com/amp-letter/amp-letter.github.io

No, it's recent. (last update 2 minutes ago)


Personally, speaking as a search user, all I want from Google regards AMP is a setting that disables it in search results.

Google's results ordering is already sufficiently "optimized" away from my needs that as often as not I end up in 'verbatim' mode or on page 3 of the results before I get near what I'm looking for, so an extra layer of SEO-gameable "user convenience" isn't likely to make my google experience that much worse.

Since the presence of AMP results is based on device detection, it seems a search setting removing them would be a simple matter. Google's unwillingness to provide it speaks volumes about their actual intent, IMO.

This is exactly my position on this also and the reason that I've moved to DuckDuckGo on my mobile in place of Google even though I much prefer Google as a search engine.

Thanks to Spivak, who mentions in another comment here [1] that "https://encrypted.google.com doesn't serve AMP results" my phone now has a new default search!

1: https://news.ycombinator.com/item?id=16108896

Google amp is a blessing on mobile, I don't give a damn't about site owners.

> I don't give a damn't about site owners.

But aren't they the ones creating the content you are enjoying? Seems strange to care so little about them, as without them, there would be no content to enjoy.

There are other publishers. Authors, as distinct from publishers, will find other outlets. Outlets that make my life as a user less miserable.

And the net will be a better place.

Thing is, those awful, intrusive ads pay more money than AMP pages do. So I wouldn't assume that other publishers will appear, or that authors will be able to make enough to continue writing.

How many publications do you subscribe to?

> Thing is, those awful, intrusive ads pay more money than AMP pages do. So I wouldn't assume that other publishers will appear, or that authors will be able to make enough to continue writing.

If my only two options are awful intrusive ads or no content, then I'd choose no content.

Me? Several, for which I pay yearly.

I hate it on mobile, it makes Google unusable to me. I had to switch to DDG just because of AMP.

Google, please allow me to opt-out of AMP.

> I had to switch to DDG just because of AMP.

Funny because I did the same and stayed with DDG even though Google has slightly better results.

AMP pages have broken controls and I frequently encountered completely broken AMP pages (no content visible). Was this the "fast loading page" experience that AMP promises?

Please explain why you hate it?

Not OP, but the primary issues I'm not a fan (on iOS) are:

1) Performance - Yes, it LOADS fast, but MANY pages "feel" janky, when scrolling, especially pages that allow you to load additional accords by scrolling left/right. Only the simplest pages don't "jank" on my device.

2) Scrolling - It isn't the native scroll. It seems to be a div with "overflow-scrolling" applied. Other than that feeling "off", it prevents the bottom toolbar from hiding, reducing screen real estate.

3) Sharing URLs.

4) Biggest: No way to opt-out. I'm a fan of the open web. It's why I prefer a website to a sandboxed experience you get with native apps. At least for the first click on Google SERP pages, I'm "stuck" with google.com.

One note: I get the AMP use case, and I'm sure there are users who prefer AMP pages for the performance they see. However, I'm fortunate enough to live in an area with decent mobile speed/bandwidth, and loading the original pages are rarely "slow" on my device. However, the originals all address the issues I noted above. I'll take the .5-1 second delay for the original(s), over the AMP version.

You can try going to https://encrypted.google.com, it seems to not show amp-crippled content.

Because I want the full web experience, not some neutered version.

I’ve been in mobile development for about a decade, what Google did was re-invent WAP on top of HTML5, turning back the clock at least 10 years.

Google has become an OSP. All they need to do is start using their monopoly on ccTLDs to create dark web google services that you can only get to via Google internet connections. If they switched their core services over to that, they could require you use a Google tunnel to use their services.

By also sending all non-Google traffic through the tunnel ("for improved speed!" remember when AOL included SpeedBooster Technology?) they'd control a majority of all non-search internet traffic, like AOL without the disks. And they wouldn't even have to bill you because they'd get an unlimited supply of ad data.

It actually breaks pages. Not only their functionality but the browser will actually close the page or go back to the search results with amp, and sometimes clicks don't work. It also loads "weird". I almost always have to furiously click the 'link' icon on top to load the real page, because the page was designed to load under real-world conditions, not through a web proxy.

Aside: from a privacy perspective it creeped me the hell out once I realized what was happening.

Most of the time I don't care one way or another. Occasionally I search Google for Reddit posts on my phone, and it's mildly infuriating to get a bunch of AMP results, because I strongly prefer the desktop UI that I've chosen in my preferences over the default mobile UI that AMP shows.

1) Scrolling is broken.

2) Touching tap to return to the top of a webpage is broken.

3) Searching for a word within a webpage is broken.

This is all basic functionality that I use often. And I'm sure there are other things broken about it.

yeah, as a user AMP is awesome.

Dear website owners who wrote this letter: yes, amp has it's downsides but the reason we have amp is that the internet as a whole failed to make fast sites. you had your chance for the previous ~30 years, and it's clear you couldn't be bothered to make speed a priority. I'm glad you're angry that AMP is eating your lunch, because maybe you'll actually start trying to compete now. But you aren't getting my sympathy.

The whole point is, Google has shut down the competition. One Google has everyone on "the platform" they now get to mandate new rules. There may as well not BE any more web frameworks. The idea that we "did this to ourselves, and are now getting our just desserts" isn't going to fair well for anyone - not even the user. Google is now the gatekeeper and independent attempts to chip away at that foundation are almost a waste of time.

Publishers had no (and still have no) motivation to make regular pages fast, because Google won't reward them for their effort with higher traffic. Imagine what headlines like "Google to rank popular content primarily by speed" would do to the web.

+1 million It's especially noticable when you have poor service. Amp pages load fine, whereas just about everything else cant load.

I don't like AMP as a user. It breaks sites that I want to open in an app or in the browser with my logged-in session. I also don't like Google hosting the content and giving preference to a specific, consumption-only user experience.

I'd probably have a different opinion if I didn't use Adaway, which blocks ad servers by domain in /etc/hosts. When I've tried going without an adblocker, the web was not such a nice place.

Native apps and disabled JavaScript are a blessing on mobile.

Thousand times this.

Half of the reason we're in this mess is on-line advertising. The other half is that lots of companies think it's a great idea for everything to require constant data exchange with the cloud.

As a publisher, thanks for this blunt answer that cuts through to the crux of the issue. There are supposedly big publishers that have shut down in my country without a single obituary or fond remembrance from users. Sometimes I wonder in this saturated online media world, how many publishers would actually be missed if we all disappeared from the earth. Probably a single digit percentage

Except when you want the original link.

Every Google AMP link that I've clicked on has a button in the top right corner which displays the original link. It's very easy to copy from, and isn't hidden at all.

Oh good! They fixed the problem that didn’t used to exist by adding additional steps, all for their benefit.

And now they’re trying to get a new standard feature to let them lie about the URL to make it even easier!

You know what worked well? 20+ years of the URL bar showing me the site I was trying to read without bikinis messing with it.

Also, that was added later. Originally it wasn’t there at all.

> all for their benefit

And the benefit of everybody with a mobile device that just wants their article to load quickly.

> And now they’re trying to get a new standard feature to let them lie about the URL to make it even easier

These are called "engineering tradeoffs". It's not perfect, but it's better than what we had before.

This is not an engineering trade off. All mobile users are forced to use AMP.

Google could allow opt out of AMP for all those who don't want it in their search results but they don't because they are trying to wall off and control a section of the web.

But they're the reason you're blessing amp on mobile.

No one seems to be up in arms about slow websites. ¯\_(ツ)_/¯

If AMP and others exist, there's a reason, and it might be more useful to respond to that than to respond to AMP.

There is a reason. Google gives it preferential treatment.

That’s not a reason that matters to me at all.

I don't see the point of AMP. Why not just give certain sites a little icon/preference if they're faster than XXXms? Glad to see this letter says as much.

Because they did that, for years, and next to nobody changed anything.

Fast websites stayed fast, and slow websites stayed slow. They also gave out tons of free "speed test" tools to make it easy for developers to test their speed, and improve it. They even made apache and nginx plugins that would auto-optimize assets as it served them! And still basically nobody used them.

Plus it's not just about "faster than XXXms". What is that time measured to? Time to text on the screen? Can they game it by lazy-loading images, then videos, then ads, then 6mb of other javascript and social networking stuff and a comment system and...?

You could try a "size limit", but then you penalize asset heavy sites (like high-resolution image sharing sites, or video sites).

So their solution was to make a very restrictive system where you are forced to do things "the right way" (for at least one definition of "right"), and to pump that up in the search results.

Plugins were made for common blogging and news platforms, and now a portion of the web loads significantly faster for many, and I think that's a win.

It's not perfect, but it's at least the first thing that I've seen google try in this area that is actually working.

Obviously there is benefit to Google as a company in that now results that are served through their AMP system are faster than those of some of their competitors, and AMP gives them a nice easy way to pull some structured information from an article for things like the carousel or other non-search offerings, but I genuinely don't believe that was the main motivator, seeing as Google has years of examples of failed attempts to "fix" this problem (though maybe I'm just not cynical enough!).

FWIW, they did let performance factor, slightly, into ranking, but there was never a "fast" icon (one was rumored, but never shipped). An icon like that would've easily motivated folks, no need for AMP to be involved.

That being said, there are technical reasons why this hasn't happened yet. The good news is that some web standards (ex: https://wicg.github.io/feature-policy/) are being worked on that would (hopefully) allow Google to verify performance of sites without relying on AMP.

This would give Google no excuse to give AMP content special treatment, and would hopefully relegate AMP to what it should've been since day one: a framework for performance, but one that wasn't required or bolstered by any Google-colored carrots.

Feature policy won't help anyone measure site's performance unless your performance metric is "amount of features used".

Not by itself, but it does provide a portion of what is needed. Specifically:

> The developer may want to use the policy to assert a promise to a client or an embedder about the use—or lack of thereof—of certain features and APIs. For example, to enable certain types of "fast path" optimizations in the browser, or to assert a promise about conformance with some requirements set by other embedders - e.g. various social networks, search engines, and so on.

You're right that there's more to it though.

>Fast websites stayed fast, and slow websites stayed slow.

Website did ridiculous things for SEO when Google was the main traffic-driver on the web. Ridiculous. If Google sufficiently penalized bloat in ranking most websites would remove the bloat. But it never was a significant enough factor.

But they could do the best of both worlds if they wanted. The launch of AMP was coupled with the "carousel" that appears at the top of search result pages. Only AMP pages are eligible to appear in that carousel. Why not keep the carousel, but prioritise by overall load time?

I believe they kind of do.

IIRC they treat AMP pages the same as any other in terms of ranking, but since AMP is super fast, it gets a natural boost.

(obviously AMP gets other benefits like the lightning bolt badge and inclusion in the carousel)

Is that not a contradiction in terms? Placement in the carousel means you are placed above all other search results, that's in no way being treated the same as others in terms of ranking.

Ah I misunderstood what you said.

I believe that some of the data in the carousel is pulled from the way that AMP is structured, however I'm not 100% sure.

Didn't Google get itself into this mess by claiming years ago that slow ads don't affect ranking? And then also giving speed small weight in ranking?

Here's a post from 2012 that talks about downgrading rank for sites that have too many ads.


That link is not so much about speed as it is about low quality content. I cannot find supporting link now, but as I remember SEO sites echoed info from Google many years ago that including reasonable ads won't damage ranking even if it slows the page down somewhat.

Why does anyone have to change anything? If a site decides to allocate resources towards content vs. speed is that so bad?

Google wants you to search more, visit more pages, so it can show more ads. So it needs a faster Web.

yeah. this is why i abandonded Google a couple years ago. it became clear to me that they werent trying to provide me with search results that satisfied my curiosity, everything was tailored to draw me in and keep me searching. it wont send you to a page that is rich with content, itll send you to one stub article after another.

i think its kinda hilarious how there is this fixation with load speeds amongst web devs. most people i know are far more concerned with everything other than the load time.

Uh, there definitely isn’t any fixation with load speeds amongst web devs. If there were, the web wouldn’t be a complete disaster and there wouldn’t be any need for AMP. Users care about load speeds. Web devs care about getting their sites to work when their core i7 workstations are connected to the corporate intranet, and getting paid.

Frankly, I'm okay with slow sites staying slow. Google's initial attempt was to lead the horse to water. The horses that choose not to drink, I can choose not to, uh, ride. (If that maintains the metaphor)

The caveat comes in when no alternative exists, and when the access to some information of service on very constrained connections (such as you'd see in remote third-world regions with poor Internet access) meets some reasonable definition of vital, but I've not seen where AMP is playing a significant part in that.

And because google is the police of the internet, they had to do something about criminally slow webpages.

Fuck that, it's really obvious this is another Google attempt at getting as much traffic to themselves as possible.

For example, Reddit loads extremely fast, and AMP pages are absolute shit, I don't know why they decided to use it.

And I'd rather view the website as intended (minus the ads, sorry), than in a shitty cut down version that takes 3 seconds less to load.

As I understand it, a round-trip request on a mobile network can be very expensive [0]. One of the ways that AMP enables sites to load instantly-upon-click is by starting those requests in the background. (Then, when the user clicks, it's just a matter of bringing the IFRAME into view - no network involved.)

Loading a traditional site can have side effects (e.g. reporting a view to an advertiser's analytics). By using the AMP validator, Google Search knows it's safe to preload and prerender an article, without triggering side-effects like analytics.

So it's more than just loading faster than some threshold; it needs to be safe to cache and prerender the content. The AMP validator is what asserts these are both safe.

(Note, I don't work on AMP and don't represent Google - this is my personal understanding.)

[0] Ilya Grigorik's time-to-glass talk https://www.youtube.com/watch?v=Il4swGfTOSM

I have a very low datacap, I don't want a website to preload anything, I'd rather have a slow load.

Preload is treated as a browser hint. Presumably, your browser knows you're on a capped network.


That would be preloading assets, not the preloading of entire AMP sites as the GP describes.

Best practices can reduce page load time from ~10s down to ~1s and then preloading can reduce it to maybe ~20ms. Google's "preloading or nothing" attitude and the refusal to discuss tradeoffs between market power and speed really isn't helpful.

Well, if one were to buy into the hypothesis that AMP is a way for Google to monopolize content, that would be the reason. At least in search results provided by Google's search.

AMP pre-loads content, and no third party site can do that, so it's probably impossible to be as fast as AMP.

The narrative of wanting to monopolize user traffic doesn't make much sense in light of the recent announcement around URLs.

I'm not sure why you think it doesn't make much sense. Google can do things that benefit multiple parties including themselves.

I think a lot of this is in reaction to FB's Instant Articles and wanting to gain further leverage over publishers. If you can commoditize publishers' content and control the format, and you supply the users and the monetization via ads, you can make the publishers do whatever you want because the alternative is for them to lose money, which they can't afford to do.

In many ways, this is a very defensive play by Google. It also happens to provide a better user experience in some ways, which is awesome, and also helps further Google's goals of controlling ad formats and more importantly the data that is collected on publisher sites that lets them monetize their data in ways they may not be able to with AMP pages.

downvoters: this is 100% true. AMP has some nice standards on how to make a fast page but at the end of the day, google's re-hosting your page in a place where it can fetch in advance while obeying the same-origin policy, and clicking a result is the moral equivalent of unsetting a "display: hidden" attribute on the page.

AMP only works today because it's re-hosted inside the origin (google.com), so much that in order to fix this in the future, they'll have to rush out a new web standard and implement it in chrome (and hope other browsers do the same:) https://amphtml.wordpress.com/2018/01/09/improving-urls-for-...

it's not just about speed, it's about quality as well. AMP has more restrictions than just that the site is fast: amp pages have to have all the image sizes defined and the content loaded in the initial load - no fast page loads followed by spinners or lazy-loaded content, or page blocks that jump around as loads. It's a better user experience all around, not just a faster page.

I support this completely and I would go one step further. We need a search engine that is free of CPM-monetization practices. It has become clear that Google is driving the web intentionally in a profit-maximizing direction at the expense of users and ordinary developers.

Move ad blocking to the indexing layer. Create a corner of the web that is naturally fast and user-friendly again without resorting to corporate-defined subsets of well-documented open web technologies.

And it will get paid... How? Ads are keeping your net experience free.

DuckDuckGo does advertise but based on current search never on tracking you. Aka highly relevant ads to you now vs 3 months from now.

DDG is a just a re-skin of Bing search. They don't do indexing.

They don't just use bing is my understanding. Unless that's changed?

They also use Yandex, which isn't very relevant unless your website is in Russian, and allow websites to manually set up their pages for instant answers.

This was my understanding too. Their Sources page[1] indicates that they run a crawler called DuckDuckBot (implying some indexing takes place), and have community maintained/crowd-sourced data as well.

[1] https://duck.co/help/results/sources

Those are mostly DDG's instant answers, if I'm not mistaken, where websites use the SDK to submit their data to be crawled.

DDG doesn't have webmaster tools or crawls domains natively.

It would still be ads, but much nicer ones that come out of direct negotiation between publishers and advertisers, instead of some shady clickbait ad agency randomly selected to invade your site by Google's algorithms.

My experience has been that direct-sold ads produce more revenue for publishers than programmatic ads. I would love to see research on this one day.

> My experience has been that direct-sold ads produce more revenue for publishers than programmatic ads

Yes, but do they provide an equally good ROI for advertisers? (And the same level of granularity, tracking and reach as something like Google Adwords?) That's the real issue at play.

If price is any measure of value, yes.

"Premium" sites (like the ones with engineering resources to build to AMP spec) e.g. NYTimes, WA Post, WSJ, are not really prime concern - they can afford to move to AMP and do whatever kind of ad sales they want (direct/programmatic).

It's the smaller websites (Bob's Blog) who can't afford eng resources to build to AMP, and to build fast ads, who are "suffering" a loss of search results prioritistion. (These smaller sites may turn to shady ad networks, and suffer further with shitty ad payload, further lowering search preferences if based on page load).

Does that mean smaller websites are stifled? Not sure...

> much nicer [ads] that come out of direct negotiation between publishers and advertisers

Uh, usually this results in "native advertising", which is advertising pretending to be real articles. This is much more dangerous and insidious in my opinion than an explicit 'sponsored' box with an ad inside.

If this continues, successful publishers will be the ones that push the most profitable corporate propaganda, rather than the ones that effectively inform the public.

You're still thinking in terms of programmatic ad marketplaces.

I've actually done native advertising and even something as simple as adding a company's logo to an online catalog can pull in $100,000s in ad revenue. You don't need DFP for that, and it's undetectable by ad blockers because it's just an image.

We need to get out of this concept of thinking of advertising as just publishers selling blank squares of "real estate" on their site, and back to the sort of partnership model that existed in print media. A publisher specializes in reaching a certain audience with certain interests; the right advertiser will pay more to reach that audience. You don't need to track individuals to do that or build profiles on them.

Digital advertising made a new pricing model possible: CPM. But just because you have a new way of doing something doesn't mean it's the best way.

Too bad there aren't alternative options, like Patreon, or Brave, or Kickstarter. If such a concept existed, people might actually spend money on those.

I have no idea what this comment means. Do you want to block all websites that have ads on them from Google?

I mean block websites that monetize using CPM, which would incidentally be the vast majority of ads served using DoubleClick's ad marketplace. These also tend to be the ads caught with ad blockers, since CPM requires technical artifacts like iframes that are easy to spot.

Sorry, I'm trying to be as succinct as possible, but I've come to see the advent of ad blockers, intrusive ads, pervasive tracking, client-heavy web practices, and clickbait as a single phenomenon that has its roots in the rise of CPM as the primary pricing mechanism for advertising.

I'm sure that your search engine with no websites on it will be a huge success.

Do you know another way out of the Googernet ad matrix?


You can do any of the following.

1) Accept that most websites have to make money to continue to provide you content and that advertising is how most of them do it.

2) Do not visit websites that choose to support their business through advertising

3) Use uBlock and continue to use websites.

But given the prevalence of websites that monetize via advertising a search engine that excludes websites with ads is as useful as a social network without any people.

Eh, it's not quite the same as a social network. Indexing a webpage doesn't require any action on the publisher's part (even though in practice, it's astonishing the hoops publishers jump through for SEO). If that were the case, you would be right, and there would not be any room for niche search engines like DuckDuckGo.

Like I said in other comments - my gripe is not with advertising. My gripe is with CPM. The trend of optimizing for CPM has created a perverse incentive structure that harms the web ecosystem in many ways, and IMO, only benefits Google.

>We need a search engine that is free of CPM-monetization practices

Then you should join the distributed free p2p search engine: http://yacy.net


Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact