
Google – polling like it's the 90s - remotekieran
https://www.ably.io/blog/google-polling-like-its-the-90s/
======
nostrademons
When this feature launched it was literally an engineer sitting at home
watching TV and entering the scores into a Google Spreadsheet, which Search
pulled via Google Sheets API and stuck into a formatted OneBox. I think
they've since taken the engineer out of the loop and worked out real data
distribution deals with the sports leagues.

The thing is - from a product perspective, this is the right way to do it. At
the time of its launch (2013, IIRC), about 90% of new features launched on the
SRP didn't survive live experiment phase (i.e. nobody wanted them), and about
98% didn't survive their first birthday (if people did want them, it was only
transient, or they didn't want them enough to justify ongoing maintenance).
This was one of the lucky ones.

It's also somewhat pointless to optimize this case - from a product
perspective, polling is _better_ for the user (websockets have some edge cases
around firewalls and network changes that make them a bit less reliable), and
I guarantee that the bandwidth used by this feature is rounding error for
Google. They spend many times more hosting JQuery for free than this feature
consumes.

Sometimes the dead-simple answer is the right one, and rather than paying
someone to do the hard bits, you're better off not doing the hard bits at all.

~~~
matt_oriordan
I appreciate when features are in test phase that it makes sense to keep it
simple. But this is in production, and there are many dead-simple solutions
that do this efficiently, long polling, SSE, etc. This is Google after all.

~~~
Raidion
To implement this, you have to have a developer who is doing something else
switch to this, get up to speed, and make a change. This involves risk and
opportunity cost.

The question for enterprise software development isn't "Will this make this
thing better?", but "Is making this change more valuable than the other things
on the priority list?"

~~~
matt_oriordan
Sure, but I'm never suggested front-end developers should build new tech every
time they want a new feature. But there are solutions to do things better now,
so it seems odd to ignore that. It's a bit like saying React/Angular/Vue/any
other frontend framework won't make things tangibly better, so why bother. Or
optimising page size won't make things better, so why bother. Google actively
invests in telling everyone else to optimize their websites and even
prioritizes fast sites over slow sites
([https://developers.google.com/speed](https://developers.google.com/speed)).
Why shouldn't they care about their performance too?

~~~
askmike
This isn't just a frontend optimization that can be applied in isolation by a
good dev. With WS or SSE there now needs to be server written & supported to
push this data over. On top of all the networking issues that come with these
streaming protocols.

> Why shouldn't they care about their performance too?

They definitely do this for everything they push out that runs on billions of
devices for people all over the world. How many people use Google to keep
track of the live score of "the Laver Cup tennis championship in Australia?"
I'd be surprised if that's more than 100. But Google has these numbers, and
based on that they've chosen not to optimize this.

~~~
nostrademons
There's also the issue of optimizing for users of sports fans but pessimizing
for the general population.

Say that they do this "right" and include a real-time websockets client that
has all the proper reconnect logic, uses a minified protocol, has a proper
real-time stack on the backend with all the appropriate DDoS protection,
encryption, CORS, etc. All the client logic is extra bytes that need to be
shipped on every search request. Or they could split the bundle and lazy-load
it, which is still extra bytes, but fewer of them, and adds complexity. The
server code needs to be audited for security; one compromised frontend server
and the damage to Google is already more than the entire value of the feature.
One black-swan websearch outage (caused, say, by a misconfiguration leading to
the load balancers getting slammed by websocket connections, or a
misconfiguration causing the DDoS servers to think websearch is actually
undergoing a DDoS when it's just normal websocket traffic) is already more
than the entire value of the feature. The bar for simplicity and reliability
has to be pretty high for a feature that's going to be used by 0.0001% of
users, because if there are _any negative effects at all_ on mainstream
websearch traffic, that feature should never have been launched.

~~~
matt_oriordan
I agree complexity should be avoided. But I don't really follow the logic that
if every feature can break security, reliability, etc. you should never change
anything or progress. Google is doubling down on these features as you can see
at [https://www.seroundtable.com/google-pin-live-
scores-28261.ht...](https://www.seroundtable.com/google-pin-live-
scores-28261.html). If they're continuing to invest in adding more realtime
features, and penalising sites in search results that don't optimise their
websites [https://www.sitecenter.com/insights/141-google-introduces-
pe...](https://www.sitecenter.com/insights/141-google-introduces-penalty-for-
slow-websites.htm), then I think they should apply that same logic to their
own site IMHO.

~~~
summerlight
Probably that's why Google is trying to improve network protocols (QUIC,
HTTP/3) rather than relying on one-off optimizations described in the article.

~~~
matt_oriordan
How does QUIC or HTTP/3 change things? Polling is polling, regardless of
whether it's over HTTP/1, HTTP/2 or HTTP/3 (QUIC).

~~~
summerlight
It still optimizes overall network efficiency much more than enough to negate
0.001% inefficient cases using polling? I would say this is a right
prioritization of optimization engineer headcounts, which is pretty scarce.

------
fooblitzky
The article considers efficiency from the client side, but it doesn't really
consider scaling effects on the server side. Simple polling requests coming in
are easy to load-balance, either randomly or by a round-robin scheme. It's
resilient, because if one request fails, the next will likely succeed and the
user won't even notice.

On the other hand, the complexity of trying to manage millions of open web
socket connections filled with state would make things difficult. You would
have an uneven distribution of load across servers, the challenge of knowing
which connections to close in the lb, and probably the best way to keep the
user experience smooth would be to have the client assume failure rather than
delay, and aggressively disconnect and establish a new web socket anyway.

~~~
rauhl
Agreed that websockets would be overkill and probably less reliable, but why
not Server-Sent Events[0]? They are quite simple and _should_ get the job
done. For unidirectional information I find that they’re a pretty decent
technology, and one which is surprisingly-often overlooked.

0: [https://www.w3.org/TR/eventsource/](https://www.w3.org/TR/eventsource/)

~~~
paulddraper
Browser support may be holding SSE back. [1] It will be nice once Edge (as
branded Chromium) supports it.

[1]
[https://caniuse.com/#search=server%20side%20events](https://caniuse.com/#search=server%20side%20events)

------
peterkelly
Loading this article involves my browser making 141 separate requests,
downloading a total of 6.9mb of data. Based on the numbers given the article,
that's more than eight hours worth of tennis scores.

~~~
matt_oriordan
He he, that is true, and not something we're proud of. We're continuing to
optimize things where we can with the resources we have. Given our size, we
are spending our engineering efforts on our product where we can bring our
optimization work on streaming to our customers. Sadly, as a result, our blog
has plenty of room for improvement. As we grow, I hope the existing
optimization tasks in our backlog are prioritized.

I appreciate this may come across as hypocritical, and you're more than
welcome to think that. But I don't think that changes the analysis of my
article, being that Google are telling everyone else to optimize their sites
because it makes a better web, or you'll be penalized
([https://www.sitecenter.com/insights/141-google-introduces-
pe...](https://www.sitecenter.com/insights/141-google-introduces-penalty-for-
slow-websites.htm)), and then on the other hand they have over 100k staff and
haven't optimized their own results.

~~~
kllrnohj
68KiB (worst case) per 5 minutes kinda makes it sound like Google _did_
optimize their site. They could optimize it further still, certainly, but
there doesn't seem to be any real hypocrisy here as you are claiming. The
absolute number here is still very firmly in the "very small" category.

------
sholladay
I can't decide how I feel about this. On the one hand, they are totally right
that polling is an inelegant solution. In the context of a large-scale
website, perhaps it is even harmful to a degree that we should take special
note of it. On the other hand, however, the amount of time that was spent on
trashing this work is sad to me. Polling is not merely a quick and dirty
solution, it is a high-reliability solution where other techniques may fail. I
have used many proxies that don't handle WebSockets properly, for example. It
is not simple to maintain long-lived streams or connections across all
networks. So there are practical upsides to polling that should not be
ignored. I can appreciate that this post goes into details explaining why
polling is bad, but perhaps it would have come across better as a tweet or
something, rather than a full blog post with fancy charts showing how much a
particular project sucks.

~~~
unilynx
Exactly - poll `/status.json?at=${Date.now() - Date.now() % 5000}` and let the
CDN worry about infinitely distributing state.

(and make sure the CDN isn't caching 404s triggered by clients who are already
in the future)

------
rjkennedy98
Aside from the points made about the technical solution, its worth pointing
out that there is really no one sits watching the google homepage scores to
update. 95% of people are just going to check the page for the score and
leave. If they want to get more up to date scores they will watch an actual
game tracker on ESPN or the like. I think this blog post really misses the
business use case.

~~~
joshstrange
Wow, this isn't high enough and sadly it didn't even occur to me until I read
your comment. I'm not huge on the sports-ball but when I do need a score I
google the game, see the score, and close the tab or sleep my phone. I might
look at it later but even that is not a common use-case for me.

------
mpetrovich
The author makes an unvalidated assumption that the users of this service care
MOST about bandwidth efficiency.

I suspect that users care more about a service that works reliably. An extra
50-100k every 5 minutes (assuming the user keeps their mobile browser open
during this time) does not seem like it would be problematic.

Ironically, the alternatives he proposes make the service LESS reliable, since
many users may be behind firewalls that block WebSockets, HTTP streaming, etc.

HTTP polling works for a larger percentage of users and can be scaled
horizontally more easily than these other methods during high-volume spikes
like the World Cup.

In short, I think Google made the right tradeoff between dumb, boring,
accessible vs. clever, complex. Especially for a product that probably doesn’t
meet the threshold for investing in a more sophisticated architecture.

~~~
microcolonel
> _firewalls that block WebSockets_

How does that work? How can a firewall tell that a TLS connection is a
WebSocket and not just an HTTP session with a server experiencing high load?

Added: just saw this gem[0] from a few days ago, does anyone have some idea
why McAfee suggests to their customers that WebSockets are a potential
security risk on a web client network? I have literally no idea how they would
be more risky than ordinary HTTP...

[0]:
[https://kc.mcafee.com/corporate/index?page=content&id=KB8405...](https://kc.mcafee.com/corporate/index?page=content&id=KB84052&actp=null&showDraft=false&locale=en_US&viewlocale=en_US)

~~~
acdha
If the firewall has access to the traffic stream — say local “security” tool
or a corporate managed environment — it can block the Upgrade header which
attempts to turn the HTTP connection into a WebSocket. That's the kind of
thing which doesn't affect a huge percentage of users but at Google's scale
it's still a large number of people.

Re:McAfee, I wouldn't agree with the logic but I've heard people worry about
data being tunneled out through new protocols which are harder to filter or
used to establish some kind of a persistent control channel. In almost all
cases this has high impact with little benefit unless you're filtering all
other traffic strictly enough that malware can't use other common
circumvention techniques.

------
sidhuko
I built an app for major sports company in the UK several years ago now. The
choice as actually the most cost effective. Most providers of the data,
usually collated by people using an Xbox controller and relay device in the
stadiums, are uploading static files to storage. We used a thin node.js layer
to stitch these from a connected volume and expose them via an endpoint which
would specify a time range. If a user never requested the results for the 36
min in a game then our solution wouldn't have to do any work. Once we did
process a request for a user for a game at a given time range then all other
users would then receive the same cached result. It was very cost effective
when you compare millions of users requiring websockets and processing all the
events regardless of whether someone is watching those results.

~~~
brisance
Thank you for your perspective. This is a perfect example of the difference
between theory and practice.

------
Footkerchief
This did a great job convincing me that Ably's tech is more complicated, and a
terrible job convincing me that the UX improvement is worth the implementation
cost.

------
hirundo
The 68KiB per 5 minutes is trivial for most users and the 10 second latency
for a sports score usually is too. Measure fixing these problems against the
opportunity cost of not getting something else done. This reads as more of a
tradeoff than a horror story.

~~~
commandlinefan
> more of a tradeoff

You make it sound like it would have taken months of investment to just use
one of the many available better options - at worst, it might have taken a
week for somebody to learn web sockets if they actually didn’t have anybody
who knew it (and even then, that’s a slow learner). We should be delivering
the most efficient options and always weighing the alternatives - we’re
professionals, maybe some day we’ll actually start behaving that way. From the
top-level responses to this article, today is not that day.

~~~
spenczar5
Websockets imply persistent connections between servers and clients. That
long-term state tightly couples the two: deploying a change to servers now
requires draining all running connections, for example. Load balancing is much
harder. Throttling client behavior is much harder, so you’re not as insulated
from bad client behavior or heavy hitters. Consequences of that decision
ripple through the entire architecture.

I really like the polling approach used here. It’s simple, easy to reason
about, and loosely coupled. It will be reliable and resilient.

Saying it might have taken a week to learn websockets completely misses the
point. I’ve built large architectures on persistent connections and _deeply_
regretted it.

~~~
tjungblut
Exactly! Keeping all of those stateful connections open is a much bigger issue
than loadbalancing simple HTTP polls (that you could cache super
aggressively). I think this is what this blog post really misses and shows
that they have not much of a clue how to operate things at a worldwide scale.

------
scary-size
Seems low tech, but _it just works_ for _every_ user.

~~~
ryandvm
Exactly. It's "good enough" engineering. HTTP polling is guaranteed to work
for every client, all of the time.

While the other solutions mentioned offer improved performance with regard to
latency and bandwidth, neither of those are going to noticeably improve the
user experience. And having to deal with stuff like bloated libraries, broken
websocket connections or browser incompatibility mean the proposed alternative
solutions have a very real possibility of degrading the user experience.

I get it, the author sells real-time solutions, but in this very specific use
case, I think HTTP polling was actually the correct choice.

~~~
kekub
This. Especially because their main service is search and not live scores. I
would expect that these are shown quite rarely.

------
bastawhiz
I'm surprised that this article didn't consider HTTP/2 or QUIC at all. In
either case, the disadvantages of polling are dramatically decreased
(persistent connection or lack of need for one, header compression, etc.).
Having long-lived connections with web sockets is hard to do at scale (pinning
a user to a single server, pushing caching down a layer or two in the stack),
and when you're Google the back-end efficiency of stateless requests is hardly
a concern.

If you're on a browser that doesn't support a new version of HTTP, you're
probably also not able to use web sockets. Supporting two dedicated transports
(one for fallback, one for marginal efficiency gains over the fallback) for a
small feature like this seems crazy.

------
falcolas
Something else to consider (and something we've had to deal with at work): Not
all company firewalls support websockets, long polling, and other newer
technologies. So, if you want your product to work as many places as possible
(without working with every potential user independently), you really do need
to use tech from the 90's.

~~~
matt_oriordan
falcolas I am not aware of any single firewall that blocks long polling, or
XHR streaming (at least for a fixed period). Can you substantiate that? We
regularly check our transports against legacy devices at Ably, so I don't
think this is true. I stand to be corrected of course :)

------
thrower123
There are a lot of feature requests that just aren't worth implementing in a
complex way.

You can set up a simple GET request with some caching on the server and a
simple poll on the client in minutes to a couple hours and move on. Especially
when real-time accuracy doesn't matter.

------
squeaky-clean
Looks like Ably has their own page on long-polling and why sometimes you can't
use that or web-sockets. A few cherry-picked sentences (though it actually was
a good read, do recommend the whole article if you have the time).

[https://www.ably.io/concepts/long-polling](https://www.ably.io/concepts/long-
polling)

> Reliable message ordering can be an issue with long polling because it is
> possible for multiple HTTP requests from the same client to be in flight
> simultaneously.

> Another issue is that a server may send a response, but network or browser
> issues may prevent the message from being successfully received. Unless some
> sort of message receipt confirmation process is implemented, a subsequent
> call to the server may result in missed messages.

> Depending on the server implementation, confirmation of message receipt by
> one client instance may also cause another client instance to never receive
> an expected message at all, as the server could mistakenly believe that the
> client has already received the data it is expecting.

> Unfortunately, such complexity is difficult to scale effectively. To
> maintain the session state for a given client, that state must either be
> sharable among all servers behind a load balancer – a task with significant
> architectural complexity – or subsequent client requests within the same
> session must be routed to the same server to which their original request
> was processed.

> This can also become a potential denial-of-service attack vector – a problem
> which then requires further layers of infrastructure to mitigate that might
> otherwise have been unnecessary.

> That said, there are cases where proxies and routers on certain networks
> will block WebSocket and WebRTC connections, or where network connectivity
> can make long-lived connection protocols such as these less practical.
> Besides, for certain client demographics, there may still be numerous
> devices and clients in use that lack support for newer standards. For these,
> long polling can serve as a good fail-safe fallback to ensure support for
> everyone, irrespective of their situation.

And this one is the most important response to the OP article, in my opinion.

> That said, given the time and effort – not to mention the inefficiency of
> resource consumption – involved in implementing these approaches, care
> should be taken to assess whether their support is worth the added cost when
> developing new applications and system architectures.

------
jmathai
I miss 90s tech. I feel like building web applications was a lot more
enjoyable with 90s tech that really met the needs of users.

------
groestl
Polled responses will be cached away at Google's Edge Cache. It's dumb on the
client, but this scales practically without limit.

------
cj
Let's look at some strategies for updating content on a page...

#1: Give users a link they can click that says "Click here to refresh"

#2: Implement <meta http-equiv="refresh>. __Usability win: direct user
interaction not required __

#3: Implement http polling. __Usability win: the page doesn 't flicker and
disrupt the user, since no page refresh is necessary __

#4: Implement http long polling. __Usability win: ??? __

#5: Implement SSE. __Usability win: ??? __

#6: Implement Websockets. __Usability win: ??? __

TLDR: The user doesn 't care if you use http polling or websockets.

New + complex is no better (and often worse) than old, boring and simple.

~~~
Matthias247
The usability win from 3 to other solutions is: Users don't have to wait until
the next polling interval expired in order to get an update. In interactive
applications users don't expect delays, so you should display updates in
around 100ms. Achieving that with polling might not be feasible for some
applications.

------
unilynx
Let's hope the author never finds about HTTP Live streaming. Now there's some
very effective abuse of HTTP even though we thought we had much better
protocols already.

~~~
Matthias247
Which ones?

My thoughts were that actually HLS is a quite nice idea, once we remove the
"live" from it's name. There are certainly things that do a lot better at live
broadcast. But the ability to distribute it over plain HTTP at massive scale
with zero changes to existing webservers makes it a quite compelling solution.
We can't use real realtime streaming protocols from CDNs.

------
jedberg
Ironically the ably website won't load for me right now so I can't even read
this blog post about website scalability and efficiency.

------
xg15
Some technical point in addition to the other arguments: Even with WS or SSE
you'd have to send keep-alive packets periodically, so you can find out if the
connection has died. Those would be smaller than data for polling and the
interval is not bound to your refresh rate anymore - but still more than
nothing.

------
anonygler
> design choices are surprisingly bad in terms of bandwidth demand, energy
> consumption

Google Search supports old browsers to an incredibly painful degree. I believe
IE6 still work.

This effectively discourages engineers from exploring new approaches, because
they always have lots of edge cases that just aren't worth solving.

------
jchw
OTOH: Polling is stateless and easy to scale. It is not the most efficient,
but presumably it can be improved versus the worst case by exploiting browser
caching effectively, and via connection re-use.

------
ammmir
Maybe the original developer left the company and it wasn't worth the effort
to introduce more state on the server-side via the other methods? I can
imagine a scenario where if an open TCP connection isn't generating revenue
(via ads or whatever), they don't want to allocate any resources for it on the
server-side.

But yeah, they could have improved on diffing the data or even leveraging the
If-Modified-Since HTTP header. Seems lazy.

------
ramenmeal
This is google.com. Supporting ie whatever actually is a large number of users
for them. Websockets is not a thing.

------
redpillor
search results of google are almost dead. it doesn't give you effective
results like it used to do in the past..

------
GWSchulz
Features like this are presumably what antitrust investigators will
scrutinize. It’s Google doing everything in its power to keep users within the
Google empire at the expense of potential competitors, of which there precious
few at this point.

