The thing is - from a product perspective, this is the right way to do it. At the time of its launch (2013, IIRC), about 90% of new features launched on the SRP didn't survive live experiment phase (i.e. nobody wanted them), and about 98% didn't survive their first birthday (if people did want them, it was only transient, or they didn't want them enough to justify ongoing maintenance). This was one of the lucky ones.
It's also somewhat pointless to optimize this case - from a product perspective, polling is better for the user (websockets have some edge cases around firewalls and network changes that make them a bit less reliable), and I guarantee that the bandwidth used by this feature is rounding error for Google. They spend many times more hosting JQuery for free than this feature consumes.
Sometimes the dead-simple answer is the right one, and rather than paying someone to do the hard bits, you're better off not doing the hard bits at all.
Who sits at the Google search page for longer than a minute or two waiting for scores to update? They probably aren’t waiting for more than a couple polls.
You're points are of course correct, but I want to note that bandwidth goes both ways. They may not care that they are wasting their own bandwidth - but they are wasting the users' bandwidth as well. If the page was kept open for some time on e.g. a metered connection, this could cause some annoying surprises.
From Google's perspective, the descisions make a lot of sense, but it's not simply cost/benefit, it's also not considering externalised cost.
Over a 5 minute window we're talking 68KiB before compression. The article made the (poor imo) decision to omit compression, claiming it will affect all endpoints. But of course it won't affect all endpoints equally, since the payloads themselves are not equal. But let's go with this 68KiB for now as the worst case. That means if you left this open for a solid 2 hours it'd consume a grand total of less than 1.7MiB. Even on a metered connection it'd be damn hard to single this out as anything but a rounding error.
That's way less than so many web pages deliver on initial load.
The question for enterprise software development isn't "Will this make this thing better?", but "Is making this change more valuable than the other things on the priority list?"
> Why shouldn't they care about their performance too?
They definitely do this for everything they push out that runs on billions of devices for people all over the world. How many people use Google to keep track of the live score of "the Laver Cup tennis championship in Australia?" I'd be surprised if that's more than 100. But Google has these numbers, and based on that they've chosen not to optimize this.
Say that they do this "right" and include a real-time websockets client that has all the proper reconnect logic, uses a minified protocol, has a proper real-time stack on the backend with all the appropriate DDoS protection, encryption, CORS, etc. All the client logic is extra bytes that need to be shipped on every search request. Or they could split the bundle and lazy-load it, which is still extra bytes, but fewer of them, and adds complexity. The server code needs to be audited for security; one compromised frontend server and the damage to Google is already more than the entire value of the feature. One black-swan websearch outage (caused, say, by a misconfiguration leading to the load balancers getting slammed by websocket connections, or a misconfiguration causing the DDoS servers to think websearch is actually undergoing a DDoS when it's just normal websocket traffic) is already more than the entire value of the feature. The bar for simplicity and reliability has to be pretty high for a feature that's going to be used by 0.0001% of users, because if there are any negative effects at all on mainstream websearch traffic, that feature should never have been launched.
Google gets to see the actual error rates when these solutions are deployed in production. If you want traffic to get through to end-users, then it has to look like what network administrators expect to get to the long tail. The end result: QUIC is encapsulated in UDP, DNS-over-HTTPS, and MPEG-DASH live streaming (making streaming video look like HTTP, and making a new request every few seconds).
If you are not at Google scale, please, yes, use the newer, simpler technology.
IMO that was more dictated by server-side complexity. A multimedia streaming server, especially for live video, that is optimized end-to-end for robustness and efficiency can be a very complex beast. Streaming transcoding, metadata injection, buffering, etc are difficult on their own. Doing them together in a unified fashioned is especially difficult. Solutions like GStreamer that abstract this away and make it conceptually simple are also extremely inefficient at scale. (Take it from someone who wrote an entire end-to-end solution that could do M:N codec, format, and transport transcoding, per-user streaming ad injection, and a bunch of other stuff. I could do 5k+ live audio streams, each with unique ad injection/splicing, with most of the pipeline occurring on a single E3 Xeon core--the socket writes to 5,000 clients from a single thread and the resulting ethernet interrupts that caused that CPU to max out. Reusing solutions like GStreamer or FFmpeg directly in your pipeline you'd be extremely lucky to get 1/10 of that, and more likely 1/100 at best.)
MPEG-DASH permits the incoming media processor to simply dump a bunch of files to disk. The client facing service can then serve those up however it wishes--it can be completely dumb or incredibly smart, but importantly both halves can evolve on their own. You could do codec and rate transcoding as a third, middle component. And you could break these components down even further, as "everything is a file" is the most basic IPC interface we have. This makes iterative development much easier.
IME, MPEG-DASH is pretty much what you'd get if you began with stodgy old (relatively) commercial broadcast equipment using nasty broadcast framing standards and organically built a streaming pipeline from scratch, without ever having the benefit/burden of knowing how to do performant streaming server development, or even how existing solutions worked. The multiple files non-sense was never necessary for client compatibility or network filter bypassing. Apple had solved that with RTSP-over-HTTP tunneling years prior, as did Flash with its clunkier RTMP solutions later.
On the other hand, the complexity of trying to manage millions of open web socket connections filled with state would make things difficult. You would have an uneven distribution of load across servers, the challenge of knowing which connections to close in the lb, and probably the best way to keep the user experience smooth would be to have the client assume failure rather than delay, and aggressively disconnect and establish a new web socket anyway.
Those things can in some cases operationally introduce a lot more pain (e.g. in terms of resource exhaustion) than the additional short lived requests.
I appreciate streaming is harder than polling. But IMHO, if that's Google's thinking, that feels a little defeatist that has designed and built complex systems like Millwheel (https://ai.google/research/pubs/pub41378) to solve these types of problems.
At Google, those client connections are probably kept open at a load balancer. The load balancer has its own connections to servers on the other side, and those will be as short-lived as possible, so that the server is available to service the next request the lb sends its way.
If you start talking about server-side push technologies, you are keeping a long-lived connection open from the server, through the load balancer, to the client. That will affect the dynamics of the load balancing - it's not as simple as round-robin or random distribution that short-lived requests allow. Consider, for example, a server that becomes overloaded. With short-lived requests you can just add more hosts to the LB pool. With long-lived requests you might need to start thinking about migrating connections, or forcefully terminating some connections if the server hits some kind of load threshold, as well as adding hosts.
It's not impossible, it just makes it more complicated and less reliable, which is not what you want for something that just updates a score. As others have pointed out, it's probably just fetching a static file that gets updated periodically.
Websockets etc. are great technologies if you can handle all traffic on a single box; I use them extensively for my startup. They are a terrible technology at scale. If you want to run websockets at scale, I would actually recommend pulling all the real-time features into a side-channel that's updated via RPC when a relevant event happens and listens directly on the Internet, and then making that as lightweight as possible so that it can scale vertically and as non-critical as possible so that users don't care if it goes down momentarily. If your users don't really care about up-to-the-second information (as with sports scores), just don't do it, and use polling instead.
I appreciate this may come across as hypocritical, and you're more than welcome to think that. But I don't think that changes the analysis of my article, being that Google are telling everyone else to optimize their sites because it makes a better web, or you'll be penalized (https://www.sitecenter.com/insights/141-google-introduces-pe...), and then on the other hand they have over 100k staff and haven't optimized their own results.
(and make sure the CDN isn't caching 404s triggered by clients who are already in the future)
I haven't tracked tennis before, but for example ESPN football has a non-video, non-audio mode with event history plus up-to-date field diagram of each play, and I think even a social media or commenter text stream too. You could actually watch the game for a while this way.
I can't imagine looking at this page for longer than a couple minutes.
I suspect that users care more about a service that works reliably. An extra 50-100k every 5 minutes (assuming the user keeps their mobile browser open during this time) does not seem like it would be problematic.
Ironically, the alternatives he proposes make the service LESS reliable, since many users may be behind firewalls that block WebSockets, HTTP streaming, etc.
HTTP polling works for a larger percentage of users and can be scaled horizontally more easily than these other methods during high-volume spikes like the World Cup.
In short, I think Google made the right tradeoff between dumb, boring, accessible vs. clever, complex. Especially for a product that probably doesn’t meet the threshold for investing in a more sophisticated architecture.
How does that work? How can a firewall tell that a TLS connection is a WebSocket and not just an HTTP session with a server experiencing high load?
Added: just saw this gem from a few days ago, does anyone have some idea why McAfee suggests to their customers that WebSockets are a potential security risk on a web client network? I have literally no idea how they would be more risky than ordinary HTTP...
Re:McAfee, I wouldn't agree with the logic but I've heard people worry about data being tunneled out through new protocols which are harder to filter or used to establish some kind of a persistent control channel. In almost all cases this has high impact with little benefit unless you're filtering all other traffic strictly enough that malware can't use other common circumvention techniques.
Maintaining a stateful websocket connection on the server side isn't cost-free, and that connection would be idle nearly 100% of the time. The bandwidth consumed via Google's polling solution might well be cheaper than open socket file descriptors of a websockets solution.
You make it sound like it would have taken months of investment to just use one of the many available better options - at worst, it might have taken a week for somebody to learn web sockets if they actually didn’t have anybody who knew it (and even then, that’s a slow learner). We should be delivering the most efficient options and always weighing the alternatives - we’re professionals, maybe some day we’ll actually start behaving that way. From the top-level responses to this article, today is not that day.
I really like the polling approach used here. It’s simple, easy to reason about, and loosely coupled. It will be reliable and resilient.
Saying it might have taken a week to learn websockets completely misses the point. I’ve built large architectures on persistent connections and deeply regretted it.
While the other solutions mentioned offer improved performance with regard to latency and bandwidth, neither of those are going to noticeably improve the user experience. And having to deal with stuff like bloated libraries, broken websocket connections or browser incompatibility mean the proposed alternative solutions have a very real possibility of degrading the user experience.
I get it, the author sells real-time solutions, but in this very specific use case, I think HTTP polling was actually the correct choice.
You might say it's over-engineering, but in aggregate terms this could make a big difference in power and network usage. I'm talking more about the users (usually mobiles). So I would call it just... engineering.
Engineering also includes scoping resources for maximum impact, after all. Building a fancy, ultra-efficient live score system would be fun, but it's not going to be good engineering if the usage metrics don't justify it.
If you're on a browser that doesn't support a new version of HTTP, you're probably also not able to use web sockets. Supporting two dedicated transports (one for fallback, one for marginal efficiency gains over the fallback) for a small feature like this seems crazy.
You can set up a simple GET request with some caching on the server and a simple poll on the client in minutes to a couple hours and move on. Especially when real-time accuracy doesn't matter.
> Reliable message ordering can be an issue with long polling because it is possible for multiple HTTP requests from the same client to be in flight simultaneously.
> Another issue is that a server may send a response, but network or browser issues may prevent the message from being successfully received. Unless some sort of message receipt confirmation process is implemented, a subsequent call to the server may result in missed messages.
> Depending on the server implementation, confirmation of message receipt by one client instance may also cause another client instance to never receive an expected message at all, as the server could mistakenly believe that the client has already received the data it is expecting.
> Unfortunately, such complexity is difficult to scale effectively. To maintain the session state for a given client, that state must either be sharable among all servers behind a load balancer – a task with significant architectural complexity – or subsequent client requests within the same session must be routed to the same server to which their original request was processed.
> This can also become a potential denial-of-service attack vector – a problem which then requires further layers of infrastructure to mitigate that might otherwise have been unnecessary.
> That said, there are cases where proxies and routers on certain networks will block WebSocket and WebRTC connections, or where network connectivity can make long-lived connection protocols such as these less practical. Besides, for certain client demographics, there may still be numerous devices and clients in use that lack support for newer standards. For these, long polling can serve as a good fail-safe fallback to ensure support for everyone, irrespective of their situation.
And this one is the most important response to the OP article, in my opinion.
> That said, given the time and effort – not to mention the inefficiency of resource consumption – involved in implementing these approaches, care should be taken to assess whether their support is worth the added cost when developing new applications and system architectures.
#1: Give users a link they can click that says "Click here to refresh"
#2: Implement <meta http-equiv="refresh>. Usability win: direct user interaction not required
#3: Implement http polling. Usability win: the page doesn't flicker and disrupt the user, since no page refresh is necessary
#4: Implement http long polling. Usability win: ???
#5: Implement SSE. Usability win: ???
#6: Implement Websockets. Usability win: ???
TLDR: The user doesn't care if you use http polling or websockets.
New + complex is no better (and often worse) than old, boring and simple.
My thoughts were that actually HLS is a quite nice idea, once we remove the "live" from it's name. There are certainly things that do a lot better at live broadcast. But the ability to distribute it over plain HTTP at massive scale with zero changes to existing webservers makes it a quite compelling solution. We can't use real realtime streaming protocols from CDNs.
Google Search supports old browsers to an incredibly painful degree. I believe IE6 still work.
This effectively discourages engineers from exploring new approaches, because they always have lots of edge cases that just aren't worth solving.
But yeah, they could have improved on diffing the data or even leveraging the If-Modified-Since HTTP header. Seems lazy.