Hacker News new | past | comments | ask | show | jobs | submit login
Google – polling like it's the 90s (ably.io)
108 points by remotekieran 13 days ago | hide | past | web | favorite | 80 comments





When this feature launched it was literally an engineer sitting at home watching TV and entering the scores into a Google Spreadsheet, which Search pulled via Google Sheets API and stuck into a formatted OneBox. I think they've since taken the engineer out of the loop and worked out real data distribution deals with the sports leagues.

The thing is - from a product perspective, this is the right way to do it. At the time of its launch (2013, IIRC), about 90% of new features launched on the SRP didn't survive live experiment phase (i.e. nobody wanted them), and about 98% didn't survive their first birthday (if people did want them, it was only transient, or they didn't want them enough to justify ongoing maintenance). This was one of the lucky ones.

It's also somewhat pointless to optimize this case - from a product perspective, polling is better for the user (websockets have some edge cases around firewalls and network changes that make them a bit less reliable), and I guarantee that the bandwidth used by this feature is rounding error for Google. They spend many times more hosting JQuery for free than this feature consumes.

Sometimes the dead-simple answer is the right one, and rather than paying someone to do the hard bits, you're better off not doing the hard bits at all.


Google, just like any other company, has to balance between something working and working the right way. The Google managers are probably just fine with polling and, as you said, the bandwidth isn’t even a rounding error.

Who sits at the Google search page for longer than a minute or two waiting for scores to update? They probably aren’t waiting for more than a couple polls.


I do this all the time when I want the live scores but don't want to watch a game

> and I guarantee that the bandwidth used by this feature is rounding error for Google.

You're points are of course correct, but I want to note that bandwidth goes both ways. They may not care that they are wasting their own bandwidth - but they are wasting the users' bandwidth as well. If the page was kept open for some time on e.g. a metered connection, this could cause some annoying surprises.

From Google's perspective, the descisions make a lot of sense, but it's not simply cost/benefit, it's also not considering externalised cost.


> but they are wasting the users' bandwidth as well. If the page was kept open for some time on e.g. a metered connection, this could cause some annoying surprises.

Over a 5 minute window we're talking 68KiB before compression. The article made the (poor imo) decision to omit compression, claiming it will affect all endpoints. But of course it won't affect all endpoints equally, since the payloads themselves are not equal. But let's go with this 68KiB for now as the worst case. That means if you left this open for a solid 2 hours it'd consume a grand total of less than 1.7MiB. Even on a metered connection it'd be damn hard to single this out as anything but a rounding error.


> 1.7MiB

That's way less than so many web pages deliver on initial load.


Hmm, that's a good point.

I appreciate when features are in test phase that it makes sense to keep it simple. But this is in production, and there are many dead-simple solutions that do this efficiently, long polling, SSE, etc. This is Google after all.

To implement this, you have to have a developer who is doing something else switch to this, get up to speed, and make a change. This involves risk and opportunity cost.

The question for enterprise software development isn't "Will this make this thing better?", but "Is making this change more valuable than the other things on the priority list?"


Sure, but I'm never suggested front-end developers should build new tech every time they want a new feature. But there are solutions to do things better now, so it seems odd to ignore that. It's a bit like saying React/Angular/Vue/any other frontend framework won't make things tangibly better, so why bother. Or optimising page size won't make things better, so why bother. Google actively invests in telling everyone else to optimize their websites and even prioritizes fast sites over slow sites (https://developers.google.com/speed). Why shouldn't they care about their performance too?

This isn't just a frontend optimization that can be applied in isolation by a good dev. With WS or SSE there now needs to be server written & supported to push this data over. On top of all the networking issues that come with these streaming protocols.

> Why shouldn't they care about their performance too?

They definitely do this for everything they push out that runs on billions of devices for people all over the world. How many people use Google to keep track of the live score of "the Laver Cup tennis championship in Australia?" I'd be surprised if that's more than 100. But Google has these numbers, and based on that they've chosen not to optimize this.


There's also the issue of optimizing for users of sports fans but pessimizing for the general population.

Say that they do this "right" and include a real-time websockets client that has all the proper reconnect logic, uses a minified protocol, has a proper real-time stack on the backend with all the appropriate DDoS protection, encryption, CORS, etc. All the client logic is extra bytes that need to be shipped on every search request. Or they could split the bundle and lazy-load it, which is still extra bytes, but fewer of them, and adds complexity. The server code needs to be audited for security; one compromised frontend server and the damage to Google is already more than the entire value of the feature. One black-swan websearch outage (caused, say, by a misconfiguration leading to the load balancers getting slammed by websocket connections, or a misconfiguration causing the DDoS servers to think websearch is actually undergoing a DDoS when it's just normal websocket traffic) is already more than the entire value of the feature. The bar for simplicity and reliability has to be pretty high for a feature that's going to be used by 0.0001% of users, because if there are any negative effects at all on mainstream websearch traffic, that feature should never have been launched.


I agree complexity should be avoided. But I don't really follow the logic that if every feature can break security, reliability, etc. you should never change anything or progress. Google is doubling down on these features as you can see at https://www.seroundtable.com/google-pin-live-scores-28261.ht.... If they're continuing to invest in adding more realtime features, and penalising sites in search results that don't optimise their websites https://www.sitecenter.com/insights/141-google-introduces-pe..., then I think they should apply that same logic to their own site IMHO.

Probably that's why Google is trying to improve network protocols (QUIC, HTTP/3) rather than relying on one-off optimizations described in the article.

How does QUIC or HTTP/3 change things? Polling is polling, regardless of whether it's over HTTP/1, HTTP/2 or HTTP/3 (QUIC).

It still optimizes overall network efficiency much more than enough to negate 0.001% inefficient cases using polling? I would say this is a right prioritization of optimization engineer headcounts, which is pretty scarce.

This feature is not for the Laver Cup, it's for all sports events and scores they have access to, and has been so since 2016 - http://www.thesempost.com/google-adds-live-sports-scores-sea...

Speaking of optimization, keep in mind this shows up at the top of the results page. Any of the "more optimal" streaming solutions would likely require more JS on the page, and thus make the page as a whole heavier and load slower, which is actually worse for Google as a whole. Looking at just this feature in isolation misses the real-world problems immediately surrounding it.

Those solutions are not dead-simple at scale: they consume resources (such as file handles) on the server and they have to be supported by middle-boxes (such as proxies and reverse-proxies).

Google gets to see the actual error rates when these solutions are deployed in production. If you want traffic to get through to end-users, then it has to look like what network administrators expect to get to the long tail. The end result: QUIC is encapsulated in UDP, DNS-over-HTTPS, and MPEG-DASH live streaming (making streaming video look like HTTP, and making a new request every few seconds).

If you are not at Google scale, please, yes, use the newer, simpler technology.


> MPEG-DASH live streaming

IMO that was more dictated by server-side complexity. A multimedia streaming server, especially for live video, that is optimized end-to-end for robustness and efficiency can be a very complex beast. Streaming transcoding, metadata injection, buffering, etc are difficult on their own. Doing them together in a unified fashioned is especially difficult. Solutions like GStreamer that abstract this away and make it conceptually simple are also extremely inefficient at scale. (Take it from someone who wrote an entire end-to-end solution that could do M:N codec, format, and transport transcoding, per-user streaming ad injection, and a bunch of other stuff. I could do 5k+ live audio streams, each with unique ad injection/splicing, with most of the pipeline occurring on a single E3 Xeon core--the socket writes to 5,000 clients from a single thread and the resulting ethernet interrupts that caused that CPU to max out. Reusing solutions like GStreamer or FFmpeg directly in your pipeline you'd be extremely lucky to get 1/10 of that, and more likely 1/100 at best.)

MPEG-DASH permits the incoming media processor to simply dump a bunch of files to disk. The client facing service can then serve those up however it wishes--it can be completely dumb or incredibly smart, but importantly both halves can evolve on their own. You could do codec and rate transcoding as a third, middle component. And you could break these components down even further, as "everything is a file" is the most basic IPC interface we have. This makes iterative development much easier.

IME, MPEG-DASH is pretty much what you'd get if you began with stodgy old (relatively) commercial broadcast equipment using nasty broadcast framing standards and organically built a streaming pipeline from scratch, without ever having the benefit/burden of knowing how to do performant streaming server development, or even how existing solutions worked. The multiple files non-sense was never necessary for client compatibility or network filter bypassing. Apple had solved that with RTSP-over-HTTP tunneling years prior, as did Flash with its clunkier RTMP solutions later.


The article considers efficiency from the client side, but it doesn't really consider scaling effects on the server side. Simple polling requests coming in are easy to load-balance, either randomly or by a round-robin scheme. It's resilient, because if one request fails, the next will likely succeed and the user won't even notice.

On the other hand, the complexity of trying to manage millions of open web socket connections filled with state would make things difficult. You would have an uneven distribution of load across servers, the challenge of knowing which connections to close in the lb, and probably the best way to keep the user experience smooth would be to have the client assume failure rather than delay, and aggressively disconnect and establish a new web socket anyway.


Definitely, 90s tech sometimes is the right answer.. HTTP request compared to HTTPS is far less back and forth between the server and the client, and websockets (I've used them) don't scale well into the super high concurrent connection counts (like more than 1 million) -- would love to know if there is an implementation that does? Or if its just implemented by lots of parallel servers each running like 100k websockets each?

Agreed that websockets would be overkill and probably less reliable, but why not Server-Sent Events[0]? They are quite simple and should get the job done. For unidirectional information I find that they’re a pretty decent technology, and one which is surprisingly-often overlooked.

0: https://www.w3.org/TR/eventsource/


Browser support may be holding SSE back. [1] It will be nice once Edge (as branded Chromium) supports it.

[1] https://caniuse.com/#search=server%20side%20events


Those still tie up request handlers and connections on the backend side, whereas short request/response requests do not.

Those things can in some cases operationally introduce a lot more pain (e.g. in terms of resource exhaustion) than the additional short lived requests.


I would guess it's because they likely weren't supported yet by Google infrastructure at the time of implementation. Or that it was must simpler to just add a pollable endpoint.

Yup, the article covers SSE, long polling and XHR streaming as all vastly superior solutions that would have improved things. So yes :+1:

fooblitzky it's interesting you think that handling millions of open websocket connections with state is needed. Firstly, all of these browsers already have connections to Google's servers. HTTP keeps a connection pool open, so keeping connections open is still needed regardless of the transport. Given that, why not use long polling? No state is needed in that situation. And if you're doing that, what's wrong with one additional step, and stream updates over an XHR connection, which too is effectively stateless given any request can fail, reconnect, and subscribe to the stream of updates.

I appreciate streaming is harder than polling. But IMHO, if that's Google's thinking, that feels a little defeatist that has designed and built complex systems like Millwheel (https://ai.google/research/pubs/pub41378) to solve these types of problems.


You are still describing a situation where a client connects directly to a server.

At Google, those client connections are probably kept open at a load balancer. The load balancer has its own connections to servers on the other side, and those will be as short-lived as possible, so that the server is available to service the next request the lb sends its way.

If you start talking about server-side push technologies, you are keeping a long-lived connection open from the server, through the load balancer, to the client. That will affect the dynamics of the load balancing - it's not as simple as round-robin or random distribution that short-lived requests allow. Consider, for example, a server that becomes overloaded. With short-lived requests you can just add more hosts to the LB pool. With long-lived requests you might need to start thinking about migrating connections, or forcefully terminating some connections if the server hits some kind of load threshold, as well as adding hosts.

It's not impossible, it just makes it more complicated and less reliable, which is not what you want for something that just updates a score. As others have pointed out, it's probably just fetching a static file that gets updated periodically.


It's actually worse than that: IIRC (this is 10 years old), traffic to Google passes through a custom DNS resolver that geolocates the lowest-latency active datacenter; a load balancer; 2 levels of reverse proxies; DDoS protection; a bastion server that lets SRE quickly kill non-critical traffic if it threatens the stability of websearch; the webserver; numerous application servers (for core websearch this could be tens of thousands per request, but for a feature like this it's probably just one); and the storage layer. With various push technologies, all of these need to be made aware of the need for a server push, and modified to handle potentially long-running connections that receive events at the data-source's command rather than at the user's request. With polling, you write the new data into the data source once and then the infrastructure just picks it up periodically.

Websockets etc. are great technologies if you can handle all traffic on a single box; I use them extensively for my startup. They are a terrible technology at scale. If you want to run websockets at scale, I would actually recommend pulling all the real-time features into a side-channel that's updated via RPC when a relevant event happens and listens directly on the Internet, and then making that as lightweight as possible so that it can scale vertically and as non-critical as possible so that users don't care if it goes down momentarily. If your users don't really care about up-to-the-second information (as with sports scores), just don't do it, and use polling instead.


Loading this article involves my browser making 141 separate requests, downloading a total of 6.9mb of data. Based on the numbers given the article, that's more than eight hours worth of tennis scores.

He he, that is true, and not something we're proud of. We're continuing to optimize things where we can with the resources we have. Given our size, we are spending our engineering efforts on our product where we can bring our optimization work on streaming to our customers. Sadly, as a result, our blog has plenty of room for improvement. As we grow, I hope the existing optimization tasks in our backlog are prioritized.

I appreciate this may come across as hypocritical, and you're more than welcome to think that. But I don't think that changes the analysis of my article, being that Google are telling everyone else to optimize their sites because it makes a better web, or you'll be penalized (https://www.sitecenter.com/insights/141-google-introduces-pe...), and then on the other hand they have over 100k staff and haven't optimized their own results.


68KiB (worst case) per 5 minutes kinda makes it sound like Google did optimize their site. They could optimize it further still, certainly, but there doesn't seem to be any real hypocrisy here as you are claiming. The absolute number here is still very firmly in the "very small" category.

I can't decide how I feel about this. On the one hand, they are totally right that polling is an inelegant solution. In the context of a large-scale website, perhaps it is even harmful to a degree that we should take special note of it. On the other hand, however, the amount of time that was spent on trashing this work is sad to me. Polling is not merely a quick and dirty solution, it is a high-reliability solution where other techniques may fail. I have used many proxies that don't handle WebSockets properly, for example. It is not simple to maintain long-lived streams or connections across all networks. So there are practical upsides to polling that should not be ignored. I can appreciate that this post goes into details explaining why polling is bad, but perhaps it would have come across better as a tweet or something, rather than a full blog post with fancy charts showing how much a particular project sucks.

Exactly - poll `/status.json?at=${Date.now() - Date.now() % 5000}` and let the CDN worry about infinitely distributing state.

(and make sure the CDN isn't caching 404s triggered by clients who are already in the future)


Aside from the points made about the technical solution, its worth pointing out that there is really no one sits watching the google homepage scores to update. 95% of people are just going to check the page for the score and leave. If they want to get more up to date scores they will watch an actual game tracker on ESPN or the like. I think this blog post really misses the business use case.

Wow, this isn't high enough and sadly it didn't even occur to me until I read your comment. I'm not huge on the sports-ball but when I do need a score I google the game, see the score, and close the tab or sleep my phone. I might look at it later but even that is not a common use-case for me.

Yes, there's nothing to see here except the score.

I haven't tracked tennis before, but for example ESPN football has a non-video, non-audio mode with event history plus up-to-date field diagram of each play, and I think even a social media or commenter text stream too. You could actually watch the game for a while this way.

I can't imagine looking at this page for longer than a couple minutes.


The author makes an unvalidated assumption that the users of this service care MOST about bandwidth efficiency.

I suspect that users care more about a service that works reliably. An extra 50-100k every 5 minutes (assuming the user keeps their mobile browser open during this time) does not seem like it would be problematic.

Ironically, the alternatives he proposes make the service LESS reliable, since many users may be behind firewalls that block WebSockets, HTTP streaming, etc.

HTTP polling works for a larger percentage of users and can be scaled horizontally more easily than these other methods during high-volume spikes like the World Cup.

In short, I think Google made the right tradeoff between dumb, boring, accessible vs. clever, complex. Especially for a product that probably doesn’t meet the threshold for investing in a more sophisticated architecture.


> firewalls that block WebSockets

How does that work? How can a firewall tell that a TLS connection is a WebSocket and not just an HTTP session with a server experiencing high load?

Added: just saw this gem[0] from a few days ago, does anyone have some idea why McAfee suggests to their customers that WebSockets are a potential security risk on a web client network? I have literally no idea how they would be more risky than ordinary HTTP...

[0]: https://kc.mcafee.com/corporate/index?page=content&id=KB8405...


If the firewall has access to the traffic stream — say local “security” tool or a corporate managed environment — it can block the Upgrade header which attempts to turn the HTTP connection into a WebSocket. That's the kind of thing which doesn't affect a huge percentage of users but at Google's scale it's still a large number of people.

Re:McAfee, I wouldn't agree with the logic but I've heard people worry about data being tunneled out through new protocols which are harder to filter or used to establish some kind of a persistent control channel. In almost all cases this has high impact with little benefit unless you're filtering all other traffic strictly enough that malware can't use other common circumvention techniques.


Based on our experience, it is not about actual risk, but perceived risk. As for how they are blocked, it’s done by an intercepting proxy that doesn’t allow encrypted connections through it.

Agreed, and what firewalls block long polling, which is in itself significantly better?

I built an app for major sports company in the UK several years ago now. The choice as actually the most cost effective. Most providers of the data, usually collated by people using an Xbox controller and relay device in the stadiums, are uploading static files to storage. We used a thin node.js layer to stitch these from a connected volume and expose them via an endpoint which would specify a time range. If a user never requested the results for the 36 min in a game then our solution wouldn't have to do any work. Once we did process a request for a user for a game at a given time range then all other users would then receive the same cached result. It was very cost effective when you compare millions of users requiring websockets and processing all the events regardless of whether someone is watching those results.

Thank you for your perspective. This is a perfect example of the difference between theory and practice.

This did a great job convincing me that Ably's tech is more complicated, and a terrible job convincing me that the UX improvement is worth the implementation cost.

The 68KiB per 5 minutes is trivial for most users and the 10 second latency for a sports score usually is too. Measure fixing these problems against the opportunity cost of not getting something else done. This reads as more of a tradeoff than a horror story.

Moreover it might not just be developer time vs. latency that's being traded off.

Maintaining a stateful websocket connection on the server side isn't cost-free, and that connection would be idle nearly 100% of the time. The bandwidth consumed via Google's polling solution might well be cheaper than open socket file descriptors of a websockets solution.


Moreover, if they are concerned about bandwidth they could definitely transmit these data with less than the ~2.2KB/request it’s taking. Fixed width integers, anyone? No parsing overhead, even. And that’s without getting fancy. Maybe some flags to indicate a full refresh needed if e.g. a match has ended. 2.2kb must include markup or something.

> more of a tradeoff

You make it sound like it would have taken months of investment to just use one of the many available better options - at worst, it might have taken a week for somebody to learn web sockets if they actually didn’t have anybody who knew it (and even then, that’s a slow learner). We should be delivering the most efficient options and always weighing the alternatives - we’re professionals, maybe some day we’ll actually start behaving that way. From the top-level responses to this article, today is not that day.


Websockets imply persistent connections between servers and clients. That long-term state tightly couples the two: deploying a change to servers now requires draining all running connections, for example. Load balancing is much harder. Throttling client behavior is much harder, so you’re not as insulated from bad client behavior or heavy hitters. Consequences of that decision ripple through the entire architecture.

I really like the polling approach used here. It’s simple, easy to reason about, and loosely coupled. It will be reliable and resilient.

Saying it might have taken a week to learn websockets completely misses the point. I’ve built large architectures on persistent connections and deeply regretted it.


Exactly! Keeping all of those stateful connections open is a much bigger issue than loadbalancing simple HTTP polls (that you could cache super aggressively). I think this is what this blog post really misses and shows that they have not much of a clue how to operate things at a worldwide scale.

Seems low tech, but it just works for every user.

Exactly. It's "good enough" engineering. HTTP polling is guaranteed to work for every client, all of the time.

While the other solutions mentioned offer improved performance with regard to latency and bandwidth, neither of those are going to noticeably improve the user experience. And having to deal with stuff like bloated libraries, broken websocket connections or browser incompatibility mean the proposed alternative solutions have a very real possibility of degrading the user experience.

I get it, the author sells real-time solutions, but in this very specific use case, I think HTTP polling was actually the correct choice.


This. Especially because their main service is search and not live scores. I would expect that these are shown quite rarely.

Pair that was demand forecasting for this relatively obscure feature, they probably made the call that simple polling was good enough for now.

That's the really important thing - we use websockets pretty heavily here, and users can have bizarre issues with them. With this, it does just work. It's not the best or the shiniest, but it's the easiest and the most compatible.

Sure, but long polling and XHR streaming work for all browsers too. The article was not saying you should use Websockets, it was meant to say HTTP polling is the least efficient way possible to do this :)

Sounds like an excuse. If I can find a way to make something like this gracefully degrade, I'm sure Google also could...

You might say it's over-engineering, but in aggregate terms this could make a big difference in power and network usage. I'm talking more about the users (usually mobiles). So I would call it just... engineering.


How long does the average user stare at the scores on the search result page? It doesn't seem like something people would leave open. Over a short period of time, say 30 seconds, the difference in bandwidth & user experience is inconsequential between the different methods.

Engineering also includes scoping resources for maximum impact, after all. Building a fancy, ultra-efficient live score system would be fun, but it's not going to be good engineering if the usage metrics don't justify it.


Sure, but that was not the point of the article. It works, but badly, and even long polling would have been a significant improvement and work for every user.

I'm surprised that this article didn't consider HTTP/2 or QUIC at all. In either case, the disadvantages of polling are dramatically decreased (persistent connection or lack of need for one, header compression, etc.). Having long-lived connections with web sockets is hard to do at scale (pinning a user to a single server, pushing caching down a layer or two in the stack), and when you're Google the back-end efficiency of stateless requests is hardly a concern.

If you're on a browser that doesn't support a new version of HTTP, you're probably also not able to use web sockets. Supporting two dedicated transports (one for fallback, one for marginal efficiency gains over the fallback) for a small feature like this seems crazy.


Something else to consider (and something we've had to deal with at work): Not all company firewalls support websockets, long polling, and other newer technologies. So, if you want your product to work as many places as possible (without working with every potential user independently), you really do need to use tech from the 90's.

falcolas I am not aware of any single firewall that blocks long polling, or XHR streaming (at least for a fixed period). Can you substantiate that? We regularly check our transports against legacy devices at Ably, so I don't think this is true. I stand to be corrected of course :)

There are a lot of feature requests that just aren't worth implementing in a complex way.

You can set up a simple GET request with some caching on the server and a simple poll on the client in minutes to a couple hours and move on. Especially when real-time accuracy doesn't matter.


Looks like Ably has their own page on long-polling and why sometimes you can't use that or web-sockets. A few cherry-picked sentences (though it actually was a good read, do recommend the whole article if you have the time).

https://www.ably.io/concepts/long-polling

> Reliable message ordering can be an issue with long polling because it is possible for multiple HTTP requests from the same client to be in flight simultaneously.

> Another issue is that a server may send a response, but network or browser issues may prevent the message from being successfully received. Unless some sort of message receipt confirmation process is implemented, a subsequent call to the server may result in missed messages.

> Depending on the server implementation, confirmation of message receipt by one client instance may also cause another client instance to never receive an expected message at all, as the server could mistakenly believe that the client has already received the data it is expecting.

> Unfortunately, such complexity is difficult to scale effectively. To maintain the session state for a given client, that state must either be sharable among all servers behind a load balancer – a task with significant architectural complexity – or subsequent client requests within the same session must be routed to the same server to which their original request was processed.

> This can also become a potential denial-of-service attack vector – a problem which then requires further layers of infrastructure to mitigate that might otherwise have been unnecessary.

> That said, there are cases where proxies and routers on certain networks will block WebSocket and WebRTC connections, or where network connectivity can make long-lived connection protocols such as these less practical. Besides, for certain client demographics, there may still be numerous devices and clients in use that lack support for newer standards. For these, long polling can serve as a good fail-safe fallback to ensure support for everyone, irrespective of their situation.

And this one is the most important response to the OP article, in my opinion.

> That said, given the time and effort – not to mention the inefficiency of resource consumption – involved in implementing these approaches, care should be taken to assess whether their support is worth the added cost when developing new applications and system architectures.


I miss 90s tech. I feel like building web applications was a lot more enjoyable with 90s tech that really met the needs of users.

Polled responses will be cached away at Google's Edge Cache. It's dumb on the client, but this scales practically without limit.

Let's look at some strategies for updating content on a page...

#1: Give users a link they can click that says "Click here to refresh"

#2: Implement <meta http-equiv="refresh>. Usability win: direct user interaction not required

#3: Implement http polling. Usability win: the page doesn't flicker and disrupt the user, since no page refresh is necessary

#4: Implement http long polling. Usability win: ???

#5: Implement SSE. Usability win: ???

#6: Implement Websockets. Usability win: ???

TLDR: The user doesn't care if you use http polling or websockets.

New + complex is no better (and often worse) than old, boring and simple.


The usability win from 3 to other solutions is: Users don't have to wait until the next polling interval expired in order to get an update. In interactive applications users don't expect delays, so you should display updates in around 100ms. Achieving that with polling might not be feasible for some applications.

Let's hope the author never finds about HTTP Live streaming. Now there's some very effective abuse of HTTP even though we thought we had much better protocols already.

Which ones?

My thoughts were that actually HLS is a quite nice idea, once we remove the "live" from it's name. There are certainly things that do a lot better at live broadcast. But the ability to distribute it over plain HTTP at massive scale with zero changes to existing webservers makes it a quite compelling solution. We can't use real realtime streaming protocols from CDNs.


Ironically the ably website won't load for me right now so I can't even read this blog post about website scalability and efficiency.

> design choices are surprisingly bad in terms of bandwidth demand, energy consumption

Google Search supports old browsers to an incredibly painful degree. I believe IE6 still work.

This effectively discourages engineers from exploring new approaches, because they always have lots of edge cases that just aren't worth solving.


Some technical point in addition to the other arguments: Even with WS or SSE you'd have to send keep-alive packets periodically, so you can find out if the connection has died. Those would be smaller than data for polling and the interval is not bound to your refresh rate anymore - but still more than nothing.

OTOH: Polling is stateless and easy to scale. It is not the most efficient, but presumably it can be improved versus the worst case by exploiting browser caching effectively, and via connection re-use.

Maybe the original developer left the company and it wasn't worth the effort to introduce more state on the server-side via the other methods? I can imagine a scenario where if an open TCP connection isn't generating revenue (via ads or whatever), they don't want to allocate any resources for it on the server-side.

But yeah, they could have improved on diffing the data or even leveraging the If-Modified-Since HTTP header. Seems lazy.


This is google.com. Supporting ie whatever actually is a large number of users for them. Websockets is not a thing.

search results of google are almost dead. it doesn't give you effective results like it used to do in the past..

Features like this are presumably what antitrust investigators will scrutinize. It’s Google doing everything in its power to keep users within the Google empire at the expense of potential competitors, of which there precious few at this point.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: