
Long Polling – Concepts and Considerations - srushtika
https://www.ably.io/concepts/long-polling
======
majke
The title brings nostalgia!

For me it all started when Michael Carter worked on orbited, years ago (was it
2008?). Michael made a number of important "discoveries", like the famous
document.domain trick:

[https://developer.mozilla.org/en-
US/docs/Web/Security/Same-o...](https://developer.mozilla.org/en-
US/docs/Web/Security/Same-origin_policy#Changing_origin)

Then I played with libevent, trying to create a framework for comet
applications. Then redis came along, perfect for backend and Websockets in
browsers.

Then of course node.js became a popular engine for messaging applications.

In the mean time we worked on SockJS, to make the transition from comet to
websockets smoother. Oh, all those tricks employed in Sockjs...

[https://github.com/sockjs/sockjs-client#supported-
transports...](https://github.com/sockjs/sockjs-client#supported-transports-
by-browser-html-served-from-http-or-https)

There still is a place for things like Opera Eventsource, but it seems
websockets over TLS is now the best, stable, and portable way of doing
messaging on the web.

------
paulddraper
Comet is really an underrated technology.

Because, it's not a separate technology. It's worked in browsers for years,
and is compatible with any HTTP server toolkit/framework that lets you stream
a response (so basically all of them).

That said....who would use Comet for long polling? You'd use Comet to avoid
the difficulties of long polling.

~~~
tome
How is Comet different from long polling? It's not clear to me. The article
only mentions COMET in the context of a Python server, and the Wikipedia page
isn't much help

[https://en.wikipedia.org/wiki/Comet_(programming)](https://en.wikipedia.org/wiki/Comet_\(programming\))

~~~
paulddraper
It appears my usage is a little different than wikipedia. Apparently "Comet"
can mean multiple things. I meant " _streaming_ Comet", which is in my
experience what most people mean.

Long polling: (1) Request (2) Wait (3) Receive an event. (4) Go back to #1.

Streaming Comet: (1) Request (2) Wait (3) Receive an event (4) Go back to #2.

Rather than getting a raw response from the server in a request, streaming
Comet loads the response in a iframe. The response is streaming HTML and has a
series of <script> tags that post events to the parent window.

It's arguably easier to manage and scale (no missed events, no need to find
the same server again), though it certainly looks a little odd.

------
ben509
This is pretty interesting as an examination of long polling on the browser as
an alternative to websockets. Most of my experience was with it on the
backend, e.g. SQS[1] and hadn't thought much about it beyond, "sit in a loop
and check for tasks on a queue." One of the advantages of that approach, I
think, was that the queue service is dedicated to just that task so the
complexity on that server is all in one place.

It would be great if they went into some of the additional architecture
they've had to implement to, e.g., prevent denial of service.

[1]:
[https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQS...](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-
long-polling.html)

------
jordache
In present time, is there an advantage of long polling pattern vs web socket?

~~~
napsterbr
Just the fact that it's stateless and therefore may result in a simpler
application and infrastructure. But I belive that, in this case, the pros of
websocket outweigh its cons.

That said, I've been working with a middle ground: http 2 + sse. Gives the
benefit of simpler infrastructure while not being as inefficient as
short/long-polling.

ETA: after using websocket for a while, one of the major downsides is that you
can't use most of the http ecosystem. Stuff like cloudflare workers,
cloudflare anti-ddos[0], swagger api docs, caching etc. I recently switched to
http 2 + sse and do not regret.

[0] - they do support websockets, but at a limited and undisclosed rate (I.e.
You don't know how many is too many until they ask you to upgrade your plan).

------
derek_frome
PubNub is probably the most experienced company in this area. Would be
interesting to see them add their experience here.

~~~
stephenblum
Hi Derek. Thank you for mentioning us here. The age-old conversation on
transport protocols is evergreen. The goal of these protocols is to allow data
transmission from one device, to other devices. It is great to be on the
leading edge of technology, continuing the innovative optimizations of the use
of the internet. As the internet scales, more devices need connectivity and
each byte counts more and more. There are 20 billion connected internet
devices today (2018). That's 2x more than the number of humans. The device to
human ratio continues to grow. We need efficient and affective methods for
coordinating information between devices.

The various methods to coordinate information between devices on the internet
should be looked at objectively and mechanically. At the end of the day, the
modern reliable messaging protocols use the IP Frames. The basis of our
internet. Layer 6 protocols based on IP Frames are not equal. Many methods are
not compatible with the various configurations of networks. The bytes and
bandwidth required between each production-ready method differ.

* HTTP/1.1 - 100% Compatibility

* HTTP/2.0 - 100% Compatibility with client initiated connectivity and backward compatibility with HTTP/1

* MQTT - near-full internet wide compatibility ( routing policy / network topology )

* WS - near-full internet wide compatibility ( routing policy / network topology )

Each message received using these mechanisms requires TCP ACKs. The promise of
MQTT and WS leads you to believe that the data streaming to your device over
WS or MQTT don't require ACKs. However this is not how TCP works. When packets
are received there is an associated timeout and retransmission when an ACK is
late or missing. Additionally light-weight layer 6 traffic is required to
maintain connectivity between two endpoints. Otherwise LRUs and quotas are
triggered for routes could be treated as stale, and therefore dropped
altogether. This is the underlying mechanism of the layer 6 protocols that are
often left out of the discussions.

There is a clear winning approach in my mind. HTTP/2.0 includes, by default,
server-initiated data push. The required TLS and header compression, as part
of the spec, allow for a secure yet efficient streaming solution. With
HTTP/2.0, TCP socket limits are less of a concern, as the client only needs
one TCP socket to subscribe to an unlimited number of data feeds. HTTP/1
requires the client to maintain separate sockets for each independent stream,
as HTTP/1 enforces head-of-line ordering for muxing. Something we've done
special for HTTP/1 clients, we've added multiplexing by allowing multiple
topic subscriptions and filter expressions to be passed in a single HTTP call
on the same socket. This isn't natively built into HTTP/1 and is supported on
all our SDKs.

This is why we have chosen HTTP/2.0 as our next-gen transport protocol. We
have started by providing HTTP/2.0 connectivity at our edge for select
customers. As of 2018 PubNub is the world-record holder for the largest online
concurrent event in human history using HTTP/2.0 for live data streams on a
globally celebrated sporting event.

You should be using HTTP/2.0 for your customers. Here is a dockerfile that
makes it easy for you to start testing HTTP/2.0 -
[https://github.com/stephenlb/http2-proxy](https://github.com/stephenlb/http2-proxy)

Stephen Blum ( @stephenlb ) CTO PubNub

