
ReverseHttp - apgwoz
http://www.reversehttp.net/
======
extension
There are two protocols described here. One is an extension to HTTP which
allows the client and server to swap roles while still using the same
connection. This would allow a server to "push" events to a client
asynchronously. This will never work in a browser.

The other protocol tunnels an HTTP connection over another HTTP connection in
the opposite direction. Tunneling asynchronous messages over HTTP is an old
technique which can be implemented in a browser.

Neither protocol enables any kind of novel functionality. They merely add
another layer of HTTP cruft.

~~~
tonyg
Well, the intent is to design a systematic way of setting up a relay for HTTP
requests from a public internet through a gateway to an application that
otherwise wouldn't be publicly addressable. Without a protocol like the one
I've drafted, setting up HTTP servers or CGI scripts stays ad-hoc, requiring
local access to the gateway server and DNS and firewall configuration.

~~~
extension
The problem could be solved much more generally with a protocol to request
socket level forwarding of arbitrary network services. This could be used
transparently to create a gateway for HTTP or any other protocol. Some
existing protocols come close to doing this (e.g. SSH) but I don't know of any
that handle public namespace allocation.

For the case of "works in a browser today" aka Comet, it is again better to
solve the more general problem of bidirectional tunneling over HTTP (e.g.
<http://xmpp.org/extensions/xep-0124.html>), through which you could make any
sort of connection, including a reversed HTTP connection or the gateway
request protocol above.

Imposing an extra HTTP layer and/or building on top of HTTP (in the first
case) needlessly complicates and significantly restricts both of these
protocols without deriving significant value from existing standards or
infrastructure.

~~~
tonyg
Check out <http://www.orbited.org/> for TCP-sockets in the browser (though I
don't think it lets you act as a server yet). As you point out, nothing yet
handles public namespace allocation, and this is key; for the specific case of
TCP servers in the browser, port contention could become an issue fairly
quickly, in which case lifting the level of abstraction to something like XMPP
or HTTP (as I've done), where the addressing model is more flexible than
TCP's, seems the sensible thing to do to avoid this.

XEP0124 ("BOSH") is very similar indeed to what I've defined; the differences
are (1) BOSH is XML-specific and (2) it only provides a tunnel between the
client (browser or not) and the server. What I've been experimenting with is
content neutral, and, crucially, not only specifies the tunnel, but also
specifies how the gateway server should expose the application at the end of
the tunnel to the rest of the world. That's something that I have not seen
before anywhere. (Except, as you mention, by SSH in a limited way.)

With regard to leveraging existing infrastructure: this is exactly why
restricting ourselves to carrying HTTP over the transport is a good idea. We
get to reuse _all_ existing infrastructure such as URLs, proxies, and of
course the ubiquitous HTTP client libraries. Raw TCP sockets, even if the
public namespace allocation issue were addressed, do not have an URL-like
notion, and caching proxies do not exist; further, raw TCP access is in many
environments not permitted or not available (e.g. within corporate firewalls,
or running within a browser). Using HTTP rather than TCP is a deliberate
choice to structure the network by providing not just a transport (packet-
based, at that!) but a notion of addressing and a content model. HTTP out-of-
the-box is a much richer protocol than TCP.

In conclusion, what I've proposed is in its transport aspect no more
complicated than XEP0124, and in its name-registration aspect AFAIK not
comparable to anything currently existing. The restriction to HTTP gives us an
addressing model already widely supported and understood, and lets us reuse
existing infrastructure and avoid needless reimplementation or reinvention.

~~~
extension
URLs could be used with a generalized protocol. The client would specify the
URL scheme, port and an arbitrary name and the server would generate and
return a URL, or an error if it doesn't support the requested scheme (servers
could support a very limited set of schemes and ports, perhaps just one). Raw
socket endpoints would use "tcp://host:port" and "udp://host:port". Servers
that provide raw sockets would probably want to create a subdomain for each
endpoint to avoid port contention. Since the server knows the URL scheme, it
can transparently do caching/filtering/mangling for particular protocols.
Making a request with the "http" scheme would be functionally equivalent to
your reverse HTTP.

This is just off the top of my head and there are surely better approaches but
the point is that it's quite doable and probably as simple or simpler than
something at the HTTP layer.

HTTP's "richness" is also what makes it a pain in the ass. It's a
megalomaniacal protocol designed for a very specific purpose and when you are
forced to use it for any other purpose, you have to carry a lot of baggage,
and the baggage is full of rocks.

This gateway service is nearly always going to be used to create some sort of
ad-hoc messaging endpoint, rather than a proper web server with web pages, so
why force tunneling over TWO layers of HTTP while precluding the use of any
existing wire-level level protocols?

It's time we buried the "use HTTP for everything" meme. We already have an
everything protocol called TCP and if there's going to be a layer after that,
it will be a carefully designed, flexible messaging protocol like AMQP. HTTP
adds negative value as a general purpose transport and it's not even that
great for serving web sites.

~~~
tonyg
The generalized protocol/URL idea is a nice one. I'm not sure it'd be simpler,
but it'd certainly be useful.

It's interesting you mention AMQP; this work I'm doing actually came out of
some of the work I did as part of my work on RabbitMQ.

~~~
extension
Ah, you work at LShift. A funny coincidence indeed.

I think the URL request protocol itself would be fairly simple but any use
case would be application specific so coming up with a general purpose
implementation might be tricky.

I'm just finishing off the Erlang book and for my first project, I was going
to either build a general purpose Comet server (improving on Orbited, Meteor,
cometd, etc) or flesh out the above protocol and implement it... or a
combination of the two. If you want to offer input or be involved: jedediah at
silencegreys dawt kom.

------
samueladam
The demo speaks for itself.

<http://www.reversehttp.net/demos/demo.html>

~~~
DLWormwood
Case breaks things; entering a sub-domain name with uppercase letters causes
the local server to fail. DNS names are supposed to be case-insensitive.

<http://www.rfc-archive.org/getrfc.php?rfc=4343>

~~~
apgwoz
is that the fault of DNS or the actual server sitting behind it, expecting the
host header?

~~~
tonyg
It's the fault of the server reading the host header. I'll fix it now.

~~~
DLWormwood
Cool. It normalizes to lowercase, but it's otherwise functional, even if the
URL bar's case differs from the posted version.

------
TimothyFitz
This is a different propopsal than the IETF draft by Donovon Preston, which is
knows as "Reverse HTTP".

This is apparently "ReverseHTTP" and the specification looks about twenty
times longer, needlessly. I'd love to hear any good reasons why I should use
this "ReverseHTTP" instead of long-polling or Reverse HTTP... the IETF draft

~~~
tonyg
It's very similar to Lentzcner & Preston's I-D, yes. We both seem to have
independently invented the same general idea and chosen the same obvious name.
The differences are the use of a ReSTful protocol, and more elaboration of the
registration/name-management aspects. And of course that I haven't submitted
it as an I-D :-)

My current draft is far too long, I agree; it describes not only the use of
HTTP to retrieve requests (which is equivalent to Donovan Preston's idea), but
also the interactions and headers needed to manage the tunnelled service. That
latter is something that Lentzcner & Preston haven't addressed yet, I think.

------
patrickdlogan
Not sure I buy the premise "Polling for updates is bad." Certainly reverse
http and/or web hooks do not cover, for example, all the same cases as
http/atom/atompub. I'd like to see people's guidance on when to consider one
or the other.

~~~
tonyg
Perhaps more accurately, polling for updates in an event-based network is
_suboptimal_ \-- especially since we have all this lovely packet-switching
machinery available for use! -- but it's not completely wrong. Polling an RSS
feed is equivalent to (a shitty form of) queue replication, and (slightly less
closely) to TCP retransmissions, where event notification is equivalent to
message delivery and to TCP segment transfer. The two approaches are in a
sense dual. You can construct a message-streaming system from a state-
replication system, and you can construct a state-replication system from a
message-streaming system. Of course this still doesn't address when one or the
other should be used: for that you have to get into the different scenarios
for message replication. RSS/Atom etc are great when latency doesn't matter
and you are multicasting, or when recipients desire (relative) anonymity. The
cacheability of the polling approach can also be valuable.

Neither pull nor push solves the problem that SUP tries to address. For that,
a layer on top is required -- essentially an embedded message broker with
configurable private/shared queues and bindings. One very promising approach,
once the transport is sorted out (which is what ReverseHttp is trying for), is
to transplant the AMQP model (objects and operations) into the new setting.

------
tjogin
Neat, almost a bit magical. But what kind of _common_ web development problems
does this solve?

I'm sure there are a lot of contrived examples, but are there any _good_ ones?
Facebook and Twitter.

Ok, but are there any _good_ and _common_ ones?

~~~
apgwoz
Maybe there are no "common" uses just yet because we do not have an easily
implemented way to do it. When Ajax first " came out", many said the same
things. "great but what do we use it for." now, can you imagine the present
day web without it?

~~~
tptacek
I remember almost exactly the opposite about AJAX (when the paper came out,
not when the XmlHttpRequest object was introduced) --- people went ape about
what they could use it for. AJAX style genuinely made new things possible.
This (supposedly) just makes them cleaner.

~~~
DomesticMouse
Are you questioning the relative merit of reverse http, or of the ability of
the server to update the page?

I'm failing to see the advantage of reverse http over long polls, even though
I am completely sold on the difference that having server push would make...

~~~
tptacek
The relative merit.

------
axod
In safari, the loading spinner never stops. Not a good thing.

I don't really see what this is trying to solve either.

~~~
apgwoz
I believe Comet works around this by using iframes, though I could be wrong.
Either way, ReverseHTTP is another way to not have to poll for resources,
which might be useful for instance in real time chat.

~~~
axod
It's all in fuzzy definitions, but this is using 'comet'.

Comet is generally used to refer to any method that can emulate a raw socket -
iframes, xhr, inserting script tags, etc etc

All this is doing is proxying http from the server, to the browser and back.
It's an interesting thing to try, but I can't see any real life use for it.

~~~
mbreese
This is much lower level and doesn't require javascript to work. (Ignoring the
fact that the demo is an in-browser javascript http server). So, it's not
comet.

The problem is that this isn't supported by any clients that are in wide-
spread use, so you can't just load up a webpage and see a demo.

The advantage of this type of approach only becomes clear when you are using a
lower-level http client library to access resources. It gives the server a
chance to poll the client for information, without using Javascript. For
browser approaches, this may not matter. However, for lower-infrastructure
things, this approach is great.

I've used a very similar technique to link compute nodes to a job server where
the compute nodes were behind a NAT. This eliminated any long polling required
and still allowed the server to query the nodes for their status.

Again, not the type of thing where you're running anything in a browser, but I
wanted to use HTTP as the protocol for simplicity, and needed a way for the
server to talk to a client behind a NAT.

Now the down side is that you basically have to rewrite a web server in order
for this to work. I'm not sure if this could be bolted on. You also need some
sort of session management built in, so you can pair incoming (client->server)
requests and outgoing (server->client) requests. And then you need a client
library that can spin up it's own http server and handle it's own requests.

In my case, I was able to write everything from scratch. But I doubt my code
would scale very well. I'm also not sure that in this case it isn't better to
just make a new protocol. There is a lot of hackery required to get this to
work, and I doubt you'll see web browsers support anything like this.

~~~
the_me
Don't get it. If you are interested in the low level communications, why
wouldn't you simply use a socket and send your own application defined
commands over port 80?

HTTP exists so that any browser can access any web server, it doesn't re-
implement or otherwise allow the usage of TCP/IP.

As a corollary, I don't see why I need to know about your application's
communication protocol, let alone adhere to it because it's now a standard.

~~~
mbreese
We are talking about bi-directional communications between the client and
server. Specifically, server initiated requests to the client. So the major
issue that you can overcome with this would be overcoming some NAT/firewall
issues.

This proposal would be to convert HTTP from being a client making requests to
a server to (effectively) a server making / receiving requests from another
server. So your browser would also be a (mini) server, handling requests from
the main server.

This is largely for people that want to use HTTP as a message-passing
protocol, but use it in a bi-directional manner between possibly NAT'd hosts.

~~~
tonyg
"This is largely for people that want to use HTTP as a message-passing
protocol, but use it in a bi-directional manner between possibly NAT'd hosts."

That is _exactly_ it. You've got it.

HTTP makes an almost ideal message passing protocol: it has a rich and battle-
tested addressing model; it is asymmetric in a _helpful_ way (really! the
response codes are similar to ICMP messages, where the requests are similar to
IP datagrams); it is widely supported and deployed; it is content neutral.

It doesn't even have to be inefficient ;-)
(<http://www.lshift.net/blog/2009/02/27/streamlining-http>)

------
int2e
How about just using hanging gets?

~~~
DomesticMouse
Maybe it makes some sense in server apis?

I don't know. I'm sold on using long polls in conjunction with actors...

