
I am Doing HTTP Wrong - admp
http://lucumr.pocoo.org/2012/4/14/im-doing-http-wrong/
======
the_mitsuhiko
Oh wow. I did not expect this to hit hackernews considering the slides are
totally not intended to be consumed without notes. Let me just quickly point
out a few things before I write up something longer about this thing.

First things first:

> I'd like to add that maybe Fireteam should have not chosen HTTP as a
> transfer protocol for games, where state is everywhere and HTTP is
> stateless. Although I don't feel qualified to suggest alternatives.

We have very good reasons for supporting HTTP but it's not the only protocol
we're implementing. As pointed out on one slide: HTTP is better supported than
raw socket connections on certain devices we care about.

> Trying to over-abstract network communication is one of the classic mistakes
> in distributed system design.

We're not doing that. If you look at the slides we're 100% HTTP compliant, we
don't make some fancy schmancy protocol on that just uses HTTP as a transport
layer but does not embrace it. I have the "pleasure" to use SOAP in the past
and this is not what we're after.

We're basically treating HTTP as an implementation detail for JSON/urlencoded
etc. REST. and we do that by sprinkling a ton of meta information around in
the code. Once the request was dispatched to the function it's returned to the
valuable information. One layer higher all the HTTP logic happens.

The general workflow however is fundamentally different from what I did in the
past with Flask or how Django works which also explains the title.

I will write something up because I think from the slides alone you get a
completely wrong impression about why this is cool :-)

~~~
trentonstrong
Interesting! This seems like something that might work well with 3.0's
function annotation support with the ability to specify that the handler's
input argument conform to either a the streaming or buffered interface.

As a point of curiosity about best practices, how are you handling the meta
information right now?

It feels like decorators provide the cleanest solution currently, but I've
always wondered if there are alternatives.

~~~
the_mitsuhiko
> Interesting! This seems like something that might work well with 3.0's
> function annotation support with the ability to specify that the handler's
> input argument conform to either a the streaming or buffered interface.

I thought so, but the annotations in Python 3 are largely useless. They would
work if you could forward the signatures through a chain of decorators, but
ther is no guarantee for that. I did play around with Python 3 but it's
actually not in any way better for what we're doing here.

> It feels like decorators provide the cleanest solution currently, but I've
> always wondered if there are alternatives.

Decorators require a routing system that can resolve routing "dependencies".
We're using the Werkzeug routing which was designed to do that. The basic idea
is that it does not matter in which order you define the URL rules, the
routing system figures out the ordering.

If you don't have that, you would have to move the definitions to a central
file (akin to urls.py in Django) to make this work because you need to define
the ordering.

We also experimented with having the meta information in JSON files but it was
not really worth it. Having it next to the function makes it easier to
maintain it.

------
pestaa
I'm always eager to hear whatever Armin has to say, but I feel so-so this
time, probably because I didn't fully understand or his point was not clearly
communicated.

The premise is that you can serialize request and response objects only after
you made sure they are fully independent of the underlying protocol.

However, when the protocol is in fact _underlying_ , you cannot fully get rid
of its nature, and, for example, resources are still going to be requested by
their URLs. The middleware he develops already takes care of the details like
the protocol headers and encoding. I believe that even with the best efforts
and intentions, HTTP will remain implied.

It is also not clear to me what something like Flask would gain from buffered
and serializable objects. At first sight it is a performance penalty rather
than a practical improvement, isn't it? I'd like to hear more.

Edit: I'd like to add that maybe Fireteam should have not chosen HTTP as a
transfer protocol for games, where state is everywhere and HTTP is stateless.
Although I don't feel qualified to suggest alternatives.

~~~
MehdiEG
I have to say that I was equally puzzled by this post the first time I read
it. Perhaps having some background information about what exactly he's working
on might have helped.

On a second read, it looks like the meat of the post is about decoupling the
implementation of your API from the underlying network protocol. Unless I'm
missing something, that would be MVC.

Controllers are completely oblivious to the fact that HTTP is being used
behind the scene. The routing mechanism takes care of parsing the request,
deserializing the request body, instantiating the appropriate controller and
invoking the correct action passing the deserialized objects as parameters.
Once the controller's action has completed and returned its result, the View
is responsible for serialising the response to HTML, JSON, XML or whatever
serialisation format you use, and for setting the correct HTTP status code and
headers.

So still puzzled, even after a second read.

~~~
PaulHoule
Trying to over-abstract network communication is one of the classic mistakes
in distributed system design.

Remember the old "RPC" idea from the 1980's? Despite a huge amount of effort
to optimize remote procedure calls, nobody could ever get an RPC to be within
orders of magnitude of a real procedure call in speed.

The RPC concept only became mainstream in the 2000's when people had given up
on performance; SOAP and POX systems use inefficient XML and JSON
serializations and are integrated into http stacks that were designed to use
something else. These systems flourished in the 2000's because
interoperability, not performance, was the driver.

Now, if you're in an AJAX, Flex or Silverlight environment, you've got the
whole asynchronous communications issue -- you can't paper that over with an
abstraction layer, you've got to build the whole application around the fact
of async comm if you want to build something that really works.

~~~
MehdiEG
Abstraction gone wrong almost always come down to wanting to abstract for the
wrong reasons. RPC was all about pretending from the client side that
accessing a remote resource was the same as calling a local method. This was
and still is non-sense and a prime example of leaky abstraction.

My understanding of the OP's post is that he is talking about server-side
issues rather than client-side though. Others have pointed out that he's
talking about the design of a specific python library or framework that I'm
not familiar with - I'll need to read up on this first.

------
sigil
_One of the direct consequences for instance is that the first WSGI request
object that starts consuming for data is the one that ends up being the only
one that can have it._

Protocols like HTTP that lack length-prefixing of messages have some
unfortunate implications for _streams_ of messages.

1\. The reader of the first message must buffer. Since it doesn't know how
much it should read, it may buffer too much, and consume some of the second
message from the stream.

2\. The reader of the second message needs access to the stream, plus that
extra data.

3\. If your stream is a non-seekable file descriptor like a socket, there's no
way to rewind, or somehow put that extra data back onto the original stream.
This forces you to either (a) convey the stream + extra data, which is less
natural or (b) create a new stream for the second message reader, copy the
extra data to it, and then copy (in userspace!) data from the original stream
into the new stream.

Take a look at any cgi or fastcgi implementation and you'll see something like
3b.

It sucks, and I think makes a good argument for using length-prefixed messages
in your internal protocols (or whatever ones you can control).

------
adeelk
I am not sure whether Ronacher realizes it, but this is exactly the motivation
behind Pump [1], an HTTP abstraction that takes a different approach than
WSGI. It was criticized a lot by HN [2] and by Ronacher himself [3] but I’m
glad that he understands my point of view now.

[1] <http://adeel.github.com/pump/manual.html> [2]
<http://news.ycombinator.com/item?id=2810373> [3]
<http://lucumr.pocoo.org/2011/7/27/the-pluggable-pipedream/>

------
perfunctory
To be honest I never understood why wsgi conflated buffered and streaming
modes in one API. Having two separate specialized sets of API makes much more
sense.

~~~
slurgfest
There isn't any notion of 'buffered mode' in WSGI. It is just left up to the
app to buffer as needed before yielding. Middleware and servers are not
supposed to perform buffering. So I don't see how the two are conflated. How
am I misunderstanding you?

~~~
the_mitsuhiko
> There isn't any notion of 'buffered mode' in WSGI. It is just left up to the
> app to buffer as needed before yielding. Middleware and servers are not
> supposed to perform buffering. So I don't see how the two are conflated. How
> am I misunderstanding you?

The problem is that you would have to support streaming because you don't know
what the inner WSGI application is doing. Our stuff does not work on top of
WSGI because of that.

At the end of the day it is implemented as a WSGI application but it only uses
WSGI for talking to the server.

~~~
slurgfest
I don't think I have any criticism for your approach on your project.

I was trying to comment on the idea that WSGI somehow conflates buffered mode
with streaming mode. How is that even possible when WSGI does not even have a
notion of buffered mode?

If you want your app or even your framework to buffer, you can just do that in
the app layer. I can't see that the server interface should have additional
moving parts just to do what is cleanly doable in the app layer.

I can only imagine this being a disappointment if someone wanted to use WSGI
middleware to carry out buffering. But if (edit: internal) WSGI compatibility
is not a high priority, then why would you want to use WSGI middleware for
this (generally pretty messy)? Do it in your framework or in your app.

So I do not understand the complaint about the 'conflation' of buffering and
streaming. I think it is appropriate to leave the buffering decision in the
app layer. And it is also reasonable to use WSGI as just a way to talk to a
server (what else is it really for? Obviously not to offer a big fat servlet
API...)

If you want to do separate buffering and streaming APIs in your app layer then
that could be a smart way to reduce complexity but I don't see why this is
some sort of complaint about WSGI.

------
swah
I'm missing something: why is Armin mentioning that "slides are useless
without the talk" if the link is to a text with no reference of a talk?

------
PommeDeTerre
Sounds like we may be getting another name or acronym to add to the long list
of failed technologies that have already tried to promote a similar idea.

We already have ONC RPC, DCE, Java's Remote Method Invocation, Jini, CORBA,
SOAP, and .NET Remoting, among many others. Does this family of failure really
need to grow any larger?

~~~
jerf
No, this is about how the libraries that bind to HTTP don't actually match
HTTP's model of operation.

I've banged on this drum in a couple of cases myself with framework authors,
in some cases even right at the beginning of the framework's life, and
generally hit a brick wall. HTTP is _not_ a request/response protocol. It
certainly once was back in HTTP/1.0, but with pipelining and reusing
connections and now websockets (and soon SPDY's server push/hint), it's a
streaming protocol that has a common use case where it is used for
request/response, and that requires a completely, _completely_ different API
than a protocol that is truly request/response. It is a lot easier to build a
streaming base that has a special case for request/response than it is to
build a request/response base and then hack in bizarre conceptually-impure
bullshit for streaming, but every web framework I can find is the latter, if
indeed they don't simply punt entirely on streaming because they've written it
so thoroughly out of the API. (I'm vaguely aware of some that are actually
sensible but all in languages I don't know and haven't gotten to yet, so I'm
not sure if they really are built on a sensible base or if it merely raises
the bizarre crap to API-blessed status.)

~~~
saurik
Can you describe this further? Reading this guy's post it sounds like he's
complaining about the exact opposite problem: that the web framework is
designed for streaming and, despite that I agree with you that specializing
request/response out of that should be easy, he claims that this seems to make
it awkward or impossible for him to have his request/response model.

------
its_so_on
Could someone add a little more context to this please?

~~~
arthurbrown
I'd guess this impetus for this post came about after this meeting at pycon

<http://kennethreitz.com/the-future-of-python-http.html>

------
kracekumar
How about latest buzz websockets?

