

Show HN: Pump, a dead simple Pythonic abstraction of HTTP. - pyninja
http://adeel.github.com/pump/

======
justin_vanw
"What WSGI should have been."

Well, not at all. Pump is what werkzeug, webob, and other more friendly
wrappers on WSGI already are. Basically pointless duplication of work, without
understanding why WSGI can't be this simple.

One basic reason that WSGI can't be as simple as just returning a dictionary
is that you don't necessarily want the entire body of your response pre-
computed before starting to return data to the client. What about long running
connections? What if you want to return the head of the response immediately,
so the client can start pulling css and js while you compute the body of the
response? What if you want to do chunked encoding to support long polling
connections, or responses where you don't know the response size beforehand?

Pump is basically doing what lots of other things already do, except without
quite understanding HTTP quite as well.

~~~
pyninja
Thanks for the feedback. I'm not willing to accept that WSGI couldn't be this
simple. Pump's specification is modeled on Ring for Clojure
(<https://github.com/mmcgrana/ring>), and I'm sure there's a way Ring gets
around the issues you mention. I'd never had to do any of those things before
but I'll look into it now.

~~~
sigil
As a special case, how about allowing the response body to be an iterable, and
writing whatever blocks of data it produces back to the client?

You should also allow multiple headers with the same field-name, since that's
in the spec.

I like the idea of Pump, and am tired of frameworks protecting me from HTTP.

EDIT: Looking at the WebOb code. It does have quite a few conveniences for
working with HTTP messages. I'm not sure if copying bodies into temp files in
order to make them seekable is a "completely fantastic way" of doing things,
but if I were you I'd definitely read through WebOb to see what kind of
problems you might be up against.

~~~
masklinn
> I like the idea of Pump, and am tired of frameworks protecting me from HTTP.

if you don't like being "protected from http" why not write raw wsgi
applications?

~~~
sigil
How is a gateway that reads, buffers, and parses HTTP headers into an
environment object [1] before turning it over to your "raw wsgi application"
_not_ protecting it from HTTP?

WSGI protects you from HTTP. CGI protects you from HTTP. mod_python and
mod_perl protect you from HTTP. If you're unable to read and parse the
complete HTTP request yourself -- perhaps incrementally, there's an idea --
you're protected from HTTP. _Something_ is imposing policy like how many
headers to accept, what the longest header should be, how to fold multiple
headers with the same field-name, that it's okay to consume memory buffering
all the headers, and so on.

In my ideal world, a web app server has access to the full HTTP request
stream, calls an incremental HTTP parser [2] [3], and does whatever it wants
along the way. If the typical use case is to accumulate a full request object
and call a handler, fine, that can be made convenient. But the web app gets to
decide.

Perhaps my issue is not with frameworks (in the sense of Django, Ruby, etc),
but with web servers. Except, I view the infrastructure for hosting a web app
inside a web server as yet another framework. The common use case is optimized
for at the expense of the less common use cases, which become more painful
than they should be. Or sometimes outright impossible.

TL;DR -- Libraries over frameworks. In Soviet Framework Russia, you don't call
code...code call YOU.

[1] <http://www.python.org/dev/peps/pep-0333/#environ-variables>

[2] <https://github.com/ry/http-parser>

[3] <https://github.com/mongrel/mongrel/tree/master/ext/http11>

~~~
sergeys
"Libraries over frameworks. In Soviet Framework Russia, you don't call
code...code call YOU." -- exactly! Very few people get this, which is a shame.

------
collint
The Rack style API is nice for the simplicity. I <3 it.

The Rack style api is NOT a great representation of the HTTP protocol.
Specifically anything with a streaming or chunked response.

start_response/yield may not be the absolute best API for this, I haven't
thought to much about that. But if you go look into what was done in Rails 3.1
for chunked responses you may realize it's not actually too great.

------
jrockway
The point of WSGI is not to be a good abstraction of HTTP. As an app author,
you aren't supposed to care. The advantage of WSGI is that it's standard, and
everything uses it. That means you can use any web framework with any web
server, and it will all Just Work.

When you write your own copy of WSGI to change how some words are spelled, you
don't gain much, but you lose the whole WSGI community. This seems rather
pointless to me.

~~~
dschobel
FTA:

 _Take advantage of existing WSGI tools. Pump comes with adapters for serving
Pump apps with WSGI servers and converting WSGI middleware to Pump
middleware._

------
clarkevans
It seems like the author here is comparing Pump to WSGI -- perhaps a better
comparison is to Ian Bicking's WebOb (<http://webob.org>) or Armin Ronacher's
Werkzeug (<http://werkzeug.pocoo.org/>).

WSGI is a low-level protocol that provides a minimal interface to an HTTP
Server ala CGI. It is purposefully not an application-level HTTP toolkit. For
example, a WSGI component takes an input stream and returns an iterable which
could yield output chunks... of perhaps in infinite data stream. These edge
cases are sometimes very important and why the interface is designed as it is:
inconvenient as it may be for simple apps.

------
LeafStorm
One issue that I have with it is that there is no distinction between script
name and path info. Obviously, you don't have to call them that, but not
having a distinction between the two makes it impossible to serve two Pump
apps on the same domain and dispatch between them.

Also, there's no clearly defined format for middleware added by keys - the
included middleware just adds plain old keys as it pleases.

But more generally, I don't really see the purpose of this. WSGI is obviously
not ideal (thanks to start_response and CGI environment variables), but it's
also quite firmly in place in the Python world. Not to mention that it would
take the Pump library quite a while to catch up to Werkzeug or WebOb in terms
of having all the necessary HTTP primitives implemented. (Multipart parsing,
anyone?) Unless the server makers get on board, Pump is pretty much just an
added layer of complexity on top of WSGI. Instead of "server | WSGI | WSGI
library | framework", you have "server | WSGI | Pump adapter | Pump library |
frameworK".

------
saurik
I don't see any way to use Pump to stream data in (a very long POST) or stream
data out (a very long body). While it is certainly "dead simple", claiming
that it is "what WSGI should have been" is a serious stretch.

------
mkramlich
Rolling your own small web framework in some arbitrary language seems to be
one of the easier ways to add "open source project author" to one's resume.
Not that I'm knocking it. But after you've seen the wheel reinvented the Nth
time, you get increasingly less impressed on N+1, N+2, etc.

------
nikcub
by passing dicts you are removing what makes WSGI work so well: lazy loading,
chunked responses, middleware (encoding, caching etc.) by decorating,
iterators/generators etc.

your solution is going to be slower, more memory intensive and will not be
able to be http 1.1 compatible. there is a reason why WSGI was designed the
way it is

------
benatkin
Here's another NoWSGI HTTP server library (Brubeck):

<http://news.ycombinator.com/item?id=2770866>

~~~
pyninja
I hadn't seen this, looks interesting.

------
radarsat1
Looks useful, though I think any modern abstraction like this should come with
at least the scaffolding for future support of websockets. (They are a bit in
flux right now, but will ultimately be incredibly useful.)

~~~
MostAwesomeDude
WebSockets is not HTTP and should not be handled like HTTP. I just
reimplemented a WS wrapper by _removing_ its dependency on HTTP stuff, and
ended up _decreasing_ its LOCs and complexity.

------
pbreit
This seems interesting but I'm not sure why. What are some use cases?

~~~
teej
I believe the intent is to be Rack[1] for Python. The beauty of Rack is that I
can build a new web framework, web server, or web app plugin and have it "just
work" with the rest of the ecosystem. And because the Rack API is so simple,
building to spec is easy.

In the Ruby world if I want to use the awesome library Sass all I have to do
is:

    
    
        gem install sass
    

Because it functions as a Rack plugin it automatically works with any Ruby web
framework & Ruby web server combo I choose. No special setup required.

\-----

[1] <http://rack.rubyforge.org/>

~~~
irahul

        # WSGI
        def app(environ, start_response):
            start_response('200 OK', [('Content-Type', 'text/plain')])
            yield 'Hello World\n'
    
        # Rack
        app = proc do |env|
        [ 200, {'Content-Type' => 'text/plain'}, "a" ]
        end
    

WSGI is the Rack for Python. In fact, WSGI predates Rack, and Rack is WSGI
inspired.

~~~
bobbyi
They are both predated by:

    
    
        public class App extends HttpServlet {
          public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
            response.setContentType("text/plain");
            response.getWriter().println("Hello world");
          }
        }

~~~
ianb
Yes, but IMHO servlets got it all wrong, and WSGI took very little from
servlets (though for instance Webware, a pre-WSGI framework did use the
servlet model). But it owes much to the superior CGI model; replace processes
with function calls and structure the response lightly and you have WSGI.

~~~
carlhu
Ianb, may I inquire as to what is the difference you mean between servlets and
cgi. A cgi written in perl for example, was essentially a perl servlet, I
think. How is the CGI model fundamentally different from a servlet model,
where your application code is given a request as a parameter to a function,
and must return a response?

~~~
irahul
> How is the CGI model fundamentally different from a servlet model,

Since both are web server to application server interfaces, they aren't
fundamentally different - the difference is cgi was language independent and
hence defined for the common minimum. cgi couldn't have been defined in
request and response objects - it would have caused trouble for languages
which doesn't have objects.

> where your application code is given a request as a parameter to a function,
> and must return a response?

cgi is not given a request parameter - the request parameters are passed in
the environment. And cgi doesn't return a response object - whatever it writes
to stdout constitutes the response. cgi had to cater to all sorts of
implementation - assuming request/response objects wasn't a possibility.

Servlets and cgi aren't fundamentally different, but I guess we can agree they
are sufficiently different.

~~~
ianb
To add/agree: I think the biggest difference is that CGI dealt in data, and
did not have objects/APIs/etc. In many ways it would have been reasonable to
skip even that, and pass an HTTP request in on stdin, and get an HTTP response
on stdout, with just some minimal sanitizing promises; but I don't think I've
ever seen that approach. Coincidence of history I suppose. Anyway, WSGI also
carefully avoided any objects, only using standard data structures
(dicts/hashes, strings, ints, ordered-associative-arrays, and iterable
response). The result is a functional API without an opinions.

------
est
I never understand the mentality of modern wsgi/cgi design. Passing parameters
using environ? Why not directly as a python request object?

~~~
LeafStorm
For consistency. If there was a request object, there would need to be some
sort of standard implementation that there was always access to so that the
object could be properly instantiated, instance tested, etc. By using a plain
old dict, you ensure compatibility at the cost of attribute access.

Also because WSGI is designed to be low-level, and if you want a request
object you should really be using a library or framework.

~~~
est
But the old model sucks. If a process can handle more than one request, how
does it change environ if there is only one process?

------
moreyes
It is ironic that they mention "don't have to reinvent the wheel" in that
page.

------
teyc
As far as web frameworks go, wsgi is pretty low level already. If you want to
do a framework, move to a narrow vertical, it might get more traction.

------
dlamotte
<http://xkcd.com/927/> Seems relevant here...

