Hacker News new | past | comments | ask | show | jobs | submit login
The Future of Python HTTP (kennethreitz.com)
336 points by abraham on Apr 2, 2012 | hide | past | web | favorite | 53 comments



Kenneth has put a lot of work into Requests, and it's really turning into something beautiful.

On a side note, Kenneth is a standup guy who helped propel me into the world of open source. I made a few slight documentation updates on Requests as part of my first pull request ever, and he made me feel like my contribution was so important.


This is something I noticed as well, someone corrected a 1-char typo in one of his projects' documentation and he quickly merged and replied with "Thanks :)"


Exactly the same thing happened to me. Getting a sparkly cake from him feels like getting a geek merit badge.


Yeah. I'm also starting by contributing to python-guide.org, and it feels great. He's a swell guy.


I think requests and werkzeug are great, but I'm not entirely sure what problems this proposal addresses.

First, it will be extremely hard to get Django to replace a large chunk of it's core with an external library. Not only because they don't rely on an external dependencies, but because there's no obvious benefit.

For example, why would you replace the Django test client with something resembling requests? There's no need to go through the entire request process cycle just to test a small chunk of code. It makes them less efficient, less accurate, and more difficult to understand when something goes wrong.

Second, things like "cachecore" are not synonymous with "http caching", and are extremely confusing names. I think that "uricore" is a good idea, but it's another non-obvious name. It's not a "core" at all, but rather an improved version of urlparse.

I think there are a lot of valuable uses that can come from separating common functionality out so it can be reused in various places, but I dont think Python users are concerned about HTTP not mapping 1:1 with WSGI day-to-day. I can definitely say I'm not, and I'd like to think I have some knowledge as to what matters in [Python] web development.


Don't focus too much on the notion of the Django stuff — it's just vaguely mentioned as worth exploring at the bottom of the post.

Luckily, all of these changes will be purely non-destructive refactoring and won't have any negative effects.


"Luckily, all of these changes will be purely non-destructive and won't have any negative effects"

That sounds like what economists like to call a "pareto improvement", always a good thing :)


Well I think the replacement in django's tests wouldn't be for general unit tests but instead to replace their tests client that one can use to test your own views.


Thats exactly what I was talking about. Why would you even consider replacing the test client? What value could it possibly bring?

There's no need to test the HTTP lifecycle when you're performing simple "heres the data for my view, execute it, and return the response to me". The http client itself should be tested, as well as Django's internals. It should be clear and obvious separation of concerns.


I agree. While I think its great that key members in the python community are making sure key projects work together, I can't say that this will affect how I do things day to day in the short term.

In the longer term, I'm sure this interoperability will pay bigger dividends, which I'm sure is the reason behind all this anyway.


Exactly.


...of course, the introduction of LiveServerTestCase in django 1.4 puts another twist in this discussion - and a pretty awesome one, if I may say. The possibilities for testing the request / response cycle with varying levels of comprehension and veracity have never looked so sweet.


"So, instead of taking the WebOb approach of using WSGI as the common protocol between services, why not use HTTP itself? The rest of the world uses HTTP as the most-common denominator after all."

I agree with the assertion that HTTP is the most common denominator.

# done from memory

   In [1]: from webob import Request

   In [2]: print Request.blank("http://google.com/index.html")
    GET /index.html HTTP/1.0
    Host: google.com:80

Looks like HTTP to me.

   In [1]: from webob import Request

   In [2]: from paste.proxy import TransparentProxy

   In [3]: print Request.blank("http://google.com").call_application(TransparentProxy())
Yeah the syntax could be less verbose in WebOb, and there was discussion about a nicer client API at one time, but it's fairly trivial to roll your own. Though requests makes short work of common things, I think it actually abstracts some HTTP details more so than WebOb in a way that leaves the user less informed about what's going on in the HTTP, Basic Auth for example. Which makes it it's own protocol, just like WSGI. WSGI in my experience makes light work of implementing HTTP IMO, it's fairly easy to read the part of the HTTP spec you need to handle and then make that work in WSGI, I'm not so sure requests could make that any easier without intervention from the library authors.

I think there's still a place for requests, but I don't see it as becoming the standard way to build HTTP Requests and handle HTTP Responses. I view it more along the lines of a library like mechanize, only nicer. It has it's place.


Thanks for noting this (speaking as the WebOb author). I should say that I don't think WebOb is particularly contrary to the idea being put forward – instead, WebOb was written with this kind of concept in mind.

Requests certainly adds a lot of stuff, some that is below WSGI (e.g., Keep-Alive – WSGI specifically avoids that level of transport), a bunch of stuff that is more stateful (the session stuff), and a bunch of stuff that just felt like it was moving around too much to put in WebOb (e.g., OAuth).

That said, there's nothing keeping a client library from adding that. TransparentProxy for instance could instead be something that handles pipelining and statefulness, and perhaps also acts as a request factory. WSGI is rather helpful here because it does not try to touch those parts – it leaves things open for other tools to take control there. Also to make a better client library you could subclass Request and add functionality that is handy but hasn't been added to WebOb (WebTest does this for functional testing, as an example). Auth probably fits in there. Redirect handling could go in there somewhere too. But if you always redirect transparently then you've made permanent redirects useless.

WebOb itself certainly isn't a client library, its scope is largely limited to representing HTTP requests and responses. But if you are looking for HTTP-related tools, it has a lot of them.


I don't like the idea of everything using HTTP. I prefer ZMQ.

There is a substantial work done to process all the HTTP overhead. In addition to that there is no load-balancing included and it would likely be delegated to a whole new service or machine, when ZMQ could just provide this.

Instead of relying on Werkzeug and HTTP, I can use DictShield models, serialized to JSON or Python, and send those across ZMQ sockets. The ZMQ sockets don't time out, like HTTP, they are instead removed when a host goes down. PUB/SUB messaging is possible, round-robin routing. Lots of patterns instantly available.

Ease in using HTTP itself is welcome, but I don't want to use it in my infrastructure. It's a band-aid for WSGI instead of a solution.


I don't understand your point and why it's the top comment. Sure, 0MQ is great and can be used in place of HTTP in some cases. But if you want to serve requests to a web browser, you have to speak HTTP at some point. These are the use cases they are handling.

It's not even a matter of preference; it's a matter of a need for HTTP and they are addressing this need, while apparently accounting for 0MQ (he mentions it in the post).

Also, you mention down-thread, "HTTP is the weaker of the two transports"; it's only "weaker" if you ignore any use case that involves communicating with a web browser.


The involvement of the requests library signals that the consumer is not a web browser...


No, it doesn't. It signals that unifying multiple code-bases and efforts containing clear overlap is a pretty smart idea. This project will still be used to serve requests from a web browser. It will also be used to make requests to HTTP services that the user has no control over.

Are you just trying to make an ideological argument that HTTP is used in cases when 0MQ might be a better option? If so, I agree but I just don't understand why you seem dismissive of this project when HTTP has its place and, therefore, this project has its place. The fact that 0MQ is mentioned in this post seems to indicate that the author understands 0MQ's place.


I prefer ZMQ when I have the choice. ZMQ is better from an ideological perspective and better for infrastructure.

In cases where you can only use HTTP, this will be great, but if I have a choice I'd use ZMQ instead.

I don't mean to sound dismissive, but I am opinionated. I think Kenneth and Armin do great work. I'm sure this project will be excellent.


I think our only disagreement is that you viewed it as a poor alternative for 0MQ and I viewed it as something else. Just a different perspective that I interpreted as taking away from what this announcement was really about but I suspect it wasn't your intention.


No, but it does signal that the producer is a Web server. Those aren't going away anytime soon, either.


As I said before: Ease in using HTTP itself is welcome, but I don't want to use it in my infrastructure. It's a band-aid for WSGI instead of a solution.


This idea actually came up in our discussions.

Requests will have a set of request-level adapters which will let you define the protocol you're speaking, whereas urllib3 aspires to have connection-level adapters which let you define the transport you're using.

So, hypothetically once we make this a reality, you could have a ZMQ connection transport which has a JSON request adapter and happily use Requests with whatever made up scenario you like. :)


I'm curious about the constraints this would impose on alternate transports. How similar would they have to be to HTTP?

For example, zerorpc (http://github.com/dotcloud/zerorpc-python) supports a request-response pattern (using ZMQ REQ/REP), and can return a stream as a response. That maps nicely to http, and I could see that "mounted" as a mock http endpoint which is very interesting.

But there are lots of features in zerorpc that I'm not sure how to map to an http library. For example:

* zerorpc methods must expose positional arguments; how would that map to http query arguments?

* zerorpc arguments can be arbitrary data structures.

* zerorpc has no notion of headers.

* etc.

I'm excited at the idea of making cross-transport interop easier. At the same time I wonder if the mold of "HTTP-ness" might stifle the ability to think outside the box when designing a transport?


Keeping in mind that with zerorpc, you control both the client and the server, none of this should be a problem. You can have positional http query arguments (?foo=1&bar=2) as long as you can guarantee that both the client and server treat them as such (not the case with web browsers, obviously).

Either way, it'd be great if we had a big list of possible use cases we'd want to support to keep in mind as we're designing this. Anyone want to start one? :D


I think it would be pretty cool to hook Brubeck up to ZeroRPC. Brubeck can do all the web processing and then communicate with Mongrel2 for HTTP while delegating other work through ZeroRPC.


Something else worth noting: from how I understand it, 0MQ isn't safe to use over the internet. Safe in your infrastructure, inside your firewalls and such, but not over the internet. So, in terms of making, e.g. public-facing API, you don't really have a good choice except HTTP.

EDIT: This is incorrect, read below.


You may be referring to Zed Shaw's comment in his PyCon ZeroMQ talk -- but that issue has been resolved since then.


Thanks, I didn't realize that. Specifically, this link mentions it:

http://www.zeromq.org/area:faq#toc5

Appreciate the heads-up!


Seems like without SSL there are still several scenarios where a webservice over HTTPS is more appropriate?


You can use TLS/SSL with ZeroMQ. The docs suggest doing so here: http://www.zeromq.org/topics:encryption


Thank you Kenneth for your contribution to the Python community. Never met but I've used your code. Keep on focusing on what is "right" for the long term and maybe someday we'll get there!


Considering Python's place in web development, it's surprising how bad its basic http client capabilities are.


Agreed. That's why I use the cURL bindings combined with human_curl.


I don't know much about Python, but how does Request/WSGI solve async/long-polling/WebSockets? It would be a shame if they re-did the HTTP stack without solving these issues.


Requests has support for async request by using gevent: http://docs.python-requests.org/en/latest/user/advanced/#asy...

gevent has an wsgi module, but it is very low level. I am still looking for a clean/nice way of doing it in python.


    $ gunicorn -k gevent


What does Django could potentially utilize the security features provided by httpcore mean?

Would Django have to drop their own Request/Response objects and adopt the Requests/httpcore provided ones? Was there any discussion with the Django core devs at pycon regarding it?


The blog post says that Paul M. was part of the discussion.


I think this issue stems from the fact that while WSGI was a great concept, it hasn't been fully developed, and hasn't been maintained. We need a better set of abstraction layers for socket and HTTP-based interfaces in Python.

And by the way, I love Kenneth's work, especially Requests. We use it in production and it's a joy to build on top of it.


What about Python 3? Other than the async submodule (due to the missing gevent dep), requests already supports Python 3, but Werkzeug does not. Is part of the plan for the combined effort to change this, or will requests lose Python 3 support again? (This seems unlikely, but I thought I better ask.)


All of these changes are targeted at Python 2.6–3.x.

A rewrite of many parts of Werkzeug is required to support Python 3. Might as well kill two birds with one stone :)


Sounds good. Thanks!

Edit: As a bonus link to make this post more interesting, the other day I caused the Python 3 version of requests to be packaged for Fedora 16 and 17: https://bugzilla.redhat.com/show_bug.cgi?id=807525


Awesome, thanks the help! I really appreciate it :)


This is a great library, and Kenneth has worked really hard to improve it over time. I love Requests.


At the risk of sounding like a negative-nancy the problem he is fixing is the artificial impedance mismatch that was constructed ON PURPOSE by Ian Bicking when he created WSGI.

Composability is one of the most powerful concepts we have. WSGI wasn't mistake-proofing middle-ware by making the protocols different, he was breaking composability that is now being fixed by Kenneth et al.

Thank You! But let us not allow these mistakes to be made again.


Philip J. Eby was the author of WSGI, not myself. But which mismatches are you referring to? I've frequently heard these complaints, but generally they are fairly obscure things (which are hard to resolve because we can't form a quorum of people who actually care), and a lot of vague FUD.


I think you may not know as much about WSGI as you think... like, say, who originally created it.


The idea of proxying ZeroMQ over an HTTP layer is intriguing. I have never used the .*MQ variants, but I understand that request/response is pretty popular. At first glance, HTTP emulation seems like it might make a good, soft introduction to a different paradigm, but I'd love to hear from people who've actually used these libraries/protocols.


It's backwards. HTTP is the weaker of the two transports, though tried and true. Mongrel2 can used to proxy HTTP to ZMQ nicely.

Check these slides for an overview of all the things ZMQ can do: http://j2labs.tumblr.com/post/5036176531/zeromq-super-socket...

Even load balancing is included, which means you run less services too.


I hacked up a Faraday (my ruby HTTP client) plugin that uses ZeroMQ: https://github.com/technoweenie/faraday-zeromq. I honestly don't know if it's worth it though. It's easy enough to write your own ZeroMQ stuff.

Example: https://github.com/dotcloud/zerorpc-python


Lovely how 5 of the *core projects on github have 200+ followers with 4 of the repos empty. If anything, it shows there's interest in the proposed plan. I'm curious what comes of it!


If it ain't Lisp, it's shit.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: