
Python for Humans - craigkerstiens
http://python-for-humans.heroku.com/
======
loevborg
I was intrigued by the author's library "envoy", which is intended to provide
a more intuitive interface to running processes from python.
(<https://github.com/kennethreitz/envoy>)

The back story is that the older APIs that Python comes with -- os.popen and
os.system -- are deprecated. Programmers are urged to use the "subprocess"
module instead. Although this doesn't have the problems of the original
functions, it has a rather arcane interface, in particular if you want to read
the output (stdout or stderr) of a subprocess.

"envoy" seems to aim at fixing this, by providing sane defaults and being
optimized for the common case. However, these defaults have drawbacks of their
own.

1\. envoy defaults to keeping the process output in memory, as a giant string.
This can be a bad choice with regard to memory usage and performance.

2\. You can run several processes in a pipe using ("cat foo | grep bla"). But
otherwise as far as I can see, run() ignores regular Shell semantics, such as
quotes. I imagine this can lead to unexpected results. The amount of data
passed from one process to the next is capped at 10 MB -- recipe for bugs that
are hard to find.

3\. subprocess.call() accepts an array in the style of ["ls", "-l", "/mnt/My
SD card"]. This has obvious advantages over having to deal with escaping shell
characters. A good API should preserve this advantage over os.system().

4\. The defaults cannot be overridden, and no preperations have been made to
allow changing them. Of course this can be changed in the future. However, one
of the reasons the subprocess.* API is convoluted is that it allows all kinds
of flexibility, much of which is needed in many serious programs. It may be
difficult to add this flexibility to envoy at a later stage. The point is that
a flexible API is hard.

None of this is to discourage this initiative, which seems to me a much-needed
improvement over Python's built-in API. Also, with a version number as low as
0.0.2, there is probably little need to worry about API compatibility.

~~~
Silhouette
> subprocess.call() accepts an array in the style of ["ls", "-l", "/mnt/My SD
> card"]. This has obvious advantages over having to deal with escaping shell
> characters.

Unless you're running on Windows, in which case IME it will corrupt your
carefully constructed parameters in completely inappropriate ways that can be
debugged only at the cost of (a) changing the call() to execute a script that
dumps the actual parameters supplied verbatim, and (b) at least an hour of
your life that you're never getting back.

This "feature" is about one step above MS Word's default autoreplace behaviour
in irritation level. What happened to "Explicit is better than implicit" and
"Special cases aren't special enough to break the rules"?

~~~
loevborg
On Unix, a new process is supplied argv[], an array that contains the
executable name and invidual arguments. Clearly, supplying call() a list of
arguments is the right thing to do.

I seem to remember that on Win32, all you get is an argument string, and the
process is required to do the parsing itself. This is simply a different
model, and the Unix way is cleaner and easier to work with. It seems to me
reasonable to optimize for the saner (Unix) way of doing things instead. I
confess I don't understand your analogy to Word.

~~~
Silhouette
The big difference is that the UNIX shell does all kinds of
expansion/interpolation for you, while Windows basically leaves things alone.
You get an argv[] array in both cases if you're writing in C.

Neither of these approaches is inherently superior, they're just placing
responsibility for certain operations in different places. But the fact is
that because Windows programs don't have to assume a certain set of
conventions for their command line, many do not, and if you have the
misfortune to want to automate those using subprocess, you're in for a world
of pain (until you just give up and use the single-string version instead of
the list of arguments, having realised that this is enough to stop it messing
around with your carefully crafted strings and just pass them through
verbatim).

------
mattdeboard
Great presentation on a library I've loved (and used) for awhile. However
according to slide 42 I need to rewrite the regex module? I'm so busy this
week though.

edit: If anyone wants to see a real-world refactor from HTTPLib/2 to requests,
I did so with Pysolr here:
[https://github.com/mattdeboard/pysolr/commit/db63d8910dec42d...](https://github.com/mattdeboard/pysolr/commit/db63d8910dec42ddfa8c8626dbbf73dbd3853c51)

------
y3di
The slides don't fit vertically on my screen, so some of the content is cut
off. There's no scroll bar so initially it was difficult to figure how to see
the info cut off from the bottom. Chrome's text zoom out didn't work either.

I had to highlight the text and drag downwards in order to see the content.
But it was annoying having to do this for every slide with a lot of content.

Otherwise, these libraries seem really useful. Thanks for this.

~~~
cbs
I had the same problem too. I had no idea it was supposed to be a slide show,
then once I gave my browser as much screen real estate as the site wanted I
had to figure out how to navigate the damn thing.

Whats so wrong with just sticking a bunch of static slides on a page one after
another?

------
ovi256
Wow those libraries are indeed great, amazing compared to the standard libs.
Hope they'll be included in the standard libs one day.

~~~
stock_toaster
While python has always been 'batteries included' I think some of the
batteries should not have been included.

Libraries tend to move more quickly than the language and
interpreter/compiler. Tying them together, while convenient, often leads to
rot, clunky libraries, slow moving updates, and libraries being build _to the
interpreter/compiler_ instead of _to the needs of the users_.

I would like to see instead a somewhat canonical (widely accepted) list of the
highest quality libraries for a given set of needs, with information and
pro/con/caveats listed for each, instead of them being included in the
mainline trunk.

I really applaud what Kenneth Reitz has been doing lately.

\-- a very happy user of the `requests` library

------
kenneth_reitz
Audio is available here:

<https://github.com/kennethreitz/python-for-humans>

------
joshbaptiste
The portion that explains of how subprocess shuns dev/ops guys in the
beginning is so true. Perl/Bash colleagues at work would basically ask me how
to perform output=`command`. Once they seen subprocess, they would continue
writing their script in Bash/Perl.

~~~
brendano
Very true. I spent quite a while trying to learn subprocess, then gave up and
just use os.popen() now. It's a shame -- there are certain subprocess features
I really would like to have, but it's too hard to remember how to use it.

~~~
dagw
I used subprocess in a program as recently as this afternoon and I'm quite
baffled about the amount of negativity it attracts. Sure it's not quite as
simple as `ls -l` but it's not too far off and in return it gives you far more
control. Is there anything in particular that you found hard or confusing?

------
teyc
I blame GOF for making Python Standard Libs hard. The patterns described were
for an OO system where functions were not first class. Python didn't need to
be complicated.

If you have a look at the older libraries, most of them were written in a
procedural style. Not only that, it is very amenable to testing in the REPL.

    
    
        import smtplib
        s=smtplib.SMTP("localhost")
        s.sendmail("me@my.org",tolist,msg)
    

note the absence of doers like "Adapters", "Handler", "Manager", "Factory"

If you have a look at the XML library, roughly when "patterns" became popular,
this style of thinking infested standard library contributions. It also
coincides with a time when camelCased function names crept into the python
standard library.

Here's one in xml/dom/pulldom.py:

    
    
        self.documentFactory = documentFactory
    

Once you see this, you know you are in for some subclassing. You can no longer
REPL your way to figure out how things work, and you now have to consult the
manual.

Here's more pain from libraries of the same era, some of these I'd argue un-
Pythonic:

    
    
        #xml/sax/xmlreader.py:    
        def setContentHandler(self, handler):
    
        #wsgiref/simple_server.py:
        class ServerHandler(SimpleHandler):
    
        #urllib2.py:
        class HTTPDigestAuthHandler(BaseHandler,
           AbstractDigestAuthHandler):
    

The last example is especially jarring. Abstract classes have a place in
strongly typed world to declare interfaces, and help with vtable-style
dispatch. In Python, where you have duck-typing and monkey patching, a class
that virtually "does nothing" on its own stands out like a guy in a tux at a
beach party.

Even logging is infected by the same over-patterning. logging/__init__.py:

    
    
        class StreamHandler(Handler)
        LoggerAdapter(someLogger, dict(p1=v1, p2="v2"))
    

"Managers" - what a pain when plain function handles would have done the job.
Does this name even tell you what task the class performs?

    
    
        #multiprocessing/managers.py:
        class BaseManager(object)
    

If anyone remembers, Java _had_ to do OO in a big-style with OO everywhere --
there were no alternatives.

Initially, buttons had to be subclassed just to handle click events, since
functions were not first class objects. Then someone came up with a
MouseListener interface, which proved too unwieldy to handle a single click.
So the MouseEventAdapters came into being.

Therefore, to handle a click in a "pattern" manner involves

an anonymous class

which subclasses MouseAdapter

which implements MouseListener,

which overrides MouseClick.

Publishing how industry solves this problem of "MouseClick" over and over as a
pattern [ _design pattern is a general reusable solution to a commonly
occurring problem within a given context in software design_ ] only gives
legitimacy to an approach that has dubious wider applicability.

Heavens help the future developers who are forced to do it because it is now
recognized as being industrially "good practice" and codified in a reknowned
book.

It isn't!

It was a style that was forced by the constraints of a language.

This is neither pythonic nor necessary:

    
    
        panel.addMouseListener
        (
          new MouseAdapter ()
          {
            public void mouseEntered (MouseEvent e) {
              System.out.println (e.toString ());
            }
          }
        );
    

Embracing "foolish, unschooled" thinking, this would be rendered in Python as:

    
    
       def mouseEntered(event):
         print event
       panel.mouseEntered = mouseEntered
     

or for multiple event handlers

    
    
       panel.mouseEntered.append(mouseEntered)
    

This style of API again allows effective exploration on the REPL.

~~~
simonw
I've been programming python for nearly 10 years, but your comment just helped
me clarify a thought that I've had for ages but have never been able to put in
to words before: a well designed Python API is one that can be effortless used
within the REPL. And that's why urllb2 sucks.

~~~
gruseom
This is the thing that got me off OO altogether. Typical OO code is hard to
call from the REPL. You have to know what objects to construct before
constructing an object that has the method you want to call, which probably
requires more objects to be constructed and passed as args, etc. Yet more
often than not the function in question is simple enough that it ought to be
easy to call without all that baggage. That got me so annoyed that I finally
just started making all my functions easy to call from the top level and never
looked back. You're right that this leads to better designs.

------
dspeyer
Anybody have a text version of this? I got maybe 30 slides in before I got too
annoyed to continue.

~~~
bpierre
You can read the Markdown file: [https://github.com/kennethreitz/python-for-
humans/blob/maste...](https://github.com/kennethreitz/python-for-
humans/blob/master/python-for-humans/1_content.md)

~~~
jonmulholland
Just use arrow keys on your keyboard to navigate.

------
socratic
This presentation brings up a tangential point that has always confused me:
how error-prone is starting a subprocess, really?

I agree with the author's goals of making common tasks easier and more
obvious. urllib2 is an easy target, as it was added to the standard library
over a decade ago, long before REST was something people talked about. The
best tools for packaging, versioning, and testing have always been a bit
ambiguous in any language, including Python.

However, the author points out something that has always bothered me about
Python: it is way harder to start a subprocess with an external command in
Python than almost any other language. This has been true whether using _sys_
or _os_ or even _subprocess_ , which is quite recent.

I always felt that this had something to do with the constant warnings in the
documentation about how a pipe between the subprocess and the Python process
might fill and cause the subprocess to block. Or how running the program
through shell rather than exec or something might cause some sort of security
issue. Are these real issues that other languages ignore in the name of user
convenience, or has Python just never been able to make the right API (as the
author seems to argue)?

~~~
pnathan
I have never been able to figure out how - in Python - to be able to stream
asynchronously both stdout and stderr from the subprocess, both printing both
of them as well as writing the data to a file.

~~~
fdkz
I'm using the mkfifo method on linux/macosx:

    
    
        import os
        import sys
        import time
        import subprocess
    
        # turn off stdout buffering. otherwise we won't see things like wget progress-bars that update without newlines.
        sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
    
        pipename = "tempfile"
    
        if os.path.exists(pipename):
            os.remove(pipename)
    
        # create a pipe. one side is connected to the ping process, other side is connected to python.
        os.mkfifo(pipename)
        read_fd = os.open(pipename, os.O_RDONLY|os.O_NONBLOCK)
        writer = open(pipename, "w+")
    
        proc = subprocess.Popen("ping www.google.com", cwd=sys.path[0], stdout=writer, stderr=writer, shell=True)
    
        while 1:
            try:
                # nonblocking poll data from the external process.
                s = os.read(read_fd, 1024)
                if s:
                    sys.stdout.write(s)
            except OSError:
                pass
            # sidenote: minimum sleep time is 1/64 seconds on many windows pc-s.
            time.sleep(0.1)
    
        # remember to remove the pipe "tempfile"

~~~
fdkz
Replying to myself. Using mkfifo is not necessary:

    
    
        import os, sys, time, subprocess, fcntl
        sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
        read_fd, write_fd = os.pipe()
        fcntl.fcntl(read_fd, fcntl.F_SETFL, os.O_NONBLOCK) # don't know of any windows equivalent for this line
        proc = subprocess.Popen("ping www.google.com", cwd=sys.path[0], stdout=write_fd, stderr=write_fd, shell=True)
        while 1:
            try:
                s = os.read(read_fd, 1024)
                if s:
                    sys.stdout.write(s)
            except OSError:
                pass
            time.sleep(0.1)

------
prolepunk
On installing python -- the most practical way to work with it, is to have a
moderately recent os-level python install and then build all the other python
versions from source if required --
<https://github.com/collective/buildout.python>

After that use virtualenv with virtualenvwrapper.

------
drivingmenuts
Hangs on the first slide unless I'm doing something really wrong.

~~~
kenneth_reitz
Try the arrow keys

~~~
brown9-2
A clear sign that the one doing something wrong is the website's designer.

Also, it breaks the back button.

------
prolepunk
I wish that the developers of requests module stop changing it's API -- code
that was working just fine with 0.6.4 suddenly began finding missing methods
in version 0.8.5

~~~
kenneth_reitz
All changes are for the best, I promise. API will stable by 1.0.

------
nvictor
seems the guy nailed down many issues i have had in the past :)

------
rmc
This makes a lot of good points, but some bad ones.

Esp. the "installing python" one. Just use your package manager to install all
the versions you need.

And for "Packaging and Dependencies", just use pip.

~~~
tbatterii
What's the package manager for windows?

Mostly I agree with you, but installing python on windows is a PITA compared
to *nix.

PIP works until you need something that requires compiling extensions which is
also a PITA on windows.

IIRC easy_install can install binaries, but I could be wrong about that since
I don't do windows development anymore and thus pip works for me 99% of the
time.

Regardless, my point is, though I think we all wish it was as simple as using
your package manager and pip, it simply isn't that way for everyone.

And it probably never will be, but we can always make it simpler. :)

~~~
rmc
I would not use Windows for servers.

~~~
j_baker
Neither would I. But some people are stuck in jobs where they have to support
Windows. If that's never been the case for you, consider yourself lucky.

------
dhalexander
The Python standard library has gotten worse over time, as it got loaded up
with more and more features, obfuscating the common use cases. The irony now
is that to do simple, everyday things (like http requests) you are now better
off installing a third party package like "requests" than using the standard
library. So much for "batteries included."

The standard library needs a reboot. Why not do it in Python 3? Nobody's using
it yet anyway ;-)

------
arjn
I can confirm that the python subprocess api is a pain to use and also
documented poorly. I recently had to use (no choice) python 2.5.x to write a
script that extensively called external programs and ran into several
problems. It strange that a language such as python which I find so easy to
use in many cases does not already have a good as in simple, safe and well
documented subprocess api.

------
mixmastamyk
I guess I agree things could be simpler, although the cries of "garbage!" were
a bit much. I wrote a wrapper function around urllib2 about 5 years ago and
haven't looked back.

~~~
adeelk
Wrappers are handy, but as soon as you need something beyond the basic use
case they become useless. What’s great about Requests is that it seems to have
minimal leakiness as an abstraction over HTTP.

~~~
mixmastamyk
My wrapper is quite robust after almost five years. e.g. It can save headers
to alternate data streams (on NTFS) for proper 304 handling. If there is
anything left to implement it could be done pretty quickly.

Still I like these new projects; it's a shame they missed the python 3.x boat
by only a year or two. That would have been a great time to include them in
the stdlib.

~~~
adeelk
It sounds like you should release this wrapper :)

~~~
mixmastamyk
It is online, but embedded in my employers software. I could probably do some
extraction of it, but not sure if it is worth the effort.

------
pbreit
It's really inexcusable that in 2012 (or 1992 even) a language that otherwise
is well-suited for internet programming does not come with a first class
httpclient.

------
sktrdie
What did they use to make the presentation?

~~~
shuzchen
Looks like markdown and a sinatra based presentation framework. See the code
at: <https://github.com/kennethreitz/python-for-humans>

------
tuananh
The name says it all. This is much more readable compared to urllib2

------
edna_piranha
yes please and thank you muchly. _puts away yak shaving contraption_

------
josefrichter
Why don't you just switch to Ruby?

~~~
tdfx
I didn't find making moderately complex HTTP requests in Ruby to be much more
fun than urllib2 in Python. Do you have a library you recommend?

~~~
josefrichter
I like httparty or mechanize, but feel free make your own choice over here
<https://www.ruby-toolbox.com/categories/http_clients>

~~~
tdfx
I've tried rest-client, em-http-request, and mechanize. I don't recall their
specific failings except for mechanize, which seemed to work awesome in 95% of
cases and produce cryptic errors with the latter 5%. I will take a look at
httparty, though, thanks.

------
skrebbel
I like the presentation, but for me it just underlines how it's smart to stay
far away from Python. It's great that it's improving, but many other languages
have a much better library/API situation than Python has had for years
already. Will Python catch up fast enough?

~~~
ak217
It's quite the opposite, actually. Python already has an extremely good and
very capable standard library, arguably better than any other language. Python
also has a community with a strong sense of what kind of API design best fits
with the Python philosophy (which usually boils down to readability and
consistency). So there are many efforts going on to improve the standard
libraries.

For example, the standard library already has urllib2, which provides HTTP
support much better than most other languages' standard libraries. But
Requests is a rewrite of it, taking ease of use to a whole new level without
compromising any of the features. There is no equivalent in any other
language.

~~~
frodwith
"There is no equivalent in any other language."

I don't know, LWP is pretty easy to use.

~~~
pbreit
Does Perl come with LWP?

