Hacker News new | past | comments | ask | show | jobs | submit login
Python for Humans (heroku.com)
427 points by craigkerstiens on Jan 3, 2012 | hide | past | favorite | 106 comments



I was intrigued by the author's library "envoy", which is intended to provide a more intuitive interface to running processes from python. (https://github.com/kennethreitz/envoy)

The back story is that the older APIs that Python comes with -- os.popen and os.system -- are deprecated. Programmers are urged to use the "subprocess" module instead. Although this doesn't have the problems of the original functions, it has a rather arcane interface, in particular if you want to read the output (stdout or stderr) of a subprocess.

"envoy" seems to aim at fixing this, by providing sane defaults and being optimized for the common case. However, these defaults have drawbacks of their own.

1. envoy defaults to keeping the process output in memory, as a giant string. This can be a bad choice with regard to memory usage and performance.

2. You can run several processes in a pipe using ("cat foo | grep bla"). But otherwise as far as I can see, run() ignores regular Shell semantics, such as quotes. I imagine this can lead to unexpected results. The amount of data passed from one process to the next is capped at 10 MB -- recipe for bugs that are hard to find.

3. subprocess.call() accepts an array in the style of ["ls", "-l", "/mnt/My SD card"]. This has obvious advantages over having to deal with escaping shell characters. A good API should preserve this advantage over os.system().

4. The defaults cannot be overridden, and no preperations have been made to allow changing them. Of course this can be changed in the future. However, one of the reasons the subprocess.* API is convoluted is that it allows all kinds of flexibility, much of which is needed in many serious programs. It may be difficult to add this flexibility to envoy at a later stage. The point is that a flexible API is hard.

None of this is to discourage this initiative, which seems to me a much-needed improvement over Python's built-in API. Also, with a version number as low as 0.0.2, there is probably little need to worry about API compatibility.


> subprocess.call() accepts an array in the style of ["ls", "-l", "/mnt/My SD card"]. This has obvious advantages over having to deal with escaping shell characters.

Unless you're running on Windows, in which case IME it will corrupt your carefully constructed parameters in completely inappropriate ways that can be debugged only at the cost of (a) changing the call() to execute a script that dumps the actual parameters supplied verbatim, and (b) at least an hour of your life that you're never getting back.

This "feature" is about one step above MS Word's default autoreplace behaviour in irritation level. What happened to "Explicit is better than implicit" and "Special cases aren't special enough to break the rules"?


On Unix, a new process is supplied argv[], an array that contains the executable name and invidual arguments. Clearly, supplying call() a list of arguments is the right thing to do.

I seem to remember that on Win32, all you get is an argument string, and the process is required to do the parsing itself. This is simply a different model, and the Unix way is cleaner and easier to work with. It seems to me reasonable to optimize for the saner (Unix) way of doing things instead. I confess I don't understand your analogy to Word.


The big difference is that the UNIX shell does all kinds of expansion/interpolation for you, while Windows basically leaves things alone. You get an argv[] array in both cases if you're writing in C.

Neither of these approaches is inherently superior, they're just placing responsibility for certain operations in different places. But the fact is that because Windows programs don't have to assume a certain set of conventions for their command line, many do not, and if you have the misfortune to want to automate those using subprocess, you're in for a world of pain (until you just give up and use the single-string version instead of the list of arguments, having realised that this is enough to stop it messing around with your carefully crafted strings and just pass them through verbatim).


> it has a rather arcane interface, in particular if you want to read the output (stdout or stderr) of a subprocess.

  oputput = subprocess.check_output(my_command)


Great presentation on a library I've loved (and used) for awhile. However according to slide 42 I need to rewrite the regex module? I'm so busy this week though.

edit: If anyone wants to see a real-world refactor from HTTPLib/2 to requests, I did so with Pysolr here: https://github.com/mattdeboard/pysolr/commit/db63d8910dec42d...


The slides don't fit vertically on my screen, so some of the content is cut off. There's no scroll bar so initially it was difficult to figure how to see the info cut off from the bottom. Chrome's text zoom out didn't work either.

I had to highlight the text and drag downwards in order to see the content. But it was annoying having to do this for every slide with a lot of content.

Otherwise, these libraries seem really useful. Thanks for this.


I had the same problem too. I had no idea it was supposed to be a slide show, then once I gave my browser as much screen real estate as the site wanted I had to figure out how to navigate the damn thing.

Whats so wrong with just sticking a bunch of static slides on a page one after another?


Had the same problem but Chrome zoomed out for me.. (Chrome 16 -> [command -] on a mac.)


Wow those libraries are indeed great, amazing compared to the standard libs. Hope they'll be included in the standard libs one day.


While python has always been 'batteries included' I think some of the batteries should not have been included.

Libraries tend to move more quickly than the language and interpreter/compiler. Tying them together, while convenient, often leads to rot, clunky libraries, slow moving updates, and libraries being build to the interpreter/compiler instead of to the needs of the users.

I would like to see instead a somewhat canonical (widely accepted) list of the highest quality libraries for a given set of needs, with information and pro/con/caveats listed for each, instead of them being included in the mainline trunk.

I really applaud what Kenneth Reitz has been doing lately.

-- a very happy user of the `requests` library


+1

I'd like to see some community effort to build a collection of similar "better than the standard" libs.

Which, at some point, could replace the standard libs. Or be the de facto standard, a pip install call away...



The portion that explains of how subprocess shuns dev/ops guys in the beginning is so true. Perl/Bash colleagues at work would basically ask me how to perform output=`command`. Once they seen subprocess, they would continue writing their script in Bash/Perl.


  from subprocess import check_output as qx

  output = qx(['command', 'arg1'])


Very true. I spent quite a while trying to learn subprocess, then gave up and just use os.popen() now. It's a shame -- there are certain subprocess features I really would like to have, but it's too hard to remember how to use it.


I used subprocess in a program as recently as this afternoon and I'm quite baffled about the amount of negativity it attracts. Sure it's not quite as simple as `ls -l` but it's not too far off and in return it gives you far more control. Is there anything in particular that you found hard or confusing?


I've done a considerable amount of work in subprocess. It's API isn't amazing, but it does have quite a few great features and once you spend a little time with it really isn't too bad. Some of the piping features, along with shell arguments work quite well.


If backticks are good enough for them, then they don't need the more complex usecases that Popen allows, so just tell them to use check_call or check_output. As far as they should be concerned, the subprocess module has two functions that are straightforward to use.

And those functions are more convenient than the default behavior of backticks, because they handle for you raising an exception if the subprocess fails.

Python:

  import subprocess
  output = subprocess.check_output('command')

Perl:

  $output = `command`
  die "failed: $output" if $?


If you commit to building an infrastructure library, you can make a very nice and powerful interface into subprocess that makes life much easier.


I blame GOF for making Python Standard Libs hard. The patterns described were for an OO system where functions were not first class. Python didn't need to be complicated.

If you have a look at the older libraries, most of them were written in a procedural style. Not only that, it is very amenable to testing in the REPL.

    import smtplib
    s=smtplib.SMTP("localhost")
    s.sendmail("me@my.org",tolist,msg)
note the absence of doers like "Adapters", "Handler", "Manager", "Factory"

If you have a look at the XML library, roughly when "patterns" became popular, this style of thinking infested standard library contributions. It also coincides with a time when camelCased function names crept into the python standard library.

Here's one in xml/dom/pulldom.py:

    self.documentFactory = documentFactory
Once you see this, you know you are in for some subclassing. You can no longer REPL your way to figure out how things work, and you now have to consult the manual.

Here's more pain from libraries of the same era, some of these I'd argue un-Pythonic:

    #xml/sax/xmlreader.py:    
    def setContentHandler(self, handler):

    #wsgiref/simple_server.py:
    class ServerHandler(SimpleHandler):

    #urllib2.py:
    class HTTPDigestAuthHandler(BaseHandler,
       AbstractDigestAuthHandler):
The last example is especially jarring. Abstract classes have a place in strongly typed world to declare interfaces, and help with vtable-style dispatch. In Python, where you have duck-typing and monkey patching, a class that virtually "does nothing" on its own stands out like a guy in a tux at a beach party.

Even logging is infected by the same over-patterning. logging/__init__.py:

    class StreamHandler(Handler)
    LoggerAdapter(someLogger, dict(p1=v1, p2="v2"))
"Managers" - what a pain when plain function handles would have done the job. Does this name even tell you what task the class performs?

    #multiprocessing/managers.py:
    class BaseManager(object)
If anyone remembers, Java had to do OO in a big-style with OO everywhere -- there were no alternatives.

Initially, buttons had to be subclassed just to handle click events, since functions were not first class objects. Then someone came up with a MouseListener interface, which proved too unwieldy to handle a single click. So the MouseEventAdapters came into being.

Therefore, to handle a click in a "pattern" manner involves

an anonymous class

which subclasses MouseAdapter

which implements MouseListener,

which overrides MouseClick.

Publishing how industry solves this problem of "MouseClick" over and over as a pattern [design pattern is a general reusable solution to a commonly occurring problem within a given context in software design] only gives legitimacy to an approach that has dubious wider applicability.

Heavens help the future developers who are forced to do it because it is now recognized as being industrially "good practice" and codified in a reknowned book.

It isn't!

It was a style that was forced by the constraints of a language.

This is neither pythonic nor necessary:

    panel.addMouseListener
    (
      new MouseAdapter ()
      {
        public void mouseEntered (MouseEvent e) {
          System.out.println (e.toString ());
        }
      }
    );
Embracing "foolish, unschooled" thinking, this would be rendered in Python as:

   def mouseEntered(event):
     print event
   panel.mouseEntered = mouseEntered
 
or for multiple event handlers

   panel.mouseEntered.append(mouseEntered)
This style of API again allows effective exploration on the REPL.


I've been programming python for nearly 10 years, but your comment just helped me clarify a thought that I've had for ages but have never been able to put in to words before: a well designed Python API is one that can be effortless used within the REPL. And that's why urllb2 sucks.


This is the thing that got me off OO altogether. Typical OO code is hard to call from the REPL. You have to know what objects to construct before constructing an object that has the method you want to call, which probably requires more objects to be constructed and passed as args, etc. Yet more often than not the function in question is simple enough that it ought to be easy to call without all that baggage. That got me so annoyed that I finally just started making all my functions easy to call from the top level and never looked back. You're right that this leads to better designs.


Writing good python has been more of a challenge for me lately. The more C#/Java I write, the harder it is for me forget dependency injection, factories, etc and just feel the freedom Python provides.


    > If anyone remembers, Java had to do OO in a
    > big-style with OO everywhere -- there were no
    > alternatives.
You can write Java that isn't heavily OO, but you have to implement alternatives to sections of the stdlib that most people assume or take for granted.

Related to what you're saying about the GUI, I'd be interested to see a detailed summary of what the Lighthouse people did, and how it was different to Java. I've found that NeXT tradition stuff - despite claims that it's heavily OO - in facat tends to err away from subclassing towards composition. I suspect the Lighthouse interface patterns did too.


I've struggled through a few Cocoa tutorials. I think the programmer creates a delegate object which handles events. These delegates are assigned to UI elements, similar to how one would addMouseListener to a Java UI element. I may be entirely wrong here. Perhaps someone who knows can put the facts straight.


UI elements typically gets assigned an instance of an object (target) and a symbol representing the object's method (selector).

In Cocoa touch, it's:

    [mybutton addTarget:someObjectMaybeSelf action:@selector(onClick) forControlEvents:UIControlEventTouchupInside];
It's broken into separate methods in Cocoa but it's similar. You don't need to and usually don't create separate classes for targets.

The framework usually use delegates when a data source is needed. (for example to build up a table, you need to specify how many rows, what goes into each row etc).

Cocoa touch is really clean. Cocoa is still nice but a little messy.


I'm making an attempt [1] at simple logging. Thoughts on the basic design so far?

[1] https://github.com/peterldowns/lggr


The acid test is to see if you can avoid terms like "Writer", "Printer", and whether someone is able to explore your API using the REPL.

    >>> import log
    >>> log.log(log.CRITICAL, 'PrintModule', 'Printer failure', sys.exc_info())
    PrintModule: Printer failure
    >>> print log.configuration
    [<function printToStdErr as 0xcafebad>]
    >>> log.configuration.clear()
    >>> print log.configuration
    []
    >>> def send_mail(level, module, message, exception):
    ...    import smtplib
    ...    smptlib.SMTP('mail.test.com').send('syslog@test.com',
    ...   'no-reply@mysystem.com', module + ' ' + message)
    ...
    >>> log.configuration.append(send_mail)
    >>> log.log(log.CRITICAL, 'PrintModule', 'Printer failure', sys.exc_info())
    etc.
Next, configuration must be straightforward.

    >>> import log
    >>> log.configuration.extend([
    ...  log.sendmail,
    ...  log.syslog,
    ...  log.rotatinglog])
    ...
    >>> log.sendmail.to = 'admin@test.com'
    
Notice how this can easily be captured into a file you might name as logconfig.py

You might be wondering how to assign an attribute to a function, like sendmail.to?

In this particular case, sendmail is actually a class that masquerades as a function.

    >>> class Sendmail:
    ...    to = None
    ...
    ...    def __call__(self, level, module, msg, exc):
    ...       # et. c
    ...
    >>>  sendmail = Sendmail()
This way, the most common use-case, i.e. to be notified when something happens is easy to apply, while more complicated ones are only a short step away.


You can assign arbitrary attributes to a plain function:

  >>> def f():
  ...   print f.something
  ...
  >>> f.something = 'blah'
  >>> f()
  blah


You can assign an attribute to a function in Python since functions have a __dict__. for example:

def foo(): pass

foo.a="asdf"

>>> foo.a

'asdf'

>>> foo.__dict__

{'a': 'asdf'}

or alternatively:

>>> def bar():

... bar.a = "asdf"

...

>>> bar.a

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>
AttributeError: 'function' object has no attribute 'a'

>>> bar()

>>> bar.a

'asdf'


Thanks. I like the idea of classes masquerading as functions. And easier configuration. I had actually started writing this 2 days ago as something just for me, but after reading this decided to try something new.


With something that can happen as often as logging, make sure that you test it when accessed from multiple python threads, and pound the crap out of it in various typical configurations. Python’s default logging can be a real dog, and at least in one project I was working on a few years ago turned out to be a real bottleneck.


Also keep the following in mind. You don't need to use Singletons in Python. The following code smells in Python.

     log.getLogger()
use a module instead.


Your example with "MouseClick" succintly explains what I feel is wrong with much of software development today that tries to follow "modern" OO practices. Blame it on the language, or on people who try to mold the world into familiar casts at any cost by overusing patterns?


The problem with Java is because there aren't any strongly typed functions, only strongly typed classes.

Therefore to handle a mouseClick event requires a class. However, it doesn't make sense to have a IMouseClick interface that has only one method, right? That would seem too wasteful.

So the natural thing to do is to group all the mouse events into a single interface.

Before you know it, you have the abomination of MouseClickListeners, MouseAdapters, and anonymous classes.

About overusing patterns - GOF is really a book of anti patterns that is particularly useful for language designers, because it points out the glaring contortions that people have to do to get their work done.

On the other hand, being a good programmer requires learning and subsequent unlearning of GOF patterns. I'm reminded of Bruce Lee when he started with Wing Chun, but then found the "form" too rigid. Quote from http://www.cavendishscience.org/phys/brucelee/brucelee.htm

    In describing his style, Bruce Lee made a clear 
    distinction between "having no form" and "having no
    form." To have no form was to transcend form from a 
    position of mastery that enabled one to see when the
    rules could be-in fact, needed to be-broken. No form 
    was highly disciplined in its execution. In essence, 
    to have no form was to have access to all forms; to
    understand all forms at their most essential level
    and to see and be able to act on the connections among
    forms. On the other hand, "having no form" implied an 
    undisciplined approach, an inability to master any form.

    Lee discovered that he could not teach no form directly 
    to novices in the martial arts. A student first had to 
    master the form of karate, tae kwon do, or some other 
    school; only from a position of mastery could the 
    student begin to experiment with abandoning the form.


Anybody have a text version of this? I got maybe 30 slides in before I got too annoyed to continue.



Just use arrow keys on your keyboard to navigate.


Turn off your stylesheets and scroll down (the TOC links don't seem to work anymore when you do that).


This presentation brings up a tangential point that has always confused me: how error-prone is starting a subprocess, really?

I agree with the author's goals of making common tasks easier and more obvious. urllib2 is an easy target, as it was added to the standard library over a decade ago, long before REST was something people talked about. The best tools for packaging, versioning, and testing have always been a bit ambiguous in any language, including Python.

However, the author points out something that has always bothered me about Python: it is way harder to start a subprocess with an external command in Python than almost any other language. This has been true whether using sys or os or even subprocess, which is quite recent.

I always felt that this had something to do with the constant warnings in the documentation about how a pipe between the subprocess and the Python process might fill and cause the subprocess to block. Or how running the program through shell rather than exec or something might cause some sort of security issue. Are these real issues that other languages ignore in the name of user convenience, or has Python just never been able to make the right API (as the author seems to argue)?


Creating a subprocess can be complex, at least if you expose all the different subtleties. If you've ever used Java's APIs to run processes, you know that Python's aren't the worst ;-)

There are lots of interesting corner cases, for example how to join stdout and stderr properly without blocking on one stream while the other is overflowing.

On the other hand, almost nobody ever needs this. Ruby's "output = `command`" probably covers 90% of the use cases with the most trivial API imaginable. The hard part obviously is exposing the advanced functionality without compromising on the simplicity.

Almost all programming communities can learn a lot from Ruby's "if it's too hard, you're not cheating enough" approach (dhh quote I believe). Yes, the process could return an exabyte of stdout data, but do you really care? Is that really the problem this API should try to solve, with all special cases? That's not good computer science practice, but surprisingly effective.


The sad or happy truth is that thanks to advances in computing power, what used to be dummy toy programming is now not only a valid way of doing things but the correct one.

Using made up stats, slurping the entire output of a process in a big string would fail 99% of the times 30 years ago, 50% of the times 20 years ago, 1% of the times 10 years ago, but less than 0.01% of the time now. You'd waste your time doing it 'the right way'.

So it is with most simple data processing. If your goal is just to ship a product fast, you no longer need the old type of smart programmers; nowadays, smart means doing it fast and badly.


It's funny you mention that. The author/speaker wrote a "Subprocesses for humans" module, too: https://github.com/kennethreitz/envoy

There's no fundamental problem that's stopped Python from doing this before. For some reason, all of the ways to spawn a subprocess in Python have tried to map almost directly to the underlying C API... which is pretty awful.


    > For some reason, all of the ways to spawn a subprocess
    > in Python have tried to map almost directly to the
    > underlying C API
I think both are good and necessary. One of the strengthes of python is that if you have a copy of Stevens you can usually work out how to do something in Python. And this is awesome. I've written things on top of unix that in times past would have been written in C. However, that mechanism is usually not very "pythonic".

In the early days python had a principle that there should be one way to do things. You don't hear this so much any more: we're long past that now, with some things different between 2.6 and 2.7 (arg handling), and with multiple broken libraries in the stdlib. When you're working on your own computer and your own time with root access you can always hand-roll outcmes. But it's common to have to deal with a spread of python's and cater to the most obsolete version. Yet I suspect some people still aspire to the one-way-to-do-it, and pretend it's true.

I think we should dump the principle.

A good example of why compromise is not the right outcome is the curses library - it's not quite Stevens, but it's not friendly either. It's hard to do good work with the curses library. We'd be better off if there was (1) a close curses mapping to the C ncurses mechanisms and (2) a nice-to-use abstraction layer that hid far more away from you.


To get around the problem of child subprocesses spewing out too much output and blocking the parent process, one can provide an open file handle to the stdout/stderr arguments of the Popen call. I've ran into this many times and this solution has reliably worked for me every time. This could be documented better in the Python docs.

For quick tasks and scripts, I've found subprocess.check_call, and subprocess.check_output with shell=True are great tools for spawning subprocesses and quickly grabbing output. They're pretty straightforward to use.


I have never been able to figure out how - in Python - to be able to stream asynchronously both stdout and stderr from the subprocess, both printing both of them as well as writing the data to a file.


I'm using the mkfifo method on linux/macosx:

    import os
    import sys
    import time
    import subprocess

    # turn off stdout buffering. otherwise we won't see things like wget progress-bars that update without newlines.
    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

    pipename = "tempfile"

    if os.path.exists(pipename):
        os.remove(pipename)

    # create a pipe. one side is connected to the ping process, other side is connected to python.
    os.mkfifo(pipename)
    read_fd = os.open(pipename, os.O_RDONLY|os.O_NONBLOCK)
    writer = open(pipename, "w+")

    proc = subprocess.Popen("ping www.google.com", cwd=sys.path[0], stdout=writer, stderr=writer, shell=True)

    while 1:
        try:
            # nonblocking poll data from the external process.
            s = os.read(read_fd, 1024)
            if s:
                sys.stdout.write(s)
        except OSError:
            pass
        # sidenote: minimum sleep time is 1/64 seconds on many windows pc-s.
        time.sleep(0.1)

    # remember to remove the pipe "tempfile"


Replying to myself. Using mkfifo is not necessary:

    import os, sys, time, subprocess, fcntl
    sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
    read_fd, write_fd = os.pipe()
    fcntl.fcntl(read_fd, fcntl.F_SETFL, os.O_NONBLOCK) # don't know of any windows equivalent for this line
    proc = subprocess.Popen("ping www.google.com", cwd=sys.path[0], stdout=write_fd, stderr=write_fd, shell=True)
    while 1:
        try:
            s = os.read(read_fd, 1024)
            if s:
                sys.stdout.write(s)
        except OSError:
            pass
        time.sleep(0.1)


You're listening for two file descriptor events, so you need some sort of event loop. select can do it but it's low-level; and since there can be only one event loop per program, your choices are frameworks and not simply libraries.

Here's a way to do it with Twisted (docs here: http://twistedmatrix.com/documents/current/core/howto/proces... ):

  from twisted.internet import reactor, protocol

  class PrintAndLogProtocol(protocol.ProcessProtocol):
      def outReceived(self, data):
          # print and log
      errReceived = outReceived

  reactor.spawnProcess(PrintAndLogProtocol(),
       '/path/to/exe', ['exe', 'arg1', 'arg2'])
  reactor.run()


I've done this using select.


On installing python -- the most practical way to work with it, is to have a moderately recent os-level python install and then build all the other python versions from source if required -- https://github.com/collective/buildout.python

After that use virtualenv with virtualenvwrapper.


Hangs on the first slide unless I'm doing something really wrong.


Try the arrow keys


Or SPACE and SHIFT-SPACE. But it also took me a bit to figure out how to navigate the slides.


A clear sign that the one doing something wrong is the website's designer.

Also, it breaks the back button.


You know, some sort of clue that the arrow keys are used for navigation would be helpful.


It took me at least 30 seconds to figure that out. I clicked darn near every pixel on the page before that occurred to me :)

I guess you really stick by that rule of "one way to do it", huh? I kid. Great presentation, I like your passion and I hope that your call to arms is answered.


Am I weird if I say that such badly-written presentation make me much less enthusiastic about their content?


I wish that the developers of requests module stop changing it's API -- code that was working just fine with 0.6.4 suddenly began finding missing methods in version 0.8.5


All changes are for the best, I promise. API will stable by 1.0.


seems the guy nailed down many issues i have had in the past :)


This makes a lot of good points, but some bad ones.

Esp. the "installing python" one. Just use your package manager to install all the versions you need.

And for "Packaging and Dependencies", just use pip.


What if your package manager doesn't support the version of Python you're targeting? The most common place this happens is if you have some old RHEL 5 boxen that haven't been upgraded (and who that uses RHEL doesn't?). Or suppose you use python 2.7, which isn't supported by very many (any?) "stable" linuces (aka Debian stable, RHEL, Ubuntu LTR, etc).

And installing packages into the system python (if that's what you're suggesting) is the path to madness. It's much better to use virtualenvs you can throw away at will. All in all, it's usually best just to leave the system Python alone to avoid causing problems with any other packages that may depend on it being in a consistent state.


This problem really echoed with me. The one thing I hate about Python is the pain of compiling it from source.

Unlike startups, we enterprise dudes don't have the liberty of choosing the latest distros with better base Python versions.in some cases,the machines don't even have a compiler installed.

Try running your funky new admin script on these rhel5 boxen is a pain.

I've resorted to compiling with LD_RUN_PATH, -rpath & -r set.

I think this should be included in the python-guide for the sanity of other devops dudes


I agree that people should use a good package manager, but the reality of the corporate env world is that you'll be trying to use the latest Python on some ridiculously old unix install without root access, and you'll never be able to get IT to install the latest for all users.


+1 for you sir, we are constantly fighting with Unix admins who refuse to install anything but the generic terribly old 2.3 - 2.4 Python versions that are bundled with their default enterprise Suse packages, which only leads to full /home issues since everyone just downloads their own versions, compiling extensions manually.


Or you could write the applications for the systems they will run on. Why develop in the latest Ubuntu with Python 2.7 when you know your company runs RHEL 5 or SUSE with Python 2.4?

I know it was annoying how long RHEL lasted with Python 2.4, but now the latest RHEL, Debian-stable, and Ubuntu LTS all have Python 2.6, so write to Python 2.6 if your company uses those OSs, there's not excuse to write incompatible code targeted at Python 2.7.

In many Linux distributions it is impossible to replace the Python version, core parts of the OS like package management will break. It is possible to install another version in parallel and use "virtualenv --python=" to use it for an application. If you are using RHEL/CentOS 5 you can install the python26 package from the EPEL repository.


What's the package manager for windows?

Mostly I agree with you, but installing python on windows is a PITA compared to *nix.

PIP works until you need something that requires compiling extensions which is also a PITA on windows.

IIRC easy_install can install binaries, but I could be wrong about that since I don't do windows development anymore and thus pip works for me 99% of the time.

Regardless, my point is, though I think we all wish it was as simple as using your package manager and pip, it simply isn't that way for everyone.

And it probably never will be, but we can always make it simpler. :)


PIP works until you need something that requires compiling extensions </quote>

You could use *.exe installers such as http://www.lfd.uci.edu/~gohlke/pythonlibs/

Advanced users could use mingw to compile extensions.


"You could use *.exe installers" yeah I know, but then you aren't using PIP, and how's that work with virtualenv? Again, not as simple as some would have you believe.


Actually, the windows binary installers work perfectly with virtualenv, and is perfectly simple: http://stackoverflow.com/questions/3271590/can-i-install-pyt...

(It is, however, poorly documented.)


I would not use Windows for servers.


Neither would I. But some people are stuck in jobs where they have to support Windows. If that's never been the case for you, consider yourself lucky.


neither would I. but I'm not pretending they don't exist to prove a point :)

Whether you or I prefer it or not, there are lots of people who either by choice or not develop on windows and in some cases deploy to windows servers.

And for those people, things aren't as simple as "use the package manager and pip" no matter how much anyone would like it to be.

reality sucks.


The Python standard library has gotten worse over time, as it got loaded up with more and more features, obfuscating the common use cases. The irony now is that to do simple, everyday things (like http requests) you are now better off installing a third party package like "requests" than using the standard library. So much for "batteries included."

The standard library needs a reboot. Why not do it in Python 3? Nobody's using it yet anyway ;-)


I can confirm that the python subprocess api is a pain to use and also documented poorly. I recently had to use (no choice) python 2.5.x to write a script that extensively called external programs and ran into several problems. It strange that a language such as python which I find so easy to use in many cases does not already have a good as in simple, safe and well documented subprocess api.


I guess I agree things could be simpler, although the cries of "garbage!" were a bit much. I wrote a wrapper function around urllib2 about 5 years ago and haven't looked back.


Wrappers are handy, but as soon as you need something beyond the basic use case they become useless. What’s great about Requests is that it seems to have minimal leakiness as an abstraction over HTTP.


My wrapper is quite robust after almost five years. e.g. It can save headers to alternate data streams (on NTFS) for proper 304 handling. If there is anything left to implement it could be done pretty quickly.

Still I like these new projects; it's a shame they missed the python 3.x boat by only a year or two. That would have been a great time to include them in the stdlib.


It sounds like you should release this wrapper :)


It is online, but embedded in my employers software. I could probably do some extraction of it, but not sure if it is worth the effort.


That's precisely the problem: your wrapper helps you...and only you (ie, it's worthless).


It wasn't that hard... didn't mean to imply it took five years.


It's really inexcusable that in 2012 (or 1992 even) a language that otherwise is well-suited for internet programming does not come with a first class httpclient.


What did they use to make the presentation?


Looks like markdown and a sinatra based presentation framework. See the code at: https://github.com/kennethreitz/python-for-humans


The name says it all. This is much more readable compared to urllib2


yes please and thank you muchly. puts away yak shaving contraption


Why don't you just switch to Ruby?


I didn't find making moderately complex HTTP requests in Ruby to be much more fun than urllib2 in Python. Do you have a library you recommend?


I like httparty or mechanize, but feel free make your own choice over here https://www.ruby-toolbox.com/categories/http_clients


I've tried rest-client, em-http-request, and mechanize. I don't recall their specific failings except for mechanize, which seemed to work awesome in 95% of cases and produce cryptic errors with the latter 5%. I will take a look at httparty, though, thanks.


I like the presentation, but for me it just underlines how it's smart to stay far away from Python. It's great that it's improving, but many other languages have a much better library/API situation than Python has had for years already. Will Python catch up fast enough?


It's quite the opposite, actually. Python already has an extremely good and very capable standard library, arguably better than any other language. Python also has a community with a strong sense of what kind of API design best fits with the Python philosophy (which usually boils down to readability and consistency). So there are many efforts going on to improve the standard libraries.

For example, the standard library already has urllib2, which provides HTTP support much better than most other languages' standard libraries. But Requests is a rewrite of it, taking ease of use to a whole new level without compromising any of the features. There is no equivalent in any other language.


"There is no equivalent in any other language."

I don't know, LWP is pretty easy to use.


Does Perl come with LWP?


If Python has a better standard library than the other languages you know, then you should learn some more languages.


Do you have some examples? I thought I knew a fair number of programming languages, and Python's got one of the most comprehensive.

IMO, Python's stdlib beats C++, Ruby, Java, javascript and Perl....


Python's library is comprehensive "on paper".

Vast amounts of it are actually incomplete or outright broken when you try to use it in practice.

The reliable parts of Python's standard library are at a level little beyond C and C++ and far behind Perl/CPAN, Java, .Net and so on.

And even then, as the article we're discussing points out, the APIs aren't actually very good a lot of the time.


Yep, basically my point. But you're formulating it better :-)


> It's great that it's improving, but many other languages have a much better library/API situation than Python has had for years already.

As someone who has spent the last 5.5 years at a job where I didn't need anything but the standard library, I don't agree. What are these other languages that contain a better situation?


Well, Perl's CPAN has an uninstall feature. That's kind of nice.


cpan is very much not a std lib. PyPi is equiv.


Depends on what your definition of "std lib" is.

If you say a "std lib" is something that is installed along with the language, then CPAN fits the bill.

    $ corelist CPAN

    CPAN was first released with perl 5.004


That's more of a standard tool, not a standard library. Rather, a library is a collection of reusable components. A Standard Library, then, are pre-equipped collections of reusable components. Here, we define "reusable" as invokable from within the language, as first-class constructs.

Even if CPAN was a standard library, that which you install from it is clearly not, which is what the argument to date has centered around.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: