
Curing Python's Neglect - twampss
http://zedshaw.com/blog/2009-05-29.html
======
tow21
Complaints around infelicities in the Python stdlib are fair enough - I don't
think anyone would defend them, but what can you do? You can't change
published APIs.

Fos most of the rest of his issues, I think Zed's problem is that he thinks
there's one obvious answer, but lots of other people would disagree with his
one answer. Python tends to be either very opinionated, or very agnostic.
Where there's One Way To Do Things, that's what you must do; except where
there's obvious disagreement in which case it doesn't bless a single answer.

API doc autogeneration; I can guarantee that if anyone came up with a default
tool with default output such that "doctool module html_directory" was all you
needed (and indeed people have) then there would be widespread, perfectly
legitimate, disagreement on any number of choices in the design of it.

(Prsonally, I think API doc autogeneration is entirely pointless, and indeed
fundamentally the Wrong Thing To Do for most Python libraries. When I come
across pages of JavaDoc-ish API documentation, I groan internally and just
look at the source instead.)

~~~
arohner
>Complaints around infelicities in the Python stdlib are fair enough - I don't
think anyone would defend them, but what can you do? You can't change
published APIs.

You're absolutely right, but that doesn't mean Zed isn't on to something. How
many different ways are there to open a subshell in python? There's popen,
popen2, popen3, popen4 (I think), plus more (subprocess?). Yes, the past can't
be changed, but it's still a good idea to stop and ask why there are so many
mistakes like this.

Why is there a time module, and a date module, and a datetime module, and yet
all three are broken?

The main question is _What factors allowed the stdlib to get so quirky, and
what can be done to fix that?_ Pointing out the stdlib is quirky is the first
step in fixing the problem.

~~~
aston
Python 3 intends to fix a lot of the stdlib issues. For example, the
subprocess module should obsolete all of the various popens.

<http://docs.python.org/library/subprocess.html>

~~~
dkarl
And subprocess is available in 2.6, too. It looks like a well-designed API and
certainly makes it easy to run external filters on data. For example, to get
the output of 'tidy -q' on a string s:

    
    
      proc = subprocess.Popen(['tidy', '-q'], stdin=PIPE, stdout=PIPE, stderr=PIPE)
      (tidyout, tidyerr) = proc.communicate(input=s)
      if proc.returncode != 0:
          ...

~~~
tdavis
Errr, subprocess has been around since 2.4...

------
thristian
Some responses to various parts of the article:

"No easy_UNinstall" - I'm one of the lucky few who has access to an extensive
and sensible archive of easily installable and uninstallable Python library
packages; I call it "Ubuntu". From what I can see, distutils is a good
metadata format that helps real packagers like Debian and Ubuntu create real
packages - but trying to write a cross-platform installer is just asking for
trouble. Just unpack the package somewhere and set $PYTHONPATH in your startup
scripts; that's what your startup scripts are for, anyway.

"rm -rf" - my understanding is that the contents of the "os" stdlib package
are a thin wrapper around POSIX, and rmdir(2) won't remove a non-empty
directory either (although Zed says Python's rmdir will remove a directory
with subdirectories, but not files... that's odd). "shutil.rmtree" isn't in
POSIX, so it isn't in "os" either.

"Time Converstion" - Zed says "If all they did was give me the exact same
POSIX C API I’d be happy.", but so far as I can tell, the 'time' stdlib module
basically _is_ the POSIX C API, with braindead awkwardness fully intact. I'll
grant that "calendar" is stupid and "datetime" is crippled, although the
third-party "dateutil" module fills a lot of the holes in "datetime". (also,
for as long as I've known of it, mx.DateTime has been freely available)

"API Documentation Generation" - It's not in the standard library, but I've
been quite happy with the third-party tool "epydoc" for Python API doc
generation, and it pretty much is as easy as "epydoc path/to/package" (and
predates Sphinx and a lot of the other tools).

I have to agree with the rest of his examples, though - Python's had a long,
rocky road from "procedural Unix scripting language" to "Object-oriented,
Internet service-providing language", and although Python 3.0 has cleaned up a
lot of the cruft there's still some oddness that remains (like the len
function, or the del keyword). A lot of the standard-library crud has come
from people saying 'here, I've found/written an 80% solution to this
particular problem, let's put it in the standard library since it's better
than the solution that's in there at the moment' rather than 'here's a
problem, let's design a 90% or 95% solution for the standard library'.

~~~
stcredzero
_A lot of the standard-library crud has come from people saying 'here, I've
found/written an 80% solution to this particular problem, let's put it in the
standard library since it's better than the solution that's in there at the
moment' rather than 'here's a problem, let's design a 90% or 95% solution for
the standard library'._

This is always a cultural/community issue. It would need a cultural/community
fix. And, given what I've seen in other communities, it should be fixed, as
there's a lot to be gained by doing so.

------
callahad

      > Then when you are told about it, you’d make up excuses trying to explain
      > why it is totally normal.
    

That's a pretty incredible rhetorical device. "If you disagree, you're
fundamentally incapable of reasoned thought."

~~~
ellyagg
Except, do you doubt its truth? Humans are rationalization engines.
Communities will go to great lengths to prove that everything about their
culture is good.

~~~
Confusion
I not only doubt its truth: I know for sure it's false.

If I disagree with someone, then I often still grant that their opinion is the
result of reasoned thought. In such a case it's the underlying _assumptions_
that we disagree on and those assumptions are usually not amenable to
reasoning.

A programming example is bracing styles: I have my preferred one and a
colleague has his preferred one. I have my arguments and he has his arguments.
In the end, it comes down to what each of us considers to be 'best readable',
which is an entirely irrational consideration. We get along very well, despite
our differences (and of course, for each project, we settle upon a style,
based on exterior considerations like: what would be consistent with this
clients' codebase).

Another example is religion: I'm a staunch atheist and my girlfriend is a
Christian. 'nuff said.

~~~
cturner

        A programming example is bracing styles
    

That's not an example of what he's talking about. He's talking about people
creating reasons to explain away a real problem in a way that preserves
reputation or personal peace-of-mind.

What he's talking about is the pattern you get when you speak to an
Alzheimer's sufferer who is trying to pretend they have nothing wrong. They
can be extremely convincing to you and themselves for a while making up
excuses for why things are the way they are. Eventually they exhaust (rapidly,
if you make them reload context) and the facade falls away.

Programmers do this stuff all the time. "Oh it has to be this way because ...
[some bullshit reason that doesn't explain why a customer has a reasonable
objection to the software cracking its head doing a double blackflip when they
asked it to step forward]". Is it reasonable to the educated impartial
observer that a piece of enterprise software should crap itself on a null
pointer exception and start silently failing? Not remotely. Will you find
programmers defending their software when it exhibits such behaviour? All the
time.

------
petercooper

      >>> exit
      Use exit() or Ctrl-D (i.e. EOF) to exit
    

Enough said. If you bloody well knew what I typed, just exit FFS.

Python's design is full of shortsighted, dogmatic decisions like this - don't
even get me started with functions versus methods and __len__-esque line
noise, argggg! And there should only be one obvious way to do something? Give
us a break - that's just not how life, mathematics, or anything should work.

~~~
jjames
Explicit function calling via () is one of my favorite things in the language.
To each their own.

"exit" merely references a function. It doesn't call by name alone. It can
thus be passed to a function (a callback for example), saved to a variable,
etc for later calling. Maybe not so useful for the exit function in the
interactive shell but very useful for functional programming. This is an
example of consistency, sometimes at the expense of convenience; not great for
hacking, awesome for collaboration and large projects (imo).

>>> dir(exit)

['__call__', '__class__', '__delattr__', '__dict__', '__doc__',
'__getattribute__', '__hash__', '__init__', '__module__', '__new__',
'__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__str__',
'__weakref__', 'name']

(pardon the dunderware)

Here's another way to exit the interactive shell ;)

>>> exit.__call__()

~~~
derefr
But how does "exit" by itself print that pretty little message telling you
that, although it's obvious you wanted to exit, it's not going to let you?
It's not just a function reference if it shows up that way.

~~~
anuraggoel
<http://docs.python.org/library/constants.html#exit>

In CPython, exit is actually an object of type site.Quitter (the actual class
is implementation dependent). site.Quitter overrides the __repr__ special
method (which is called by the interpreter on any expression typed in the
shell to print the result). site.Quitter also overrides the __call__ special
method, so when an object of type site.Quitter is called, the overridden
__call__ method invokes system exit.

    
    
       >>> exit_class = type(exit) #gets a reference to the class
       >>> my_exit = exit_class('bye') #the arg is used to print the message
       >>> my_exit
       Use bye() or Ctrl-D (i.e. EOF) to exit
       >>> my_exit()
       <python shell exits>
    

Minor inconsistency: typing "bye()" doesn't work so technically the message is
incorrect. But I suppose they don't want you to be hacking exit() in the first
place.

------
fauigerzigerk
Yes the python APIs are very inconsistent, but I disagree about the del
statement. That's a different matter and the way Zed looks at it is very much
the kind of hemispatial neglect that tends to befall people who are immersed
too deeply in the OO paradigm.

del x unbinds x from whatever namespace it resolves to. How would you say
c.remove(x) if you don't know which collection c represents? Now, of course
you could argue that del x is too different from removing something from a
collection to have it use the same syntax. But that's not about neglect. It's
just a different opinion about what is consistent.

The reasoning is probably that removing a variable should always be done in
one way and that is del x no matter if x is inside a collection we know or
not. It's a very conscious attempt to make it consistent and that's not
neglect. Whether it's a good idea I don't know.

Maybe a global del function would be better. If x is inside c and we know what
c is we would say c.del(x) and if we want c to be resolved we say del(x). But
is that really so different from what we have now?

More generally, my opinion is that the way functions are used in OO is in
itself very inconsistent. If you have a function f and variables x, y, and z,
it's mostly a consideration of implementation details that leads to a decision
of whether it should be x.f(y, z), y.f(x, z) or z.f(x, y). Why does the user
of an API have to think about or remember which one it is?

------
grandalf
I've been exploring Python over the past few weeks after a few years of
intense Ruby immersion.

There are definitely some very nice things about Python. I highly recommend
that all rubyists do a nontrivial project in it asap.

~~~
tptacek
What nice things would those be? I'm struggling for an obvious example of
something I miss from my Python days (which ended in '06).

~~~
grandalf
I've found that code is way more readable, even though I've only been using
the language for a short time.

I wonder what a whitespace sensitive ruby would be like :)

    
    
      %w(a b c d e f g).each do |i|:
           puts i
    
      foo = lambda do |x|:
           x + 2
    
      foo 3

~~~
natrius
That's one of the problems: Inline blocks are fundamentally incompatible with
significant whitespace, or at least no one has thought of a good way to
combine the two.

~~~
nathanic
I personally like how Haskell can combine the two. You can have:

    
    
      myFunc = do
        this
        that
        other
    

Or equivalently:

    
    
      myFunc = do { this; that; other; }

~~~
natrius
How does the first form work out as an inline function argument? I'm not
familiar with Haskell's syntax.

~~~
nathanic
I think I misunderstood you before. I thought you were talking about putting
multiple statements on a line.

But Haskell can pass around code blocks as well, with or without significant
whitespace.

For example, I can define a function called 'forever' to run a sequence of
actions forever like so:

    
    
      forever x = x >> forever x
    

Where 'x' is any sequence of actions. You can pronounce '>>' as "followed by".
(These are monadic actions, actually, but nevermind that.)

I can invoke it like so:

    
    
      myFunc = forever (do {this; that; other})
    

or:

    
    
      myFunc = forever $ do
        this
        that
        other
    

The dollar sign is a function that lets you dispense with some parentheses.
Think of it like an opening paren that closes at the end of the subexpression
it's in.

If the block is a function rather than a sequence of actions, Haskell uses '\'
as a lambda for introducing anonymous functions.

For example, the function 'map' takes a unary function and calls it with each
value from a list. Below I'll call it with an anonymous function which returns
double its argument:

    
    
      map (\x -> x * 2) [1,2,3]
    

This results in a new list [2,4,6].

~~~
shrughes
Another good example:

    
    
        forM_ [1..9] $ \i -> do
          this i
          that
          other

------
ghshephard
Python is relatively consistent with regards to
appending/popping/inserting/removing:

a=[] dir(a) 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove',
'reverse', 'sort'

help(a.remove) remove(...) L.remove(value) -- remove first occurrence of value

help(a.pop) pop(...) L.pop([index]) -> item -- remove and return item at index
(default last)

append(...) L.append(object) -- append object to end

insert(...) L.insert(index, object) -- insert object before index

------
sandGorgon
Ruby's Gems ownzors easy_install - with versioned libraries, updates, etc..

And apt-get, pacman, rpm are not excuses, because I like to use Slitaz as my
OS

If virtualenv (which I love) is something that plays havoc with package
management - they should be merged, rather than not do anything about it.

~~~
llimllib
what I love is when I want to install a single ruby gem and I get to sit
around waiting for a half hour while it fetches seventeen unrelated things
from github. For extra special sauce, let github be down.

Get pip and be enlightened.

~~~
carbon8
The situation you describe simply doesn't exist.

------
jnoller
Patches, PEPs and bug reports welcome. If you see a problem, it's polite to at
least assist in fixing it.

~~~
pbiggar
I hate that when criticised, people demand that the criticizer fix the problem
they're pointing out. I realize your request is less strong than that, but
fixing "your" software is not "my" problem. Especially when you consider that
nearly every piece of software I've ever used has bugs, and most of the bug
reports I submit are ignored.

~~~
jnoller
I'm sorry; No. I'm not demanding he fix it; I'm asking he do something more
constructive than nothing. Python isn't "my" software - it's "our" software.
Zed's smart enough to contribute and help fix things - I hold him to a higher
standard than someone just walking in off the street.

Python-core does not ignore bug reports, especially ones which really fix
things, and come with a patch. Asking him to help isn't out of bounds.

------
shadytrees
_Yet, here’s what you have to do for Sphinx which is an insane amount of work
for something that JavaDoc, POD, Doxygen, RubyDoc_

Nit: Sphinx is a system for long-form, separately written documentation. Like
manuals and tutorials; think stuff that requires indices. That's why autodoc
is an extension, why you usually set up a separate directory, why you have the
flexible build system, and so on. (Although all you have to do is create a
file and type `automodule` and then the module name.)

------
aston
Hey Zed, it's mystuff.pop(4).

~~~
batasrki
Is that intuitively the orthogonal method to append()? Not in my view. If I
appended something, I'd want to remove it or delete it, not pop it. If I
push'ed, then popping would be obvious.

~~~
smanek
Just to be pedantic, orthogonal means perpendicular (i.e., 'at a right angle
to'). It seems like you meant opposite/complementary/inverse, etc.

Sorry - people misappropriating math jargon into the mainstream is a little
pet peeve of mine. I agree with your main point though.

~~~
DanielBMarkham
Interesting.

In IT, I've heard the phrase "mutually orthogonal requirements" for years now
to indicate requirements that do not overlap.

~~~
maggie
Both of you are correct. In programming languages / the IT world, object A is
orthogonal to object B if object A can be used without thinking about the
potential consequences to object B.

In mathematics, orthogonal means perpendicular in a geometric-sense (think two
vectors) but can also be used in other contexts with a different meaning.

~~~
likpok
Even more generally, two vectors are orthogonal if their inner product is
zero.

This is why the word "perpenidcular" is not used, as sometimes your vectors
don't really have "directions" in the intuitive sense of the word (e.g. the
inner product space of functions).

------
triplefox
The one that always bugged me but seems to be true for most dynamic languages
is immutable types being passed by value and everything else passed by
reference.

~~~
brendano
For an immutable type, there's no distinction between pass-by-value and pass-
by-reference. How would you ever know the difference?

~~~
tow21
Speed? Not sure if this is actually what the OP was thinking of, but pass-by-
value usually implies copying, while pass by reference just involves handing
around a pointer.

ie, if you had an enormous immutable value (in Python, a tuple with lots of
entries), then in a hypothetical Python implementation which did call-by-
value, it might be significantly slower than call-by-reference, on account of
having to copy the entire tuple.

But this falls squarely into 'implementation detail' - if you're so inclined,
you could implement either CBV or CBR in hundreds of other ways. Certainly
from the perspective of time-independent program behaviour you shouldn't be
able to tell the difference.

------
rw
From the OP: "For those of you who refuse to read the article, "

Maybe some of us already know what hemispatial neglect is, Zed.

------
holygoat
I've ranted on this subject to my coworkers many times in the past. Python's
API is full of missing inverses, missing analogues, and similar things with
oddly different shapes.

------
andres
the symmetric add/delete operations are actually:

mystuff[4] = 'apple'

del mystuff[4]

i'm not saying it's very elegant, but it makes sense.

~~~
calambrac
mystuff[4] = 'apple' doesn't add an 'apple', it overwrites an existing element
in a list of at least 5 elements, and del mystuff[4] actually does change the
length of the list. They aren't symmetric at all.

------
erlanger
"A normal person will eat everything in front of them, but a person with
neglect will happily eat only the things _on the left side of their body
(emphasis added)_."

It appears that this condition is more serious than we originally thought.

