

Unfortunate Python - webexcess
http://excess.org/article/2011/12/unfortunate-python/

======
slurgfest
I agree with the portions of this dealing in obsolete libraries (e.g.
asyncore, SimpleHTTPServer, array).

I also agree that __del__ and copy expose pain points and should be treated
carefully.

isinstance() is often a code smell for sure, but I don't think it's
potentially harmful in the same way as using os.system/os.popen; sometimes,
INSIDE your own code, you want to fail early with a more informative message
if a class does not explicitly contract to provide a certain set of interface
behaviors. (I don't necessarily want to find out that a large/complex behavior
is wrong only after it has occurred).

(I don't find it particularly nice to start writing tons of IWhatever objects
and such to get behaviors already provided by built-in language constructs.)

"if __name__ == '__main__':" is ugly buts its purpose is not to be pretty, it
is to make your module more fool-proof and explicit about what should happen
in a script run vs an import. I don't see a good general substitute. If you
are sure that only you or someone sensible will be importing/running your
module, then I can agree with the advice just to keep those two kinds of
scripts separate.

And I agree that any Python code compiling giant strings is in serious need of
fixing (including that nasty bit in namedtuple as well as Simoniato's
otherwise very nice non-stdlib code for signature-preserving decorators -
there's really no other way I know of yet for doing that without ugly hacks)

~~~
reinhardt
The isinstance() criticism is in principle outdated after PEP 3119 [1] but in
practice it is indeed mis(/over)-used more often than not.

Compiling strings is usually a code smell but at least in the case of
namedtuple I'm pretty sure it is justified for at least one reason:
performance. It's certainly possible to be implemented in more idiomatic
python but the result would probably be less efficient compared to, say, dicts
or regular objects.

[1] <http://www.python.org/dev/peps/pep-3119/#abcs-vs-duck-typing>

~~~
sitkack
I am with you, it seems like a knee jerk reaction to call all exec or eval
code bad. Complexity has been pushed into a library, and the amount of clarity
that namedtuple adds is more than it has "hurt" by its use of exec.

~~~
slurgfest
Yes; to clarify, I believe that if it is necessary to compile big old strings
to get a certain sort of behavior, then that might be a point at which Python
should be improved to allow similar code to be constructed dynamically (and
more safely).

And if it is necessary for performance reasons, Python should probably be able
to do a similar thing efficiently without compiling big old blobs of text.

It is true, of course, that this can be hidden as an implementation detail
with relatively little impact to end users... but the mere possibility of
relatively clean and judicious uses do not change the fact that as a pattern,
it is tricky and hard to read and easy to mess up in horrible ways. It is not
that it is fundamentally unworkable. But if it is the only/best way, then that
probably indicates a place where Python could be incrementally improved...

------
LeafStorm
Most of this I agree with, with a few comments/caveats:

> Easy Stuff First

The unfortunate part about this particular example is that subprocess's
interface isn't really that well-designed, and so people resort to os.system
simply because it is less complicated than subprocess.

> Ducks In A Row

I don't think isinstance is really that bad, but checking based on something's
exact type is definitely wrong.

> Toys are for Children

One of the problems of a batteries-included stdlib is that you have to support
it just about forever. Though I would like to note that for basic async
programming Tornado is surprisingly good, and less complex than Twisted.

> Foreign Concepts

Apparently this is especially useful for scientific and mathematical
computing. I can't see how it's especially dangerous since just about everyone
will use list instead.

~~~
icebraining
I disagree that subprocess is complicated. Replacing the os.system() is really
simple: [http://docs.python.org/library/subprocess.html#replacing-
os-...](http://docs.python.org/library/subprocess.html#replacing-os-system)

I think people are just used to system(), particularly if they come from C.

~~~
rat87
I think people might be complaining about how hard it is to do the equivalent
of procs = `ps -ef` in ruby, not to mention crazy ways you can combine it with
string interpolation.

------
ggchappell
Okay, here's one no one has disagreed with yet: copy.

If I write a Python class that is going to be used with someone else's code,
then it is up to me to make sure that my class behaves reasonably -- in the
context of Python's normal behavior. For example, if I define a bracket
operator, then Python will make me an iterator; I need to be sure that
iterator behaves well.

And similarly with "copy". Someone might want to write a generator that spits
out instances of my class. If we apply list() to the output of that generator,
then the result will not be correct unless the generated objects are actual
separate objects. copy.copy is a good way to make things actually distinct,
that might otherwise not be.

That means that instances of my class need to behave well if someone calls
copy.copy on one of them.

I don't see that the issues here are much different from those involving
regular old methods in a class. If I'm putting together a class, and I write a
method that does something nasty, then code that uses that class and calls the
method, will have bad results. Similarly copy.copy needs to avoid doing nasty
things, too.

SO, I don't have a problem with expecting copy.copy to produce reasonable
behavior, when I use it with a class written by someone else.

In any case, this is a nice article. Thanks for posting.

~~~
plq
I was surprised to see "copy" there. I'd agree that deepcopy (and its friends
like pickle) could be tagged with a big "use with care" sign, but especially
when one can implement __copy__ to control exactly how shallow copies are
made, I don't think it's a problem at all.

------
ghc
I found myself nodding along with this until the bit about the array module.
The array module is very well suited to doing real work. Maybe it seems arcane
if you're coming from a web dev background, but given the space inefficiency
of Python's lists, the array module is awfully useful in a lot of situations.

~~~
apu
Not just space, but speed as well. I've often noted _significant_ speedups in
many operations by switching to array.array. In some cases, the speedups are
comparable to or even slightly better than using numpy (which is much more
heavy-weight).

~~~
chimeracoder
Out of curiosity, what are you doing for which arrays are more suited than
NumPy?

I've been using NumPy for a while now, and I've literally never had a reason
to use array.array where NumPy arrays have worked.

~~~
apu
I find numpy too heavy weight for a lot of stuff and also not very good for
communication/interchange with other programs. For a lot of my work, I often
need to write data files to disk (for C programs to use) or mmap'ed files for
concurrent access. In these situations, the fact that array.array offers
precise data alignment rules and toString() and toFile() methods that are
actually sane, I end up using it instead of numpy.

Of course, if I do significant computations in my python program itself, I
just use numpy.

------
thurn
> If you're debugging at the interactive prompt consider debugging with a
> small script instead.

I don't understand this suggestion. The ability to interact with my code via
the REPL is a very important python feature for me, is the author suggesting I
abandon that?

~~~
abstractbill
I think the author is just saying if you're debugging something in a REPL _and
you're finding you need to reload() frequently_ , then you might want to
consider switching to debugging with a small script instead (and of course
there's no reason the script couldn't just import the things you're debugging,
and then dump you into a REPL).

~~~
sitkack
You mean like when I debug a script as

    
    
        ipython>import foo as t
        ipython>t.test(10,20)
        ipython>reload t
    

Roughly 2/3rds of this guy's rant is off base. Maybe if he had flushed out his
arguments with citations, but seriously. It is all opinion.

Reasoning with opinion is dumb.

~~~
dboat
You did the same thing. What's so bad about reasoning with opinion?

~~~
sitkack
You just got stabbed by an ironicaltite

------
dajobe
This article / talk is a collection of relatively random python things of
which only a few are unarguably good advice such as not using deprecated terms
or checking for exact types. The rest is not a good basis for pythonic best
practice that I would recommend.

------
tungwaiyip
This is just a lot of nitpicking, parroting of common wisdom, plus uninformed
opinions and misapplication of use cases.

To complain about SimpleHTTPServer is not for use as production public web
server is ridiculous. That was never its intended used and most people
understand this. Not every HTTP server is a public web server however. It is
very convenient to fire up one as a test server and to get around some of the
browser security constraint on the file:// scheme. It is also useful for a lot
of light weight internal integration and configuration UI. To deploy
SimpleHTTPServer for production web server is not a correct use. But neither
is deploying a heavy weight HTTP server like Apache for testing or light
weight integration appropriate.

I agree JSON is preferable to pickle in most situations. But you have to
understand the history context. Pickle was the standard way to serialize
object well before JSON is popularized. Serialization is an important topic in
general and you will find many other computer language provides similar
facility. Also JSON only address a subset of the serialization problem.

The array module is used not only for performance. It also support tight
packing of data. If you have a million integer, array pack them tight as 1
million x 4 bytes. To store them into a list will have a much larger memory
footprint because the are stored as 1 million individual objects plus the data
structure of the list. I happen to have written a proprietary database engine
in Python where memory, disk and I/O footprint matter hugely. Go easy on
berating it because there are use cases you are not aware of.

It is sad that the author do not understand namedtuple but choose to mock it.
I studied the code extensively and emulated it in a my application. To
appreciate namedtuple, think about what is the alternative to this
implementation? Try to write your own that serve the same function. Yes you
can more easily and the code more readable by doing it with __setattr__ and
__getattr__. The only problem every time you access an attribute you get hit
with a big overhead. The use of exec is not a hack but a deliberated decision
to provide attribute access at a speed comparable to standard attribute
access. I agree with the author, "do not do this at home". Leave the heavy
lifting to the experts unless you really known Python inside out like
namedtuple's author Raymond Hettinger.

------
Jach
With regards to "Ducks in a row", I don't think the answer is try..except. You
either have a code smell, or you should use type-dispatch multimethods. (I
think Guido's "Five-minute Multimethods in Python" was recently submitted
here, but anyway: <http://www.artima.com/weblogs/viewpost.jsp?thread=101605> )

------
manojlds
Who really uses SimpleHTTPServer in production or for serious work?

~~~
obtu

      python3 -m http.server 8080
    

is a useful one-liner to serve a directory over http. There's no good reason
to have it in the standard library — mapping directories to html listings and
http paths to filesystem paths is a bit idiosyncratic — but now that it's
there I'm happy to use it.

------
oinksoft
> This leaves only one place where pickle makes sense — short lived data being
> passed between processes, just like what the multiprocessing module does.

ZODB disagrees.

------
bconway
_try treating the object as the first data type you expect, and catching the
failure if that type wasn't that type, and then try the second. This allows
users to create objects that are close enough to the types you expect and
still use your code._

Is chaining try-catches as a trial-and-error way to figure out a type really
the Right Way to do things these days?

------
CPlatypus
I agree with most of these, but a couple seem pretty specific to the use of
Python as a language for serious, complex, fully packaged applications. Sorry,
but Python has another use as well - to create quick utility scripts in a
better language than bash or perl. For example...

* When discussing "__name__ == '__main__' he complains about 13 non-alphanumeric characters, and then proposes a setuptools-based alternative that's 12 extra _lines_.

* He disses asyncore/asynchat, and then proposes the much more complex Twisted instead.

If your project is already big enough to have multiple files, already complex
enough to require Twisted-level functionality (though even then Twisted sucks
compared to Tornado or just about anything else), if it's already being
packaged for general use via setuptools, then following these suggestions is
almost free. OTOH, they're way overkill for other situations. About three
years ago some colleagues and I wrote an asynchat-based server to coordinate
certain administrative actions on a 1000-node system. It was stable, it
performed as well as it needed to, and - even after working around some
"infelicities" in asynchat - it was still only half as much code as would have
been necessary in Perverted.

I don't think I'll be signing up for any idioms or coding standards that are
based on an assumption of using Python as a direct replacement for Java.
Neither should anyone else.

~~~
kamaal
_I don't think I'll be signing up for any idioms or coding standards that are
based on an assumption of using Python as a direct replacement for Java.
Neither should anyone else._

I am not sure, But if I'm using Java or C++ or another equivalent say even
Clojure. Then the glue language for the rest is going to be Perl/Bash not
Python. The reason is Perl does the scripting far too well than Python.

Python is more fashionable today courtesy Django, Twisted and other
frameworks.

And I think that's the problem here. Python is trying to be half serious in
both the Java and Perl worlds. Java and Perl have nearly orthogonal goals. If
you try to do half work in both the worlds. You end up pleasing neither. Try
to be either in this camp or the other.

~~~
slurgfest
Python certainly imposes greater overhead if all you are doing is writing a
short shell script of a few lines in a few minutes. I feel that Perl is closer
to bash than Python is (whether that's good or bad just depends).

But if you want to be able to go back to that code later, or reuse it in a
bigger system, or have things like robust error handling and testing, you are
out of the domain of short shell scripts anyway. Now you have a choice of
whether to write nice Perl code, which takes a little overhead and effort and
thought more than just using Perl, and writing nice Python, which is pretty
directly what Python's language design is about. I figure it's a matter of
what you know best and what you like and what libraries you need.

I reckon the reason you won't use Python is that you are relatively
uncomfortable with Python for whatever reason. That's legitimate, but it
doesn't mean that Python won't be a great solution for someone else doing the
same task.

As a 20-year old language from the Unix/C world with mostly boring Algol-
family syntax, I don't find Python very fashionable at all, and I think it is
just weird to accuse Python of "trying to be half serious" in anything except
being Python. If people prefer to write big apps in Python or Rakudo Perl, and
those languages facilitate writing big apps, it doesn't really have anything
directly to do with Java (except that all the involved languages have some
shared heritage, concepts of objects, etc.) The same is not true of C# or
Scala, which have really deep and undeniable debts to Java, nothing like the
very general family similarity you see between Java and 90s interpreted
languages like Perl and Python.

The problem with treating Python as Java is that Python is NOT Java (and has
never attempted to be Java) - so the results of treating it like Java are
often not as a Java programmer expects, giving the impression that Python is a
bad Java. But it isn't supposed to be a Java at all; this is just Doing It
Wrong. The same is undoubtedly true with other language combinations - it
isn't fair to judge Ruby harshly for responding poorly to C idioms, for
example, because it just ain't C.

