I also agree that __del__ and copy expose pain points and should be treated carefully.
isinstance() is often a code smell for sure, but I don't think it's potentially harmful in the same way as using os.system/os.popen; sometimes, INSIDE your own code, you want to fail early with a more informative message if a class does not explicitly contract to provide a certain set of interface behaviors. (I don't necessarily want to find out that a large/complex behavior is wrong only after it has occurred).
(I don't find it particularly nice to start writing tons of IWhatever objects and such to get behaviors already provided by built-in language constructs.)
"if __name__ == '__main__':" is ugly buts its purpose is not to be pretty, it is to make your module more fool-proof and explicit about what should happen in a script run vs an import. I don't see a good general substitute. If you are sure that only you or someone sensible will be importing/running your module, then I can agree with the advice just to keep those two kinds of scripts separate.
And I agree that any Python code compiling giant strings is in serious need of fixing (including that nasty bit in namedtuple as well as Simoniato's otherwise very nice non-stdlib code for signature-preserving decorators - there's really no other way I know of yet for doing that without ugly hacks)
Compiling strings is usually a code smell but at least in the case of namedtuple I'm pretty sure it is justified for at least one reason: performance. It's certainly possible to be implemented in more idiomatic python but the result would probably be less efficient compared to, say, dicts or regular objects.
And if it is necessary for performance reasons, Python should probably be able to do a similar thing efficiently without compiling big old blobs of text.
It is true, of course, that this can be hidden as an implementation detail with relatively little impact to end users... but the mere possibility of relatively clean and judicious uses do not change the fact that as a pattern, it is tricky and hard to read and easy to mess up in horrible ways. It is not that it is fundamentally unworkable. But if it is the only/best way, then that probably indicates a place where Python could be incrementally improved...
There are other drawbacks to exec though, some people have disabled exec for security reasons and it's opaque to things like pypy.
The existence of third-party libraries that are arguably more featureful does not render core libraries obsolete. Handling the 80% case easily is a virtue, not a vice.
> Easy Stuff First
The unfortunate part about this particular example is that subprocess's interface isn't really that well-designed, and so people resort to os.system simply because it is less complicated than subprocess.
> Ducks In A Row
I don't think isinstance is really that bad, but checking based on something's exact type is definitely wrong.
> Toys are for Children
One of the problems of a batteries-included stdlib is that you have to support it just about forever. Though I would like to note that for basic async programming Tornado is surprisingly good, and less complex than Twisted.
> Foreign Concepts
Apparently this is especially useful for scientific and mathematical computing. I can't see how it's especially dangerous since just about everyone will use list instead.
Actually, I use Python for mathematical computing, and I've never used Python's built-in arrays. NumPy arrays are much more convenient, and NumPy is written in C, so it's way faster than using a list.
It's a win-win: the speed of C, combined with the flexibility and syntax of Python.
The only reason array really exists in Python (this implementation, that is), is to support some other libraries that need O(1) indexing but don't want to depend on NumPy. For everything else, you should really just use NumPy.
I think people are just used to system(), particularly if they come from C.
The little part you link to starts off simple, sure, but then says "Calling the program through the shell is usually not required.", and proceeds to present a "more realistic example", leading to an immediate "WTF?" for anyone just learning of subprocess.
People who have been doing things the obvious way for years will not take well to a new mechanism that they don't understand the function nor implications of, and whose use is immediately discouraged by its own documentation.
99% of system() calls (regardless of language) will never be so adorned. It's used for quick hacks, rarely anything more. If the return code is checked at all, people only pay attention to whether it's something other than 0.
I think it's potentially dangerous if/when Python newbies from other languages pick up old idioms and (ab)use them thinking they're "Pythonic".
I haven't seen it with a Python codebase, but I have worked on a Java codebase that was developed from scratch by C developers new to Java. They used lots of arrays (same sin), minimal Collections, and lots of for(i = 0; i < arr.length; i++) instead of foreach. Urgh.
So it isn't Python's fault, but I see how it could pose a danger.
It's possible they learned Java at some earlier point in their lives, but my understanding was the team came from embedded/systems C backgrounds and I think they just did what felt right.
run = lambda cmd: subprocess.check_output(cmd, shell=True)
data = run("ps -e")
If I write a Python class that is going to be used with someone else's code, then it is up to me to make sure that my class behaves reasonably -- in the context of Python's normal behavior. For example, if I define a bracket operator, then Python will make me an iterator; I need to be sure that iterator behaves well.
And similarly with "copy". Someone might want to write a generator that spits out instances of my class. If we apply list() to the output of that generator, then the result will not be correct unless the generated objects are actual separate objects. copy.copy is a good way to make things actually distinct, that might otherwise not be.
That means that instances of my class need to behave well if someone calls copy.copy on one of them.
I don't see that the issues here are much different from those involving regular old methods in a class. If I'm putting together a class, and I write a method that does something nasty, then code that uses that class and calls the method, will have bad results. Similarly copy.copy needs to avoid doing nasty things, too.
SO, I don't have a problem with expecting copy.copy to produce reasonable behavior, when I use it with a class written by someone else.
In any case, this is a nice article. Thanks for posting.
I've been using NumPy for a while now, and I've literally never had a reason to use array.array where NumPy arrays have worked.
Of course, if I do significant computations in my python program itself, I just use numpy.
The usual #python response to people using the array module was to instead consider NumPy (with its excellent arrays) or use normal lists and PyPy to make it fast.
I don't understand this suggestion. The ability to interact with my code via the REPL is a very important python feature for me, is the author suggesting I abandon that?
ipython>import foo as t
Reasoning with opinion is dumb.
To complain about SimpleHTTPServer is not for use as production public web server is ridiculous. That was never its intended used and most people understand this. Not every HTTP server is a public web server however. It is very convenient to fire up one as a test server and to get around some of the browser security constraint on the file:// scheme. It is also useful for a lot of light weight internal integration and configuration UI. To deploy SimpleHTTPServer for production web server is not a correct use. But neither is deploying a heavy weight HTTP server like Apache for testing or light weight integration appropriate.
I agree JSON is preferable to pickle in most situations. But you have to understand the history context. Pickle was the standard way to serialize object well before JSON is popularized. Serialization is an important topic in general and you will find many other computer language provides similar facility. Also JSON only address a subset of the serialization problem.
The array module is used not only for performance. It also support tight packing of data. If you have a million integer, array pack them tight as 1 million x 4 bytes. To store them into a list will have a much larger memory footprint because the are stored as 1 million individual objects plus the data structure of the list. I happen to have written a proprietary database engine in Python where memory, disk and I/O footprint matter hugely. Go easy on berating it because there are use cases you are not aware of.
It is sad that the author do not understand namedtuple but choose to mock it. I studied the code extensively and emulated it in a my application. To appreciate namedtuple, think about what is the alternative to this implementation? Try to write your own that serve the same function. Yes you can more easily and the code more readable by doing it with __setattr__ and __getattr__. The only problem every time you access an attribute you get hit with a big overhead. The use of exec is not a hack but a deliberated decision to provide attribute access at a speed comparable to standard attribute access. I agree with the author, "do not do this at home". Leave the heavy lifting to the experts unless you really known Python inside out like namedtuple's author Raymond Hettinger.
python3 -m http.server 8080
Is chaining try-catches as a trial-and-error way to figure out a type really the Right Way to do things these days?
* When discussing "__name__ == '__main__' he complains about 13 non-alphanumeric characters, and then proposes a setuptools-based alternative that's 12 extra lines.
* He disses asyncore/asynchat, and then proposes the much more complex Twisted instead.
If your project is already big enough to have multiple files, already complex enough to require Twisted-level functionality (though even then Twisted sucks compared to Tornado or just about anything else), if it's already being packaged for general use via setuptools, then following these suggestions is almost free. OTOH, they're way overkill for other situations. About three years ago some colleagues and I wrote an asynchat-based server to coordinate certain administrative actions on a 1000-node system. It was stable, it performed as well as it needed to, and - even after working around some "infelicities" in asynchat - it was still only half as much code as would have been necessary in Perverted.
I don't think I'll be signing up for any idioms or coding standards that are based on an assumption of using Python as a direct replacement for Java. Neither should anyone else.
- the async* library is not thread safe: loop and poll* both use a global socket_map (read the code to see what I'm saying -- on my mac its at /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/asyncore.py)
- There's no way to integrate timers in the loop (there are constructs like timerfd in linux to get around this deficiency in the linux C api)
That being said, IRC bots (more generally, simple chat applications) are the type of application perfectly suited for asyncore, and I'm pretty sure someone has written a tutorial on it using an IRC bot as the goal.
My bot  has an idle function, which is called every 30 seconds and does background jobs like checking git repos for new commits. The 30 seconds are the timeout of the select/poll call asyncore does internally, which can of course be changed. Based on this the bot got a cron-like timing infrastructure.
It's a legacy of the import system and I'd rather see it go.
The non-alphanumeric argument is pretty weak though.
I don't know enough about asyncore/chat but SimpleHTTPServer is quite spot on. It's fine as a development tool, but really should be avoided in production environments.
(btw, Twisted is awesome. A comparison with Tornado only compares a small part of Twisted. Sure it was developed pre-PEP8 but it's still solid kit. Nothing quite like it on the Python market, afaik. Also the docs have gotten a lot better in recent years.)
I am not sure, But if I'm using Java or C++ or another equivalent say even Clojure. Then the glue language for the rest is going to be Perl/Bash not Python. The reason is Perl does the scripting far too well than Python.
Python is more fashionable today courtesy Django, Twisted and other frameworks.
And I think that's the problem here. Python is trying to be half serious in both the Java and Perl worlds. Java and Perl have nearly orthogonal goals. If you try to do half work in both the worlds. You end up pleasing neither. Try to be either in this camp or the other.
But if you want to be able to go back to that code later, or reuse it in a bigger system, or have things like robust error handling and testing, you are out of the domain of short shell scripts anyway. Now you have a choice of whether to write nice Perl code, which takes a little overhead and effort and thought more than just using Perl, and writing nice Python, which is pretty directly what Python's language design is about. I figure it's a matter of what you know best and what you like and what libraries you need.
I reckon the reason you won't use Python is that you are relatively uncomfortable with Python for whatever reason. That's legitimate, but it doesn't mean that Python won't be a great solution for someone else doing the same task.
As a 20-year old language from the Unix/C world with mostly boring Algol-family syntax, I don't find Python very fashionable at all, and I think it is just weird to accuse Python of "trying to be half serious" in anything except being Python. If people prefer to write big apps in Python or Rakudo Perl, and those languages facilitate writing big apps, it doesn't really have anything directly to do with Java (except that all the involved languages have some shared heritage, concepts of objects, etc.) The same is not true of C# or Scala, which have really deep and undeniable debts to Java, nothing like the very general family similarity you see between Java and 90s interpreted languages like Perl and Python.
The problem with treating Python as Java is that Python is NOT Java (and has never attempted to be Java) - so the results of treating it like Java are often not as a Java programmer expects, giving the impression that Python is a bad Java. But it isn't supposed to be a Java at all; this is just Doing It Wrong. The same is undoubtedly true with other language combinations - it isn't fair to judge Ruby harshly for responding poorly to C idioms, for example, because it just ain't C.