Hacker News new | past | comments | ask | show | jobs | submit login
Python Packaging: Hate, hate, hate everywhere (pocoo.org)
191 points by uggedal on June 22, 2012 | hide | past | web | favorite | 115 comments

sigh If you want to help, check out the Python-dev thread where this is being hashed out ad-infinitum rather than flamed to death: http://mail.python.org/pipermail/python-dev/2012-June/120430... - for the tl;dr crowd - yeah, things have problems, want to fix.

> check out the Python-dev thread where this is being hashed out ad-infinitum rather than flamed to death

Where am I flaming anything to death here? The sole purpose of the post was to point out that certain things worked in setuptools that broke with the introduction of pip. That's neither bashing setuptools or pip, just stating something that you can observe yourself.

It's an observation of an unfortunate effect of someone else taking the work of someone else to make a new tool and missing a usecase that the old tool supported with the end result of having two tools with overlapping usecases, but neither is a superset of the other.

With some modifications virtualenv + pip works perfectly fine for us. It just takes some extra work and it's unfortunate that it had to happen. Personal lesson learned: try not to replace tools in the future without understanding all the usecases of the old tool first.

> If you want to help

I think that's one of the big fallacies of open source: that you should fix it yourself rather than saying that something is not working. I reserve the right to feel unhappy when something does not work and I reserve the right to share my experiences with others without having to feel guilty about not helping out.

I tried patching pip to be able to install binary eggs but it turned out to be impossible without replicating all the logic in easy_install because pip by design does not follow it. Rather than making pip2 I decided against it and looked at what the problem actually is: we want to have fast builds. So we could solve the problem by copying over virtualenvs and fixing the paths. Perfect? No! But it's a simple fix and solves our problem.

I didn't put binary support into pip because (a) it didn't work well on some systems, and (b) I didn't use the other systems. I am not aware of rejected patches to support binary packages on Windows (where it would be nice to include), or even rejected patches on other systems (where I think it would be misguided).

That this functionality still doesn't exist years later I think is due in part to the problem domain – people find it easier to change parts of their workflow than to add this functionality, and the whole problemspace is kind of hairy which makes it hard to advance without causing regressions in somebody else's weird workflow. (And admittedly the code is hairy too – but who wants to fix that? I don't! No one does!)

> I didn't put binary support into pip because (a) it didn't work well on some systems, and (b) I didn't use the other systems. I am not aware of rejected patches to support binary packages on Windows (where it would be nice to include), or even rejected patches on other systems (where I think it would be misguided).

I did not even manage to come up with a patch that adds binary support. I looked into it one or two months ago and the way pip finds and installs packages is very different from how easy_install does it. It looked like too big of a task for me to do in the time I allocated for it.

Until two months ago I did not care at all about binary distributions. It never was an issue for me. Suddenly a requirement change and I was presented with a problem that turned out to be tricky to fix with pip but easy as pie with easy_install.

This is not a criticism on either pip or you as a person, just an observation of how easy it is to miss something because it does not fit ones personal requirements. I certainly did not care about binary eggs until I had to deploy code to more than one machine and wanted to have the ability to quickly switch between releases.

> And admittedly the code is hairy too – but who wants to fix that? I don't! No one does!

Yes. And now it's getting replaced with a new system and I hope that's not going to make the same mistakes. I certainly don't feel like I'm up to the task of making a replacement for distutils/setuptools.

I think you are confusing "missing a requirement" with "not prioritizing a few other people's requirements". That pip is missing this feature has always been documented, I always knew it was a missing requirement, but I did not have any personal motivation to resolve that.

Pip is great for many many use cases, but I think it kind of get overhyped as a replacement for easy_install. People just get upset when they encounter situation where it doesn't live up to the hype.

What hype? pip can uninstall packages. This is a very hard feature to live without for everyone. Binary eggs seem to have a devoted following but are not a universal need.

> This is a very hard feature to live without for everyone.

I never ever uninstalled a package since I started using virtualenv. I always just trash the venv every once in a while and recreate it from a requirements.txt.

No arguing that working on setuptool to improve it would have been better but the thing is that Phillip (setuptool) hasn't been cooperative (to say the least) and Tarek ended up forking. At least that's how I remember it, so in that light, everything on your article is true but setuptool had to be "rebooted" as pip.

I think that pip + distribute intended to be a pragmatic approach you're calling for and that distutil2 is the cleanup/rewrite/redesign you said should happen after. Except it's not as clean cut as "repair setuptools with as much hacks as necessary and then rewrite the implementation once all design decisions have been finalized".

That said, you're completely right, use cases were forgotten and distutils2 might have started too soon. Hopefully your article sparks interest in those others use cases.

"... Where am I flaming anything to death here? ..."

The title. Change the title to something neutral and it will match the article. Good read.

> try not to replace tools in the future without understanding all the usecases of the old tool first.

This is exactly why autotools is so hard to replace in the C/C++ world for distributing source packages on Linux/Unix/etc., and why CMake/SCons/Jam/etc. have only ever had success in the right niche (projects which are very large or cross-platform).

I think you missed the subtext. The article wasn't projecting hate, it was describing how previous hateful reactions have spawned incomplete replacements, and have made the problem worse (edit: or at least more complicated).

This article is an explanation of the mechanics of that process: why packaging isn't great. It's a software engineering lesson about use cases and partial supersets of features. I really enjoyed reading it.


Every developer should know one of the very helpful and working ways of doing things in open source software development: "Stop complaining, fix it."

It's one thing to complain about closed source software that you have to use and probably paid for as well. But a very different one to complain about free and open source software that you want to use.

Stop the bickering, start the work on fixing the problem instead.

While I don't think it's to the same degree, I do think that "fix it" is at least in the same category of "things that turn people off OS" as RTFM.

There are different level/kinds/what-have-you of users/coders. There are wizards who write operating systems, grey-beards who write API's, application developers, scripters, and application users. It's perfectly valid for one kind of person to complain to another with no hope of fixing the issue themselves.

I work on open-source middleware both for a living and for fun and I welcome complaints about particular use-cases just as much (if not more than) patches. Cultural silos do exist between the different kinds/levels of coders and these kinds of heated conversations are needed. I can't even count how many times I've received a complaint from an app developer that ended up being a valid use-case that I had not foreseen. In fact, most of the time fixes for these issues are easy on my end. These complaints are helping me expand the user-base and utility of my code. In cases where the fix is not simple, it means there is a deeper architectural issue that I need to put on the road map. Even if they hurt my pride occasionally, these users are helping me and are contributing.

I just want people to know there are avenues to help, and that core developers and many others aren't blind to the mistake, missteps and other things wrong.

I'm exhausted by the vitriol; I'd ignore a bug report (and have) that described things as idiotic, and there are avenues for people to help chime in and guide the future.

Nothing is unfixable.

I didn't see any vitriol in the article.

> I didn't see any vitriol in the article.

I hope there was not because that was not my intention at all. It's more an observation that replacing things with other things is a dangerous thing because you can easily dismiss parts of the design as “useless” by accident.

I had the feeling that happened when ditching setuptools and that's a mistake that problem should not be repeated.

I made that mistake in the past already at one point where I tried to replace the Python logging library and I missde perfectly valid usecases in the process.

As a newbie coming into the python world, from the nodejs world, it's somewhat hard for me to just start fixing a broken system. It took me a few hours to fully understand the npm system, yet I still cannot understand the combination of tools python has -- virtualenv, burrito, pip, easy_install, etc. Sometimes I can get things to install with easy_install, but not with pip, and sometimes it's easier the other way around, mostly I simply don't comprehend what's going on under the covers so it's hard for me to start fixing.

I'd like to fix it too, but honestly I don't think people would adopt my package manager if I were to create something new from scratch.

  virtualenv => a way of isolating a set of python libraries. 
                Sort of like a chroot for Python libraries.  
                (Python's version of Perl's perlbrew or Ruby's

  pip => Supercedes easy_install, but as this post mentions,
         doesn't cover 100% of the use-cases (just the most
         common ones). Think of it as Python's version of
         Perl's cpanminus (aka cpanm).

  burrito => The README says it all:
                With one command, have a working Python 
                virtualenv + virtualenvwrapper environment

  virtualenvwrapper => A set of shell scripts/functions for
                       working with multiple virtualenvs in
                       development. Basically sets them all up
                       in a central location so that you can
                       do things like:
                          workon my-websight-library-set

how is nobody aware of PythonBrew? to me it seems like the best and most comprehensive environment manager. it supports separate Python versions as well as virtual envs, in a very simple command line interface that is reminiscent of Ruby's RVM:


I have heard of pythonbrew[1][2][3], but I didn't feel the need to list everything.

[1] http://news.ycombinator.com/item?id=3183721 [2] http://news.ycombinator.com/item?id=2795333 [3] http://news.ycombinator.com/item?id=2856131

As a sysadmin pip and its ilk annoy me as well, although for a reason which is not mentioned in the article: It creates an entire package management system which is not the distribution's package manager. Ruby's gems tick me off for the same reason. As a sysadmin you need to decide to either manage Python and Ruby entirely outside of the OS's native package management or try and wrap every single Python egg and Ruby gem in an RPM/DPKG/whatever. Mixing the management between two packaging systems is just going to cause trouble.

There's a similar problem with Perl's CPAN, which is what I think all of these projects are aiming to emulate, but the nice thing about CPAN is that repositories like rpmforge or EPEL already have a large number of commonly used CPAN packages all wrapped up in RPMs already. It would be nice to see similar community efforts centered around Python and Ruby packages. Perhaps one day I will have enough free time to start one.

The problem is the distributions packages are often horrendously out of date. I have stopped using them entirely even for things like Numpy and Scipy which I would really rather use them for (as they take a long time to compile and rely on several C libraries and a fortran compiler). It isn't my favorite thing to have my fabric task compile all the dependencies but there isn't really an easy way around it.[1]

In general I think pip has drastically improved my work flow and code structures. For instance: I no longer use submodules for my library code, I pip install the git repositories. Huge improvement, way easier to manage.

[1] Obviously, you can get around it if you do things like copy the virtualenv and fix the paths as in the OPs post. However, I wouldn't call that "easy".

It's true that the distro's packages can often be out of date: particularly with "Enterprise" releases like RHEL/CentOS. That doesn't mean that one shouldn't use the package manager, though. Building custom packages to backport updates and tracking the upstream for security and bug fixes isn't the funnest thing in the world, but it is often a necessary evil. It's certainly better than a "compile from source then dump everything into a tarball" approach which leaves you with no good way to track what versions of which software are installed on which nodes.

It isn't a necessary evil when it isn't necessary. If you have isolation (like virtualenv) then each app gets exactly what it needs. Life is too short to waste it trying to make every single bit of Python use the same versions of dependencies, pinning the version in the package manager, etc.

The universe does not revolve around lazy sysadmins

I think that a 'best practice' is to do something like create a virtualenv for your app and package the entire virtualenv in the distro's package manager, then version that.

That doesn't mean that one shouldn't use the package manager

Using the distro package manager for eggs, gems or jars is an exercise in futility.

Bundler, Maven, NPM et al are not flawless, but maturing rapidly, and a much better fit to the problem domain.

I don't understand why they can't all play along. Why can't pip packages be generic enough to be transformed into rpm/deb packages? Why can't a repository of these packages be maintained with a comprehensive and up to date selection of packages for each OS?

Why must everyone reinvent the wheel, but do so in an incomplete way?

Ubuntu is pretty good about keeping it's packages up to date and specifically Numpy and Scipy.

The answer, as far as I can tell, is that it makes it easier for Python (and Ruby and Perl) developers to share code and stay on the "latest and greatest" versions of their libraries. I mean, as a library author, what would you rather do? Just create a single egg, gem, CPAN package, or whatever? Or would you rather create three or four versions of the same package (one for Debian-based systems, one for Fedora-based systems, one for Gentoo, one for Arch, etc.)? I understand that it can be frustrating from a sysadmin perspective (pip, especially, isn't sysadmin friendly at all), but from the perspective of a library author, eggs and gems are much preferable to debs and rpms because I know that they'll work in a reasonably distribution independent fashion.

EDIT: I know that's not a satisfying answer. But, as far as I can tell, it is the primary reason that runtimes like Python and Ruby include their own package management rather than falling back on the package management that the OS provides.

I don't agree with that approach. It sounds nice for a sysadmin, but as a programmer I don't want to have to wait for you or the distributor to approve and package the libs I need to work with.

>I don't agree with that approach. It sounds nice for a sysadmin, but as a programmer I don't want to have to wait for you or the distributor to approve and package the libs I need to work with.

Which is why development, QA, staging and production are all separate environments. Do what you want in your dev environment: when the code is ready for testing let the ops team know what the dependencies are. We'll take care of wrapping everything in packages and updating chef/puppet/whatever to deploy it.

This doesn't consider the use case of needing to have multiple versions of the same package installed for testing. Sure, it would be nice to be able to have a second machine provisioned to do testing on.

And, as a extension of that, if you are using virtualenv+pip for testing, it makes sense for the deployment to use that too, as that means that the testing environment is closer to the deployment environment.

I have the same problem, fortunately there is fpm. I use it to create RPM binary packages from pip repos -- problem solved.

First I've heard of it, thanks for the pointer!

I would have liked to see a rationale for using virtualenv and all this python-only stuff, when rpm/apt/msi had more mature tools and would have given sysadmins a unified sanely manageable view of what's deployed and their cross-language dependencies.

I am going to guess you are coming at this entirely from the perspective of a sysadmin who doesn't do a lot with python.

virtualenv allows you to easily isolate dependencies so that multiple versions can coexist in different projects without fighting. If you sysadmin significant amounts of Python stuff then you would have already seen that benefit. It also allows you to control PYTHONPATH more elegantly than setting it in every script. This lets you write clean sensible imports rather than using fragile path tricks or wrestling with relative imports.

Your project dependencies are often not exactly the same as what the distribution gives. On one hand, distributions can be mind-bogglingly slow about releasing updates to python libraries. On the other hand, you may not want to force your project to use a new update of something because it may introduce bugs. It is pretty essential to have some lead time to port if your app is anything important. So you need control of the versions of your dependencies.

So if your project dependencies are not exactly the same as what the distribution uses, then you will need some kind of isolation mechanism in order to allow your project to work as a distribution package. You could do that, but you could also save time and just use virtualenv.

Creating platform-specific packages is nontrivial, also it is platform-specific. If I write a library for Python I only want to specify the packaging metadata/scripts once. I am certainly not interested in trying to get it accepted in repos for every distribution.

> Creating platform-specific packages is nontrivial, also it is platform-specific.

Unfortunately, Python-specific packages are Python-specific. I don't want to manage one package manager for Python, one for Ruby, one for Perl and yet another one for system libraries. And I'd prefer to have to trust only one vendor for security updates for everything.

And anything that is not Python depends on system libraries, which are supplied by apt or yum and live in an distro-specific namespace. How does setuptools declare and pull in a dependency on a system library?

I understand the need for Python, Ruby etc. to have their own packaging for cross-platform use, including on non-Unix. But system packaging tools have their uses too. There is no easy answer, and certainly no One True Answer.

I never even implied that system packages do not have their uses, or that python packages are free-standing of libc. That is just a straw man.

In fact, I agree with you for system libraries or tools which happen to use Python and really do not need virtualenv because they are really system-global.

Where I don't agree with you is with use cases you clearly don't understand, where you are developing and/or running multiple python apps in which you NEED isolation or you NEED to manage the versions of your dependencies. Your package manager is not helping with that at all.

And if you don't understand these tools and configuration management tools, then you should not be a sysadmin for projects which involve significant amounts of Python or Ruby.

A system packaging system that was better about sandboxes and running certain programs in certain contexts would be the One True Answer. NixOS (a Linux distro) at least has the requisite features as bullet-points, but I'm yet to successfully get it to install in either of the two times I've tried. (And my level of experience with Linux installations is "no longer need to consult the Gentoo manual to install Gentoo".)

> I am going to guess you are coming at this entirely from the perspective of a sysadmin who doesn't do a lot with python.

I am going to guess you are coming at this form the perspective of a small web developer who hasn't deployed large projects at various customer sites, and then had to support them. Your projects are also just python, not much anything else (C++, Erlang, data files).

For example if you deploy on a well known OS and you deliver a software package to customers and expect to upgrade and maintain them then OS packaging makes sense. Otherwise you are re-inventing the wheel and are shooting yourself in the foot.

I, for one, don't see how you can just dump files one system. Then when it comes to upgrading, just never removing the previous files and simply overwriting files again.

> Creating platform-specific packages is nontrivial, also it is platform-specific.

As much as everyone bashes distutils it does/did? have a reasonably easy way of building rpms -- setup.py bdist_rpm. It has bugs, but it works.

How do you use rpm to provide isolation of dependencies and simple control of PYTHONPATH? This lack of awareness shows that you don't understand the purpose of virtualenv. Nobody is arguing about the utility of packages, but you really don't understand the Python world at all if you think that packaging systems cover the use case of virtualenv.

I wasn't talking about virtualenv only. I was referring more to just using setuptools and pip to manage dependencies and upgrades.

However, system packaging is orthogonal to virtualenv. You can have virtualenv setup packaged inside an RPM (an isolated, self contained package).

> How do you use rpm to provide isolation of dependencies and simple control of PYTHONPATH?

Our product has about 50 or so RPMs. Only a couple of packages needed self-containment. Isolation has its downsides -- you cannot reuse code. So we have managed to avoid it as much as possible.

> but you really don't understand the Python world at all

I understand the "deliver a product" world though. Python is part of the product, not the whole product. I think it is little short-sighted to assume everything is just Python and that is the end of the world, and just run "setup.py install" and you are done. And yes, we have been through the "just run pip" or "just run setup.py install as root" before, and it creates a dependency and upgrade tracking nightmare.

Install a plugin using "setup.py install". Then update the package and remove some modules, run "setup.py install" again.

Now all of the sudden your system is messed up because it picks up old, modules that have not been cleaned by properly removing a previous package when upgrading.

My perspective is a developer on a small team who had to try and support a legacy environment in which no two machines had the same versions of the kernel, native libraries, web server, script engines, and our code, and every box was an irreplacable work of art which we basically had to /bin/dd to migrate. It took us a traumatic year to get from there "yum install our-app" producing a usable app server on a freshly-imaged box, and I don't think we would have succeeded without being able to express all our dependencies regardless of language.

As for mixing different versions of a library, that way lies madness. I regard it as a problem I want the package manager to prevent. An app that isn't on the company-wide version of a library is like radioactive waste—keep it tightly contained on separate hardware and dispose of it as soon as we possibly can. This does make upgrades a Big Deal involving a lot of regression testing, but it's better than troubleshooting code that both is and isn't what's running.

> Creating platform-specific packages is nontrivial

Not so much any more - for some platforms, anyway: http://lists.debian.org/debian-python/2010/01/msg00007.html

Also look at fpm: https://github.com/jordansissel/fpm/wiki

This sysadmin uses it frequently and it's a big time-saver for many software packages.

You have a system running "Package A v1.0", which requires "Package X v2.3" and "Package Y v4.5".

You need to install "Package B v2.0", which requires "Package Y v5.0".

Note that Y4.5 and Y5.0 are incompatible. RPM/APT have ways of dealing with some of this sort of thing, but python doesn't necessarily support it. Packagers would have to do things like patch both A and B to do things like "import y45 as y" and "import y50 as y". (I've seen this approach.) Of course, this breaks Packages C and D, neither of which are packaged, that want to use Y4.5 and Y5.0, respectively.

Multiply this by a dozen top-level packages and scores of dependencies with potentially sparring versions and it becomes a huge mess.

Consider that developers are keen on using the latest shiny bauble, and are thus pulling code directly from github/etc. The authors of these packages don't always advertise (because they don't know) strict requirements. They might say "Doesn't work with Y5.0", but they aren't aware that it won't work with Y4.4 and don't mention that.

Working out all of the interdependencies is why the ecosystem has packagers and distributors. These people do the thankless job of sorting this all out. But they can't keep up with the bleeding edge, and sometimes developers have nonstandard requirements, so developers still often resort to pulling from trunk (or local copies) instead of using packages.

Factor in the possibility that the system default python is ancient (2.4, 2.5) and the packages depend on newer versions (2.6, 2.7). (Sadly, there are systems that are running CentOS5 (python 2.4) with no hope of upgrade. I know of an internal production system that is still running Red Hat 9! What's that, py2.2?)

In the end the only reasonable approach is sandboxing via virtualenv.

"In the end the only reasonable approach is sandboxing via virtualenv."

Why wouldn't dependencies on versions have worked? If it doesn't matter, import blah. If it matters, import blah-1.2.3.

> rpm/apt/msi had more mature tools

MSI is terrible. It has older tools, and maybe that's what you mean by mature, but every time I've ever interacted with an MSI build process, I walk away amazed that this is the actual, certified, Microsoft-approved way to package applications for the most popular OS in the world.

Furthermore, it's not really that great of a dependency tracking system, although in theory you can use it for that. Certainly nowhere near APT or RPM.

And on OS X, you've got a whole hodgepodge of competing solutions. Which one of those should the Python runtime rely on to quickly resolve packages and modules at import time?

One reason is that many distros are very slow to update the versions of these packages. It's much easier to get up-to-date packages from language-specific package managers.

Non-python programmer here. The author refers to "Python's idiotic import system" - can someone explain why it's idiotic? How does it differ from Ruby or Perl (both of which I am familiar with)?

> can someone explain why it's idiotic?

It's just very complex and until Python 3.3 it was very hard to split things on the filesystem. See also http://www.python.org/dev/peps/pep-0420/

Generally I think Brett Cannon has at least one talk online about how the Python import system works which is quite interesting.

What you can't do with the Python import system is having an `org.foo.bar` and `org.foo.baz` as namespaces if they are from different PyPI packages unless you do some hoops (pre Python 3.3).

My own opinion, being more knowledgeable about Python's import than I wish I was, I feel like Node.js's require() system is what I wish Python's system was – some of the same simplicity of Python, e.g., with externally-defined names (not a name embedded into the source itself), but with a much more comprehensible and less magical loader.

Working in both python and nodejs day to day, I'm thinking the same thing. npm, coupled with nodejs's import mechanism, makes things so much simpler and it's because it goes a long way to eliminates globals. Developing node modules is a pleasure. Python is a headache.

Ruby and Python user here:

If you ignore for a minute the complaints about the way Python modules are linked to the filesystem, I'd say there are actually some big advantages over Ruby in Python's module system.

Mainly just that you actually have proper namespacing features which are orthogonal to the notion of classes or mixins.

In Ruby namespacing (in as much as it exists) is implemented via a clumsy dual-purposing of the Module/Class system; the mechanism used to import methods into a class from a mixin, is the same one used to import constants from a module's namespace. A lot of the flexibility you have when importing things between python namespaces, just isn't possible in Ruby, at least not without a lot of boilerplate or dubious metaprogramming.

I don't mean to start a flamewar here though. I could list Python warts just as easily. Just saying, Pythonistas, be grateful for what you have :)

I'd like to know the same.

In my limited knowledge and understanding, python's import system is actually better than ruby's (since by default you avoid dumping stuff in the global namespaces) and java's (since you can rename on import).

I believe modern perl has some nice things such as versions, "unimport" statements and explicit client control of import hooks?

Please correct me if I'm wrong.

Indeed, Perl6 has module and class vesioning [1] at the 'language' level. It's pretty interesting on paper, just like everything else about Perl6 (imho).

I wouldn't call populating the global namespace with a modules' contents an inferior default. Some like it that way and I'm glad that Python is flexible enough to allow both implicit and explicit imports.

[1] http://perlcabal.org/syn/S11.html#Versioning

I'd say the biggest difference between Python and Ruby's is that Python's has a syntax, whereas Ruby's is a function call, this makes Ruby's more flexible since you can always replace it, add new arguments, or in other ways manipulate it. I'm not sure if that's what Armin is referring to though.

> I'd say the biggest difference between Python and Ruby's is that Python's has a syntax, whereas Ruby's is a function call

This is not true on two fronts.

First, the import statement is a convenient, hookable wrapper around the __import__ function. It's quite customizable already. See PEP302 which references and allows implementation of PEP273 (importing modules from Zip archives). Other points of reference include [0], [1] and [2]

Also, the biggest difference is that when you 'require', you require a filename (i.e the arbitrary content of a file in the filesystem), but when you 'import', you import objects from a namespace into a scope. The namespace is resolved as an item living in the filesystem. In Ruby I could require 'foo/bar' and it could create the constants Foo::Bar and Qux::Quux. I have no way to require Qux::Quux. Rails for example tries to autoload constants based on their names, but for all I know, the second I reference Foo::Bar, it could actually define Qux::Quux. Ruby encourages this style of programming, spreading extendable classes and modules around in multiple files, while Python decides that things should be contained and well-behaving, and not trivially pollute unrelated module namespaces.

My analysis is that by definition you just can't have both, and that either one has a set of benefits and drawbacks.

[0] http://docs.python.org/library/importlib.html#importlib.impo...

[1] http://docs.python.org/library/imputil.html

[2] http://docs.python.org/py3k/library/importlib.html?highlight...

JFTR python also has a function call to import modules http://docs.python.org/library/functions.html#__import__

We just have really high standards.

Yes, setuptools is broken. Proposal: don't use setuptools!

It's funny that the author doesn't make any mention of this, which is not new news: http://docs.python.org/dev/library/packaging.html

Maybe not every use case anyone ever proposed needs to be standardized.

That isn't the takeaway I got. It was that Setuptools is broken, but so is pip and distutils2. Pip lost very worthwhile functionality that exists in setuptools, namely binary eggs.

Binary eggs aren't a good reason to trash pip, which otherwise works great. He really didn't say how distutils2 was broken, just expressed some sort of prejudice against it.

> aren't a good reason to trash pip

Good thing that never happened then.

pip isn't broken just because it doesn't support a particular use case from setuptools.

Again, it would be nice if you stopped making shit up on the spot, TFA didn't say anywhere that pip is "broken" or anything even remotely close to that.

> That isn't the takeaway I got. It was that Setuptools is broken, but so is pip and distutils2. Pip lost very worthwhile functionality that exists in setuptools, namely binary eggs.

The takeaway should be that it's very easy to miss usecases by accident when replacing tools with other tools.

packaging hasn't been proven to be any better. It's just shiny and new. That recommendation is somewhat irresponsible.

That isn't a recommendation, that is a module which was not mentioned at all in an article where it is directly relevant.

It is standard; it is trying to fix the problem, rather than lambasting Python because setuptools is broken and you have prejudices against the solutions to the problem which are actually moving forward.

but they aren't finished with it yet. so why should it be recommended? that's irresponsible.

I never recommended anything, that is your fixation.

What is really ridiculous is to flame about how there is fragmentation in Python packaging, and COMPLETELY IGNORE the ongoing efforts which have made progress in exactly that area. And if someone tries to mention this progress, call them irresponsible. That is ridiculous.

> What is really ridiculous is to flame about how there is fragmentation in Python packaging, and COMPLETELY IGNORE the ongoing efforts which have made progress in exactly that area. And if someone tries to mention this progress, call them irresponsible. That is ridiculous.

It was pointed out in the past already that packaging/distutils2 is not yet implementing all functionality of setuptools. Right now I would not recommend using it personally. Then again, just my opinion.

Does anyone know if namespace packages actually work?

I've been wanting to put up a few utility libraries on Github. I don't think they're good enough for PyPI, and I mostly just want to be able to pip install them and have access to a few minor functions that I use periodically. However, I don't think they deserve their own top level library name, ideally, they'd all be under "socratic.*" or similar. However, there seems to be some sort of mess with init in the top level of the package, some pkgutil fix, some sort of incomplete PEP (maybe 402?), etc. Should I just give up and name my internal libraries "socratic_{name}" and be done with it? Or does having multiple packages with the same namespace work?

> Does anyone know if namespace packages actually work?

They used to work somewhat. I had flaskext.* registered as a namespace package but unfortunately it conflicts with pip which is why new Flask extensions name their package `flask_bar` instead of `flaskext.bar`. The exact problem is that setuptools uses pth magic to put libraries into a namespace package which conflicts with pip's idea of installing packages flat into site-packages.

It gets bad if you use pip and easy_install with namespace packages. If you stick to one or the other you should generally be okay.

But yes, just use socratic_{name} – it makes everything easier, and "." and "_" are just string differences. The only good reason IMHO for using namespace packages is because you are breaking up an existing package into multiple packages and you want to keep the dotted names. And even then I might just prefer a compatibility package that does the mapping and move things to entirely new package names.

it has some rough edges, sometimes when I try to build the docs for one it has a hard time importing both packages until I kick it a few times, but they work. I'm doing it for these two modules:

http://pypi.python.org/pypi/dogpile.core/ http://pypi.python.org/pypi/dogpile.cache/

edit: also they work fine with pip so not sure what armin's issue here is.

> edit: also they work fine with pip so not sure what armin's issue here is.

See this issue which ultimately made me ditch them: https://github.com/pypa/pip/issues/3

Hate, hate, hate on specific platform, not everywhere! All that trouble that he complains about is only on *nix. For Windows is quite simple. Install Phyton, then look here for other packages: http://www.lfd.uci.edu/~gohlke/pythonlibs/

Edit: I should have said ”For Windows and ReactOS”

Programmatic deployment on Windows is still a massive problem (especially for extension modules-- e.g. the entire scientific Python ecosystem).

I have used Pip, Virtualenv and virtualenvwrapper-win on windows with very little trouble... except for anything that isn't pure Python.

In order to install PIL in a virtualenv on windows I ultimately ended up downloading the PIL windows installer, extracting the required files and placing them in the virtualenvs site packages directory.

This is what needs fixing! Pip with access to compiled windows binaries.

Install Pip on windows: http://stackoverflow.com/a/4921215

Virtualenv Win: https://github.com/davidmarble/virtualenvwrapper-win

> This is what needs fixing! Pip with access to compiled windows binaries.

That's the same issue really. Pip can't install (by design) binary distributions. Neither eggs nor .exes. So you are limited to easy_install on windows in many cases.

Self hosted PyPI (with eggbasket or even a static dir listing), combined with easy_install, is an old and unsexy solution that just works. Instead of saying we are limited to easy_install on Windows, I say we are embraced by setuptools on all platforms.

Inside your virtualenv you can just easy_install pil_installer.exe.

Edit: I think I should expand on the limitations of this. You can do this to any exe that was built with distutils, which you can check by changing the file extension to zip and seeing if it opens up.

A workaround is to edit the registry to make the virtualenv available as an install target:


Thanks for the link! That is interesting.

I followed though to the SO thread and discovered that you can install windows .exe installers using easy_install into virtualenvs. That is really usefull, its a shame that PyPi doesn't also link to the windows installers so that you wouldn't have to put the path to a windows initialler in.


Install windows binary packages into virtualenvs with easy_install - http://stackoverflow.com/a/5442340

Actually I've tried it and got no problem. Have you tried it? Installing packages on NT-kernel OS is different than on *nix, BTW! I guess that's what it was with "programmatic".

Could you explain how Christoph Gohlke's .exe installers solve the programmatic deployment problem (clicking through an .exe installer is not "programmatic deployment")?

Compared to the effort to build the binaries it is trivial to repackage the content of those installers into EGG, MSI, NSI or other programmatically deployable formats. In fact the Pythonxy and Enthought Python distributions repackage some of those installers into their formats. The exe installers created by Python distutils are the least common distribution/packaging option that works on Python 2.4 to 3.3, 32 and 64 bit. The installers are valid ZIP files and it takes a few lines of command line or Python script to extract the content to any target location. It is the sheer availability of those binaries, and the feedback (incl. patches) to the package authors about Windows build and runtime issues that is the value of the project. It does not solve the Python packaging or deployment problems.

Everything can be wrong when measured with a wrong reference. That, BTW, is used in marketing and PR to gain a better perception for one's "different" design. You have to do it "programmatic" on nix because that's how it's done there. Easer alternatives just aren't developed enough because those are't in accordance with the principles in place, thus aren't well received. Some like it, other come to hate it, like this guy did. You don't have to use "programatic deployment" on NT-kernel systems once all you want to do is get the job done. Stop thinking nix all the time.

Being able to build in one step is a common practice in software engineering everywhere, not just on unix. If you've ever heard the term "Continuous Deployment", then build in one step is being done (and hopefully more besides).

How do those nice .exe files work with virtualenvs?

You are using virtualenv, aren't you?

I was asking about whether the installers respect existing virtualenvs or whether they install globally.

That's great as long as it's a common package, and you want to install it globally, and you don't care what version you get.

But if that's all you want it's easier to type eg. "apt-get install python-reportlab" on *nix.

Among those common packages are pil-py3, numpy-mkl, scipy w/ umfpack, curses, videocapture, orange, twainmodule, simpleitk, pymol, vigra, cellprofiler, cellcognition, pythonmagick, cgkit, scikits.vectorplot, nlopt, slycot, casuarius, ilastik, iocbio, vlfd, mmlib, pysfml2, pywcs.

It appears as though in order to get Flask this way, I have to either also get Pylons, googlemaps, etc... or I have to unzip an exe and install it manually.

This really does not look pleasant.

The "Base distribution" is a distribution of packages for some specific needs. If you need individual installers for Pylons, googlemaps etc. you need to look elsewhere.

Yes, that is what I gathered. I mist can't see this being something I would want to use.

I hate package dependencies on windows.

Most languages are in the same boat. PHP, Java, C, etc...

Another post on the same subject written a few years ago: http://www.b-list.org/weblog/2008/dec/14/packaging/

Ugh...python packaging, it makes my head hurt every time I have to figure out how it works again. This is something that Ruby got very right, how could Python be such a mess?

I wish it was just a simple 3-step process to fix http://www.youtube.com/watch?v=HR_5QFZ-kbk

I'm using bento, which keeps backward compatibility with setuptools/distribute, but aims at fixing the traditional Python packaging tools:


It is stable and usable.

I'm a python-learning guy who uses npm and homebrew.

Comparing to npm and homebrew, it seems to me python's package manager is quite tricky.

It's all baby stuff compared to autotools.

Title of the post reminds me of the Player Hater's Ball.

Why doesn't the Python world just copy CPAN and update it for Python? You can e.g. literally find stuff in CPAN with hundreds of dependencies -- which is a pain, but even that works.

At least CPAN testers? Please...

If Python people insist that there is only one way to do it -- and the answer is Python, then rewrite CPAN in Python. Copy stuff that works...

"Only one way to do it" is a point of language design, not an assertion that Python is the one true way.

Python does have PyPI. Works fine. Use pip to install and uninstall packages. Use virtualenv to isolate dependencies/PYTHONPATH, as needed.

Can you be specific about what aspect of CPAN you think should be ported?

Specific? I already mentioned CPAN Testers. And that CPAN works... (I've seen people with years of Python experience that had problem doing simple things. Edit: Including uninstall of packages, which you mention.)

Edit: Here you see a result of CPAN Testers for a module with complex dependencies. Note statistics on where/how dependencies are run. Also click on a dependency icon to see the matrix of perl/os versions.


I don't have any problem doing uninstall of packages. Use pip rather than easy_install. You are done.

I'm not convinced that CPAN is better

Good for you, re uninstall...

(I really didn't expect Python users to admit that anything in any other environment could be better... :-) E.g. running tests at install by default is obviously bad? :-) )

So no comment about CPAN Testers -- automatically installing dependencies, running the test suites on different perl/os versions and then getting neat reports? What is the better way in Python?

Despite all the hate, Maven solves a lot of problems in the Java land and I couldn't be more happier.

There is really no exact equivalent, but you could give buildout[1] a go if you find yourself in need of a maven-like tool in Python land.

[1] http://www.buildout.org/

Sweet, thanks for pointing out that BO is closer to Maven!

I've heard a few Python SCM tools (setuptools, distutils, fabric, scons, buildout) but haven't gone deep into the Python land yet.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact