Hacker News new | past | comments | ask | show | jobs | submit login
Incremental Plans to Improve Python Packaging (curiousefficiency.org)
63 points by pmoriarty on Dec 21, 2014 | hide | past | favorite | 37 comments

I'm going to say it, even though I'll probably get slapped by all of HN- PHP beats the crap out of python for this exact problem. The php package manager Composer (https://getcomposer.org/) is an amazing tool that I wish python would emulate. It uses a very simple json file for metadata, an indexing site to point package names to actual git repositories, and then git tags to handle versioning. The workflow is very simple and tends to meld with actually development nicely. In the end all you have to do to make a new release is tag it.

Okay, commence telling me I'm an idiot for thinking PHP has some ideas worth looking at.

The problem is that Python libraries will often depend on code compiled in C / C++ or something else, and inevitably something as simple as "just have a git repo" just doesn't cut it (for someone somewhere with some specific workflow).

Package management is not rocket science, but it's very political in terms of deciding who should be heard and who should be let down by The One True Blessed Way Of Packaging.

rubygems and bundler made me jump ship from python/django to the ruby and rails ecosystem. the package management and isolation in that is top notch. virtual env and pip/easy_install/whatever just don't match it for repeatability.

I like python too but the packaging ecosystem is nowhere near where it should be.

If you ever find yourself back in Python land, give Conda (Miniconda is the barebones install, Anaconda is the complete version, if you don't know that you want Anaconda, starts with Miniconda)

It's replaced virtual_env and friends for me, and with some more experimentation it might also replace apt for my own packages deployed to other machines. It's written in Python and for Python, but it's much less python-centric than you might expect, and it is nothing short of excellent.

good to know that things are progressing over there. it would be nice if the packaging ecosystem was more canonical like rubygems and bundler but if it works well then that's awesome.

virtualenv beats the crap out of rvm though. If you ever tried ton use rvm in crontabs, etc

rbenv is much better and doesn't overwrite "cd" (yes the builtin command). I thought everyone moved away from rvm years ago?

yeah but you don't need rvm. rvm is not great.

edit: so what that it beats rvm? bundler + ruby-install does everything that you need and works very well.

Would you please elaborate what features Composer have which pip doesn't? You can pin verisons, you can install multiple packages from one requirements.txt, you can use virtualenv for isolated environment, you can install from git (yes, you can specify branches, tags also), from web, from gzip, from different formats. What is missing?

> you can use virtualenv for isolated environment

That's my biggest gripe with Python packaging. Fundamentally, virtualenv is a hack for the fact that the Python ecosystem fundamentally doesn't embrace that one computer can have different projects with different dependencies. Compared to npm (for example), it's much much worse.

With Python 3.3 the venv module is now part of the standard library and the interpreter itself has been modified to support the isolation that the virtualenv had to use hacks to make happen. In Python 3.4 the venv module installs pip by default as well.

As one of the maintainers of virtualenv, it's my goal to move that project so that it will use the venv isolation mechanism when it is available, and have virtualenv just provide a level of UX overtop of it as well as shims for versions of Python that don't have the venv module.

> virtualenv is a hack for the fact that the Python ecosystem fundamentally doesn't embrace that one computer can have different projects with different dependencies

Not even slightly, virtualenv is the answer to exactly that problem.

I think the complaint is that virtualenv works by isolation; instead of resolving potential conflicts they are avoided. An arguably more elegant approach versions the dependencies of each package so that everything can be installed globally instead of redundantly for every virtual environment.

It might be elegant on paper but I can't think of a nice way that Python could support something like that. Which is a shame I guess.

I don't see what the problem could be. "pip install foo" would store somewhere that foo wants bar=2.0. The python interpreter will then upon importing foo specifically load /usr/bin/python.../foo-2.0. Not sure if it would be worth it though.

That already exists, setuptools has supported it for years and years. Nobody uses it though and they prefer to use virtualenv instead. That may be because setuptools itself wasn't that great, or it may be that people just didn't prefer that mechanism for working.

Doing that isn't really much different than a virutal environment though. The only real difference is that in a virtual environment you essentially have "named" (by file system path) sets of dependencies that are automatically "activated" when you start up the Python interpreter. In the setuptools/bundler style you have in memory sets of dependencies that are activated by calling a particular API, often done automatically via a binstub.

> Not even slightly, virtualenv is the answer to exactly that problem.

Exactly. That's my point: you shouldn't need an additional tool to solve this problem.

npm is not built into v8. Composer is not built into php. Maven is not built into Java. Cabal is not built into Haskell. And all these tools benefit from being able to have their own release cycle, decoupled from the parent language.

If anything, the problem with Python is that it does have built in packaging tools - which inevitably became outdated, and just as inevitably remained in the standard library, aka the place where modules go to die. Everyone agreed that the standard tools were terrible, but because they were standard they hung on much longer than they should have.

distutils being built into the stdlib means that it's not very easy to improve the tooling by improving that module, since it's tied to the Python release and people can't depend on a new python release for many years.

setuptools isn't tied to the stdlib, though it had many problems and still does. A large portion of what was holding back improvements was that there was nobody really pushing through all the political nonsense surrounding the tooling, and there was nobody to say Yes/No when a consensus couldn't be reached. For normal featured there was the PEP process, but the PEP process didn't work for a long time for packaging because Guido admits he really doesn't care much about packaging at all. Now that we have BDFL-Delegates in the form of Nick Coghlan and Richard Jones and we have people willing to push through changes even when it takes a lot of pain to argue the points we're finally seeing the engine of progress start to grind to start.

Oh, and to be clear. When I argued for PEP 453 I was very explicitly against doing anything that meant pip wasn't upgradeable on it's own outside of the standard library release cycle.

Composer is possibly the worst after Python's pip. RubyGems and npm are much saner "role models".

Looks like this is quite old post, head over to https://packaging.python.org/en/latest/ for updated docs about Python packaging, especially 'Tools Recommendations' section.

Link needs a trailing slash: https://packaging.python.org/en/latest/

Opps. Thanks corrected now.

That link you posted is dead

Yup, python's packaging is even awful for the users. Half of the time pip cannot determine what version of a package to install for a given version of python on its own. "pip install --upgrade"? Gross.

However, you know what else needs improvement? PyPi. PyPi is downright embarrassing compared to rubygems, npm, etc... and not only needs a facelift but a whole redesign. This is the reason that I absolutely love to use python but dread using a python library that I haven't heard of 1st hand. Treading outside the standard library is an exercise in sifting through garbage.

Want to see which package for a "x" is most popular? No problem for any dynamic language besides python. However, search for "http" on pypi and you'll get a list of packages by weight which does not take into account popularity or download count or views, just the score of a textual match.

Why is that important? Well when you search for "http" you get these as pypi's top choices (no help at all):

    http               0.02
    CydraGitHTTP       0.1
    django-http-status 0.2
So what do we have here? A beta library (version 0.02) called "http" ranks 11 (off the charts) while kenneth reitz's insanely popular "requests", which should be mandatory as an http client library, is nowhere to be found.

Pypi isn't even serving up the latest version of the python packages it has some of the time (see: SCons).

All of this leads me to think the Pythonistas have given up on PyPi which does more to hurt the community than you can imagine: It leaves a bad taste in the mouths of those looking for quality libraries and software when they reach for Python as their tool of choice.

PyPI is due to be replaced with something better shortly.

>You know what else needs improvement?

The whole Python 3 situation? :)

I sure appreciate pip being included in Python 3.4. It's nice when writing tutorials not to have to provide instructions for how to install pip, so the user can then install packages.

It's annoying though, because distros like ubuntu pigheadedly decide to break functionality by removing ensurepip from the core 3.4 package because it violates debian policies put into place 20 years ago. As a result, there's no straightforward ubuntu-kosher way to get a python 3 virtualenv with pip working properly, and the users suffer for it.

edit: relevant bug and argument: https://bugs.launchpad.net/ubuntu/+source/python3.4/+bug/129...

I recently installed numpy and scipy in a new virtualenv on my OS X 10.10 machine and was very pleasantly surprised to see them being installed directly from wheels instead of being rebuilt (installation time was a nice 1min compared to 10+ min before). Does anybody know if the wheel format is now supported/standard on PyPi or was this caching of my local pip?

I understand wheels has been blessed as the way forward, so I would expect PyPI to support it (where developers make it available).

At least "easy_install" and the rotten "eggs" format seem to have been deprecated. Those tools had a failure rate above 50%. There were implicit assumptions about where things went, and if anything went wrong, there was neither a workaround nor a useful error message.

Nice, I've been doing a bit with packaging python modules and have found the current lay of the land incredibly confusing. This article helps quite a bit and seems to be the most authoritative guide I've seen on the subject.

As linked above, this is the current go-to ressource for packaging: https://packaging.python.org/en/latest/

This post is quite old as others have noted. For those interested in the plans for secure package and metadata distribution, a proposal is available now:

http://legacy.python.org/dev/peps/pep-0458/ http://legacy.python.org/dev/peps/pep-0480/

This text seems quite old. Now :

- wheels are more common, and popular libraries (scipy, cryptography) have packages built at least for windows. On linux it's not a big deal since we have package managers. - setuptools is now the only stuff we need again, since all has been merged in it.

It's still not very clear, but it's much better than before.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact