Hacker News new | past | comments | ask | show | jobs | submit login

Python packaging is such an absolute disaster. As a casual Python user, can someone explain to me how Wheels fixing the problem?



As a casual python user, I've always found it fairly straightforward.

As a systems engineer, I've always found it challenging to maintain a sane python ecosystem across multiple hosts.


As a (now casual) sysadmin, I think things have gotten a bit better. Virtualenv (now part of python3) helps a lot. The trick, if running on Debian, is:

1) use system python -- if you can. Now with python3.4 and 2.7 in stable, that's the easy bit.

2) If you can't do 1), use backports and/or manually backport[1]

3) If you need python-packages that depend on c-libraries, use "apt-get build-dep python-foo".

4) use a virtualenv, or several virtualenvs, eg:

  virtualenv -p python2.7 myvenv
  ./myvenv/bin/pip list -o #list outdated packages,
   # typically that'll mean you first need to:
  ./myvenv/bin/pip install -U pip #possibly also setuptools
Then install the packages you need with pip -- see 3) for tip on stuff like pyqt that needs some (fairly hairy) packages.

This works for most everything -- except perhaps for pygame, which is exceptionally gnarly.

[1] Manually backport, the easy way:

  sudo -s
  echo "deb-src http://httpredir.debian.org/debian\
   sid main contrib non-free" \
   >> /etc/apt/sources.list.d/sid-src.list
  apt-get update
  cd /tmp
  apt-get build-dep python3
  exit # No need to build as root
  apt-get source python3 # this assumes sid has a version
   # that'll work for you
  cd python3*
  dpkg-buildpackage -uc -us
  cd ..
  sudo dpkg -i python...deb
Note, this may or may not work with major packages like python -- but it often goes a long way. If you can't get enough build-deps satisfied, things are more complicated -- and you're probably better off running testing/unstable in a chroot, possible with help of the schroot package -- rather than fighting with conflicting packages.


The Python ecosystem is so good about portability, but flexibility always comes with a cost.

I'm assuming your systems engineer perspective is colored by RedHat (and others) packaging python-ldap at the system level and then leaving you to your own devices for everything else.

C bindings are really convenient, but then you have to worry about dynamic linking. Are you sure Python is linked to the same OpenLDAP library that NSS uses? Hence, system versions of some Python packages, but only where other system level dependencies are an issue.


"Absolute disaster" seems strong. The situation doesn't seem too bad to me. I can pip install things and it works. I keep hearing that Python's packaging is so bad, but I don't even know what the problems are.


Again as a casual user, I don't want to state anything too strongly, because some of it could be user error... but a few of the troubles I've had just trying to use and update Python in the last couple of years include packages from PyPI being broken and refusing to install; this including pip so pip was not able to update itself. Packages with dependencies that won't install from PyPI (or installation fails on my machine), sending you looking for tarballs. No clear route to upgrade between semi-major versions of Python. Things that install but need the config frobbing before they work - I'm not afraid of doing that but it's always a hassle.

I think some of the problems were self-inflicted by trying to use Python on a Mac with no package manager, and on Windows. Things will get easier on the Linux side with Python 3 in repositories now. But it really seems like a fairly common pattern for languages / programming environments is to install the languages package manager and upgrade packages separately from the OS package manager. For whatever reason this seems like a bigger headache with Python than others.


Try that on Windows for anything with compiled C bits. Curse a lot. Give up on pip and use easy_install. Which works. Maybe. If you're lucky.


Or give up on Windows and use a real OS.

But seriously, mirror your production environment for local development and testing. This is all comically easy with VMs, you just need to alter your workflow.


> Or give up on Windows and use a real OS.

If pip doesn't work on Windows then it shouldn't ship on Windows. This isn't Windows' fault.


Actually, I was very pleasantly surprised at how well pip+python3.4+visual studio (w/correct paths set up so that pip finds the compiler) worked.

Granted this was recently, under windows 8.1 (tried it mostly for fun, and so that I have a decent calculator (ipython) on hand when I boot up in windows).


Actually, distutils is looking for Visual Studio first in registry, then by env vars. So it should find VS without setting up anything. You just need to have correct version istalled (the banner in repl will tell you that, for example, for ActiveState Python 2.7 = VS2008, 3.x = VS2010).

I was also pleasantly surprised, building python C extensions under Windows is easy. Building some of the C or Fortran dependencies, on the other hand, might be not.


Right - remembered something about python giving me very nice error messages (probably over wrong version or 32bit vs 64bit) - along with pretty clear help on how to fix things. Just didn't remember exactly what small task (hit "y"/download+install - or set up env vars) I had to do to get everything to work. I do remember it was (for windows and c compilers) very easy.


I'm glad I'm not the only one that uses IPython as a calculator. I might have to revisit trying to get it working in windows 8.


I confess, I've only dabbled with venvs on windows (had to play along with the wonderful django tutorial created by/for djangogirls[1]). But my secret to working with pip on windows have been invoking it from inside python: install python (optionally set up a venv); then:

  import pip
  pip.main("install ipython".split())
of course, python for windows drops pip in the scrips-folder -- but doing everything from inside python is magically cross-platform :)

Pip will drop ipython in the scripts-folder too -- so that where one can grab a shortcut. Or just (after restarting python):

  import IPython
  IPython.start_ipython() # Ed: not .start()
[1] http://tutorial.djangogirls.org/


Actually using wheels on Windows is a superior solution for most packages now.


The disaster part is not the tooling, but indeed the nobody understanding how "X" solves the problem and actually choosing an "X" and stick with it.

There is a clear lack of _direct_ and _concise_ documentation about Python packaging. Something between overly descriptive reference documentation which nobody reads and "copy-paste this setup.py" that teaches nothing of substance.

Let me use the OP as an example:

  "Wheels are the new standard of python distribution and are intended to replace eggs."
Ok, I've done my share of Python coding and have published stuff on PyPI and this sentence means absolutely nothing to me. I'm not even sure I can define what an egg is if someone asks me. It probably means even less for someone trying to make it's own package.

One important thing to remember is that packaging is a means to an end. Nobody gets excited about packaging so nobody wants to know too much, but also not too little that thinking about doing it instills feelings of dread.

PS: This applies to other packaging scenarios too. I've seen people fall to their knees when faced with building Debian packages for deployment (ending up not doing it at all), RPMs, etc.


Python packaging seems pretty painful compared to what I've experienced with, say, Clojure (list dependencies in project.clj, done), but that seems like an unfair comparison. It's really common for Python projects to bind to C libraries for heavy lifting, while this is pretty uncommon in Java-land. Is there another language that makes heavy use of the C-lib-binding pattern which has a much more pleasant packaging experience than Python? Does Ruby, for example?


Ruby does not (IMNHO) have a more pleasant packaging experience, no (although, the inverse may be true - I think writing a proper gem -- and listing/uploading it -- is probably easier than a proper python package, and pushing that to pypi).

Python's virtualenv is a lot more stable than the various hoops one goes through with eg rvm. On the other hand, if you needed/wanted to use more than python2.7/python3.4 -- it might be a bit harder to juggle many python versions than it is to juggle many ruby versions with rvm.


I thought virtualenv was more comparable to bundler than rvm?


Yes, that's probably right. But is there a good way to have several instances of say, ruby2.2 but with a different gemset outside of nvm?


Probably the main difference is that you rarely really need "many" versions of python -- and hopefully one won't really need "many" versions of ruby (as much) any more either.

So the problem bundler/venv solves becomes a solution to 95% of the entire problem, rather than just 50%.


take a look at a requirements.txt file for your python project, you can install everything dep-wise in a one liner.


We stopped developing web projects on Windows and switched to Linux (Ubuntu) Workstation. We are one with the Universe again. Seriously, run a VM on a reasonably powerful Windows machine and you can have both worlds and it all works very well. My favorite setup is a three monitor rig with Ubuntu on a VM running on my main monitor and other stuff running on the other monitors under the Windows host OS. There are a lot of advantages to doing this, including also running something like a Xen hypervisor and simulating an entire (small) data center infrastructure.

If by packaging you mean deployment, yeah, it's a mess. Only in the sense that you can't just FTP you application to any old server like you do with PHP projects.

It'd be nice if Python for the web "just worked" but it really isn't that difficult to get things setup (Nginx, Gunicorn, PostgreSQL, etc.). Also, there are a lot of reason not to want a canned PHP-type setup.

We have a policy to never bring-up servers by hand. You capture your setup and configuration in Ansible scripts, document them well and things just work. Then use Selenium to confirm that all is well with a basic test application. Life is good.


It seems that every single language(even Haskell semigods) struggle with this. Guido himself stated that it's the part he hates the most in python.


I agree to a certain extent. The JVM/Maven pom appoach seems to work the most reliably across architectures, which is nice.

Python is a friggin' nightmare by comparison. Some trivial low-complexity operations can take insane amounts of elbow grease due to missing/outdated libs.


I think clojure/lein really shows off the power of Maven/Ivy for managing packages. But I'm not sure how well that works for c++/c-extensions?

Pip does a remarkable job of magically getting and compiling missing c/c++ libraries, and works great in tandem with virtualenv.


To this day I have not found a good way to distribute compiled JNI binaries. You can either consider them completely separate from your Java code and package them separately, or you can put them into your JAR and then do some horrible hacks of unpacking the JAR at runtime, detecting the architecture and loading the library from the temporary directory. This is really something that should be built into the standard package formats/tools.


Use https://github.com/fommil/jniloader - it's still the horrible hack under the hood, but it standardizes it all for you.


Wheels can be installed into a Python environment without executing code and can be cleanly uninstalled. They're effectively an iteration on eggs.


I've always found it fairly easy. Why do you think it's a disaster?


I'm in the likely unique position of being a serious Python user that has almost no experience with virtualenv, pip, setuptools, or any python packaging. I have no control over the hosts I have to support, which means I can never count on packages to be installed. I almost always use pure python 2.5-2.7. I'm also a Windows user.

From that perspective, I can see the OP's point. For most of Windows history, if you needed a library installed, you'd install it. Every installation was accomplished through the familiar Windows installer. If you needed a newer version, you installed it. There was process, and one target.

Now take modern Python. A new layer has been introduced, where you may run every project in a new target, or through the 'default' install. In addition, you have libraries and packages that may be installed through one of several different installers or processes, most of which are different than the OS's package management, and which aren't necessarily compatible and aren't tightly managed. This is on top of multiple python versions that may have to be installed.

I can see where he's coming from.

That being said I love Python and I respect the work that has been done to allow such a vibrant user base.


You're a serious Python user that has no experience with any of the last decade's tools for packaging, and you find things hard without using any of those tools, and THAT'S why you can see where he's coming from?


The key question I'd ask is "casual Python user coming from which other ecosystem?"

Have you tried setting $JAVA_HOME and restarting? rimshot

Here's a good overview of real world issues: http://lucumr.pocoo.org/2012/6/22/hate-hate-hate-everywhere/

(I'd imagine the core team thinks Armin should just embrace Python 3 already. And some of the details have changed since this post, naturally.)


And the update where he loves wheels:

http://lucumr.pocoo.org/2014/1/27/python-on-wheels/


I assume you're being sarcastic, since "it kinda works" is not a ringing endorsement (and the entire piece has the same sulky tone).


Well if you know Armin, you know that it about as close to a compliment as it gets from him :)

I help co-maintain salt, a config management tool which he used and contributed quite a few bug reports to.


It is only a partial solution. You need the pip and virtualenv packages along with a private PyPI repository. Binary wheels make it so that you don't have to clutter any environments except for build servers with compilers and other build tools. There's still the problem that Linux wheels don't differentiate on distribution, so you will hit problems if you build on one distribution and try to deploy on a very different one.


As a heavy Python user what exactly do you find "an absolute disaster"?

I've had a few issues over the years, but absolutely no showstoppers, and in no way any "disater" much less an "absolute" one.


How do I upgrade all packages with Pip? Why are there multiple ways to install packages, e.g. eggs, setup.py, pip, easy_install (just off the top of my head)? How does pip manage to break dependencies? Why does no standard way to manage packages come pre-installed? How do I pin packages at a version? Why is there a need for virtual-env to exist as a separate project?

Compare this to how languages like Go, Rust, Ruby, and Julia handle packages and dependencies and Python is an absolute disaster. Even if there are answers to the above questions, as a fairly advanced user I have no idea what they are, and I have done plenty of research.


I'll try to answer your questions.

> How do I upgrade all packages with Pip?

I don't know how to upgrade all packages, but I that's not something I want to do because I want to control which packages I upgrade. To upgrade a single package you can do

     pip install --upgrade packagename
> Why are there multiple ways to install packages, e.g. eggs, setup.py, pip, easy_install (just off the top of my head)?

Egg is a package format. setup.py is a build script. pip and easy_install are package management tools. You use setup.py to build eggs that you install with easy_install or pip. You can also install directly with setup.py, but that's not something you'd generally do. pip is a better, more recent installation tool than easy_install.

> How does pip manage to break dependencies?

I'm not sure what you mean here.

> Why does no standard way to manage packages come pre-installed?

I guess the answer is because no one had bothered solving this issue until recently. Starting with Python 3.4, pip is included by default. See https://docs.python.org/3/installing/index.html

> How do I pin packages at a version?

You list your packages with their version numbers in a requirement file that you can pass as an argument to pip. You can use pip freeze to get a list of currently installed packages with their pinned version numbers that you can include into your requirements file.

See http://pip.readthedocs.org/en/latest/user_guide.html#require...

> Why is there a need for virtual-env to exist as a separate project?

No need for that, it just hasn't been integrated in the standard distribution until fairly recently. Starting from Python 3.3, venv is included: https://docs.python.org/3/library/venv.html

> Compare this to how languages like Go, Rust, Ruby, and Julia handle packages and dependencies and Python is an absolute disaster.

Absolute disaster is a bit strong, but it's admittedly not as good as the other languages you mentioned. I think every Python developer who knows other languages will agree. That doesn't stop us from getting our job done though and the situation is improving.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: