I switched our webapp deployment from using sdists from an internal pypi instance (yay cheeseshop!) to using wheels built on the server and the time to install some things such as lxml, or numpy, went from 20 minutes-ish to 20-30 seconds. Python wheels are fantastic.
I've been looking at doing something similar in our environment but there's so many options I haven't figured out what the best and most straightforward way might be.
Usage is as simple as
mkwheelhouse mybucket.mycorp.co scipy numpy
pip install --find-links mybucket.mycorp.co scipy numpy
The downside of this approach is you can't host private packages, because you need to enable public website hosting. (Although, VPC endpoints might have changed this!) But the simplicity of this approach plus the massive speedup of not needing to constantly recompile scipy was totally worth it.
I'm using cheeseshop, but some people swear by warehouse, which is supposedly a legacy-free version for running pypi eventually.
If you don't care about the search api, you can also just enable an directory listing index page and use any web server. Pip will do the right thing when given the right incantation of magical arguments and you make a prayer to the pip gods.
You get the benefits and drawbacks of Google Cloud Platform.
As a systems engineer, I've always found it challenging to maintain a sane python ecosystem across multiple hosts.
1) use system python -- if you can. Now with python3.4 and 2.7 in stable, that's the easy bit.
2) If you can't do 1), use backports and/or manually backport
3) If you need python-packages that depend on c-libraries, use "apt-get build-dep python-foo".
4) use a virtualenv, or several virtualenvs, eg:
virtualenv -p python2.7 myvenv
./myvenv/bin/pip list -o #list outdated packages,
# typically that'll mean you first need to:
./myvenv/bin/pip install -U pip #possibly also setuptools
This works for most everything -- except perhaps for pygame, which is exceptionally gnarly.
 Manually backport, the easy way:
echo "deb-src http://httpredir.debian.org/debian\
sid main contrib non-free" \
apt-get build-dep python3
exit # No need to build as root
apt-get source python3 # this assumes sid has a version
# that'll work for you
dpkg-buildpackage -uc -us
sudo dpkg -i python...deb
I'm assuming your systems engineer perspective is colored by RedHat (and others) packaging python-ldap at the system level and then leaving you to your own devices for everything else.
C bindings are really convenient, but then you have to worry about dynamic linking. Are you sure Python is linked to the same OpenLDAP library that NSS uses? Hence, system versions of some Python packages, but only where other system level dependencies are an issue.
I think some of the problems were self-inflicted by trying to use Python on a Mac with no package manager, and on Windows. Things will get easier on the Linux side with Python 3 in repositories now. But it really seems like a fairly common pattern for languages / programming environments is to install the languages package manager and upgrade packages separately from the OS package manager. For whatever reason this seems like a bigger headache with Python than others.
But seriously, mirror your production environment for local development and testing. This is all comically easy with VMs, you just need to alter your workflow.
If pip doesn't work on Windows then it shouldn't ship on Windows. This isn't Windows' fault.
Granted this was recently, under windows 8.1 (tried it mostly for fun, and so that I have a decent calculator (ipython) on hand when I boot up in windows).
I was also pleasantly surprised, building python C extensions under Windows is easy. Building some of the C or Fortran dependencies, on the other hand, might be not.
Pip will drop ipython in the scripts-folder too -- so that where one can grab a shortcut. Or just (after restarting python):
IPython.start_ipython() # Ed: not .start()
There is a clear lack of _direct_ and _concise_ documentation about Python packaging. Something between overly descriptive reference documentation which nobody reads and "copy-paste this setup.py" that teaches nothing of substance.
Let me use the OP as an example:
"Wheels are the new standard of python distribution and are intended to replace eggs."
One important thing to remember is that packaging is a means to an end. Nobody gets excited about packaging so nobody wants to know too much, but also not too little that thinking about doing it instills feelings of dread.
PS: This applies to other packaging scenarios too. I've seen people fall to their knees when faced with building Debian packages for deployment (ending up not doing it at all), RPMs, etc.
Python's virtualenv is a lot more stable than the various hoops one goes through with eg rvm. On the other hand, if you needed/wanted to use more than python2.7/python3.4 -- it might be a bit harder to juggle many python versions than it is to juggle many ruby versions with rvm.
So the problem bundler/venv solves becomes a solution to 95% of the entire problem, rather than just 50%.
If by packaging you mean deployment, yeah, it's a mess. Only in the sense that you can't just FTP you application to any old server like you do with PHP projects.
It'd be nice if Python for the web "just worked" but it really isn't that difficult to get things setup (Nginx, Gunicorn, PostgreSQL, etc.). Also, there are a lot of reason not to want a canned PHP-type setup.
We have a policy to never bring-up servers by hand. You capture your setup and configuration in Ansible scripts, document them well and things just work. Then use Selenium to confirm that all is well with a basic test application. Life is good.
Python is a friggin' nightmare by comparison. Some trivial low-complexity operations can take insane amounts of elbow grease due to missing/outdated libs.
Pip does a remarkable job of magically getting and compiling missing c/c++ libraries, and works great in tandem with virtualenv.
From that perspective, I can see the OP's point. For most of Windows history, if you needed a library installed, you'd install it. Every installation was accomplished through the familiar Windows installer. If you needed a newer version, you installed it. There was process, and one target.
Now take modern Python. A new layer has been introduced, where you may run every project in a new target, or through the 'default' install. In addition, you have libraries and packages that may be installed through one of several different installers or processes, most of which are different than the OS's package management, and which aren't necessarily compatible and aren't tightly managed. This is on top of multiple python versions that may have to be installed.
I can see where he's coming from.
That being said I love Python and I respect the work that has been done to allow such a vibrant user base.
Have you tried setting $JAVA_HOME and restarting? rimshot
Here's a good overview of real world issues: http://lucumr.pocoo.org/2012/6/22/hate-hate-hate-everywhere/
(I'd imagine the core team thinks Armin should just embrace Python 3 already. And some of the details have changed since this post, naturally.)
I help co-maintain salt, a config management tool which he used and contributed quite a few bug reports to.
I've had a few issues over the years, but absolutely no showstoppers, and in no way any "disater" much less an "absolute" one.
Compare this to how languages like Go, Rust, Ruby, and Julia handle packages and dependencies and Python is an absolute disaster. Even if there are answers to the above questions, as a fairly advanced user I have no idea what they are, and I have done plenty of research.
> How do I upgrade all packages with Pip?
I don't know how to upgrade all packages, but I that's not something I want to do because I want to control which packages I upgrade. To upgrade a single package you can do
pip install --upgrade packagename
Egg is a package format. setup.py is a build script. pip and easy_install are package management tools. You use setup.py to build eggs that you install with easy_install or pip. You can also install directly with setup.py, but that's not something you'd generally do. pip is a better, more recent installation tool than easy_install.
> How does pip manage to break dependencies?
I'm not sure what you mean here.
> Why does no standard way to manage packages come pre-installed?
I guess the answer is because no one had bothered solving this issue until recently. Starting with Python 3.4, pip is included by default. See https://docs.python.org/3/installing/index.html
> How do I pin packages at a version?
You list your packages with their version numbers in a requirement file that you can pass as an argument to pip. You can use pip freeze to get a list of currently installed packages with their pinned version numbers that you can include into your requirements file.
> Why is there a need for virtual-env to exist as a separate project?
No need for that, it just hasn't been integrated in the standard distribution until fairly recently. Starting from Python 3.3, venv is included: https://docs.python.org/3/library/venv.html
> Compare this to how languages like Go, Rust, Ruby, and Julia handle packages and dependencies and Python is an absolute disaster.
Absolute disaster is a bit strong, but it's admittedly not as good as the other languages you mentioned. I think every Python developer who knows other languages will agree. That doesn't stop us from getting our job done though and the situation is improving.
For deploying software to end-users (rather than developers), bundling all required Python code with appropriate wrapping to adjust PYTHONPATH works just fine.
1. linux package management is sufficient - this isn't the case. linux package management couples your library versions with your distro verison. Meaning if I want to do something like try out my code with the latest pandas, or run 2 jobs with different versions of numpy I'm out of luck
2. conda went ahead and forked everything without regard for the community - this is completely untrue. We sat down with guido for the very first pydata at google offices in 2012 to discuss many issues, packaging being one of them. Guido acknowledged that the python packaging ecosystem isn't sufficient for the scientific community, and we should go out and solve the problem ourselves - so we did. Honestly on this point, you should just read travis' words about the issue
Scientific users need to be able to package non-python libs along with python libs (think zmq, or libhdf5, or R) we need a pkg mgmt solution that sits OUTSIDE of python for this reason. You can think of conda as a little in between place between virtualenv and docker
Wheels and Conda are almost completely orthogonal. Conda is a higher level abstraction than pip. Wheels is operating at the lowest level of abstraction.
Multiple package management tools are relatively fine. Having multiple competing format standards (egg vs wheel vs lots of other options) would be a disaster.
(For reference, there's .deb versus .rpm, but also dpkg versus apt-get versus yum versus up2date versus...)
> Having been involved in the python world for so long, we are all aware of pip, easy_install, and virtualenv, but these tools did not meet all of our specific requirements. The main problem is that they are focused around Python, neglecting non-Python library dependencies, such as HDF5, MKL, LLVM, etc., which do not have a setup.py in their source code and also do not install files into Python’s site-packages directory.
I haven't used Conda, but I totally get their main point here. Python packages sometimes depend upon non-Python packages. Any Python-specific packaging solution that cannot express that e.g. a Python package depends upon a native library does not really solve this problem.
I don't regard packaging, dependency management, and isolation as programming-language-dependent problems.
Windows is getting a package manager now, and OS X arguably have the App Store.
Everywhere else (read: Linux/BSD) -- most of what's packaged by Conda, is already packaged by the vendor.
In short: I don't think it's the best way forward (but it may very well be the best way for many usecases/users right now and in the near future).
I believe it uses virtualenv for this, but I may be wrong. Regardless, having this feature (as well as no interference with the system Python) is one of the killer features for me, and others I'm sure.
There's probably also some element of "it isn't the best general solution, but it's super convenient for some particular group of people, e.g. python users".
On the one hand, there are people who write, distribute and deploy Unix-y-type software -- system utilities, web applications, service daemons, etc. These users write for users who are like themselves, or who are sysadmins of Unix-y systems. Their concern is with automatable, repeatable, command-line-friendly packaging, distribution and deployment. They benefit from wheels because it saves them a potentially-time-consuming build step during installation, and will generally prefer wheels or tarballs as a distribution format.
On the other hand, there are people who use Python as a scientific computing and number-crunching stack. Many of them do not write and distribute their own applications; they distribute the results of their code. Many of them run on desktop-type machines using Windows. What they primarily want is a single .exe they can download and run which will provide Python plus all the numeric/scientific libraries in one install, plus (increasingly) IPython Notebook as a way to share their work.
There is no single standard which works well for both of these groups.
Sure there is. A single exe that bootstraps python and leverage wheels to install what the scientific user needs.
In fact, with so many new tools comming to windows lately (as in the past decade) -- that exe could probably be a powershell-script. And an equivalent shell-script could probably do the same for more unix-y platforms.
Or one might make a script that builds an installer that bundles up things downloaded via the wheels framework.
The distributions that I've used, unpack themselves to look just like a mainstream installation, including pip. So, if you have to install a missing package, you still do so in the same way. The distributions just give you a head start.
Specifically, there aren’t conda packages for every Python package. And if there are, they are behind their pip alternatives in many cases. Sometimes you then need to fall back to pip. However, it is impossible to install some packages into conda environments via pip. So basically the whole solution falls apart because the python libs you install via pip can't find C libraries installed via conda.
One example is:
Their solution was to include openssl into conda. Clearly, this solution doesn't generalize to other libs. Just things that use openssl.
This means the Python community can work towards standard tools and interfaces for working with Python packages rather than trying to support many different ways of installing / distributing packages.
Then again, we tend to struggle a bit with Python package management in general (who doesn't, I guess).
This isn't much of an advantage as you're trusting code compiled on an untrusted machine. I suppose it saves you some CPU cycles, but don't be fooled into thinking it's safer.
It's not terribly compelling given stuff like virtualenv, though, since users can just `pip install` stuff on their own and the experience is strictly nicer. And it's also not compelling if there isn't an explicit promise that this is treated as a security boundary.
Ah, that makes a lot of sense, especially for applications (eg meld) and command line utilities (eg httpie) which are nice to install systemwide as root, but only ever run in an unprivileged user's context.
Also, the ordering on the right seems to imply they are in order of popularity on pypi.
Does anyone know where they got their list or if they are associated with pypi in some way?
Also, learn to use:
pip install --user <package_name>
Depending on your resources that may be possible or not; personally I can't provide them as I only use Linux.
Unfortunately Wheels don't solve this problem as it seem inherent to how python imports modules.
I've seen people get around this by vendoring their dependencies and I've done some tricks in the past where you provide a vendor proxy package that you can point at a different site packages but this is brittle if your dependency has additional dependencies or uses absolute imports. Now you're maintaining a fork.
I'd love to hear if anyone has had more success in this realm. I love Python but using node and rust, and having the ability to have isolated dependencies trees has made me sad about the state of python packages.
Pillow: buttery smooth on Ubuntu LTS and OS X for a long time
python setup.py bdist_rpm
Should python really have been called "rodent" with "wheels" being the "cheese wheels" we rodents get from the "cheese shop"?