Hacker News new | past | comments | ask | show | jobs | submit login
Why I hate virtualenv and pip (pythonrants.wordpress.com)
105 points by dissent on Dec 6, 2013 | hide | past | favorite | 113 comments

A lot of this seems to boil down to a combination of seeing others misuse tools and assuming that's what they're for (which is a communication/teaching failure, not a failure of the tool), and looking at stepping-stone solutions.

Take binary packages, for example. Sure, eggs did that. Sort of. But they also introduced a weird parallel universe where you had to stop doing normal Python things and start doing egg things. So pip eschewed eggs.

And meanwhile the community banded together to find a way to separate the build and install processes, the result of which is the wheel format:


Similarly, virtual environments are still being improved (with the improvements now being integrated direct into Python itself).

And yes, you can use pip's requirements files as a duplicate way to specify dependencies. And people do that, and it's unfortunate. Because the thing requirements files are really useful for is specifying a particular environment that you want to replicate. That might be a known-good set of stuff you've tested and now want to deploy on, it might be an experimental combination of things you want others to test, etc., but that's something setup.py's dependency system isn't good at. And repeatable/replicable environments are certainly an important thing.

etc., etc.

Yeah, I think that was largely my reaction as well. Attacking virtualenv in particular because you've seen others use it as a replacement for a completely isolated environment just says that at the very least they didn't understand its intended use case, and perhaps OP doesn't either.

Also, while I support his offering a full-isolation alternative, realistically not everybody is going to want to develop in a Vagrant environment. It's a great solution if you're willing to run with that sort of overhead, but not everybody is.

I'm not sure that full-isolation is an alternative, they're somewhat orthogonal solutions. It's perfectly normal to use pip and virtualenv inside a virtual container, so that the base operating system is a known environment.

Just saying "use full-isolation" doesn't solve the problem where I want to deploy multiple projects in the same unix environment. Heaven forbid I might want two processes from different projects to cooperate, without needing to do everything via APIs and TCP. Overheads there are not just performance, but also development time.

> Just saying "use full-isolation" doesn't solve the problem where I want to deploy multiple projects in the same unix environment.

Isn't that not full isolation, then?

Yeah. pip/virtualenv allows you to run multiple python processes without full isolation, but without conflicting with each other's libraries either. That's the point.

Provided there's no conflict outside of python, that is. This is also sounds like a deployment scenario, in which case it ought to be non-interactive, in which case you can set sys.path in your entry point. That's the point.

i was pleased to see wheel mentioned. here is a wheel anecdote:

suppose you wanted to automate the installation of a python package to a windows machine, but the package has binary dependencies, is irritating to build from source, and is distributed as a pre-built .exe interactive installer (click next, next, ...). you can `wheel convert` the exe installer to get a binary wheel archive, then automate the installation of that wheel archive with pip. hopefully this isn't a common scenario, but the fact that pip + wheel make this kind of thing possible at all is very helpful.

I fail to see the attraction of wheel, and the uptake since it was invented appears to be basically non-existent.

Can someone seriously explain what the point of it is?

As far as I can tell, it's: make life easier for windows python users (which I am one of, and I don't care about it at all; you need VS to build everything anyway; one or two wheel modules makes zero difference; it'll only make a difference when everything uses wheel, which will be never...).

I read the wheel PEP. I got to the part about munging #! shebang lines and stopped reading. Shebang is just totally irrelevant outside of interactive command line convenience. Just run the python you want to run explicitly! Just like java.

As a programmer who uses a lot of Java and .NET, I was surprised by how complex pip and virtualenv are. The complexity is hidden well enough if you just want to get down to business, so it's not really a practical concern, but still - copy an entire Python environment over? Why? Why can't I just have the pip'ed libraries in a subdir somewhere and tell Python to look there?

Plain old JARs get this right, Maven gets this right, NuGet gets this right, NPM gets this right. Why is it so complex on Python and Ruby? Some technological aspect of the (flexibility of) the languages that need you to basically copy over the entire world? Or just legacy of unfortunate bad design choices in earlier days?

Well, I would not want to copy a virtualenv from one machine onto another - that approach sounds like it's fraught with trouble.

I think the sane way to do it is to package your app using setuptools (and list dependencies) and then use pip install to install it in a new virtualenv on the production machines.

Here's how I do it on the project I'm currently on:


Ruby gets it right, too. You can have a very simple gem setup if you want it - set GEM_HOME and away you go. I'm sure it's possible to get the same simplicity with Python.

The issue I see is that the default tools people turn to hide this simplicity behind a shiny interface, and encourage people to think of the management tools as magic black boxes. The tools then become the only interface people know, and they're inevitably more complicated and fragile than the underlying mechanism.

Python predates Java and .NET (never mind NPM), and was originally more systems-oriented; the idea is that you install libraries globally, using your system package manager, just like with C libraries. pip/virtualenv had to be retrofitted on afterwards.

because (and correct me if I am wrong) Jars (and certainly Wars) contain other bits than the part you wrote.

let's say you are packaging up a java and a python "program". Both print Hello world to the stdout, but use a third party spell checking package. Both the venv and the jar will contain that third party package

All python needs to go from venv to jar is a tarball process.

that is roughly were I see us going anyway - ps anyone with good understanding of jar files please jump in - I would love to map out the python parts against the jar parts (ie pip install is what part of the java setup process?)

Python also understands zip format for modules. So you run

zip -r mymodule.zip mymodule/

and you have the same portability as you have with jar. For bonus points, you can append that zip to a copy of the python executable (either python.exe or whatever binary your OS uses) and it is available as a module for that instance of python. This is one way standalone Python program executables are distributed.

(this was only tangentially related, but I think it's cool. I may also be wrong)

> but use a third party spell checking package. Both the venv and the jar will contain that third party package

Jar files do not support including third party libraries. You simply cannot put a jar in a jar. Eclipse/maven etc can explode the dependancies class files inside of the jar, but people rarely do that for anything other then distributing binaries of client apps (licensing becomes a massive pain btw).

war/ear files on the other hand does support including dependant jar files.

I may have mis-spoke (could not re-read thanks to my antiprocrast) : I am thinking of a package format, probably more similar to .deb than .jar, that will drop the code I have written, plus the third party packages, plus the system level packages, ie all the dependancies that can get compiled)

The idea is to divide the Operating system from the application. There is never a good divide point but at some point we can say the chef/puppet/salt infrastructure will give us the operating system, and the big binary blob (.deb) will give us everything else.

Throw in configuration as a seperate function and I am going to have to lie down in a dark room.

>Why? Why can't I just have the pip'ed libraries in a subdir somewhere and tell Python to look there?

Because that ties you to a single system wide python. Why would you want that?

Yeahyeah, NPM and its community get plenty other things wrong, but not the particular flavour of dependency hell that we're discussing here. It's all neatly there, in a local node_modules subdirectory, and assuming the modules have been properly maintained, it just works.

You have to do actual effort to install an npm module globally (namely, provide a switch). The default does the sane thing.

>You have to do actual effort to install an npm module globally (namely, provide a switch). The default does the sane thing.

No it doesn't? You have to manually create a node_modules dir in the current directory because otherwise it'll scan up your tree for any parent dir containing node_modules, which is pretty much never what you want.

The npm registry can be replicated as CouchDB is the underlying data store.


python actually has reasonably good support for this via a single environment variable (PYTHONENV or PYTHONHOME, I think, it's been a while since I looked at this). The problem is that it overrides all the default search/path locations, and thus is missing the complete stdlib unless you copy everything into it. There's also the matter of binary modules that can not be shared between some interpreter versions (which feels like it begat Ubuntu's questionable python-dist symlink nightmare).

Nope, PYTHONPATH just prepends to sys.path. Does exactly what you say it doesn't do. It's virtualenv that can drop the system, thus necessitating a local copy.

I make no mention of PYTHONPATH.

> NPM gets this right.

Until you want to actually deploy to production -- then good luck, write your own tools.

>Provided the developer is fully conscious that this is python-level-only isolation, then it is useful.

Of course we know it's python level only isolation. We're still running in an OS with a filesystem and such. If we wanted something more we'd use jails or something similar.

>Full methods of isolation make virtualenv redundant

So what? They are too heavy handed, and 99% of the time, I don't want them anyway.

>It is very, very easy to install something as large as a Django application into a prefix. Easier, I would argue, then indirectly driving virtualenv and messing with python shebangs.

You'd argue, but you'd lose the argument.

>You need to preserve this behaviour right down the line if you want to run things in this virtualenv from the outside, like a cron job. You will need to effectively hardcode the path of the virtualenv to run the correct python. This is at least as fiddly as manually setting up your PATH/PYTHONPATH.

Yes, if only they gave you OTHER BENEFITS in exchange. Oh, wait.

In general, move, nothing to see here...

> You'd argue, but you'd lose the argument.

Done it already, in production. The virtualenv fanboys didn't even notice. It's simple and elegant and works perfectly.

The argument wasn't that it can't be done.

It was that it's "easier" ("Easier, I would argue, then indirectly driving virtualenv and messing with python shebangs").

Also, "virtualenv fanboys"? Please, are we 16 years old?

Why I love virtualenv and pip.

We use virtualenv and pip extensively here, with virtualenvwrapper.

  mkvirtualenv <project>
  workon <project>
  pip install -r requirements.txt
It just works. I don't spend any time on it. Our developers don't have any problems with it. All the other considerations in the article we either handle as you're meant to, or understand the limitations.

Still, looking forward to some interesting comments on here.

Did you actually read the entire article, or did you just come here to say that?

It extends to deployment. Off top of my head, the script that you have to write to trivially deploy with virtualenv is something like:

    virtualenv --no-site-packages -p$PYTHON $workdir
    (cd $my-package-dir && python setup.py)
    virtualenv --relocatable "$workdir"
    fpm -s dir -t deb -n "$package" -p "$package.deb" -d <system dependencies> ...
And you have a deb that you can just scp anywhere and it'll just work.

Thanks for your pointless blog post

Thanks for your pointless comment

Sure, pip's imperfect; I have to install the mysql header files, woe is me. But the cost/benefit tradeoff is better than LXC; pip gets me most of the isolation with much, much less overhead.

Is the author really claiming that it's easier to script a non-virtualenv deployment than a virtualenv one? If so, great, do that - the only reason I deploy with virtualenv is because, guess what, that's easier to script.

Why default to --no-site-packages? Because it helps and it's easy. No, I'm not perfectly isolated from my host system - but then the host system could have a broken libc and then nothing, not even LXC, is going to make your isolation system work. Just because you can't isolate perfectly doesn't mean there's no point isolating as much as you can.

Yes, pip builds from source. That's a lot more reliable than the alternative. The Java guys certainly aren't mocking you if they've ever done the actually equivalent thing, i.e. deploy a library that uses JNI, which is a complete clusterfuck.

(URLs as dependencies are indeed a bad idea; don't do that. The complaint about pip freeze is purely derivative of the other complaints; it's wrong because they're wrong).

Yeah, the lxc package gives you command line tools that are a close analogue to virtualenv. Combine it with a filesystem like btrfs for copy-on-write, and it's FAR quicker too.

I'm glad you mentioned JNI. In Java, native is the exception. In python, it's much closer to the rule. A hell of a lot of python libraries rely on C components which leak out of a virtualenv.

Building from source isn't reliable. It's quite hard, not to mention relatively slow. See the great success of RPM and APT based Linux distributions as proof of this.

Why pip?

pip uninstall psycopg2


pip install --upgrade psycopg2

But I guess with easy_install you can fake it by running with -m and then deleting the errant egg files in lib and bin files. That's pretty easy, I guess.

Oh but hey, you know what you can do instead? Setup a virtualenv, easy_install everything and when it gets hopelessly out of date or munged, you can just delete the virtualenv directory and start again.

Snark aside, I would agree with the OP that the "feature" of installing via arbitrary URLs is an anti-pattern and encourages lazy development. Of course, not every package we build can be posted to a public package library, so there's always that issue with easy_install too. Sigh, what a mess we have. Good thing I'm still able to get work done with these tools :)

Good luck pip installing psycopg2 on Windows :)

Haha. Win-what? In my domain I am happily able to ignore almost all attributes of the Windows operating system. Well, except for IE <shakes fist of to the air>

I think we should look back to the rant of Python Core Committer Hynek Schlawack. (https://hynek.me/articles/python-app-deployment-with-native-...)

In short build your system and it's dependencies once and once only then pass them around through test into live.

We have three competing needs: a reliable deployment process that can move a binary-like blob across multiple test environments

A need for exactly reproducible environments but without dd/ghosting

A desire to keep things simple

Isolation is good - whether through the mostly isolated approach of venv, the almost total isolation of jails/LXC or the vagrant approach. But they focus almost entirely on binary builds - how does one pass around a python environment without rebuilding it and it's dependencies each time ala pip?

Well by taking the running built python environments and passing them into a package manager like apt and calling that a binary. That might mean tar balling a venv or tar balling /use/local/python but in the end it matters that we pass around the same basic bits.

I am working this out in pyholodeck.mikadosoftware.com and in my head - when I have a good answer I will shout

While I see where he's coming from, I really can't agree with many things he's saying:

"For python packages that depend on system libraries, only the python-level part of those packages are isolated."

And there's nothing really bad about it. Well-written python libraries will work with any previous version of the library they're wrapping. They will also report incompatibilities. It's ok to use system libraries - especially if you're advocating getting rid of virtualenv as author does.

"Full methods of isolation make virtualenv redundant"

Well... no. There are times when installing a local version of some library is required and it cannot be installed system-wide, or it will break system's yum for example. You're not only isolating your app from the system, but also the system tools from the app.

"virtualenv’s value lies only in conveniently allowing a user to _interactively_ create a python sandbox"

There's nothing interactive about what `tox` does for example and it's a perfect example of why virtualenv is useful. You can have not only a virtualenv for testing your app, but also multiple configurations for different selected extras - all living side by side.

"Clearly virtualenv advocates don’t want any hidden dependencies or incorrect versions leaking into their environment. However their virtualenv will always be on the path first, so there’s little real danger"

Until you want the same package that's available in the system, but your app's version constraint is not looked at when the system's package is upgraded. Or you want different extras selected. Or your deps are incompatible with some-system-application deps, but you're calling it via subprocess (this is also where changing the python path in shbang comes useful).

Venvs are definitely not perfect, but for testing and installation of apps, they're amazingly useful. Binary libs issue is definitely annoying, but there's a different solution for it and I'm happy to see it used more often - don't compile extensions, but use cffi/ctypes.

Companion read, "Python Packaging: Hate, hate, hate everywhere" by Armin Ronacher (June 2012): http://lucumr.pocoo.org/2012/6/22/hate-hate-hate-everywhere/

Has anybody here used conda[1] for anything significant/serious yet? I've been using it in passing for my side and small projects but I'm still not convinced I want to go whole hog yet.

Regardless, my experience with it so far has been... ideal. It really makes building environments and linking/unlinking packages a breeze. I haven't needed it for building my own packages yet, so we'll see how that goes.

[1] http://docs.continuum.io/conda/

If anyone needs more color, Travis, the CEO of continuum and the author of NumPy just wrote a great post explaining why conda exists, and why virtualenv and pip aren't sufficient for the scientific python community.


To both commenters so far I would like to say PEP-370, a new way since 2.7 to create virtual environments.

I started using it recently and I see no need for virtualenv anymore.

I have nothing to say about the pip issue though, never had an issue with pip myself.

Back in 2009, Jesse Noller wrote about his experiences with PEP-370 and seems to have felt that it is an improvement on the status-quo and a complement to virtualenv but not a replacement:


Please add a link to the PEP in your post. Thanks.

What has the author built to replace pip?


Generic algorithm of making things better:

0. Give it a go to fix it oneself first. Really.

1. Failing the previous, raise the perceived deficiency with a specific and workable proposed solution.

2. Failing the previous, indicate what's undesirable and how, and what behavior would be desirable.

3. Failing the previous, put a monetary bounty on the feature, fork the project or live with it. Rewriting from scratch has a 99.99% probability of being several times more work than it seems.

Replied to wrong parent?

Nope. The claim calls out that the author complained without also offering a constructive, alternative solution.

While I don't agree with the author's views, I have to say that he did offer the alternative solution of lxc or vagrant to more fully isolate his python environments.

The lightweight part is pretty useful. LXC is definitely overkill. I don't want to have to bridge my graphics and networking over so that I can run programs against different versions of libraries. Going more lightweight, if I'm doing PYTHONPATH and PYTHONHOME, I would start scripting them, script the installation of libraries into my virtual environment that I just recreated badly...

--no-site-packages has been default for a while. http://www.virtualenv.org/en/latest/virtualenv.html#the-syst...

I don't really see the argument about compiling against system headers and libs. Generally I do want to isolate my Python modules that are calling into other binary libs but don't care about isolating those binary libs themselves because their interface isn't changing when the Python wrapper for them changes. This is unless they are part of what I'm wanting to develop/deploy with, in which case the source will be in the virtualenv and install into the virtualenv using the config script at worst. A frequent example ends up being how Pygame will compile against some system libvideo.so, the behavior of which I never change, but may Pygames might have their own API's etc, and so the many compiled versions do have their own use.

Virtualenv is actually pretty noob friendly because one of the mistakes I see far more frequently than the others is that users will install things using pip system-wide that conflict with the system package manager. This can become pretty difficult to unscrew for inexperienced Linux users.

I've been meaning to actually add some virtualenv docs because of the frequency with which inexperienced Python and Linux users in general will waltz in and not be able to compile something because only the old version of Cython etc are on Ubuntu 11.blah and thus we start bringing in distribution-specific package managers into the realm Python package management was intended for and people try to figure out what version of Ubuntu they need instead of figuring out that they can install everything in one place in many instances and maintain an entire slew of projects without conflicts and without calling on IRC when synaptics clobbers things.

There's a level between virtualenv and LXC, and that's schroot. Combine it with a CoW filesystem and that will cover everything you mention. Although personally, I find LXC very lightweight. Note that in my article, I did point out that I do still use virtualenv sometimes :)

Interesting take on this. I think everyone just accepts pip + virtualenv as the only way to be without questioning it. You have definitely convinced me to reexamine that on my next project.

Well, I really don't agree with any of the arguments in the article. I'll use the experience card and just say that I've been using virtualenv/pip for years and it always served me very well. Made development, testing, and deployment easier. Even if it's hackish and that there are more robust solutions, this one strikes a perfect good enough of quality versus time versus complexity.

Here's an idea, if you want to convince me there's a better way of doing something, don't belittle me for the way I'm currently doing it.

I know pip isn't perfect. I know venv isn't perfect. They do work pretty well though. And when you find something that works well for you in your process, use it.

Some valid points (many of which have been on articles featured on HN before). Shame about the tone.

Please take a look at using and building real packages for your system. RPM and APT. These are battle tested, they handle dependencies, transitive dependencies, multiple repo sources, {pre/post}{install/uninstall/upgrade} scrips. They provide a transactional way to add software to your systems.

You can use pip and virtualenv as well perhaps by creating a parallel Python install in /opt or something like that if needed. And then install that in an RPM if needed.

But if you are installing hundreds of binary files, dlls and using requirements.txt as the main way to specify dependencies you are probably going to end up with a mess.

It is much harder if you have multiple OS systems to support. Installing on Windows, RHEL/CentOS, Ubuntu and Mac OS X is hard in an easy and clean way. But if you target a specific server platform like say CentOS 6, take a look at RPMs.

This happens to be just where some of my issues come from. When building a monolithic python app into an RPM, Virtualenv was somewhere between annoying and pointless. It's this belief that it actually does something other than set paths that annoys me about it. Just listen to the rants on here. If it had been named Python Pathsetter, I don't think it would have gotten a following!

This is pretty much what I often discuss with a friend of mine. Like you he always tells me to use RPM/APT instead of the Python stuff. But my goal is not that the stuff runs as clean as possible on one specific system, but that it runs on as many OS's as possible in relatively clean manner. If virtualenv doesn't tell my Ubuntu's APT that it installed some Python packages in ~/.virtualenv/project_one then an Ubuntu sysadmin will cry and try to kill me, I really don't care. On the other hand with virtualenv I can make sure, that my team leader can run the same command lines as myself and get my Python project running on his Suse box and the QA department can run my code on their Fedora as well. This is why I use Python and this is why I use virtualenv. I'm not happy with how things are in Python, but if I compare the problems between both solution paths currently I still think I'm better off with virtualenv and co.

You seem to have good reasons -- multiple OS-es.

I was just saying it is good to be aware of RPMs and APT packages. Even old and crusty setuptool has RPM support.

    $ python setup.py bdist_rpm 
(add deps setup.cfg and make sure to have a MANIFEST.in file to add extra data besides python packages to the RPM package).

You seem to need multiple OS-es. That is hard and I haven't found a clean universal solution.

It also depends on the software. We have lots of mixed, C/C++/Node.js/Java/Python/big data files. Using virtinst/pip, then unpacking tar.gz, make ; make install , then java's (whatever it has) then npm all to setup a repeatable dev and test environment would be a horrible mess. That is what RPMs packages provide for us.

This is pretty much what hynek was on about (see my comment below) and what the guys at Parcel are doing (Iam a johnny come lately with pyholodeck) but we are trying to do the same thing - build once, package into apt/deb and deploy many times.

I think the argument above (boss runs different OS) is a fallacy - you want to deploy to the same target OS, probably in the cloud, so optimise for that first then fiddle with different OS. I guarantee people will prefer deploying a cloud server and logging in "just to see" and be happy with manually bringing things up with `setup.py develop` locally.

The code I write is a framework for testers and developers, though. So it won't run on many similar machines but on many very individual machines. But I guess it might even be possible to transform APT/RPM packages in each other and my guess would be that even yum can read one or the other, even if it's not its main package format (don't know Fedora very well).

You guys really got me excited about this. If I can get any air soon, I'll try to learn more about it!

Isolating an entire environment is a better idea than isolating a python environment, but isolating a language's environment is an easier problem to solve, so the tools for it are currently better. I doubt we'll all be doing it this way in 2020, but it works pretty well right now.

My friend who is a python guru uses buildout for all his apps.

Some of the recipes I have seen go more into configuration-management-like stuff but it is cool to see a single buildout script deploy nginx, DB, deps, and app in one go, on any linux box.

it is cool to see a single buildout script deploy nginx, DB, deps, and app in one go, on any linux box.

Would anyone please link to a practical, working example of this? I want to use buildouts, but I learn from example, and there seem to be very few examples of how to deploy a production configuration. How do the pros do it? What are the gotchas? Is there a book I can buy? Will someone please put together a PDF explaining all this, so that I can throw money at you?

EDIT: Arg, that's exactly what I mean... https://github.com/elbart/nginx-buildout is an ok example just to learn the basics of buildout, but making it "production ready" (i.e. extending it to build postgres, etc) is left as an exercise for the reader. I was really hoping to find a production buildout example... (But thank you, rithi! I appreciate you took the time to dig that one up for me.)

I'm not sure I'd advocate using buildout for this, but as (you've probably seen) there are a few recipes for stuff like haproxy[ha], varnish[va], postgres[pg] and nginx[nx] available.

An example of how to combine many of these, for deploying a complex stack, could be:


Another somewhat complex example:


You might also find this enlightening: http://glicksoftware.com/blog/using-haproxy-with-zope-via-bu...

[pg] https://github.com/securactive/sact.recipe.postgresql

[va] https://github.com/collective/plone.recipe.varnish

[ha] https://github.com/plone/plone.recipe.haproxy

[nx] https://github.com/collective/plone.recipe.varnish

https://github.com/elbart/nginx-buildout Extend to build postgres and other dependencies.

Ansible is made for this. From a fresh instance (or instances) with only ssh access to Postgres, Nginx, rabbitmq, redis, memcached, django app, etc. all configged and working.

You're probably better off looking into something like Ansible or Salt.

I agree that developers using virtualenv should be aware that it does not provude isolation for system dependancies. On two different projects, I've had to track down a PIL bug on another developer's machine only to find that the wrong version of libjpg or libpng was installed. Solution? Install the right version. I've never experienced needing different versions of a system library, but if I did, LXC with btrfs sounds like an option worth trying to avoid the overhead of vagrant.

I think the author is just misunderstanding what virtualenv is for. I use it to run different python package versions on the same system. With Pythonbrew (https://github.com/utahta/pythonbrew) I can even run different versions of Python alongside each other. It's not meant for easier deployment, it's meant for easier development.

It depends on what you want. Ultimately, the holy grail is being able to build, today, a bit-for-bit identical binary to whatever you are running in production that you built a day, a week, or a month ago. That way, if you need to debug something obscure, you're not adding extra variables to the equation.

But most people just want to be able to write two apps that use different libraries, and for that, virtualenv is fine.

Whenever I set up a new Django project, it was always a bunch of boilerplate work to get virtualenv setup. Then if you forgot a step you would have to look into what is wrong.

After spending some time with Node.js, I have become spoiled by how well npm works for everything. Albeit, if you want to mess with node versions you need something like nvm, but it is easy to get the hang of.

Seeing something as simple as npm for python would be awesome.

There's nothing in node that can replace Django. Django does so much.

NPM is a wasteland of abandoned projects and nested npm_modules with broken symlinks to or from a bin that'll never work on your vm's shared directory (unless it's VMWare I guess, some kind of secret sauce they use).

Having said that, they're all ok tools that people use to build pretty cool things. I just think there's got to be a better way, and I think it's somewhere in between an extern and npm/pip thing where you store your dependencies somewhere where you don't have to worry about a site being down during deploy and versioning issues.

It's like you freeze and shrinkwrap, but by actually capturing everything you need once your shit works and putting it somewhere where you control not someone else.

If you're deploying an application (really, if you're working on anything other than a library intended to be used by many other projects) check in your dependencies. This solves a lot of problems regardless of what language/platform you're developing on.

Is it ever not true that when you forget a step, you have to look into what's wrong? In any language, or any project?

1. The fewer steps, the less chance of making a mistake. With node/npm there is no need to set up anything, 'virtualenv' is enabled by default, just npm install and you're set.

2. Some tools make it easier to detect the mistakes and guide you to a solution. Some present you with unintelligible messages or a (for end-users rather useless) stacktrace. Others describe the problem nicely (in prose), point you to FAQ/documentation or tell you what the most likely solution is.

> 1. The fewer steps, the less chance of making a mistake. With node/npm there is no need to set up anything, 'virtualenv' is enabled by default, just npm install and you're set.

You're assuming that setting up virtualenv is a heroic, error-prone process rather than a couple seconds the first time you get a new system – just like node/npm.

Once you've installed virtualenv or npm, the process is identical: you need something, you install it and if something breaks you have to debug that particular package. In both cases, you're going to need to be able to read an error message and in neither case does the challenge usually involve packaging rather than, say, an issue with a shared library or incompatible/unavailable dependencies.

> 2. Some tools make it easier to detect the mistakes and guide you to a solution.

Again, there's no meaningful difference between the two unless you choose to make your environment complicated, which is not a problem specific to a language. I use both node and python on a regular basis and there's no general conclusion to be made about either one – npm installs are slower, python requires me to activate a virtualenv when I open a new window[1], and none of that really matters because no developer should be spending all day installing packages or opening terminal sessions.

1. If this is truly soul-crushing, it's a solved problem: https://gist.github.com/codysoyland/2198913

I'm going to go out on a limb and guess that you haven't yet discovered application templates? If you have, then feel free to disregard this message.

If you've got a bunch of project boilerplate, you can start a new, empty project, configure it to your boilerplate scenario, and then save that as a template for future projects.

Then, the next time you start a new Django app, you'd do something like this:

    virtualenv project_name; cd project_name; 
    . ./bin/activate
    django-admin.py startproject --template=/Users/username/Django-templates/boilerplate project_name
    pip install -r project_name/requirements.txt

> it was always a bunch of boilerplate work to get virtualenv setup.

  curl -O https://pypi.python.org/packages/source/v/virtualenv/virtualenv-X.X.tar.gz
  $ tar -xvfz virtualenv-X.X.tar.gz
  $ python ./virtualenv-X.X/virtualenv.py myEnv
  $ source ./myEnv/bin/activate
yeah, that's a killer alright...

Yeah but there's even some boilerplate work to get a new Java project setup. mvn archetype:generate is helpful but you still have to go in and start adding your dependencies to the pom. Once you do it a few times it goes pretty quick. You can easily write a little bash script that can do all that stuff for you, too.

Node's npm doesn't work for everything. For example, it fails to install indirect dependencies: https://github.com/isaacs/npm/issues/3124

NPM may be simple to use, but it is not accurate.

Well one reason to use pip over easy_install is that pip uses https by default and easy_install doesn't

Who is dissent? Who is this 'Adam' guy?

No other posts on this 'python rants' blog, nothing else on HN?

IMO this is astroturfing ( http://en.wikipedia.org/wiki/Astroturfing )

Astroturfing generally requires a company backing the movement. Why would a company care enough about pip and virtualenvs and then want to hide that? I didn't see an alternative offered besides using existing open source tools.

docker.io comes to mind.

I actually did not know about this thread until you mentioned Docker in it.

And after reading it, I've realized I've been plain wrong in how I've been constructing my python related Docker containers.

Full disclosure: I love pip. I love virtualenv. I use them religiously.

I also work for the Docker team.

It's all genuine, just one guy who doesn't like virtualenv much. I started the blog just to have somewhere to post that. The person who mentioned names probably worked with me.

And who are you? No other posts, anonymous account, simply to call someone out and try to tie them to a company?

Full disclosure: Know 2 ex coworkers at Docker, no financial ties.

I like pip.

I don't like virtual env quite as much, but maybe I haven't juggled enough python versions at the same time (I don't often work on such disparate codebases that virtualenv becomes a real need, and it's a little obtuse to set up for me)...

virtualenv isn't made redundant by Vagrant at all; I use them together all the time. The remote server I deploy to is completely different from my development machine, so I use Vagrant to simulate the remote environment. Since I'm on shared hosting, though, I use virtualenv to make sure that my app isn't depending on shared modules that are out of my control. It's also a LOT more reliable than trying to deal with a userdir python build or pip --user, where you'll probably end up fiddling with envvars anyway, except worse because you'll be doing it by hand instead of through a script.

Binary python packages are useless IMHO. If I want binary packages, I make a binary package for my package management system. Running pip/easy_install on a production system is an anti-pattern.

You summarize one point I learned from this article and the comments: deployment and development is not the same. On the development machines it might make sense to easy_install complete package trees, but in a deployment environment the most important concern is a clean system wide package management and for that I need binary packages in a format that the system's package manager understands.


Yes, I tend to agree here. However if you're building an RPM, and your python packages will be compiled to binary on the way in. It's not unreasonable in some cases to precompile things to avoid repeatedly doing it.

To each his own, I guess. They, together, are two of my favorite tools. Never had a problem with either of them.

Oh dear. What's the right way then?

LXC + eggs + setuptools + something, I guess?

As a current member of the `pip freeze > reqs.txt' brigade, I'd be interested in seeing a more detailed look at how to do it "right"/better.

Just create a setup.py and put your deps in there.

Ideally, setup.py is for generic minimum-version-known-to-work dependencies of a package. Pip requirements files are for very specific known-good combinations of dependencies of an application. https://caremad.io/blog/setup-vs-requirement/

It's an anti-pattern to not pin versions in libs. setup.py is a contract. Every version listed as a dependency should have been tested, otherwise the package will bitrot. It should be safe to pin major versions, if you can trust upstream packages to follow best practices with version numbers and you aren't using any "undocumented" features of the lib. Ideally, your packages are tested and built into binary packages before they are deployed to production systems. If you are following that best practice, then your risk of a patch breaking your system are minimized.

I consider that nonsense. If there was truth in this, other platforms besides python would need something similar.

Virtualenv + PIP are far more portable than LXC.

They aren't portable at all :)

What definition of portable are you using? My definition is that developers on Windows, Mac OS and Linux can work with the same python dependencies. Are you dissenting with this?

Using conda to build and deploy binary dependencies. Create system-level virtual-environments with ease. Anaconda has jump-started providing binary packages for several platforms.

Getting a conda package from can be as simple as

`conda build --build-recipe <pypi-name>`

one should hate something when that thing is something he dislikes but most importantly it is forced on him or her. I am criticizing only the title, he might be right about these tools, but using agitative words does not make an argument better.

I admit I'm stirring things up a bit here. You're quite right that I'm venting.

:+1: And the same holds for rbenv and rvm for Ruby.

I’d put it a bit more broadly: The state of working with Python packages has been painful for many years, and despite recent improvements and ongoing heroic efforts to fix things (thanks Tarek in particular), it’s still not much fun. Piling on hate doesn't help anybody though.

There’s some history worth considering. LXC requires Linux >= 2.6.24, released 24 January 2008. Virtualenv was released in October 2007. So LXC was hardly an option at the time.

Virtualenv was, I think, a pretty good attempt at a pragmatic solution for purely python-related dependency management issues at that time. I found it a hell of a lot easier and quicker than chrooting or building a whole new python installation (I used to do that) or using (shudder) zc.buildout. System-level virtualization was pretty heavyweight in 2007.

I think maybe virtualenv is showing its age a bit; I agree with that the system library isolation issue is a huge hole in the virtualenv approach. But often, it’s enough to get work done.

As for pip vs. easy_install, anybody who was around at the time (sorry I can’t tell from the pythonrants blog if that includes "Adam" or not) remembers that life with easy_install was horrifically painful. It was buggy, often failed with completely unhelpful messages, and issues with it (and setuptools more generally) were simply not getting fixed at all. For _years_. (That is finally changing more recently, thankfully.) Pip was intended to route around all that (while still using setuptools internally) by doing less and by having less painful failure modes. As one example: if you tried to easy_install a bunch of packages, or one package with a bunch of dependencies, and one somewhere in the middle failed because it couldn’t find a dependency, you’d end up with a fucked environment that had half of the packages installed and half not. And since there was no easy_uninstall, you had no easy way to clean up the mess. And you wouldn’t even have any easy way to know what actually depended on the dependency that failed to install. Pip took the much nicer approach of downloading everything, resolving all dependencies, trying to tell you what depended on something that couldn’t be found, and building everything before installing any packages at all. So if there was a failure prior to the installation phase, it had no effect on installed libraries at all. It’s hard to overstate how much pain relief this provided.

For another example: setuptools and easy_install allow installing different versions of the same package into the same python environment at the same time. I’m not sure why that was ever considered desirable, because every time it actually happens, it’s been a source of nothing but pain for me and didn’t even seem to work as advertised. Pip took the opinionated approach that only one version should be installed and if you want different versions for some other application, just go build a separate environment for it (virtualenv or no).

I agree that the –no-site-packages vs. –system-site-packages options to virtualenv are problematic, but for me that’s largely because it amounts to a binary choice between having to build all C extensions from scratch vs. having to depend on whatever happens to be installed system-wide, which is a pretty poor choice to have to make.

As for the statement that “there’s little real danger” because “their virtualenv will always be on the path first”, he's forgetting (or possibly doesn’t know) that the easy_install.pth file messes with sys.path as well. I had several hair-tearing sessions trying to figure out why on earth the wrong version of some package was getting imported before I realized that. At one point I found that import behavior changed depending on which version of setuptools or distribute I had installed. That was fun.

It’s also true that people often misuse requirements files. I don’t think that’s the fault of the tool. Depending on the current commit of some master or other branch is idiotic regardless of what mechanism you use. Depending on a specific tarball URL is no more or less reliable than depending on a specific package version being available via pypi (hasn't everybody had the experience of somebody yanking away a package version you depended on?). The right thing to do is probably neither, but to host all your dependencies somewhere that you control (whether that’s your own python package index or .deb or .rpm server or whatever).

> he's forgetting (or possibly doesn’t know) that the easy_install.pth file messes

The .pth files don't do anything until their own paths are added to sys.path. It's safe to consider directories with .pth files as if they contain python modules.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact