
Tools of the Modern Python Hacker: Virtualenv, Fabric and Pip - iamelgringo
http://clemesha.org/blog/2009/jul/05/modern-python-hacker-tools-virtualenv-fabric-pip/
======
grandalf
as someone who has been exploring python lately but who has spent the past few
years using mostly ruby and javascript, i have really enjoyed observing how
the python community approaches problems ...

\- code readability is generally very important (unlike ruby, where magic-fu
is quite important)

\- there is more heterogeneity

\- there is less of an obsession with testing and boasting about it.

Most importantly, I haven't read any blog posts promoting automated code
quality measurement tools (are any of your methods over 3 lines?) -- those
things are incredibly backwards in my opinion, and the gravitational pull of
the ruby community toward them is unfortunate, and suggest that too many ruby
programmers are squandering their gains in productivity refactoring code into
3 line methods...

~~~
amelim
I know this may be hearsay, but I've been using pylint and have been generally
pleased with the results. I don't take everything it says as the ultimate
source of coding standards, but it will pick up on some silly mistakes one
might make and generally make my code more maintainable.

Granted, I'm still in University, but I think your disdain for automated code
quality tools may be misplaced. Instead blaming the tools, why not blame the
community as a whole for engaging in arcane coding practices.

~~~
jacobolus
You mean _heresy_. _Hearsay_ is information you learned from a friend. :)

~~~
yters
He may have learned that from a friend. Who here really remembers what they
did during one of those all night coding sessions?

~~~
amelim
Could be, but sadly no. Chalk this up to a fatal reliance on Firefox
spellchecker and laziness.

------
ajross
Beware: talks about the benefits of using utility code to solve subproblems
without even a nod to the complexity introduced by adding dependencies to a
project and the costs thereof. This always tells me I'm reading someone who
views software development as _writing_ code only, and doesn't spend much time
maintaining old stuff. All the "hacker" verbiage notwithstanding, I think the
author is missing some important points about what makes a good hack.

That said, I don't know a thing about any of these tools. They might be
fantastically worth it for all I know. But talk about the tools, not about how
they make you a great hacker.

~~~
ubernostrum
I do real-world deployment of Python applications. I use two of these three
tools daily, and will be using the third once it hits API stability.

I use them because -- even though they represent added dependencies -- they
deal with tough problems in clean ways.

For example, every platform has its equivalent of DLL hell; in Python it's not
unheard-of for Application A to need version 1.0 of some library, while
Application B needs version 2.0, they both have to be on the same server and
the two library versions are incompatible.

You can solve this by careful manual management of import paths, or you can
use virtualenv, which does it for you and ensures complete isolation of each
application's dependencies. I opt for virtualenv.

Add pip and you get a reproducible build process: create a virtualenv, and
install whatever's needed to turn it into a working environment for your
application. Then 'pip freeze' and you have a requirements file which can
reproduce that environment for you on-demand (and which is also human-readable
documentation of the needed software).

Fabric's in the middle of a big rewrite and so I'm not presently using it, but
I will be once it's stable, because it adds the final piece of the puzzle:
deployment to multiple servers, with support for multiple targets (e.g., I can
differentiate "deploy this to the staging area" from "deploy this to the
production environment").

Yes, there are tools on popular operating systems which can do these sorts of
things, but you don't always have a homogeneous deployment target where you
can rely on the same tool being available everywhere. These tools, on the
other hand, are pure Python and so get to be cross-platform largely for free.

------
jwecker
The tool he's missing that I end up using constantly is Memoize -
<http://www.eecs.berkeley.edu/~billm/memoize.html> \- enormously simplified my
make processes, and works very well with Fabric and the others.

------
oyving
I am probably missing something about Fabric, but it seems to me it solves a
problem already solved by most unix distributions.

Why not utilize the host packaging system for deploying your code and
applications?

~~~
delano
There are classes of problems that can't be solved at the system level, like
deploying to more than one system type or across multiple machines at the same
time.

~~~
moe
Sorry, but I wonder what you're talking about. Package managers are designed
exactly for this purpose.

It's not even hard to leverage the power of apt (the most advanced of the
bunch) for your own deployments.

In essence you setup a local mirror and add that to the sources.list of all
your hosts. Then you learn how to roll your own debs and push them to the
mirror. The actual deployment happens via apt-get - which can be triggered by
cron, a shell-script, puppet, a sweaty admin at 4am, or whatever fits your
bill.

Working with the system this way instead of along with it (or even against it)
has various advantages. Most importantly you get proper dependency management.
Need to roll back to app version 2 from version 3, whereas version 2 depends
on an older version of _foo_? No-brainer, apt takes care of that for you, both
ways, also in much more complicated cases. Need a package or package version
that's not in the official repos? No problem, roll your own and make your
application package depend on that.

With a bit of elbow grease you can also have it mangle your database and other
auxiliary infrastructure appropiately, within the respective pre-/post-install
scripts.

Fabric and capistrano are just expressions of the old "if you don't understand
you're doomed to reinvent, poorly" meme.

~~~
delano
I agree with you. But now I want to launch and release to several EC2 and
Rackspace machines, in parallel. apt doesn't help with that. It also doesn't
help with releasing to multiple machines simultaneously (including different
types).

If I have 5 debian machines that need to be updated, I should be able to do
that with a single command and it should happen in parallel. The same applies
if I have 5 debian machines and 5 red hat machines (etc...). I'm advocating a
tool that is aware of the existing system specific package managers rather
than a replacement of them.

~~~
moe
_I agree with you. But now I want to launch and release to several EC2 and
Rackspace machines, in parallel. apt doesn't help with that._

Ofcourse it does. What makes you think it doesn't?

 _If I have 5 debian machines that need to be updated, I should be able to do
that with a single command and it should happen in parallel._

reprepro -Vb . stage1 myapp_2.0-1.dsc

That drops a new pkg onto the mirror where the staging hosts pick it up within
one minute, from cron. I could use the "live" distro instead of "stage1" to
roll it out to production. We use sections if we want to limit the push to
individual groups of hosts.

 _The same applies if I have 5 debian machines and 5 red hat machines
(etc...)_

If you mix linux distributions in a production environment then you have
bigger problems to resolve first.

 _I'm advocating a tool that is aware of the existing system specific package
managers rather than a replacement of them._

Those who don't understand are doomed to reinvent, poorly...

~~~
delano
I think we mostly agree, we're just looking at the problem from different
directions.

 _Of course it does._

Can apt launch EC2 instances and execute scripts (that are not part of the
package) before and after installation? Can it update security group settings
and request and assign static IP addresses? My understanding is that apt does
not help with these problems, so we write scripts or use tools like Fabric to
do this. These scripts/tools are aware of the package manager in that they
call the commands to make things happen. This is the level I'm talking about
at which there are open problems.

 _If you mix linux distributions in a production environment then you have
bigger problems to resolve first._

In an ideal world this is true, but it does happen. For example, one vendor my
require a specific type or version of OS from the rest. A business may also
choose to change the OS from one release to the next.

It's important to be aware of what is possible and account for it ahead of
time. Again, I'm not advocating _not_ to use apt or yum or rpm. I'm suggesting
that it's helpful to not tie your process to a specific one unless you have
complete control over the environment, now and for the foreseeable future.

~~~
moe
_Can apt launch EC2 instances and execute scripts (that are not part of the
package) before and after installation? Can it update security group settings
and request and assign static IP addresses? My understanding is that apt does
not help with these problems, so we write scripts or use tools like Fabric to
do this._

Well apt does not launch EC2 instances, _you_ launch them, after you defined
their role in your central configuration server.

The first thing a launched instance does (in rc.local) is "apt-get install
bootstrap". The bootstrap package contains everything a node needs to come
alive. Ours consists of not much more than a script that immediately runs via
the post-install hook. This script is where the magic happens, it connects to
the "hivemind" and gathers the configuration data, based on the node name that
the instance was parametrized with at startup. According to the role it is
asked to assume it will install the appropiate application packages (we call
them "logic bombs"). For sanity it makes sense to just name the packages after
the role. We have packages for "faceplate", "db", "queue" and such.

The packages will depend on other packages as needed and most of them contain
pre-install hooks for initialization tasks (e.g. mount an EBS volume for a
database node, claim an elastic IP, mangle DNS, etc.).

Well, long story short, I think the key mistake of capistrano and fabric is to
assume _Push_ where you really want _Pull_. Once that is realized life becomes
much easier.

 _My understanding is that apt does not help with these problems, so we write
scripts or use tools like Fabric to do this._

Apt is ofcourse just one part of the toolchain and scripts will always be
involved either way. My point is that a toolchain built around apt most likely
has no need for something like fabric. Fabric is just not a very useful
abstraction in a scenario involving more than a handful of hosts.

 _In an ideal world this is true, but it does happen. For example, one vendor
my require a specific type or version of OS from the rest. A business may also
choose to change the OS from one release to the next._

Well, these are problems technology can't fix. These are problems only the HR
department can fix.

 _I'm suggesting that it's helpful to not tie your process to a specific one
unless you have complete control over the environment, now and for the
foreseeable future._

There is a word for systems where nobody assumes "complete control":
abandoned.

------
snprbob86
I took a brief look at virtualenv, but decided not to use it for some reason
that currently escapes me. The solution that I came up with instead has been
working out quite well so far...

My main package has a sub-package called "external". When I run manage.py, it
immediately imports external. External's __init__.py adds a bunch of things to
sys.path

Whenever I want to add a new package, I simply add the egg, or source, or
whatever to the external folder, add an entry to the external/__init__.py and
check it in. This process can even selectively load packages by platform with
a simple if-statement. Now when I checkout a new enlistment from SVN, I
immediately get the full set of dependencies at their exact versions.

Simple, but effective.

Thoughts?

------
stcredzero
_Repetition leads to boredom, boredom to horrifying mistakes, horrifying
mistakes to God-I-wish-I-was-still-bored_

    
    
        It is by will alone I set my mind in motion.
        It is by the juice of sapho that thoughts acquire speed,
        The lips acquire stains.
        The stains become a warning.
        It is by will alone I set my mind in motion.
        -- David Lynch (Piter de Vries from Dune movie)
    
    

"It is caffeine alone that sets my mind in motion. It is through beans of java
that thoughts acquire speed, that hands acquire shakes, that shakes become a
warning... I am... IN CONTROL... OF MY ADDICTION!" -- From the Minicon
Graffiti Wall, 1989

------
diN0bot
>" [deploying production code from local dev] always involves several steps
like packaging up the source from you source code management system, putting
the source in the correct place remotely, and the restating the remote web
server. This can be very tedious by hand, especially for a couple of frequent,
small changes. "

did i get this right? commit/push local code, up/fetch on server, reload web
server.

am i deploing wrong or naively? you know some people honestly don't see the
point in version control? yeah, i don't want to be that person.

------
pibefision
I'm using Capistrano to deploy python code and works fine.

Can fabric get the lastest release from a git repo?

~~~
simeonf
Yes - to the extent that fabric is just a framework for building shell
deployment commands and could run the shell command.

I think there's a lot of (forgive me) synergy from pairing pip, virtualenv and
fabric together though: pip can easily install a project into a virtualenv by
checking it out of a source code repository (currently git, bzr, svn, hg I
believe). Rebuilding the environment then is just a matter of "freezing" the
current virtualenv, transferring the list to the host and building an env with
the appropriate packages (down to revision #) that's a "put" and a "run" - two
lines in a fabfile.

The article links to one of my blogs which links to the slides from talks I
did on pip, virtualenv, and fabric for Baypiggies a few months ago:
[http://simeonfranklin.com/blog/2009/mar/28/baypiggies-
presen...](http://simeonfranklin.com/blog/2009/mar/28/baypiggies-
presentations/) if you're interested...

------
davepeck
Fabric is in a wild state of flux right now. I used it a bit -- then a bit
more -- then I backed off entirely. It has promise, but it seems a bit young
at the moment.

------
s3graham
I keep meaning to use virtualenv. Good pointer to howto with modpython?

~~~
inklesspen
Don't use modpython. It's pretty obsolete.

