
A Better Pip Workflow - ycnews
http://www.kennethreitz.org/essays/a-better-pip-workflow
======
blaze33
I would additionally recommend to use :

    
    
      pip freeze -r requirements-to-freeze.txt > requirements.txt
    

instead of just:

    
    
      pip freeze > requirements.txt
    

So you can keep your file structure with comments, nicely separating your
dependencies from the dependencies of your dependencies. Actually I only used
one requirements.txt file, removing everything below "## The following
requirements were added by pip --freeze:" and regenerating it when I changed
my dependencies.

And sure, beware of git urls being replaced by egg names in the process.

~~~
true_religion
Another tool you might find useful is pipreq.

    
    
        pipreq --savepath gen.requirements.txt /
    

The above will run through all your source code, and generate the requirements
that you _actually_ use via imports.

[https://github.com/bndr/pipreqs](https://github.com/bndr/pipreqs)

Disclaimer: I contribute to it.

~~~
StavrosK
Does it work well? I can't imagine it would be very accurate for things like a
Django project...

~~~
true_religion
It works pretty well for a Django project with the exception of anything
that's only called from INSTALLED_APPs.

I have a branch that can handle Django, but I kept it private because it's not
really general purpose to introspect Django's settings.py then read out data.

------
bpicolo
This is sort of the same thing setup.py is intended for.

The way I've seen this successful in practice across many projects:

setup.py: specify top-level (i.e. used directly by the application)
dependencies. No pinned deps as general practice, but fine to put a hard
min/max version on them if it's for-sure known.

requirements.txt: Pin all deps + sub-deps. This is your exactly known valid
application state. As mentioned, a hasty deploy to production is not when you
want to learn a dependency upgrade has broken your app.

requirements-dev.txt: dev dependencies + include requirements.txt

~~~
ianbicking
This in fact was my intention when creating this feature in pip: setup.py has
the abstract dependencies, requirements.txt is a recipe for building something
exact. And I quickly found requirements-dev was useful for adding tools (some
people have done that with setup.py extras, but the ergonomics on that never
seemed great).

The one thing I wish was part of this is that there was a record of conflicts
and purportedly successful combinations. That is, you distribute a package
that says "I work with foo>=1.0" but you don't (and can't) know if the package
works with foo 1.1 or 2.0. Semantic versioning feels like a fantasy that you
could know, but you just can't. Understanding how versions work together is a
discovery process, but we don't have a shared record of that discovery, and it
doesn't belong to any one package version.

This sense that package releases and known good combinations are separate
things developed at separate paces is also part of the motivation of
requirements.txt, and maybe why people often moved away from setup.py

~~~
e12e
A pacakage/system that handled this would be great. Zope and Plone have the
notion of "known good sets" \-- and it's not really a pleasure to use (but
much better than nothing). As far as I can tell, with Plone 5 - the
recommended way to install Plone is from the unified installer - leaving known
good sets, pinning and buildout to manage plugins:

[http://docs.plone.org/manage/installing/installing_addons.ht...](http://docs.plone.org/manage/installing/installing_addons.html)

For an example of what bootstrapping a full Plone 4 site via buildout entails,
have a look at the (defunct) good-py project:

[http://good-py.appspot.com/release/plone/4.2rc2](http://good-
py.appspot.com/release/plone/4.2rc2)

[http://www.martinaspeli.net/articles/hello-good-py-a-
known-g...](http://www.martinaspeli.net/articles/hello-good-py-a-known-good-
versions-manager)

As long as one is able to manage to keep projects small, and in a virtual-env
of it's own, managing "known good sets" (in buildout, or for pip) shouldn't
really be too hard. But as projects grow, a real system for managing versions
will be needed. As far as I know there are no good systems for this... yet.
Ideally you'd want a list that people could update as they run into problems,
so that if projectA runs fine with foo=1.0, and bar=1.1, maybe projectB
discovers a bug in foo<=1.0.4rc5 and can update the requirement.

It's not a trivial thing (see also: All the package managers, apt, yum, etc).

------
sprayk
I was thinking about this the other day when I was helping a friend with his
method #1-style requirements.txt, and how I wish there was something similar
to composer's "lockfile".

The author's proposed method is basically the same as how php's composer does
it, with its composer.json and composer.lock. Specify your application
requirements by hand in composer.json, run composer install, and composer.lock
is generated. Check both in so you can have consistent deploys. When you want
to upgrade to the latest versions within constraints set by hand in
composer.json, run composer update to pull latest versions, updating
composer.lock. Run tests, and commit the new composer.lock if you are
satisfied.

~~~
Legion
> The author's proposed method is basically the same as how php's composer
> does it

Composer merely cloned Ruby's Bundler and its Gemfile/Gemfile.lock in that
regard. Which is a good thing. It's beyond puzzling that Python has spawned
multiple dependency managers, none of which have replicated the same golden
path.

~~~
servilio
That it hasn't been adopted as a core functionality doesn't there isn't one,
zc.buildout first stable release predates bundler's 0.3.0 by at least one
year:

[https://pypi.python.org/pypi/zc.buildout/1.0.1](https://pypi.python.org/pypi/zc.buildout/1.0.1)

------
K0nserv
The lack of separation between requested direct dependencies and a pinned
resolved dependencies(including transitive) has been one of the great
confusions for me learning Python. Having used Bundler, CocoaPods, Composer,
and NPM before, which all have this separation builtin, pip feels broken.

However there are a few project that tries to solves this, but the fact that
the Python community has not decided on one cripples any initiative to fix it.

~~~
mattnewton
I can second this, I wish every time I interact with pip that it was npm. I
never appreciated the KISS approach npm uses of just dumping all the
dependencies in local folder exhaustively until I visited Python versioning
hell upon myself by not using a virtualenv (which is an ugly hack itself).

~~~
StavrosK
How does NPM do it? My requirements.txt looks like this and I've never had any
problems:

    
    
      foo ~= 1.8.2
      bar ~= 2.4.1
      baz
    

(These are only requested dependencies, resolved are not specified).

~~~
K0nserv
I misspoke a little with regards to NPM. NPM has something called npm-
shrinkwrap that allows you to lock resolved dependencies. It's used in many
NPM projects and seems to be a standard chosen by the community.

I am not sure what ~= means in requirments.txt, but I'm gonna guess it means
something like ~> or ^. With as system like that if everyone follows semver
correctly we are fairly okay. The problem is that not everyone does and you
have no guarantee that deploying the same code at two points in time t1 and t2
will produce the same application since one of the dependencies might have
released new code.

------
89vision
I've been using pip-tools to manage my requirements. It allows you to specify
a top-level requirements file, and from that it builds you a requirements.txt
with specific versions and all dependencies. It has really streamlined my pip
workflow.

[https://github.com/nvie/pip-tools](https://github.com/nvie/pip-tools)

~~~
conradev
Yep, I'm surprised no one knows about pip-compile. It does exactly as the OP
suggests, but with the ability to specify a range of versions.

------
p4wnc6
If you don't mind committing to Anaconda Python distributions, then you should
simply use conda.

You can still pip-install things within a conda environment, and conda can
manage more dependencies than just Python dependencies (a common use case is
managing R dependencies for a Python statistical workflow).

You can do

    
    
        conda list -e > requirements.txt
    

then

    
    
        conda create -n newenv --file requirements.txt
    

to create a Python environment from the frozen requirements.

I believe that conda makes it easier to selectively update, but even if you
don't enjoy those features of conda, the same two-file trick as in this post
will work for conda as well, since you can use `conda update --file ...`.
Conda's "dry-run" features are more useful than pip's as well.

The perfect feature for conda to add is the ability to specify alternative
Python distributions, either by a path to the executable, or by allowing
alternative Python distributions to be hosted on Binstar.

I can understand why Continuum wants conda to heavily influence everyone to
use only Anaconda, but I think the goodwill of making conda work for any
Python distribution would bring more to them than keeping it focused solely on
Anaconda. (For example, I know some production environments that still use
Python 2.6 and are prevented from updating to 2.7 -- and even if they did
update, they'd need to keep around some managed environments for 2.6 for
testing and legacy verification work).

~~~
odonnellryan
I agree. Only used conda on a few projects just recently, so not too battle
tested, but I've found it much easier to use, especially with "difficult"
libraries.

------
xuan
Nice post as alway from Kenneth.

However, this workflow has a little drawback. If you have a dependency not
from pipy, e.g. `pip install
git+ssh://github.com/kennethreitz/requests.git@master`, it won't work.

~~~
89vision
I've given up trying to use vcs links as dependencies with pip. I deploy
everything to an internal devpi server and use --extra-index-url

------
goerz
I would argue that the information in "requirements-to-freeze.txt" could
simply be expressed in the "install_requires" list in setup.py. That is, the
"general" dependencies (with only vague version information) should be in
setup.py, whereas requirements.txt should pin the exact "guaranteed to work"
versions. For small/non-commerical projects, requirements.txt may not even be
necessary.

~~~
mixmastamyk
So, you use `python setup.py` instead of pip?

~~~
goerz
No...? AFAIK (and based on my daily experience), pip processes setup.py, so it
definitely installs the dependencies listed there.

~~~
mixmastamyk
Ok, so you just skip the requirements file.

------
Diederich
Pinto
([https://metacpan.org/pod/distribution/Pinto/lib/Pinto/Manual...](https://metacpan.org/pod/distribution/Pinto/lib/Pinto/Manual/Introduction.pod))
nails this problem in the Perl ecosystem:

Pinto has two primary goals. First, Pinto seeks to address the problem of
instability in the CPAN mirrors. Distribution archives are constantly added
and removed from the CPAN, so if you use it to build a system or application,
you may not get the same result twice. Second, Pinto seeks to encourage
developers to use the CPAN toolchain for building, testing, and dependency
management of their own local software, even if they never plan to release it
to the CPAN.

Pinto accomplishes these goals by providing tools for creating and managing
your own custom repositories of distribution archives. These repositories can
contain any distribution archives you like, and can be used with the standard
CPAN toolchain. The tools also support various operations that enable you to
deal with common problems that arise during the development process.

------
akx
The `pip-tools` utilities automate this nicely.

~~~
nikolay
... but it's notorious for not working on latest Python and pip.

~~~
njharman
A limitation you are not bound to cause they are open source.

~~~
nikolay
So, I should learn the inner-workings of pip just to use a basic tool that
exists both in Node.js and Ruby land and a dozen others?

~~~
njharman
Yes. Or, wait until some one else does.

This is how open source works. How do you think Node.js and Ruby got the
capability? Do you imagine they sprout fully formed from hyperbole like "a
dozen others".

~~~
nikolay
I've been waiting, trust me. It's never been working with the latest pip,
sorry!

------
edavis
With this workflow, what purpose does requirements.txt serve? Would the file
ever be used directly?

Only thing I can think of is you'd track top-level packages in requirements-
to-freeze.txt during development, while your deploy would use requirements.txt
to get a deterministic environment.

~~~
fredsir
That is indeed the idea.

------
smnrchrds
What does "requests[security]" do? I don't remember ever using the bracket
syntax.

~~~
luhn
Brackets are used to install recommended dependencies. See
[http://pythonhosted.org/setuptools/setuptools.html#declaring...](http://pythonhosted.org/setuptools/setuptools.html#declaring-
extras-optional-features-with-their-own-dependencies)

In the case of requests[security], it installs some extra packages that allow
for more secure SSL. [http://stackoverflow.com/questions/31811949/pip-install-
requ...](http://stackoverflow.com/questions/31811949/pip-install-
requestssecurity-vs-pip-install-requests-difference)

------
crdoconnor
>While the Method #2 format for requirements.txt is best practice, it is a bit
cumbersome. Namely, if I’m working on the codebase, and I want to $ pip
install --upgrade some/all of the packages, I am unable to do so easily.

$ pip freeze --local | grep -v '^\\-e' | cut -d = -f 1 | xargs ./venv/bin/pip
install -U

Ought to be a first class command, though.

After running that the first thing I do is run the tests. Then I freeze,
commit and push.

This is usually a good thing to do at the beginning of a new sprint so that
more subtle bugs caused by upgrading packages can be teased out before
releasing a new build.

For my projects that I release on pypi I don't want to use pinned
dependencies, but for them I run periodic tests that download all dependencies
and run tests so that I'm (almost) instantaneously notified when a package I
depend upon causes a bug in my code (e.g. by changing an API).

------
darkerside
Another method is to use pip constraints to enforce package versions
[https://pip.pypa.io/en/stable/user_guide/#constraints-
files](https://pip.pypa.io/en/stable/user_guide/#constraints-files)

------
tibbon
I'm probably going to get downvoted to hell on this; but as a Rubyist who has
been working on a Python project, I've been finding pip to be really weak
against rubygems/bundler.

With RubyGems/Bundler I love the ability to point to github repos, lock
versions (or allow minor/patch versions to update), have groups, etc.

requiresments.txt and pip just feels, awkward and weird. Especially when
combined with virtualenv, in comparison to Ruby this is just stiff and
strange.

I've had nothing but problems with more complex packages like opencv and
opencl as well.

~~~
infecto
I am fairly positive you can do everything you have described. Just doing a
quick google search returns the answers I need.

1\. Point to github repos. `pip install
git+ssh://git@github.com/echweb/echweb-utils.git`
[http://stackoverflow.com/questions/4830856/is-it-possible-
to...](http://stackoverflow.com/questions/4830856/is-it-possible-to-use-pip-
to-install-a-package-from-a-private-github-repository)

2\. Lock minor versions `Django>=1.3,<1.3.99`
[http://stackoverflow.com/questions/6047670/pip-specifying-
mi...](http://stackoverflow.com/questions/6047670/pip-specifying-minor-
version)

What is a group? My google search did not turn up anything relevant.

~~~
brbsix
The only bummer is that you can't declare git repos as project dependencies,
AFAIK they have to be installed manually as you've described.[0][1] Does ruby
handle this well? This is a real PITA when the Cheese Shop / Warehouse either
don't have a particular project or only have an outdated version. In such
cases you pretty much just have to create your own mirror.

FYI groups allow you to group various dependencies. E.g. you have one group of
dependencies for development, another for testing, and a minimal one for
production.

[0]: --process-dependency-links will permit you to pull in VCS repos as
dependencies _locally_ , but it does you no good when distributing packages
for third-party consumption

[1]: [https://groups.google.com/forum/#!topic/pypa-
dev/tJ6HHPQpyJ4](https://groups.google.com/forum/#!topic/pypa-dev/tJ6HHPQpyJ4)

~~~
scandinavian
If you mean project dependencies as dependencies declared in the setup.py
file, setuptools support the dependency_links option, which can include VCS
links. So that shouldn't be a problem, albeit I believe it is frowned upon.

~~~
brbsix
You have to pass pip the --process-dependency-links flag at install time
(which is pretty shitty if users are expecting to be able to install from PyPi
normally) and it will ignore your dependency links if it thinks there is a
better package available at PyPi (e.g. some other package with the same name
has a higher version or the GitHub repo has some critical bug fixes but still
has the same version as the PyPi package). Sometimes you can fool it into
working, but it's just been such an unpredictable mess that I've given up on
dependency_links for software that's distributed to third-parties.

------
nikolay
I personally use requirements.txt for source and requirements.txt.lock, which
I find more standard.

Or maybe use the pip tools' standard: requirements.in for source and
requirement.txt for compiled at least.

------
deepwalker
Hey, I have wrote Pundle to solve this problem! Idea is simple — we just have
package to install other packages with versions freeze. Pundle maintaine
frozen.txt alongside with requirements.txt.

What more — pundle does not use virtualenv and install all packages to user
directory ~/.Pundledir and import frozen versions on demand.

It have all nice commands like install, upgrade, info etc

Check it
[https://github.com/Deepwalker/pundler](https://github.com/Deepwalker/pundler)

------
drcongo
Vaguely related, I hacked together a little tool called Olaf[1] for people who
like to pip freeze but also like to have multiple distinct requirements files
like requirements-dev.txt – Needs a little work still but I've been using it
on projects quite happily.

[1] [https://pypi.python.org/pypi/olaf](https://pypi.python.org/pypi/olaf)

------
njharman
Instead of doing that manually, use pip-tools [https://github.com/nvie/pip-
tools/](https://github.com/nvie/pip-tools/)

Some posts on theory and use [http://nvie.com/posts/pip-
tools-10-released/](http://nvie.com/posts/pip-tools-10-released/)

~~~
blaze33
He explicitly wrote: "I thought long and hard about building a tool to solve
this problem. Others, like pip-tools, already have. But, I don’t want another
tool in my toolchain; this should be possible with the tools available." so
it's not like he's not aware of pip-tools.

~~~
kenneth_reitz
/me has commit access to pip-tools.

------
ekmartin
We built Doppins ([https://doppins.com](https://doppins.com)) to be able to
use pinned PyPI dependencies and/or ranges and still keep them up-to-date
continuously. Still another tool, but it's quite quick to enable on a
repository and doesn't require any maintenance afterwards.

------
itissid
I found myself using tox for managing the general case of this. But the spirit
to the blog is in the same direction.

------
david-given
I still find myself automatically thinking of _this_ PIP.

[http://www.shaels.net/index.php/cpm80-22-documents/using-
cpm...](http://www.shaels.net/index.php/cpm80-22-documents/using-cpm/6-pip-
utility)

------
jbmorgado
I would also suggest using anaconda if you are using python for science:
[https://www.continuum.io/downloads](https://www.continuum.io/downloads)

It makes managing the python packages and even the python versions quite easy.

------
bobinator606
like it or hate it, this is exact how its supposed to be done.

~~~
castis
That doesn't mean its the way it should always be done though.

