
Overview of Python dependency management tools - mariokostelac
https://modelpredict.com/python-dependency-management-tools
======
madelyn
People always get up in arms about this, but as someone who has used Python as
her daily driver for years it's really... never been this serious of an issue
for me?

I have used virtualenv/venv and pip to install dependencies for years and
years, since I was a teen hacking around with Python. Packaging files with
setup.py doesn't really seem that hard. I've published a few packages on pypi
for my own personal use and it's not been too frustrating.

A lot of the issues people have with Python packaging seem like they can get
replaced with a couple shell aliases. Dependency hell with too many
dependencies becomes unruly in any package manager I've tried.

Is the "silent majority" just productive with the status quo and getting work
done with Python behind the scenes? Why is my experience apparently so
atypical?

~~~
a_cool_username
Let me start by saying: I love python, and I love developing in it. It's the
"a pleasure to have in class" of languages: phenomenal library support, not
too painful to develop in, nice and lightweight so it's easy to throw together
test scripts in the shell (contrast that with Java!), easy to specify simple
dependencies + install them. (contrast that with C!).

That said... if you work on software that is distributed to less-technical
users and have any number of dependencies, python package management is a
nightmare. Specifying dependencies is just a minefield of bad results.

\- If you specify a version that's too unbounded, users will often find
themselves unable to install previous versions of your software with a simple
`pip install foo==version`, because some dependency has revved in some
incompatible way, or even worse specified a different dependency version that
conflicts with another dependency. pip does a breadth-first search on
dependencies and will happily resolve totally incompatible dependencies when a
valid satisfying dependency exists.[1]

\- If you specify a version with strict version bounds to avoid that problem,
users will whine about not getting the newest version/conflicting packages
that they also want to install. Obviously you just ignore them or explain it,
but it's much more of a time sink than anyone wants.

\- In theory you can use virtualenvs to solve that problem, but explaining how
those work to a frustrated Windows user who just spent hours struggling to get
Python installed and into their `PATH` is no fun for anyone. Python's made
great strides here with their Windows installers, but it's frankly still
amateur hour over there.

\- Binary packages are hell. Wheels were supposed to make Conda obsolete but
as a packager, it's no fun at all to have to build binary wheels for every
Python version/OS/bitness combination. `manylinux` and the decline of 32-bit
OSes has helped here, but it's still super painful. Having a hard time
tracking down a Windows machine in your CI env that supports Python 3.9? Too
bad, no wheels for them. When a user installs with the wrong version, Python
spits out a big ugly error message about compilers because it found the sdist
instead of a wheel. It's super easy as a maintainer to just make a mistake and
not get a wheel uploaded and cut out some part of your user base from getting
a valid update, and screw over everyone downstream.

\- Heaven help you if you have to link with any C libraries you don't have
control over and have shitty stability policies (looking at you, OpenSSL[2]).
Users will experience your package breaking because of simple OS updates.
Catalina made this about a million times worse on macos.

\- Python has two setup libraries (`distutils` and `setuptools`) and on a
project of any real complexity you'll find yourself importing both of them in
your setup.py file. I guess I should be grateful it's just the two of them.

\- Optional dependencies are very poorly implemented. It still isn't possible
to say "users can opt-in to just a specific dependency, but by default get all
options". This is such an obvious feature, instead you're supposed to write a
post-install hook or something into distutils.

\- Sometimes it feels like nobody in the python packaging ecosystem has ever
written a project using PEP420 namespaces. It's been, what, 8 years now? and
we're just starting to get real support. Ridiculous.

I could go on about this for days. Nothing makes me feel more like finding a
new job in a language with a functioning dependency manager than finding out
that someone updated a dependency's dependency's dependency and therefore I
have to spend half my day tracking down obscure OS-specific build issues to
add version bounds instead of adding actual features or fixing real bugs. I
have to put tons of dependencies' dependencies into my package's setup.py, not
because I care about the version, but because otherwise pip will just fuck it
up every time for some percentage of my users.

[1] I am told that this is "in progress", and if you look at pip's codebase
the current code is indeed in a folder marked "legacy".

[2] I 100% understand the OpenSSL team's opinion on this and as an open source
maintainer I even support it to some degree, but man oh man is it a
frustrating situation to be in from a user perspective. Similarly, as someone
who cares about security, I understand Apple's perspective on the versioned
dylib matter, but that doesn't make it suck any less to develop against.

~~~
ptx
> struggling to get Python installed and into their `PATH` ... it's frankly
> still amateur hour over there

But that has been solved on Windows for quite a while hasn't it?

Python installs the "py" launcher on the path, which allows you to run
whichever version you want of those you have installed. Just type "py" instead
of "python". Or "py -3.5-32" to specifically run 32-bit Python 3.5, or "py -0"
to list the available versions.

~~~
a_cool_username
It's gotten a lot better, but we still hit tons of issues with users who don't
know what Python version they installed their application in. Oh and of course
our "binaries" in Scripts/bin don't seem to show up in the PATH by default. So
I get to tell people "py -3.8-64 -m foo" on windows, "foo" everywhere else.

This gets much much worse when a new version of Python comes out and we don't
support it yet (because of the build system issues I mentioned). I spent
several weeks teaching people how to uninstall 3.8 and install 3.7 before we
finally got a functioning package out for 3.8.

~~~
sergeykish
I like Mozilla build system on Windows, you click "start-shell.bat" and it
runs console. Python, mercurial, rust - just works, never checked PATH.

[https://firefox-source-
docs.mozilla.org/setup/windows_build....](https://firefox-source-
docs.mozilla.org/setup/windows_build.html)

------
vmsp
pip-tools is almost never mentioned because it's boring but great. I always
default to it.

[https://github.com/jazzband/pip-tools](https://github.com/jazzband/pip-tools)

~~~
mariocesar
Yes !! I just create a Makefile target and pip-tools is all I need. I create a
requirements.in and that is all. So far never feel that has to be more
complicated than that. And when I want to upgrade a package I update the
requirements.in if I need to and run `make -B` for this:

    
    
      default: requirements-develop.txt
        pip install -r requirements-develop.txt
    
      requirements.txt:
        pip-compile -v requirements.in
    
      requirements-develop.txt: requirements.txt
        pip-compile -v requirements-develop.in
    

I so nice to just write `make` than doing all the Poetry, Pipenv stuff, that
honestly I feel is not adding nothing really really useful to the workflow.

~~~
mariokostelac
Does it pin versions of 2nd degree dependencies too? Like pip freeze would do?
Also, when you remove a package, does it know to clear packages that were its
deps and are not needed anymore?

~~~
mjw1007
pip-tools can do both of those, yes.

For the second, pip-compile computes the new requirements.txt (which is
effectively the lockfile) from scratch, and pip-sync (not shown in that
Makefile fragment) removes packages that are no longer listed there.

~~~
mariokostelac
Thanks a lot for the reply. I'll include it some time soon in the article
update

------
drcongo
Every attempt to solve this problem in Python seems to eventually end up in a
pretty terrible place. Pipenv got off to a great start but got slower and
slower to the point that it was more painful to use than not. Poetry (which is
still my preferred option) started off with something seemingly beautifully
thought through, and very fast too. But after only a few version updates, it
seems to be hitting the same problems Pipenv did. On one project I was working
on recently I managed to screw up the Poetry.lock file, so I ran `poetry lock`
and it took 18 minutes. I still have high hopes for poetry, but I spend way
more time trying to work around its shortcomings now (v1.0.5) than I did when
it was at version 0.10.0 two years ago.

~~~
throwaway894345
I get that resolving dependencies is a SAT problem and inherently intensive;
however, I don't understand why it's so much slower in Python. Is it just that
all of these resolvers are implemented in Python (and Python is really that
much slower than other languages?), or does Python require you to download an
entire package just to determine its dependencies? In the latter case, that
seems pretty dumb, right? Like as bad as exposing the entire interpreter as
the extension interface, rendering optimizations and competing interpreters
virtually impossible.

~~~
raziel2p
> does Python require you to download an entire package just to determine its
> dependencies?

yes - the standard way of defining dependencies in Python is in setup.py,
which has to be invoked as a Python script in order to work. this script may
also need to read files from the rest of the project, so you do indeed need to
download the whole package to determine its dependencies.

even if the Python community were to agree on a new configuration format
tomorrow, there would still be a ton of packages out there that wouldn't
migrate for years.

~~~
epistasis
It seems that this information should cacheable after an invocation of
setup.py, at least for an installation without any extras. And even with
extras requested, perhaps.

Or is there any even greater hidden challenge from using setup.py?

~~~
JaDogg
setup.py can check the OS and pick necessary requirements. so it can have
different dependencies in different OSes.

I've used it like this -
[https://github.com/JaDogg/pydoro/blob/b1b3de38ac15b9254ef1be...](https://github.com/JaDogg/pydoro/blob/b1b3de38ac15b9254ef1bef636b6150e446272da/setup.py#L26)

------
slavoingilizov
The missing ingredient to really, REALLY solve these problems once and for all
is an authoritative decision to switch package formats and run the whole
dependency resolution stack by the core python language contributor team.

I get backwards compatibility and open-source governance and bla-bla, but the
reality is that this cannot be done by a third-party library author and needs
to become part of the core stack, including proper support rather than just
shipping a tool which covers 90% of cases. It's crazy that apart from venv and
pip, nothing else comes with python and you're left on your own.

npm + the registry is part of node

apt-get + registry is part of a normal linux distro

budnler comes with ruby

This is a solved problem elsewhere. What we lack is a fully-supported, agreed
upon, working DEFAULT choice, so people don't have to make their own choices.
I don't know if not having that DEFAULT is a function of how the python
community thinks or its diversity, but it's painful to watch. I've almost
given up myself and seen many newcomers give up because of a trivial problem
like this.

~~~
steveklabnik
While npm comes with node, and bundler comes with Ruby, the governance of
these projects/tools are separate from the language.

~~~
slavoingilizov
Yet someone has made the decision to do bundle them with the language and
provide a default. I'm not suggesting we have common governance, but these
decisions need to be made.

------
kissgyorgy
> Pipenv or poetry?

If you used pipenv for a complex project with huge dependency tree, or used it
for a long time, you definitely run into a blocker issue with it. That is the
worst package manager of all, and probably the reason why Python has such a
bad reputation in this area. It's because it's fundamentals are terrible.

Just go with Poetry. It's very stable, easy to use, has a superior dependency
resolver and way faster than Pipenv.

~~~
FridgeSeal
Am I misunderstanding Poetry? Because it seems to me more suited as something
for packaging your python code up ready to be pushed to Pypi? As in, starting
up a project creates an init.py, and a python file referencing distutils:
neither of which I need or want to do if I'm writing an app to go into a
docker container.

~~~
kissgyorgy
The first use case is actually handling project dependencies. If I remember
correctly, it couldn't build packages at the first time, so the "build"
subcommand was introduced only later. It's the same type of package manager
(with lock file) as other languages already had like Cargo, Bundler or NPM.

------
akbo
Dependency management can be pretty overwhelming for a lot of people entering
Python. This is especially true in the data science realm, where many don't
have a SWE background. Even after you have selected a tool, it can be easy to
use it in a poor way. I have recently written a short article on how I use
conda in a disciplined way to manage dependencies safely:
[https://haveagreatdata.com/posts/data-science-python-
depende...](https://haveagreatdata.com/posts/data-science-python-dependency-
management/)

~~~
edsac_xyzw
Python dependency management of packages using C or C++ behind the scenes is
really problematic and sometimes, the installation may fail. In this case, a
solution is to use Conda or mini conda which provide many pre-compiled
packages and also Clang C++ compiler.

An alternative way to allow people without software engineering background to
play with Python data science and machine learning tool may be providing pre
built Docker images with everything pre-installed which may save one from
configuration trouble.

Docker is also useful for learning about new programming languages without
installing anything. With just one command $ docker "run --rm -it julia-
image", one can get a Docker image containing a GOLang compiler; a Julia
language installation; a Rust development environment and everything else.
Docker is really a wonderful tool.

~~~
mariokostelac
Docker is definitely an interesting tool for that, but my biggest problems is
that I have to teach them Docker, which is a totally new layer of abstraction
they haven't seen before.

How do you approach this? How technical are people you prepare Docker images
for?

~~~
edsac_xyzw
You don't need to teach docker. All you need is providing a docker image with
everything pre-installed such as Julia, R language, Python, numpy, pandas,
Tensorflow and maybe Vscode. And also any Linux distribution, then one can
just type "$ docker --rm -it -v $PWD:/cwd -w /cwd my-image ipython" For better
convenience, it is better creating a command line wrapper or shell script that
saves one from typing that such as $ ./run-my-image ipython. I don't prepare
anyone, but I guess that if I knew anything about docker and was given a
docker image with everything ready and pre-configured and also a shell script
or command line encapsulating all docker command line switches, I would find
it more convenient than installing everything myself or fighting some
dependency conflict or dependency hell. So, docker can be used as a portable
environment development. VScode, aka visual studio code, also supports remote
development within docker containers with extensions installed per container.
I am a mechanical engineer by training, but I found docker pretty convenient
for getting Julia, Octave, R language, Python, Jupyter Notebook server without
installing anything or fighting with package manager of my Linux distribution
when attempting to install a different version of R, Julia or Python. This
approach makes easier for getting bleeding edge development tools without
breaking anything that is already installed. I even created a command line
wrapper tool for using docker in this way that simplifies all those case: $
mytool bash jupyter-image; $ mytool daemon jupyter-notebook ...

------
japhyr
If you're interested in the technical issues behind Python packaging, a recent
Podcast.__init__ episode features three people working on improving Pip's
dependency resolution algorithm. My use cases are simple enough that I've
gotten by for years just using pip and venv with requirements.txt files, but
it was still fascinating to listen to how package management is approached in
more complex situations.

Dependency Management in Pip's Resolver: [https://www.pythonpodcast.com/pip-
resolver-dependency-manage...](https://www.pythonpodcast.com/pip-resolver-
dependency-management-episode-264/)

------
sixhobbits
I've spent far too many hours fighting with these tools in two completely
different scenarios

* Developing and deploying production Python solutions

* Helping beginners run their first script

While it's great for beginners to use the same tools that are used in
industry, I strongly believe that the problem nearly all of these tools face
is that they can't decide whether they want to _manage_ complexity or _hide_
complexity.

You can't do both.

Some of them do a fairly good job at managing complexity. None of them do a
good job of hiding it. The dream of getting Python to "just work" on any OS is
close to impossible (online tools like repl.it are the closest I've found but
introduce their own limtiations). I recently saw a place force their beginner
students onto Conda in Docker because getting people started with Conda was
too hard. If you're battling with the complexity of your current layer of
abstraction, sometimes it's better to start removing abstraction rather than
adding more.

That said, I'm also a happy user of `pip` and `virtualenv` and while I'm sure
that many people can use the others for more specific needs, I think
defaulting to them because they aim to be "simpler" is nearly always a
mistake. I still teach beginners to install packages system wide without
touching venv at first - it's enough to get you through your first 2-3 years
of programming usually.

~~~
franey
This is a good point about complexity. I started with pip + virtualenv, and
I'd recommend pip + venv to anyone learning Python. venv is in the standard
library, so there's official documentation for it.

I picked up Pipenv when a point-point release of a dependency broke a
production deployment. Pipenv's dependency locking meant that I wouldn't get
surprised like that again.

Part of why this topic comes up so much is the desire to run with a language
before learning to walk with it, perhaps. I'm a big fan of Poetry, but I like
it because I know what it gives me compared to vanilla pip and a setup.py
file.

Installing dependencies at the OS level will get you far as a beginner. And
when the time comes that you need a virtual environment, you'll probably know.

------
dedoussis
I've been working in python roles for some years now and I never understood
why the python dependency tooling is so poor.

Pip feels like an outdated package manager, lacking essential functionality
that package managers of other languages have implemented for years. For
example, credential redacting in pip was only introduced in 2019, 8 years
after its initial release!

Not to mention the global-first nature of pip (package is installed globally
unless the user explicitly requests for a local installation). You can still
install packages locally, but this only shows that pip was not built with
environment reproducibility in mind. As a consequence, the need for additional
environment tooling (like venv) arose, which increased the complexity of the
local python setup.

Tools wrapped around pip are also under par. I cannot see why Pipenv is that
resource intensive, leading to long and noisy builds (my machine gets close to
exploding on a pipenv lock), with very fragile lock files. Debugging an
unsuccessful locking in the CI of an enterprise project is a mystery that
could take an entire week to solve. Its javascript counter-part (npm) does the
exact same thing, faster and with less CPU usage.

Trusting the OS community, I understand that there would be very good reasons
for Pipenv to perform like this, but as the consumer of a package managing
tool all I see is the same generation of file hashes I see on npm, but with
npm doing it way more efficiently. I really see value in the principles that
Pipenv is promoting, but to me the developer experience of using it is
suboptimal.

------
bvar10
Serious question: What is the difference between virtual environments and just
having several Python installs like:

    
    
      /home/foo/a/usr/bin/python3
      /home/foo/b/usr/bin/python2
    

Python is so fast to compile and install that I just install as many throwaway
Pythons as needed.

I do not recall any isolation issues between those installs, unlike with conda
or venv, which are both subtly broken on occasion.

But I dislike opaque automation in general.

~~~
luhn
That basically _is_ what a venv is, an entirely separate Python install. Some
files are linked rather than being copied, but it looks the same. venv gets
you a couple extra conveniences, like the activation script.

I wouldn't call venv "opaque automation," there's not much magic going on
there.

------
globular-toast
The trouble is these tools all do different things and aren't really
comparable. I wouldn't even include Docker in this kind of thing as it doesn't
really do anything on its own.

For me, there are two main choices today:

* An ensemble of single-purpose tools: pip, venv, pip-tools, setuptools, twine, tox,

* An all-in-one tool, for example Poetry, Pipenv or Anaconda (or Miniconda).

I prefer the former approach, but if I had to choose an all-in-one tool it
would be poetry.

~~~
mariokostelac
I agree with you that Docker should not be there, but the reality is that
people us it to replace some other tools (like venv).

I wonder why you prefer the former approach.

~~~
ggregoire
On the contrary, thanks for having included Docker in that list. It's the
obvious answer to so many problems (developing, running and deploying apps,
replicating deterministic Python environments, not installing linux
dependencies required by Python packages directly on your machine, and so on).

BTW, to comment one of the point you made in the article, it's not that hard
to run CUDA inside a container. It's less straightforward but quite well
documented. You basically need nvidia-docker [1] on the host and start your
containers with the runtime 'nvidia'. docker-compose still doesn't support it
officially but there are workarounds. [2] I'm running it on ~50 instances in
production and automated all the setup with ansible successfully.

[1] [https://github.com/NVIDIA/nvidia-
docker](https://github.com/NVIDIA/nvidia-docker)

[2]
[https://github.com/docker/compose/issues/6691](https://github.com/docker/compose/issues/6691)

~~~
globular-toast
Why thank him for including Docker if it's the "obvious answer"? You're
already using it, that's great.

Docker doesn't do anything Python specific on its own. It can be part of a
pipeline but only with support from the Python specific tools which is what
should be discussed in this kind of article.

------
franey
I think this is a good basic overview of the dependency management landscape.
I have a few things to add.

One is that because Python has been around for so long, it's easy to find
outdated or conflicting advice about how to manage Python packages.

I think it's important to stress that pyenv isn't strictly a dependency
manager, too, and depending on your OS, isn't necessary. (Supported Python
versions are in the AUR[0].)

A lot of pain from Python 2 -> 3 is that many operating systems were so slow
to switch their default Python version to 3. Unless something has changed in
the last month or so, Mac OS _still_ uses Python 2 as the default.

It's a shame to see Python take a beating for OS-level decisions.

[0] [https://aur.archlinux.org/](https://aur.archlinux.org/)

------
xapata
> [Pipenv] loads packages from PyPI so it does not suffer from the same
> problem as Conda does.

False. Conda manages packages installed from PyPI. This is discussed under the
Conda section, so I'm surprised the quoted line wound up in the article.

~~~
mariokostelac
Hey xapata, thanks for pointing this out.

Any chance you could give me some reference so I can fix it in the original
article?

~~~
xapata
[https://docs.conda.io/projects/conda/en/latest/user-
guide/ta...](https://docs.conda.io/projects/conda/en/latest/user-
guide/tasks/manage-environments.html#pip-in-env)

Basically, use Conda to manage environments, use Pip to install packages. If
you're using Conda to install anything, do that first.

------
boromi
I've alwyas used conda since I use the scipy stack. Can anyone clue me in if I
can instead use pipenv and it will download all the requisite binaries etc. ?

------
dzonga
pipenv is terrible. poetry ain't there yet. seems author forgot to mention the
problems with pipenv and poetry. virtualenv + pip will take you far. then to
reproduce pipe to requirements.txt. poetry etc are still using pip under the
hood

------
boarnoah
Could someone summarize the issues with Pipenv (and by Extension Poetry). Been
using them happily for the last few years, didn't know people disliked them.

With Pipenv, last year ownership switched from the Request's lib owner to the
Pypa, so more or less an officially blessed solution.

The only downside on this thread that I could understand so far is that it
might be slow to install dependencies on larger projects, can't think of
anything else.

~~~
nwatson
I've been using pipenv happily for a few years now, but on projects that don't
have a huge number of dependencies (Django, DRF, MySQL/Postgres, AWS,
Kubernetes, a few other random libraries), and haven't seen too much slowness.
I suppose data-science projects with large dependencies that pull in many
other dependencies might have more issues.

I've been the person to document setting up development environments for
others in macOS (and Homebrew) with a view to deploying in Linux, and pipenv
(and pyenv, and Docker/docker-compose for setting up software
context/datasets) definitely overall minimized the complexity for those
configuring their dev environments.

(EDIT: documenting dev enviroments)

------
sevensor
After hitting some weird PyInstaller bugs, I gave up and started compiling
Python myself. One interpreter for every project. Shell scripts to set the
paths. All libraries go directly in site-packages, not some other layer. A
little more complicated at the outset, but this approach has yet to let me
down. And compared to the nightmares I was trying to fix, building Python is
dead easy.

~~~
tagh
I can get PyInstaller working with venv, but not conda.

------
alxmdev
Reading this makes me grateful that I can get away with just _apt-get_. I
wonder how prevalent this is, since not every project needs specific or latest
versions of the runtime and libraries, only a minimum. Some are just plumbing
tools that stick to the stable core, and the Python 3 ecosystem has been
mature for enough years that older distro packages are still useful and
capable.

------
sosodev
Personally I just try to avoid Python development because I hate feeling like
I'm dealing with what should be a solved problem. Recently I had to work with
an outdated Python Tensorflow framework and the only way we could get it to
work correctly across different dev and deployment machines was with a fat
Docker image that took hours of head scratching to build. It was miserable.

------
brummm
Using conda for environment and dependency management works really well.

~~~
cbarrick
Conda is a great tool.

But it forks the ecosystem, twice:

First, Conda packages have to be maintained separately from PyPI packages.

Second, the "default" repo is maintained by Anaconda, but the community
maintained Conda Forge repo is also separate, and officially the packages in
one are not compatible with the packages in the other. (In practice they
usually play nice).

Having three incompatible package repos is not ideal.

------
JackC
pip-tools should really be included here. That's the single-purpose tool that
handles environment reproducibility, if you're going with the single-purpose-
tool route of pyenv + pip + venv instead of the all-in-one route of
poetry/pipenv.

------
CharlieBlack11
Tbh, this is one of the reasons why I moved away from Python to Ruby for my
side projects.

~~~
xapata
Are bundler, rvm, and rbenv not as confusing?

~~~
sosodev
Definitely not. In Ruby I can immediately understand how to run a correct copy
of any project because they all use bundler. Furthermore rbenv makes switching
between specific Ruby versions for those apps trivial.

------
carapace
You know how pathlib does for paths and files? Python needs something like
that for distribution/versions/import hacking, eh?

Glyph (of Twisted fame, whence pathlib IIRC) pointed this out ages ago: model
your domain [in Python] with objects.

------
6gvONxR4sf7o
Oof, that footnote:

> It’s 2020, but reliably compiling software from source, on different
> computer setups, is still an unsolved problem. There are no good ways to
> manage different versions of compilers, different versions of libraries
> needed to compile the main program etc.

I wonder how much stuff like this has to do with python's popularity. When I
have opaque issues like "libaslkdjfasf.so is angry with you and/or out to
lunch and/or not doing expected things," it's the most frustrating part of
programming. I'd pay devops people infinite money to not have to deal with
installation/setup issues anymore.

~~~
mariokostelac
I think this is not a problem specific to python packages, but a general
problem of how we compile C/C++ software. There is no concept of packages and
compiling one thing often requires installing a -dev package of some other
library.

The issue is that lack of packaging C/C++ world spreads to all other
communities that depend on them.

------
cycomanic
Everytime I read an article about all these tools I really can't help but
think what would happened if Linus would have taken over the desktop. All the
tools really largely seem to try to poorly replicate Linux package management
and the fact that because of this devs now don't care anymore about api
stability and not always building against the latest and greatest.

I admit a pyenv is nice for testing against different python versions if
necessary. But on my Linux systems generally fine with just installing system
packages and doing pip install --user for the odd package that is not in the
repositories

~~~
mariokostelac
I think that works when you use Python cli tools, but not when you're working
on 5 different projects, each running different python version.

~~~
Sohcahtoa82
Outside of Python 2/3 differences, are Python interpreters not backwards
compatible?

In other words, while obviously a program written for 3.3 won't work in 2.7,
but will a program written for 3.3 fail to run in 3.8?

If it runs fine, why the need for multiple interpreters? I'd think you'd get
by just fine by having the latest 2.x and 3.x installed.

~~~
Izkata
Because underscore-functions aren't truly private, I have once seen an upgrade
from 2.7.8 to 2.7.13 fail. A commonly-used package was importing one from a
core python module.

------
mariocesar
I just use Pyenv and pip-tools. Create a requirements.in and Makefile targets
to build the requirements.txt based on it. So far I haven't find an sceneario
where that combination is detrimental.

------
WillDaSilva
This is a good overview of something that took me an annoyingly long time to
learn. My personal preference is to keep things simple with pyenv, venv, and
pip.

Tangentially related is the tool tox [1], which is often used to run a test
suite inside of virtual environments created by venv, on multiple versions of
Python managed by pyenv.

Now if only setuptools could work well without hackery...

[1]:
[https://tox.readthedocs.io/en/latest/](https://tox.readthedocs.io/en/latest/)

~~~
postpawl
Wouldn’t you still need something like pip-tools to lock down subdependencies
and handle conflicts?

~~~
ploxiln
Plain old pip and venv can do that. just "pip freeze >requirements.txt" and
elsewhere "pip install -r requirements.txt", inside venvs.

~~~
postpawl
I think that will end up installing the subdependency version of whatever is
last in the requirements.txt. You need a dependency resolver to deal with
problems with conflicting versions.

More details here: [https://medium.com/knerd/the-nine-circles-of-python-
dependen...](https://medium.com/knerd/the-nine-circles-of-python-dependency-
hell-481d53e3e025)

~~~
ploxiln
pip handles the simple cases: if you install a new pkgA that depends on
'pkgB<3', it installs the latest appropriate version of that, e.g.
'pkgB==2.5.6'. This works even if you already installed 'pkgB==3.0.2', it will
uninstall that first. The problem is if some 'pkgC' depends on 'pkgB>=3'. You
probably want for pip (or similar) to figure out that an older version of
'pkgC' is compatible with 'pkgB>2'.

But I actually don't want it to be too smart. Better to keep your dependencies
minimal and explicit, and manually specify older 'pkgC' if you need to. I have
a few non-trivial services in production, the most complex one with 16 total
dependencies + sub-dependencies. That is quite manageable.

So, I strongly recommend manually curating the most appropriate versions of
the few tastefully chosen dependencies you really need. Then, pip+venv can
easily reproduce that exact set of dependencies anytime. I also do something
very similar to this with C applications, and Go. Sub-dependencies should be a
big factor in how you choose your direct dependencies.

~~~
postpawl
The problem with doing things this way is that you’re not going to know
there’s a problem until there’s an issue in your tests (hopefully) or
production. You’ll eventually install something new, it’ll update some
subdependencies to a version that another library doesn’t support, and then
things get broken. Pip-tools is easy to use and it tells you there’s a problem
before it’s too late.

------
RMPR
It's worth noting that on Linux it's slightly different because most of the
popular libraries can be installed with the system package manager (no problem
of dependency management, updates, ...), I rely on alternative solutions only
when I want to use a version of a library different from the one shipped with
the package manager (which is not that frequent with fast paced distros like
Fedora) or when the library is not packaged.

~~~
_ZeD_
No. Don't mess with os packages and your dev setup. On a very small scale I
can adapt to use the os package version. But when you start to work on 4, 5,
15 projects, each of them that need to work with some specific version of some
package, you need to detach from the os

~~~
RMPR
And my point is, use the system packages whenever you can, they are there for
a reason, while developing atbswp[0], I faced a situation with wxPython, where
the only package available on Linux was the one provided by the package
manager, it's a known situation[1], the workaround I used was, instead of
"detaching from the os" to work with the OS, more specifically, install the
package with the OS, then copy the wx folder from the system's site-package to
the venv's site-package[2].

0: [https://github.com/rmpr/atbswp](https://github.com/rmpr/atbswp) 1:
[https://wxpython.org/pages/downloads/](https://wxpython.org/pages/downloads/)
2:
[https://github.com/RMPR/atbswp/blob/master/Makefile](https://github.com/RMPR/atbswp/blob/master/Makefile)

------
franciscop
For some of the problems that Node.js and JS at large have with a centralized
package manager, I for one am very happy that it's not in the python
situation. 100% of the packages I've tried to install in the last 3+ years are
simply `npm install PKG`.

~~~
xapata
Do popular Node packages rely on C and Fortran?

~~~
diegof79
I don’t recall any popular Node package relying on Fortran, but there are two
popular packages that rely on C: fsevents and node-sass

It works on macOS and Linux without any issues. Windows usually requires some
extra steps to setup node-gyp

------
knodi
Python dependency management is much like Python 2 to 3, a mess. It's shocking
to think pip and pipenv are so widely used and still such terrible tools.

------
0xferruccio
Great article Mario! Was a pleasant surprise to open HN and find this

~~~
mariokostelac
Thank you :)

------
nurettin
pipenv also loads any .env file it finds in the directory, so it is a little
more convenient to use than poetry, so I didn't make the switch.

------
stared
I like this overview. However, it points to a fundamental problem with Python
environment, going much against its own credo:

"There should be one - and preferably only one - obvious way to do it." \- The
Zen of Python; see also [https://xkcd.com/1987/](https://xkcd.com/1987/).

When it comes to the package, environment and dependency management, I think
that ironically JavaScript environment is light years head, vide:
[https://p.migdal.pl/2020/03/02/types-tests-
typescript.html](https://p.migdal.pl/2020/03/02/types-tests-typescript.html)

------
pjs_
Anyone installs Conda on my shit I hit the roof... I sympathize with the
individual but cannot tolerate the act.

~~~
emmelaich
Can you elaborate?

~~~
anxq11
Perhaps I can guess some causes of the irritation. Let me start by saying that
_on Windows_ conda is probably an improvement.

On Linux, however, I do not see much benefit, unless you frequently install
large binary C library based packages.

To me it feels cleaner to compile these packages from source. You are sure to
have no glibc mismatches etc.

Conda, despite its advertising, _does_ have library issues. C Libraries are
shared between environments, compiling inside an environment can lead to
surprising results when stale libraries are in the miniconda path.

All in all, it feels like a second OS shoehorned into the user's home
directory. Compared to apt-get it is really slow and bloated.

It feels too intrusive on a Unix system.

Also, I'm not sure if the repositories are secured in any meaningful way.

~~~
pjs_
This is a perfect summary which exactly captures my frustrations.

I should have said that on Windows Conda is a much more understandable choice.

------
mistrial9
anyone with insight into the Debian/Ubuntu packaging care to comment ?

------
mariokostelac
It's 2020, but the python community still has not converged to a small set of
sane solutions.

It seems to me that Ruby, PHP, JS, and Rust communities have solved the
problem.

~~~
nouveaux
I'm only familiar with Python, Javascript and Rust. It seems to me that Rust
is the only one that has "solved" this problem.

I dont think there are any real Python devs who thinks dependency management
is solved. However, why would you claim Javascript has a good solution? The
inconsistencies between node and web dev is odd at best. Babel compilation is
annoying and slow. Are we even standardized on webpack yet?

Can anyone say with a straight face that getting a new JS dev caught up on
what all these different parts to compile a JS program is a solved problem?

I dont fault either Python or the JS ecosystems. As pioneers in dependency
management, there was a lot of trial and error. New languages like Rust
benefited from it and that's ok.

~~~
tarruda
One thing node.js got right is the module resolution logic.

Node doesn't even need something like venv since module lookup is always
local.

Also no problem with dependency hell. Each dependency can have its own private
dependencies, even different versions of dependencies shared by sibling
modules. Tools like yarn/npm can remove duplicates across a project.

~~~
pydry
It's a lot more of a pressing concern when you have an average of 1,200
dependencies per project.

(I'm actually not sure if it is the average but my anecdotal experience is
that it's an order of magnitude higher than python, and 1200 wouldn't be
unusual).

~~~
nicoburns
That's probably quite accurate for front-end projects which pull in a ton of
packages for 1. A compatibility layer wih older browsers. 2. Build tooling.

Node projects tend to have a lot smaller dependency graphs.

