Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why is Python package management still a dumpster fire?
61 points by breckenedge on Sept 12, 2022 | hide | past | favorite | 109 comments
Today I was trying to use Python to build some custom ZMK firmware, which relies on a package named west. For the life of me, I cannot figure out how to get it installed so that it's in my PATH. Why is python package management still this bad after so many years, with so many wonderful examples of how good a package manager can be?



This thread is textbook trolling to get an answer:

```

- I discovered that you'd never get an answer to a problem from Linux Gurus by asking. You have to troll in order for someone to help you with a Linux problem.

- For example, I didn't know how to find files by contents and the man pages were way too confusing. What did I do? I knew from experience that if I just asked, I'd be told to read the man pages even though it was too hard for me.

- Instead, I did what works. Trolling. By stating that Linux sucked because it was so hard to find a file compared to Windows, I got every self-described Linux Guru around the world coming to my aid. They gave me examples after examples of different ways to do it. All this in order to prove to everyone that Linux was better.

- ion has quit IRC (Ping timeout)

- brings a tear to my eye... :') so true..

- So if you're starting out Linux, I advise you to use the same method as I did to get help. Start the sentence with "Linux is gay because it can't do XXX like Windows can". You will have PhDs running to tell you how to solve your problems.

```


Now it's only a matter of time until someone automates that using GPT3.

I mean we already have AI-automated "email trolling" where they use GPT3 to produce an introduction sentence which suggests that the sender took at least the time to look at my homepage before bothering me, but in reality it's all automated and they are just sending more personal-looking spam than the next guy.


I find it highly offensive on multiple levels that A) people think that sort of behavior is necessary or acceptable, and B) that there are people out there who reinforce that belief by being the ass-hat who answers valid questions with "RTFM!". If someone has trouble understanding man pages, that's reasonable. Turn them on to other (more easily parsed by "average folk") sources of the same information, and request that they "pay it forward" in similar fashion.


You have to get these things right early-on, and get everyone on the same page. Once things start to get fragmented, once you have historical decisions that prevent you from improving certain aspects, etc, there's only so much you can do to after the fact to paper over the gaps.

Python's packaging story is rotten in its bones. I think at this point it's nearly impossible to fix (though many continue to try).

The way I see it, a solution would require:

- A cohesive, self-contained, easy to use, all-inclusive toolset, with a completely standardized manifest format for packages. And, for all that is holy, doesn't mess with the global environment so that we have to lie to it with virtual environments.

- That toolset would have to get overwhelming adoption by the community, to where it retroactively becomes The Python Package Manager (outside of legacy codebases, which would continue to languish). This would probably require endorsement by the Python developers, or for the tool to be so unassailably better that it's preferred without exception, or possibly both. Otherwise we'll continue to have: https://xkcd.com/927/

I want to emphasize that on the second point 50% adoption is not enough. 70% adoption is not enough. I'm talking 90%+ adoption for libraries and new projects; everything outside of it needs to be a rounding error, or we're still in fragmentation hell.

And then even in the best case- there would be a long tail for years to come of packages that haven't been ported to the new tool/manifest format, where you have to duck out to one of the other package managers to pull them in.


Yeah at this point it appears to be a cultural issue. There’s been plenty of opportunity to anoint a proper package manager that doesn’t require virtual environments or install things globally by default, that uses lock files and resolves dependencies without too much fuss. If I’m not mistaken poetry and pipenv offer this. But there just isn’t the buy in. People in the Python world, including many in this thread, just seem to refuse this path and insist, proudly, that their workflow is just fine. It’s a kind of psychological curiosity that people would resist proper package management so fervently.


The average data scientist thinks it is OK to have a manual process to make the monthly report so long as it takes less than 1 month and so does their manager. And so does the VC that funds it.

I mean, you can’t write a command line program which is reliable and straightforward to install in blub, why would anybody need it?


There's also the fact that Python is old. Python is used as Linux's unofficial system scripting language the same way Macs use Ruby. I think we should mandate a SYSTEM_PYTHON variable to avoid polluting the PYTHONPATH further with brittle global installs that can't be easily upgraded or replaced.


Erm. Brew uses Ruby, and there are historical reasons about that, but Apple never had any favorites in that race.



> - A cohesive, self-contained, easy to use, all-inclusive toolset,

The only complex thing about Python packaging is that is separates it tools into "frontend" and "backend" build tools. On the frontend, pip is essentially the cohesive, self-contained, easy to use, all-inclusive solution, as long as your backends support the standards [0][1].

> with a completely standardized manifest format for packages

That's wheel [2]. Everything else is deprecated now, except for sdist (which doesn't really matter as it's limited to Python-only packages, and transparently supported by all tools).

> And, for all that is holy, doesn't mess with the global environment so that we have to lie to it with virtual environments.

This would be nice, but it is completely out of hands of the language developers. Different Linux (and other OSs, like OSX) distributions have made different decisions about the python version(s) they bundle by default, and in many cases the bundled versions are critical for some OS-level tools to function properly. Pretty much all interpreted languages either a) have their versions of virtual environments, or b) rely on a single system-wide interpreter.

That being said, there is no need to use virtualenvs if a) you know that you need only one version of Python, and b) you have full control over your system. The best example is service deployments, where you can install a version of your choosing (or start with an appropriate base image), and install everything else on top of it.

> That toolset would have to get overwhelming adoption by the community, to where it retroactively becomes The Python Package Manager (outside of legacy codebases, which would continue to languish). This would probably require endorsement by the Python developers

Why exactly? There are two ways one can work with packages:

1. The majority of users want to be able to install packages.

This is already the case. Pip is overwhelmingly adopted by everyone, with two notable exceptions:

    * Conda, which is mainly used in some niche sub-communities, and which is fully compatible with pip anyway
    * Build tools being used for installing packages, either during development or for deployment in very controlled environments
But both of these exceptions are very specific and narrow in scope.

> Otherwise we'll continue to have: https://xkcd.com/927/

We already don't have that situation. I agree that we did, but the standards are now very clear, and the tools are slowly moving in the right direction.

[0] https://peps.python.org/pep-0518/

[1] https://peps.python.org/pep-0660/

[2] https://realpython.com/python-wheels/


What do you mean? You type `pip install west` and a binary will appear in `~/.local/bin/west`. This directory needs to be part of your $PATH variable. Done.

Using virtual envs or pipx or poetry or whatever are non-standard.

Packaging Python projects, however, ... don't get me started on the whole `setup.py`, `setup.cfg`, `pyproject.toml` debacle. This article has more information about it, but the fact that this is supposed to be the short version makes it even more infuriating: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated...


What's really frustrating is how little information is out there for setting up a pyproject.toml. The community has really dropped the ball on their new shiny.


You are right to an extent, but this is because this "shiny" has been transforming since it was introduced.

Originally, it was meant as a central point to define your build system requirements [0]; for this purpose it needs to include only two lines:

    [build-system]
    # Minimum requirements for the build system to execute.
    requires = ["setuptools", "wheel"]  # PEP 508 specifications.
Then, package managers like pipenv and poetry started using it as a central place to storing other project metadata like dependencies, description etc. Most package managers now have their own versions of that functionality, and it is currently being standardised to a common form [1].

Finally, many other projects have started adding support for keeping its configurations in pyproject.toml. Some (like black) don't even support any other form, while others (like flake8, and until recently mypy) are resisting this trend; but it is already so prevalent that it can be considered the standard.

[0] https://peps.python.org/pep-0518/

[1] https://peps.python.org/pep-0621/


And yet, only having a pyproject.toml is apparently not sufficient either (besides the annoying flake8 resistance), at least if you want to support editable installs IIRC and you or your users aren't using the latest pip.


Support for PEP 660 was introduced in pip nearly a year ago [0]. Most build backends have implemented it since as well [1][2][3].

[0] https://pip.pypa.io/en/stable/news/#v21-3-1

[1] setuptools: https://github.com/pypa/setuptools/blob/main/CHANGES.rst#v64...

[2] flit: https://flit.pypa.io/en/latest/history.html?highlight=660#ve...

[3] poetry: https://github.com/python-poetry/poetry-core/releases/tag/1....


I don't see the dumpster fire here:

$ python -m pip install west

That's it, no venv, no sudo.

pip will install it in ~/.local/ so you'll need a:

    PATH="$PATH:$HOME/.local/bin"
to get it to your path.


On my debian stable, if I want this behaviour I actually have to add `--user`.

But that's not the recommended way to install pypi packages you want to use, with debian at least. This may install dependencies that will break (for your user) some debian packages depending on other versions of said dependencies. The easy way to go is `pipx install west`.

EDIT: --user is not necessary anymore. I think this does not alter the validity of the rest of my comment.


Actually, you shouldn’t need to add that flag, AFAIK it’s been the default for a while for non-root users. But both Debian and Ubuntu tend to package their Pythons to rely on .deb packages and have been somewhat inconsistent over the years.

Does it try to install packages in your system paths and fails due to lack of permissions?


You're right, --user seems the default behaviour, I did not realize that. Maybe because it's different in Fedora that I also use? Or maybe I was just plain wrong :)

Anyway, I think this does not change the fact that it's a bad idea to populate your ~/.local/lib with python libs, since it may break some python programs installed via apt for your user, or was that wrong too?


Well, I use Fedora since March (after a lot of Ubuntu) and I don't need --user. I'm on 36.

As to filling your lib with stuff, it depends. AFAICR the default is to search for things there first, because it is assumed the user environment takes precedence.


   pip —user
is suicide. You can be doing everything right, using venv’s, conda and all that, install one package in a user directory and blam! You just broke every Python you use because that package is installed in all your environments.

It’s a bad idea right up there with Linux distros having python2, python3, python3.1, puthon3.2, python3.3, …. They give up the ease of use of a command line application where you can tell people what to type for the mind numbing complexity of a GUI application where it takes a 1500 page book to explain how to do anything in Microsoft Word because now you can’t tell people what to type to run and install things.


why use -m?


It ensures you're installing into environment using your current python. Useful in case the op already has a messed up environment where default pip doesn't use default python.


It's a good habit which ensures pip runs with the same interpreter as your `python` command.

One example (but there's many): On Windows a `pip install --upgrade pip` can't work as the OS would lock the `pip` executable, while a `python -m pip install --upgrade pip` works as the OS would lock python instead.

I also encontered environments (CI runners, things like this) with `python` and pip installed, installed but no `pip` shortcut.

More: https://snarky.ca/why-you-should-use-python-m-pip/


I find python packages quite easy to work with :) plus they're similar to ruby gems.

You create a venv or bundle, list requirements in a text file, then ask it to install things for you.

And if you need custom stuff, you can just pip install a .whl file, too.

I have yet to encounter a case where it's not working as expected, so my answer would be that python isn't getting fixed because it's not broken.

wontfix, works for me


> You create a venv or bundle, list requirements in a text file, then ask it to install things for you.

Just last week I had problems with a project that I hadn't touched in a while.

After installing Python 3 on a new computer (and making sure that pip is installed) I found that my scripts broke because "pip install" was no longer a thing. I now needed to do something like "python -m pip install".

Not a big issue, just a reminder that things are still improving for the better.

That said, whenever there's native code involved, things can get tricky (especially in Alpine based containers with musl instead of glibc).

That does apply to pretty much everything, just yesterday I also discovered that Ruby was slowing down builds 2-3x because of sassc install being really slow after an update. Then again, the whole library it depends on is deprecated so there's additional migration pain to be had soon.

And don't even get me started on database drivers and their prerequisites in each environment!

That said, even if something like Maven seems (sometimes) better due to no native extensions, I'm still happy that Python/Ruby/Node package managers work as well as they do. Sure beats copying random code manually.


Your PATH is broken and you’re not finding the pip wrapper for your install. If you need to install Python manually, use pyenv, which actually makes sure pip gets into your PATH to avoid that kind of thing.


Then I guess it's broken by default, considering that it was a pretty fresh install following whatever was the first Google result.

Regardless, someone actually advised against using PATH for pip because of confusion when you have multiple Python versions installed, which seemed like a sensible argument: https://snarky.ca/why-you-should-use-python-m-pip/


"following whatever was the first Google result"

That's your problem right there. Blindly doing that without understanding the way things work is an anti-pattern.

My point was that many Python distributions (especially in OS packages) split things up. If you need a custom Python or a specific version, use pyenv, which sets things up the right way.


Edit: actually just checked what's wrong again.

When running with cmd, I get an error about Python 2 being used instead of Python3:

  C:\Users\USERNAME>pip --version
  Fatal error in launcher: Unable to create process using '"c:\python27\python.exe"  "C:\Python27\Scripts\pip.exe" '
  C:\Users\USERNAME>pip --version
When running with Git Bash, I don't get any output whatsoever:

  USER@MACHINE MINGW65 ~
  $ pip --version

  USER@MACHINE MINGW65 ~
As it turns out, PATH has both Python 3 and Python 2 in it. Why is there also Python 2 on the machine? A legacy project that needs testing whilst migrating it over? Helping someone with an older script? Who even knows at this point.

My takeaways:

  1. following the first Google result (python.org and the official installer) wasn't the problem here
  2. Python 2 and Python 3 don't play nicely together, at all, might need to remove the old one from PATH to avoid headaches
  3. Specifying the Python interpreter to use (my example above, or better yet explicit python3) solved the issue
  4. Certain shells are weird sometimes, go figure
  5. What you're saying about pyenv is probably a good idea
Or maybe even something like PyCharm which lets you specify the runtime per project (much like you could specify which JDK to use with IntelliJ for Java).

Hopefully by the time Python 4 comes out and breaks everything, we'll already have figured out use cases like this (just kidding, sort of).

Node feels much the same way, especially because it moves ahead a bit more quickly with its releases and has less backwards compatibility when compared to Python. Somehow projects like Node Version Manager (https://github.com/nvm-sh/nvm) didn't really seem to gain much traction.


Mostly agree, however the only area I think the “official” tools could improve upon is split dependencies between requirements.txt (top level requirements), dev-requirements.txt (top level dev only requirements), and requirements-freeze.txt (all requirements versions frozen for deployment). Currently you have to be careful to ensure your requirements-freeze.txt doesn’t end up with dev dependency’s in it.


In the industries where I worked as a C++ coder, it was quite common to leave debugging symbols in the binaries so that customers could have better stack traces for their error diagnostics. On OS X, it is still quite common to ship production apps in a way that allows you to "symbolicate" stack dumps after the fact.

What I'm trying to say: In most production uses, the line between dev and release dependencies is so blurred that it almost doesn't exist.


It depends how complex your build is.

Instead of solving dependencies, pip just starts installing stuff and it tries to backtrack if it paints itself into a corner but it frequently gets stuck.

If you dependencies are wheels it is not so bad, in fact with the right software you can download the dependency list of a wheel without downloading a wheel do you could do what real dependency management software (maven) does and download and resolve the dependency graph before installing anything.

With eggs you are SOL because you have to run a Python script to determine the dependencies and if you run enough of those one will have a side effect or two.


> You create a venv or bundle

Here’s your dumpster fire. I can’t figure out why this crappy venv thing has to exist in Python when it doesn’t exist anywhere else.


It's really not special. Having a dedicated ~isolated space for a dependency graph is normal. Better than overwriting global libs.


Actually, it does. You can do the same in Java, Go, and plenty of other languages. Pretty much every language I’ve used has a way to have multiple runtimes/environments coexist on the same machine, referred to by PATH tweaks or specific environment variables.


> You can

Yes. In Python it seems like you have to, or it just doesn't seem to work.


In other languages people use the facility equivalent to venv like the way most people breathe.

Pythoners have a sense of entitlement that they can just install everything into one big pot and it will all work. Python barely survived becoming an important language on the Linux platform because once the sense of entitlement of ‘just type Python’ made the migration to Python 3 take 10 years instead of 3.

Before Docker came along I would frequently have multiple installations of customer software running on machines, so long as you can set the file system paths and database connections it ‘just works’. But people didn’t have the discipline to this so we got Docker to turn the whole morning into one big coffee break.


> In other languages people use the facility equivalent to venv like the way most people breathe. > Pythoners have a sense of entitlement that they can just install everything into one big pot and it will all work.

Perl has really high backwards compatibility. I've never used anything like a venv with perl (although you can) and never had any problems, and I just install everything into one big pot and it all works.

It just seems like a cultural issue with Python.


Well, that’s not my experience. And you’re welcome to read the rest of the comments from folk who share this viewpoint.


> this crappy venv thing has to exist in Python when it doesn’t exist anywhere else.

Well, there's RVM[1] for Ruby, maven-env[2] for maven, and perl5-virtualenv[3] for Perl.

[1] https://rvm.io [2] https://github.com/AlejandroRivera/maven-env [3] https://github.com/orkunkaraduman/perl5-virtualenv


Apparently, you're new to back-end programming? We have similar "locally installed package" abstractions in every major language there:

C++ -> vcpkg

Ruby -> bundler

Node -> npm vendor

Go -> Go Modules

Java -> Maven

If you disagree, please explain how Ruby's bundler caching dependencies in a vendor subfolder differs from Python's venv caching dependencies in a subfolder and why one of them is a dumpster fire that and the other one doesn't exist.


> Node -> npm vendor

> Go -> Go Modules

These are transparent though. You don't need to "activate" your venv when you enter your project's directory. You can't misuse it. You can't fuck up everything because you switched project and forgot to change the venv.

For Go you don't even need gvm or anything like that because of the compatibility promise. Just use the latest and be done with it.


When I have been doing analysis work I typically create a suite of command line programs to do that work with. Systems like nom and maven lock me in a prison because they are dependent on the directory I am in…. They take away the ability to cd to an arbitrary directory and run my command line programs.

I can’t see why people want to take the X out of Unix like that, maybe they remember using CP/M and never really figured out cd and mkdir and they want to bring us all to their level.

With a .venv however the program suite I am running and directory I am are decoupled and I can breathe free.


Go modules and NPM are easy to use. Virtualenvs are a pain in the ass to configure. That's the difference really. If I could just go to my Python project directory and "virtualenv init" then this wouldn't be a conversation topic.


What's crappy about virtualenvs?


They get the environment under control and remove the excuses that some Python ‘programmers’ have for always getting the job 80% done.


Why? Just read the comments here. A combination of hubris and misunderstanding of how hard it is .

"Just do pip install" --> Whenever I hear this comment, I know Im talking to someone who has never used scientific libraries like numpy, scipy etc. Never seen the problem of dependency versions going into a mess because Pip by default doesnt pin dependencies (Poetry does, but it is not standard).

Python packaging is a mess because for some weird reason that baffles me, a large majority won't even admit tehre is a problem. And they will start jumping on you and calling you an idiot if you say there is. A lot of gaslighting going on here.


I think Python package management is pretty good now. The absolute toughest bit is learning how to work with virtual environments, but once you have a string understanding of that I think the system works very well.


Using virtualenv? Or venv? Or poetry? Or pyenv? Or pipenv? Or conda? Or something-else-I-can't-think-of-right-now?

It's still a mess. Arguably, a worse mess than it used to be.


The first two are the same, the rest are effectively different abstractions over them. (apart from pyenv which is not for packages) The whole thing about choices in python packaging became more of a meme than reality. There are specific issues (pip itself not having a lock file equivalent), but they're often blown out of proportions.


I had that convo few days ago. What surprises me is that pip venv were almost ok. But since there's twenty thousands new ways that popped all at once. And I can't understand which does what better.


poetry and pipenv both use virtualenvs under the hood, wrapping it.

They provide dep management. Pip/virtualenv never did dep management really. There was requirements.txt sure, but you had to manage that yourself.


I use poetry and forget the rest of them exist. It has worked well so far.


poetry is a game changer for sure.

I’ve found using conda, in dev, to provide system libraries and poetry (+ a build.py for our C extensions) to be a really comfortable experience.

In CI we use cibuildwheel to produce “statically-linked” wheel files. They’re not really statically linked, but they behave as though they are by bundling all dependent libraries into the wheel and doing some RPATH “magic” to look those kind up during runtime.

I’d like to reduce the delta between dev and CI at some point, but it is stable rn so I’m inclined not to futz with it.


Poetry seems to be the most npm-style package management for python.


I only have a passing knowledge of NPM, as I am not a JS dev, but I believe you are correct. Poetry has taken good ideas from other ecosystems and applied them to the Python world.


They all change PATH somehow. Which one you use depends on your use case. I, for instance, prefer pyenv for having a runtime and package set that covers most of my projects and virtualenvs for test sandboxes of new package versions.

Conda, for instance, is great for ML and scientific given the extra niceties for replicating environments, but most specifically because you get package sets optimized for your analysis needs.


I really don't see the problem here - you have two options, both of which involve one install command and at most a one-step setup process:

1. A user install: `pip install [package]` and make sure "$HOME/.local/bin" is on your PATH

2. A global install: `sudo pip install [package]` - it will be installed to a dir on your path already (/usr/bin I think)

As for why pip si not ideal for installing software: it's not supposed to be. It's a Python package manager, not a software package manager. It's meant to install libraries for the Python interpreter to import, not commands for you to run. Of course, people do often library managers to install software (npm, cargo, go...), but the experience is the same in all of them - either you install with sudo and it "just works", but might cause problems later, or you install in "local" mode which requires you to add a the package managers's directory to your path.


You need three things usually:

- virtual environment (python3 -m venv --help)

- pip for manually installing things

- poetry for declaratively installing things

For a UX point of view, this is already pretty good and tends to work well. What sometimes makes this awkward and inefficient is that many python projects either don't declare their dependencies at all, or declare them with very specific versions. This makes it apparently a necessity for the resolver to do things quite heuristically.


Perfect explanation. Not much more to do for almost every project.


Virtual environments are rarely necessary in my experience. I just install everything in my home directory (the default when running `pip` as a regular user) and I haven't had a version conflict in ages.


If I were hiring for a Python job the most important questions I would ask be around weeding somebody with this attitude out. If you are coding a little bit on your own account you can get away with this but if you are on a team you are playing to help the other team win if you think this way.


That's harsh. I just want to point out that you don't have to go for venvs by default. I believe that's one of the reasons why people who aren't familiar with it are having issues with Python's package management. They think they always have to setup a venv even if there are no conflicts, and then things become quickly complicated.

If you do run into conflicts (which, again, I believe to be rarer than a uninitiated user might think) I'm happy to point you to one of several solutions. I like pipx, but peotry or the built-in venv work as well.


Remember those platforms in Super Mario World that would start out with a number on them, say 5 and the number would count down each time you jumped off the platform?

Each time you install something that isn’t conflicting now you increase the odds of a conflict the next time. Worse yet, you aren’t going to get that says ‘you have a conflict’ rather you are going to get some random error which is irreproducible so you won’t get help when you go to stack overflow or wherever you go to get bad advice.

It is cruel to give beginners instructions which might work but are nearly certain to break their environment if they go at it long enough.

Probably the best thing you could do to make Python easier to use is lock down all the footguns like ‘pip —user’ and pip outside of a virtualenv, ability to configure a charset, etc.

You don’t argue about plugging a keyboard or mouse into a computer or getting a cell phone plan for your phone, if Python didn’t give you so many footguns you wouldn’t be complaining about how hard it is not to shoot yourself.


That's all good and well, but how come I don't experience conflicts? I install a lot of packages all the time. In my experience, it's not that big of an issue and all I wanted to do was provide an additional data point.

Plus, all of the advocates of using plain venv miss OP's point: it doesn't add an executable to their PATH. So much for giving cruel advice. There are many packages that I want to use system wide (or at least in every shell), like powerline in vim. Having to activate each environment manually before running an executable does not scale at all. Recommending venv outside of wrappers like pipx or peotry won't help with that. And that's my point: Before adding needless complexity, just try `pip install X`. If doesn't work, you can always fall back to pipx/poetry/etc.


Four things: you also want to specify a python version.

Some projects run fine on 3.5, some need 3.10…

Add pyenv or something similar as step 0:)


I think the reality is that it's misunderstood and misused. Perhaps this is an education problem.

1. All virtualenv related things are not package managers. They're isolation tools. They ensure your application controls its environment.

2. Pyenv, asdf, and others provide the same benefits for versions of Python. They're incredibly powerful tools for those of us who need to support multiple versions.

3. Tools like conda, poetry, pipfile, etc serve to solve dependency management based on how you, as a user want to interact.

Given my needs, I find poetry invaluable (and I use none of its venv integrations). For specific projects I rely on pip, as that meets the goal there.

It could be better, but IMHO poetry and friends have drastically improved the situation over pip freeze.


Adopting Poetry as the official Python package manager seems like the fastest path out of this mess.


There is no "fast path out of this mess".

If you really dig in the topic, you see that the major problem is that most packages use setuptools for packaging, which includes a setup.py which is executed during installation/dependency resolution and can change the set of dependencies dynamically based on host machine attributes.

And sadly that's not just a theoretical feature, but one that a lot of libraries in the wider ecosystem take advantage of (which some of them not seeing a need for changing the behaviour).

If we really would like to enforce Poetry-style package management, migration effort that's almost as big as Python 2 -> 3 will be necessary.


The requirement for isolation tools is the dumpster fire. C#/.NET/NuGet provides system wide package caching and project level isolation. Javascript/npm doesn't bother with the caching, but still has isolation without effort.


Venv is necessary to support a command line development environment. C# devs are using Visual Studio which is doing something 100x more complex than venv behind your back.


Js has similar issues. Npm, npx, nvm, pnpm. The toolchain is lava.


Conda does do virtual environment management.

It's why I use it. It actually manages to do both Python version and python packages for my projects. I think as an interface, it is the best option for Python development, but in practice there aren't enough stuff on the conda channels, so you need to install things from both pip and conda and then you are in a messy situation.


Conda is a walled garden. It's hard to enjoy its value if it doesn't play well with the rest

> so you need to install things from both pip and conda and then you are in a messy situation.

Well obviously


I don’t get it, honestly. I have been using Python since 1.2 and never had this kind of problem since we moved to pip.

That may be because I use virtualenvs and pyenv extensively, but even without that as long as you understand how pip works and where it places packages and binaries for a non-root user, it is mostly a matter of setting PATH in your shell (I do it with a conditional to see if .local/bin exists) and you’re done…

I also have historically had very few issues with dependencies (other than lack of some OS libraries to rebuild a wheel when, say, I’m using Pillow on ARM and it needs to find libjpeg), and yet there is a constant stream of complaints here on HN… I wonder why.

Would it be the OS? (I use Mac, Fedora and Ubuntu, and just sync my projects across all machines with local virtualenvs - everything runs everywhere, but I don’t use Windows for Python development)

Is it specific packages? Complaints seem to be generic.

Or is it just Eternal September all over again?


Because a system that almost works is an enemy of a system that works.


Love this quote. My wife has put up with many a rant about "The Bluetooth of X" preventing a good system from ever being developed.


I had been trying to push this PEP ( https://peps.python.org/pep-0582/ ) which requires a very minimal work and it will help a lot of people and it seems stuck

So I'm not really surprised PIP is pretty much abandoned/dead. No one wants to introduce changes or improve it, it is easy to think: it works (for me), so why change anything?


I joined a hackathon this weekend on a Python project. I’m brand new to Python, but I know how to code and set up my Linux environment (so I thought).

First, I install python3 packages I think I need from apt and set up vscode plugins for Julia notebooks. Great,I have a repl environment and can use Python library we’re working on.

Now I want to hack the library code. The project docs say to run “just make”. Ok, that’s a rust tool, so I’ll install cargo via apt and get that installed.

The Justfile install just has a pip install with dependencies listed, which appear to be defined in pyproject.toml. I run just install, and at the end it throws errors about dependencies.

In the end, I used conda to set up the environment and get everything installed. But it seems like if the environment doesn’t have something, it gets it from the system instead of installing its own.

The project used setuptools. The python3-pip apt package requires puthon3-setuptools v59. PEP 660 adds editable installs, implemented in setuptools v64.

My takeaway is that you should not use your system installed Python packages for development, and should install as few as needed by your system or else you have to set up more overrides in your project.


The python packaging situation could be better but poetry is pretty good.

I think the real problem with python packaging is a cross-language one. Show me a language that can elegantly handle building C/C++ dependencies across platforms. This is where things really break down. Python is unusual in this because it relies on C/C++ libraries for performance in a lot of domains.


For Python packages that include something you want in your PATH you can use pipx:

   $ brew install pipx # (1)
   $ pipx ensurepath
   $ pipx install west
1. https://pypi.org/project/pipx/

Alternately you can create a virtual env just for west:

    $ python3 -m venv west
    $ west/bin/pip install west
    $ west/bin/west
If you installed west with `python -m pip install west`, then `west` should be the same place in your PATH as `python`. You can probably also run it with `python -m west`.


Is this more "Someone explain what the `PATH` is", because that's what it sounds like. It doesn't sound like this has anything to do with package management.


One solution that I often hear about Python packages is the use of wheels. But I am wondering how this can work when multiple package depend (directly or indirectly) on a C library like zlib. As far as I know, there is no authority that decides which zlib version should be used. So I package A uses zlib.so.1 but package B uses zlib.so.2 can both be loaded at the same time ? (the version numbers are just examples for two incompatible version numbers)


Publicly available wheels are typically for pure Python packages (you can often see the specific runtime version they were created for). Last I checked, packaging architecture specific binaries was frowned upon, but I might be wrong here.


As far as I can tell one of the arguments for wheels is that the "Installation of a C extension does not require a compiler on Linux, Windows or macOS"[0]. I also check some scipy wheel file and there are some binary libraries. If it is not via wheels, what would be the recommended way to install packages with C extensions (requiring 3rd party libraries)? To my understanding, this is why anaconda is so popular (despite not being an "official" package manager).

[0] https://pythonwheels.com/


Well, the wheels maintained by the scientific community may sometimes bundle .so files for x86_64, I suppose. Since I work on both ARM and Intel, I have had to skip using wheels for a few things until they got aarch64 versions, and made sure whatever wheels I was using were only pure Python.


My actual issue is the absence of namespaces. It's a relatively small thing but it ensures that the odds of making an error when installing something are much lower simply because one will have to err in both the provider and the package.

Furthermore, it allows us to fork stuff without having to also change the name. Sure, installing via pip and git solves the issue, but why not inside PyPI?


The best way to get software engineers to solve your issue with X is to phrase your question “Why is X so horrible? It can’t even do Y.”

The engineers will inevitably reply “That’s so simple. You just need to …”

The ecosystem for managing python dependencies has improved a lot: pyenv, virtualenv, poetry.

PATH isn’t innate to Python. Understanding PATH will definitely help with other issues in the future.


The fact that you don't know how to install a package so that it's in your path doesn't mean Python package management is a "dumpster fire." In most cases it works far better than I had expected and I can only admire the amount of effort that went into this.


For any python project bigger than trivial, I literally just build a docker image and run it from that.


I think that is basically why Docker was invented.


Docker is about doing dev at the speed of ops. That is, ops thinks it is really fast to be able to deploy something in an hour, in dev it has to be more like a second.

So far as Python I think Docker is a way to accelerate the creation of corrupted and uncontrolled environments, so often I have seen people pick some random docker image for Python which turns out to be incorrectly configured, say the default character encoding is EBCDIC instead of UTF-8.

If people learned how to use environment variables and configure their database connections and paths properly we never would have needed docker, unfortunately the only thing sheeple will respond to is a brand.


This is the real answer.


The dirty secret about Python packaging problems is that a lot of them... are not relatated to packaging.

Python has first and foremost a big __bootstrapping__ problem, and the consequences of that are often seen when you try to install or use dependancies. So people conclude that packaging sucks.

Now don't get me wrong, packaging have some problems, but compared to most languages out there, it's actually pretty good.

But people cannot enjoy that, because they would have to install, setup and run python in a perfect sequence.

So what's wrong with Python bootstrapping, and why does it affect packages?

Well, as it is often the case in FOSS, it stems from too much diversity. Diversity is tooted as this one great thing, but its benefits also come with a cost.

- there are too many ways to install Python. Official installer, anaconda, Windows Store, pyenv, homebrew, Linux repositories... They all come with different configurations and lead to problems, such as anaconda being incompatible with pip or Linux default setup not providing pip.

- it's very common to have numerous and diverse python installed on your machine. Different versions, different distributions, sometimes embeded Python... Now it's your job to know which Python you start and how. But people have no idea you should use "py -3.10 -m pip --user install black" and not "pip install black" on Windows if you want a clean install at the system level. Yes, this weird command is the only clean way to do it outside of a venv. I'm not making it up.

- the variation between the different ways to run Python and the commands lead to confusion. "python", "python3.10", "py -3.10", "pip", "pip3", "python -m pip", "sudo pip"... There are many wrong ways to do it, and not many good ones, and it all depends of the distribution, the OS, and wether your are in a venv or not.

So how does that affect packaging? Well, people will get a lot of "command not found", and so they will google and paste commands until one works. There is too much to know, and they want a solution, not learning the whole graph of possibilities. Eventually they do something that seems to work, but they probably installed with admin rights, or in anaconda with pip, or in the wrong Python version. On import or on run, something will then break. Maybe now. Or worse, maybe in one month. And they have no way to understand why.

What's the short term solution for you?

- Always use the official installer on Windows and Mac. Don't use anaconda, don't use homebrew, don't use pyenv.

- If you have to use Anaconda (I would advise not to), don't use virtualenv and pip, use their own tool: conda and the anaconda prompt. You will be limited in packages, but it will work. Don't mix conda and pip.

- On Linux, you will have to figure out what magic packages you need to install to get a full Python setup, because linux distributions authors don't provide a full Python by default. It's different for each distribution and Python version sometimes. It must include pip, venv, setuptools, tkinter, ssl and headers. Good luck.

- Don't use Python outside of a virtualenv. Ever. I know how to do that correctly. You probably don't. So don't. Always create a venv for each project you work on, and activate it before doing anything. From there, you can use pip and Python in peace. E.G: Don't install black or jupyter outside of a venv. You want to use poetry? Don't let it create your venv. Create one, activate it, then install poetry with pip, the use poetry from there. I'm not kidding.

- On Windows, the command is "py -3.10 -m venv name_of_your_venv" to create it, with "-3.10" to be replaced with the version of Python you want. On Mac and Linux, the command is "python3.10 -m venv name_of_your_venv". Yes. Not a joke.

- To activate it, on Windows it will be name_of_your_venv\Scripts\activate", on Mac and Linux, "source name_of_your_venv/bin/activate.sh".

Any variation on this (including using -m or pipx, which I used to recommend but don't anymore) assumes you know what you are doing.

What's the long term solution?

I give some hints here:

https://twitter.com/bitecode_dev/status/1567790837848743937

And meanwhile, take the official survey on packaging, this will help pin things down for the team:

https://www.surveymonkey.co.uk/r/M5XKQCT


Definitely conda is more problem and less solution.


https://xkcd.com/1987/

A relevant XKCD comic that encapsulates this well.


That comic does more harm than good, I believe. It's an illustration of what happens if you save 15min of learning, by doing 5h of trying random things until they maybe work by accident. If you understand how a venv works, you won't end in that situation in the first place.


It's just a comic, it doesn't have to be accurate, and I think it delivers the message.

Besides, "you understand how a venv works" is not that simple.

- if you use anaconda, do you use conda env or something else? This will have huge consequences on the availability of packages and modes of failure.

- do you use virtualenv or poetry? In that case, how do you install them? How do you run them? The answer to this will change depending of your OS.

- do you use homebrew or pyenv? Then you env may just break on you one month later, will you be able to fix it?

- do you use venv? In that case do you know about the "py" command ? Do you know about "-m"? Do you know which packages to install on linux so that it's available?

Last week I was playing with a telegram bot with a friend of mind. I asked why he didn't use a venv. He told me he could never remember how it worked.

He has been running very lucrative a django site for 10 years.


Because you don't want to use your system's package manager apt, pacman, or whatever.


If I have a dozen projects on my machine, they will all need different versions of various dependencies. How is the system's package manager going to help me with that?


"Update your software" as I am always told as a user.

Or use better software. Or use better languages. I don't have 11 versions of ffmpeg installed despite it being required (or optional) for 11 other packages.


pypi contains 399 794 python packages.

debian repositories have 23 168 packages, all languages included.

Or are you suggesting each python package author should release one version rpm, one deb, one nix, one msi, etc? And then that the deployment script branches for every single OS isolation isolation there is?


Node developer here - it’s crazy that the canonical solution embraced by Python developers involves full-on virtual environments.


Is it that surprising? Wouldn't you consider node_modules/ basically a virtual environment as well? Python didn't go the node route from the beginning so virtual environments make a lot of sense to fix the issues. Node presumably drew lessons from python and other sources and went with the full recursive design from the beginning.


Yep. I set up my Node projects in the equivalent of a venv (largely because I’ve found a lot more dependency hell in Node than in Python, especially around nested dependencies that would clobber things in other projects)


But venvs are even more powerful than npm node_modules, it gives you complete control of the path. No need for “npm run” stuff.

Take a look at https://pypi.org/project/nodejs-bin/. It lets you install a specific project version of Node in a Python venv.


That the same as npm (node), bundler (ruby), and many other solutions do. Some details differ, but the concept is the same.


I wrote a short guide for confused node developers at work a while ago https://leontrolski.github.io/virtualenvs-for-node-devs.html




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: