Hacker News new | past | comments | ask | show | jobs | submit login
If this project is dead, just tell us (github.com)
395 points by omni 40 days ago | hide | past | web | favorite | 313 comments

This "project dead?" question pops up on many open source projects on github. One recent one: https://github.com/axios/axios/issues/1965

The issue seems to be similar to one that we programmers regularly encounter in our day jobs: failing to consider maintenance and/or make a plan for what happens to a project in one, two, or three years. I don't mean to say "the authors of this project failed to consider maintenance" I mean "we, the whole community, including maintainers and consumers of the project, took maintenance for granted or were not concerned about it." Anyone who installed the tool without reading the project's governance model and maintenance plan signed up for "whatever happens happens," the governance model of most small projects. I myself do this all the time, I'm not saying I'm better by any means. To participate in the JavaScript/Python/etc. ecosystem generally requires being OK with this.

To me this isn't a question of one project, it's a question of OSS project governance in general. Is there a succession plan? How do you know when more maintainers are needed? How do you promote someone from contributor to committer to releaser?

We went around adding CoCs to everything a few years ago, perhaps a similar effort could be made with governance plans? Like a boilerplate plan projects could use, hopefully leading to a new community norm of "if you are using a project with no governance plan, that's up to you, but don't say you weren't warned!"

Broadly, "a project is being actively developed and makes releases reliably often" is a fairly tolerable maintenance plan.

It's only when that implicit contract breaks down that you get "is X dead?"

Can you suggest something a little more concrete? What should I add to my project so that if I die tomorrow the project can continue without me?

I work on a small project. I didn't add a CoC when the big push came about because I concluded that it was if I say it is bad it is bad, otherwise it is good - that is no CoC and so there is no point. (I did consider asking some large project - KDE for example - to be arbitrator, but I didn't bother)

I haven't thought about it much and I'm not a project maintainer. As for your bus factor of 1: a) it only matters when there's actually a large group relying on your project and b) the answer is "have multiple maintainers with commit bits."

To take a shot at something "more concrete":

    * New releases will be cut at least on a quarterly schedule
    * When the number of open issues passes 25, a new maintainer shall be added (if not sooner)
    * Maintainers may leave a project at any time and are encouraged to do so if they need to move on to other things
    * Maintainers may return when they have time
    * Once per year, maintainers shall either reaffirm their desire to continue maintaining the project or step back from maintaining
    * After 2 merged PRs, a contributor is promoted to a maintainer
The main functions here are 1) add clarity to whether anyone is actually maintaining the project & who 2) Clarify that maintaining the project is an ongoing commitment & it's perfectly fine to step back from that 3) Create a mechanism to add maintainers & to signal when new maintainers are needed and 4) Create some general guidelines around releases, so people know what to expect (and know when things are no longer happening).

This is certainly not a good governance plan for everyone, some may find it too prescriptive, but it seems like it could be a good start for projects like axios & pipenv. The idea is to prevent "is this project dead" type issues before they arise. When a project is losing maintainers & no one wants to step, it's clear that the project is waning.

Of course it's also perfectly reasonable to say "that's too much work" or "f*ck off, it's my project." This convention would help people make more informed decisions when adopting a project for use. Some will be fine with "whatever", some will want a clearer plan, both are OK.

I’m in a similar situation, maintaining a project that’s used widely enough it should probably have a continuity plan, but not enough that it’s developed a robust community of other contributors who could step in.

Jazzband [0] seems meant to address exactly this: “a collaborative community to share the responsibility of maintaining Python-based projects.” And it looked promising, but it’s not entirely clear that the Jazzband project itself is all that active (only news status update is the launch announcement from 2015; last public Twitter activity was in 2017).

[0]: https://jazzband.co/

Oh, just realized that pip-tools (being discussed favorably downthread as a pipenv alternative) is a Jazzband member project: https://github.com/jazzband/pip-tools.

I actually think that accepting contributions productively is almost a full-time position and will turn you from programmer to manager.

Maintenance is something I've come to appreciate more and more every year.

It's so easy to install dependencies, but it comes with a large hidden cost and painful lessons down the line.

Yup. Further, I do a lot more research in the dependencies I bring in (if I can help it).

Does this thing also bring in 1000x sub dependencies? How much am I REALLY going to use this. What's the actual added value here.

I've seen people bring in libraries for simply the dumbest things. The worst offender I've seen is lombok brought in for a @Logger annotation on one class. That stopped our java 6->8 migration because the version of lombok they brought in was old and for highly used library (so upgrading wasn't simple).

If you can write the used functionality in 10 minutes, you should not bring it in as a dependency.

If you can write the used functionality in 10 minutes, you should not bring it in as a dependency.

I disagree with this; it depends on the size of the dependency, plus the risk of a problem with the dependency vs the risk of making a mistake vs the risk of code drift when the same 10-min thing gets written a dozen different ways across an org.

I think you are dramatically underestimating the cost of a dependency.

The dependency may change in surprising ways in the future. The dependency may not change in expected ways in the future. (ie, now you're trapped in an old version of the language, or another one of your dependencies)

The problematic part with reinventing the wheel or NIH is that you add a fixed cost of doing business to any given change. The problem with importing the world is that you add a dynamic cost to simply existing without change.

It's a tradeoff, and it's not a tradeoff that's consistent across organizations. For some orgs it's really important to be first to market. In other orgs it's important to not break shit for your existing customers.

My org is well established in our field. Nothing matters more than retaining our existing customers. Importing a new lib requires approval from legal, which requires about a personweek. In other words, if one developer could implement the same thing in less than 40 hours, the developer should implement it themself. For us, a slow moving org, I think we're at the right place. For a startup, obviously not.

I think you may be dramatically underestimating the cost of NIH and code bloat.

I'm talking about things developers think they can do in ten minutes but that they are at high risk of getting wrong: generating SQL, parsing JSON, handling dates, etc. These are all things that have well-established libraries in most sane languages (maybe NodeJS is not sane in this regard), where everything that you save yourself from by using the library is worth even a 40 hour review process.

None of the examples you've given are 10 minute problems. It isn't reasonable to think of them as such and so it is a straw man.

10 minute problems are things like

is-odd - https://www.npmjs.com/package/is-odd is-even - https://www.npmjs.com/package/is-even leftpad - https://www.npmjs.com/package/pad

Or bringing in large libraries to use single methods or functions from them. Such as

Bringing in gauva for max Bringing in apache commons for StringUtils.join.

I've seen both examples in my company's code base.

A library that does JSON parsing, date handling, etc is doing a lot more heavy lifting than "is-even".

Painful lessons, yes. Instant gratification versus long-term planning. I think this applies to more than just software.

The axios example is an interesting one, because there are tons of people saying they're willing to help, but how many of those people will actually contribute anything without guidance of any kind? When the current maintainers are too busy to even provide guidance, you need to be a special kind of developer to even have a chance of becoming a real maintainer, and those kinds of developers are few and far between.

Pipenv is very controversial project who lost its reputation. Many times I have issues that was making my everyday life very uncomfortable. And of course I saw issues on GitHub with other people who also have the same problems. And then in the middle of the very constructive conversation somebody from the maintainers team or initial creator of the project jumping to the thread and very aggressively close it or saying something like "go f--k yourself we don't need it". Srsly? Who will like this level of the conversation? It is ok if you don't have resources or time or whatever, but we have this particular problem. Don't be rude and push people against you.

My personal opinion that some day ripples on the water made whole loop.

And last but not least: pip+requrements.txt are not the best, but pipenv doesn't add much over it. It gives you a little but introduce another level of abstraction over the same things with its own level of complexity.

Can you link to one of these times a maintainer said something like that?

I put myself in very uncomfortable position. When I need to prove my words by blaming someone who did very good job for the community in general. At least have these good intentions. And the same time I don't want shame somebody in public.

I did some research and think this link is enough https://github.com/pypa/pipenv/issues/1137.

Wow... some users are subtly pressuring the maintainers into work here. The maintainer simply states "works as intended".

The users in that topic then continue the discussion in all of its aspects, expecting the maintainer to engage with them. I can imagine it feels to the maintainers as an energy-sucking discussion.

IMHO, you're not shaming the maintainers, you're shaming the community here.

Here's an archived issue thread from Open Cart when the project owner gets unreasonably aggressive https://gist.github.com/T-Spoon/5a7cca7ea11c45b63c139da009c1...

669 commits in the last year... and asking if the project is dead?

I'm a bit confused here. Maybe the project is mismanaged, or there's some upstream issue with package distro, but it seems to be far from dead.

Determining if a project is alive or dead really is a problem. And this problem will grow from year to year.

There are so many small and medium sized projects on Github, where you have no idea if they are maintained or not. Sometimes there are projects which are alive and kicking with multiple pull requests, maintainers promising changes or a major release and suddenly nothing. Sometimes smaller projects see no changes for months, but they simply work and don't need maintenance at all. Maybe the community moved elsewhere and and the project will only see bugfixes coming in and no more features?

The burden of figuring this all out lies with the visitor and is a annoying hassle.

I would love if Github could somehow show me on the landing page, if the project is worth investing time in. Maybe they could send a heartbeat to the maintainer and simply show the project as dead as soon as there is no response?

You said it yourself, this is a very hard problem.

There are Java packages that haven't seen a commit in years and are still perfect for a task. There are npm packages that are 2 months old and terribly outdated and unmaintained.

My personal favorite metric is "time to maintainer response": how long does it take for a maintainer to respond to issues or pull requests. Not necessarily to resolve them, but triage issues or provide guidance on a PR.

If this happens quickly and with reasonable responses, projects are usually solid, assuming they have existed for a while and see decent usage.

> There are Java packages that haven't seen a commit in years and are still perfect for a task.

My favorite is Scriptella (http://scriptella.org/download.html) which recently just got its first update in around 7 years.

For what it does, it just worked, and for many uses cases, updates were never an issue.

Author of Scriptella here. Thank you for mentioning it! The product was neglected for many years, but I never was ready to finally press the kill switch. Hoping that one day I will have more time to work on it...

I cannot promise active feature development, but at least keep it compatible with recent JDK versions. Let me know if you have any feature requests on https://groups.google.com/forum/#!forum/scriptella or https://github.com/scriptella/scriptella-etl

I think Scriptella is fantastic. It does one thing extremely well, and for me, that's all I can ask for.

In this case, the code is still maintained and at least some pull requests are being processed (I haven't checked in detail). The only thing is that there hasn't been an official release for some time.

These are not great metrics. There are no great metrics.

The official JSON implementation is a good example. It is a mature project. It's used everywhere. If you leave an issue or submit a PR that fixes a typo, the maintainer will flat out delete your comment and tell you to buy his book to educate yourself.

Maybe this behavior is good or maybe it's bad. But maintainers are human and metrics are not going to cleanly pick up how well a project is maintained.

How's this for a metric: number of stars divided by number of open issues.

Who cares if something has not commit for 7 years if almost noone uses it and it has no real issues?

On the other hand, you would hope a popular library with heaps of bugs and issues is receiving lots of maintenance.

> Maybe they could send a heartbeat to the maintainer and simply show the project as dead as soon as there is no response?

If I could not automate response to that, I'd ignore it.

My employer gets to make bullshit intrusions on my time like that, no way will I waste my life on make-work from Github.

Just fishing for a response wouldn't be very meaningful anyway. A far better metric would be the time it takes for a maintainer to respond to pull requests and other issues.

> My employer gets to make bullshit intrusions on my time like that, no way will I waste my life on make-work from Github.

You are not willing to give Github the information, if a project ,which you uploaded, is still active? But, will you give that information to users asking for details? Which way is more annoying to you and the users?

What I want to say is, that Github will turn into a huge graveyard in the years to come, with projects long forgotten still having a landing page like it's the best thing ever. Do you want every visitor to first contact the maintainer if the thing is still alive? Isn't that a huge waste of time?

In my opinion, Github will have to clean up the mess left behind by unmaintained projects at some point. The sooner they start, the easier it will be.

I think one could get most of the way there by having a bot that: a) Autobuilds the library based on its declared dependency stack. b) Runs a test suite, and reports back breakage.

Anything that sits broken more than (say) a month can safely be considered dead. Super-stable low-level code will continue passing their tests forever, and things dependent on specific API versioning in left_pad will be marked dead quickly...

> The burden of figuring this all out lies with the visitor and is a annoying hassle.

I usually find it to be a much smaller hassle than recreating the functionality myself.

Take SpeedCrunch for example: the current release (0.12) dates from 2015 AFAICS, and works very well for the most part.

There's a bug in that release though: it thinks 0^1 is NaN. That bug was fixed in or before 2018 (https://bitbucket.org/heldercorreia/speedcrunch/issues/836/0...), but there has not been a release since. There have been a number of commits this year, but not a lot.

Is the project dead or not? I honestly couldn't tell.

not sure about the project but in a case like that the project management appears to be dead for sure, because releases should not trail vital bug fixes like that. Same for pipenv.

Those 669 commits without a single release are pretty useless to the normal users.

What would be the difference between 0 commits without a release and 669 commits without a release?

The 0 commit project is dead or finished.

The 669 commit project is clearly not finished, but also not really dead, yet.

Well for one thing it means the project isn't dead, whether or not the release is ready.

He talks about maintenance and making releases. Otherwise people will still deal with bugs in the "current" release that have been fixed in the master branch for a long time.

That would be mismanagement (and annoying of course), not "dead" I would think?

There is no difference for the average user. I'm not going to use the code in master, I will use a package from pip, pipenv or otherwise.

And even if you did, there's no guarantee that the work in master is actually in a sane state. Some try to follow models like master is always runnable, or tested/built by ci/cd, but it's often simply not the case.

I've worked on atleast a few open source projects where I'd pull and build master, trying to obtain a particular bugfix, and it would fail, and I'd have to randomly rollback commits until I found a working build.

In one case the project had a seperate repo for a library that wasn't kept entirely in sync with the parent project.. had to randomly move back both repos until I found a matching pair that built and ran successfully, but after the bugfix was implemented.

Releases are important for anyone who isn't working on the repository directly -- they gaurantee the project is in a sane state.

There is git-bisect for that task, fyi. If you can quickly script the build and test process, it's an easy way to not be bothered manually.

It may be abandoned by everyone with release permissions.

Most of the 2018 release were by techalchemy, who made commits in July.

Right now, it seems like the issues raised in [0] are still a problem. The release pipeline has problems, and dragging through that isn't a priority.

[0] https://github.com/pypa/pipenv/issues/3742

It's definitely tapered off, though: https://github.com/pypa/pipenv/graphs/commit-activity

There's no activity in December. There was one commit in November, and that removed a Fedora version number from the readme.

Similarly, Slackware Linux's last stable release was 14.2, in June, 2016, but their -current release tree was last updated.... yesterday.

Now, I could be super huffy that they haven't released in years, but this is open source software that I'm not paying for. Moreover, I have the power to choose multiple alternatives. I can run the -current release. I can make my own "pseudo-release" by just forking here and calling this "14.2.1". Or I can use a different distro.

It would be really entitled of me to demand Patrick make a new release just because I want one, especially when he doesn't owe me anything, and I have multiple alternatives. And besides, the project is still going, it's just not doing what I want it to do; and that's Pat's prerogative.

I mean, you're right on all points, yet I confess I would be a lot more willing to give Slackware a try again if it would return to having a more-or-less annual official release, like it managed from its start in 1993 up through 2013. (Mast years prior to 2000 it had more than one release a year!)

There are things I (dimly) remember genuinely liking about Slackware in terms of its philosophy, but as near as I can tell the official installation method is "install the most recent ISO, then update from there," and when the most recent ISO is three and a half years old, that's not a great look. We've now gone the longest length in Slackware's history without even a point release.

I wouldn't demand Patrick make a new release, but I don't think it's wildly unreasonable to expect one by now. Bringing it back to the original article, I think the same can be said of pipenv.

Are those Github "releases" or git tags? Or are they using a specific branch for that?

Clementine the music player is my go-to example of this phenomenon. Its got 750 commits in since its last release in 2016, there is regular merging, but the project management is in a place where trying to deploy builds to every platform is acting as a barrier to any new releases.

It recently finally merged its Qt5 support into master (which was in a working state since 2014) about two months ago. That might help the release process.

What doesn't help is that a consumer facing product of that scale is a nightmare to manage - it has over 2k open issues on Github but probably 95% of them are "I don't know how to do X" tech support kinds of things.

From the post itself, the symptom is: people are committing, but releases are not being made.

Remember that "releases" are a Github-specific feature, not a git one.

I mean releases as in, a point in time snapshot of the code that has been declared fit to use, and released for distribution to the public. Regardless of the software used to prepare this. As distinct from just pulling the latest master branch and praying it's stable.

Indeed! It doesn't seem dead at all. While having a new release could be nice, I must also say I didn't face much trouble with the last release (I do Ansible work & use pipenv+pyenv to freeze the dependencies). I'm grateful to pipenv in how it has helped me here.

Meanwhile, Poetry just announced 1.0: https://python-poetry.org/blog/announcing-poetry-1-0-0.html

The 1.0 release is a must for a working poetry environment. The people handling it are friendly and communicate over github and discord. However, it's not all roses.

- VSCode supports poetry as a second class citizen - Documentation is there, but only in github issues - There is no migration path from pip and pipenv to poetry - I can't do releases via CI because of a poetry bug.

DepHell claims to be able to convert to and from pip, pipenv and poetry, so perhaps that could be used for migrating?


We love DepHell. I migrated some work repos from Pipenv (and plain ol' pip) to Poetry. However, we didn't want to have a flag day where we updated our build tooling to be 100% Poetry, so I made a Makefile target that builds requirements.txt and setup.py from pyproject.toml. Now developers can work with pleasant tooling, but the build system can use the old stuff it already knows.

We're close to having everything migrated to Poetry. When that day comes, we can throw out all the compatibility stuff, update the build server, and be happy. Until that day, DepHell gives us an easy compatibility layer so that we don't have to do the migration all at once. It's awesome.

Ah hell, I wish I had known about this tool before migrating.

Thank you so much for pointing it out. I've added it to this issue.


There is also a bug that is keeping me from switching, and also the handling of that bug has not been great, otherwise I would do so in a heartbeat.

Huge fan of poetry. I highly recommend people try it out. It feels like cargo, but for python.

fandom always leads to these hollow hypes that end in dead projects. we do not need that in programming.

Sure we do. The whole software industry is built on hype cycles. If we didn't have them, we'd get paid normal wages and couldn't claim we're eating the world

Replace the word fan with supporter, which is more in line what what I was thinking, and is certainly needed in programming.

Speaking as a maintainer of large open-source projects, I know what it's like to get stuck working on a release branch (which is sometimes just the `master` branch) that takes forever to become an official release. I'm not sure if that's the story here, but there have been a lot of commits since the last release, so maybe?

In any case, I strongly recommend publishing alpha/beta releases along the way, without the ceremony of an official release, so that folks can give feedback during the long months that fly by when you don't have nearly enough time to spend on the project.

Respect the needs of your conservative users by not publishing official releases until they're really ready, but trust your more engaged users to use whatever you have right now, and let you know what's working/broken, without judgment.

Not dead yet.

Technically, that applies to everyone of us, and every product---or service---we use...

Technically, there is an infintesimal chance that entropy spontaneously reverses permanently and that nothing ever dies.

Pipenv has spawned so much controvery.. One big shitfest.

Python packaging in general is such a messy ecosystem

Do you think Poetry solves this package management mess?

Poetry is the solution, IMHO.

I’m sure poetry has bugs and it could have other problems (not sure), but if you look into how it works, from a software design / philosophy perspective, it’s absolutely amazing. It might be worth waiting a bit longer before you use it in a large production project. Once Heroku adds native support to the Python buildpack, and some other mainstream avenues add support, Poetry usage will increase enough to create the necessary scrutiny that will bring it up to true production-grade quality.

When you add a package in Poetry it will add the latest version of that package’s dependencies that also satisfy all the other packages (and their dependencies’ dependencies).

If you remove a package, Poetry will remove any dependencies of that package (actually uninstall them, rather than just leaving junk behind like pipenv), and will reestablish the latest versions of any other dependencies who’s versions were held back by the package you just removed or any of its dependencies...

Also, when you add a package with Poetry it doesn’t need to completely lock and install everything from scratch; only what you have just added and whatever its dependencies dictate; nothing more.

Also, you can tell Poetry to list off all the versions of your packages and the latest versions available (in case you’re like me and you pin all of your top level packages, having to Google “pypi django”, “pypi requests”, etc., to see if there are any newer versionS worth upgrading to, is a pain in the ass.

I feel like it might, if it replaced pip and gets full endorsement from the python org. It's a fantastic project, but untill it has complete backing it's just a technical risk to include in a project.

To a point. Poetry still isn't fully in board with being a project manager, it focuses too much on packaging. For example it doesn't do project scripts like nom scripts.

It doesn't support shell scripts which kind of sucks, but could be reasonable if it is expected to run on windows as well as posix. This is the scripts section from one project I used Poetry in. It seems to be fine for most things, even though the focus on managing a python module for publication is a little bit annoying. Once or twice I've run into version resolution troubles, but generally it's stable and much more ergonomic+easier to explain than pip+venv. One can of course use `poetry run pip freeze` to generate the standard `requirements.txt` file

   # NB: Poetry won't run arbitrary callables, so these are wrapped in a python script.
   # Poetry needs the root dir to be a package (e.g. module `my_repo` must exist) in order to run scripts
   # So they are just shoved into a dummy "my_repo.py" instead of going in e.g. "scripts.py"
   # Adds Git hooks from `git_hooks` path (overwrites existing!)
   hookup = "my_repo:hookup"
   # Run formatter+linter, does not modify anything, errors if fail
   lint = "my_repo:lint"
   # runs `black` (formatter), modifies stuff
   format = "my_repo:format"
   # A list of files which failed linting. Most helpful with flake8 vim integration https://github.com/nvie/vim-flake8
   lintyfiles = "my_repo:lintyfiles"
   # unit tests
   utest = "my_repo:utest"
   # unit + integration tests in container
   citest = "my_repo:citest"
   cutest = "my_repo:cutest"

To be fair, NPM really doesn't do a great job at project scripts

Well, I'd say it makes it so much smoother, almost 'not a pain' anymore.

Python packaging is broken by convention. A million snowflake packages with no clear model for reuse and extension and a whole lot of unmaintained mess, very little of it organized or uniform. Regardless of what tooling you wrap around it, the ecosystem will always be a mess.

> Python packaging in general is such a messy ecosystem

Not just messy. It’s probably worst in class.

I at least can’t come up with a single worse package-management story which I do know of.

Edit: I’m talking about platforms with actual package-management which sucks, not platforms with the absence of package-management all together.

As someone who's been in software development for two the better part of two decades now I would say that's far beyond the truth.

Yes it's not great right now and it's not completely clear if I should recommend a new user pip+venv, pipenv, pyenv, poetry or conda and that's a big problem, Python was actually quite early to handle dependencies and packaging in a standardized and structured way. I remember that most of the projects I worked on early on in my career more or less completely relied on vendoring dependencies, if packaging systems existed they were of either very complex or the one that came with your os/distro.

However Python currently needs to catch up, we need some alignment and a clear path forward, it would also be great to have an official way of building artefacts. There's plenty of ways today but no great blessed way to recommend a newcomer.

> As someone who's been in software development for two the better part of two decades now I would say that's far beyond the truth.

As someone who has had to deal with Python's packaging a lot, I would say that it's really, really bad. Maybe not the worst in the world, but it is much closer to the worst than to the best.

There are 15 ways to package Python modules, none of which are feature complete (but a lot of them pretend to be). Every year or so someone decides that they need to write another packaging tool for Python, and then they give up before it is feature complete.

A tool like virtualenv is useful, but it overpromises and underdelivers (this is the overarching theme of Python packaging imo): it does not completely separate your environment from the system. E.g. virtualenv still uses the host system python packages cache, and it does something rather nasty with .so's that get copied from the host system...

And don't get me started on dependency management in Python. When the dependencies of your dependencies make breaking interface changes, you're in for a world of pain.

I find it amusing, in a sad way, that the language that prides itself on "there is one way to do it" screwed up so badly in the packaging department by not having one good way to do it. Meanwhile the language whose motto is "there is more than one way to do it" has a standard, sane way to package things, with multiple tools that work together in a coherent way. Perl got this one right.

> Meanwhile the language whose motto is "there is more than one way to do it" has a standard, sane way to package things, with multiple tools that work together in a coherent way. Perl got this one right.

I was always impressed by CPAN compared to most other package managers I've used. And, yeah, between Python, Perl, Node, Ruby, and even PHP, Python is by far the most inscrutable and the most prone to giving me fits when existing Python apps like Lektor just... inexplicably break after one of the Homebrew versions of Python gets updated.

Oh, glad that I wasn't the only one appalled at how this breaks Python "rules" !

To add to the confusion, there’s also setuptools, easy_install, wheels, eggs, ...

I mainly use virtualenv and would recommend that to people until they have this problem trying to use matplotlib on osx: https://matplotlib.org/3.1.0/faq/osx_framework.html

osx includes a "non-framework" build of python making it hard to use matplotlib

Actually just now finding out about "venv" in the standard library introduced in python 3.3

Maven basically solved Java's dependency problems back in 2005.

Serious question: Isn't maven the reason that you can't really package Hadoop? It is my (high level) understanding that it is at the core of (or a big part of) why Hadoop is so hard to build on your own vs VM.

Hadoop is hard to package because build scripts are treated as second hand citizens for most projects. There's a high standard of code reviews for many projects, but for the build scripts it's "it works, merge it".

I've built quite big projects with Maven (>2 million LOC) without issues.

Well, C/C++ dependencies are also a hell to reliably setup on different platforms.

But C/C++ is messy because it doesn’t have a language/platform-provided package-management system, while python actually does.

So that’s apples to no oranges, I guess. And it still doesn’t leave python looking particularly good.

But the Python mess because initially, it didn't have such a system either. Then a bunch of them were introduced, and some of them became official but there was never one that solved all the problems.

I would consider the packages shipped by most distros as platform-provided package-management. It's not specific to C/C++, but if I'm developing something for say, Debian, I'm going to use as many system-provided libs as possible.

Have you tried Conan[1] by chance? I've only used it for small projects, but it was pretty nice.


I thought NPM was everyone's favourite hated package manager/repository?

The faults aren't so much with the actual NPM software as with the whole ecosystem. The real root of the problem IMO is that Javascript has such a tiny std lib compared to other popular languages that have package management systems. This encourages lots of people who are missing various functions normally found in std libs to write packages implementing various combinations of those functions. Those who are writing larger packages then depend on various combinations of those little helper packages. So of course, if you need to use multiple large packages to do something useful, they'll tend to pull in a huge forest of tiny packages in a bunch of versions.

NPM itself was also not great, but did become great in the last majors. The weakness of the stdlib is still a problem but very, very slowly getting better with the new Ecmascript and node.js versions.

Hit the nail on the head.

left-pad am-i-right?

It clearly has its faults, but it is simple to use, requires no “venvs”, is CI-friendly and mostly does what it’s supposed to with few surprises.

Much unlike how things work with python.

And it comes from a time, where a 2MB-webpage is considered acceptable. The whole idea of venvs is basically still considering some reuse. If you shout out loud "SCREW YOU LIBRARY REUSE", then npm is perfect of course...

It is, at least on HN, but it is also one of the best overall.

Yes, but for opposite reasons. You probably don't get left-pad in an ecosystem without one — and preferably only one —obvious way to publish and use it.

I might be alone but I hate dealing with Java’s package management system.

What would be an example of a good packaging ecosystem?

Probably cargo.

- It was built in to Rust from the beginning and officially sanctioned so fifteen different people don't have to build their own incomplete, buggy package managers.

- You can add plugins for things like automatically updating or adding dependencies.

- It handles projects and subprojects.

- All you have to do to install dependencies and run a project is "cargo run", reducing friction for getting into new projects.

- For the most part, it just works.

That said it isn't perfect, especially when needing custom build scripts, but it's good.

Golang is probably the worst. Python a close second.

Cmake? (And especially Kitware Superbuild... try pinning packages there!)

Isn't it a build tool, not package manager?

It is used as package manager often enough -- stick "git clone", or ExternalProject into the CMakefile and you have the (bad) equivalent of "pip install" for C++.

For example, I mentioned "kitware superbuild" above. It describes itself [0] as:

> It is basically a poor man’s package manager that you make consistently work across your target build platforms

[0] https://blog.kitware.com/cmake-superbuilds-git-submodules/

I thought NPM was everyone's favourite hated package manager?


C/C++'s package management story is "defer it to distributions". Personally I much prefer `apt-get install libfoo-dev` to 1) each language inventing its own incompatible system, and 2) each developer self-publishing, so there's little to no safety or accountability when adding a new dependency.

So, to be clear, I also tend to prefer distro-supported packaging. However. Forcing everything to go through the distro means that you massively limit what's available by raising the barriers to entry, you slow pushing new versions (ranging from Arch's "as soon as a packager gets to it" to CentOS's "new major version will be available in 5 years"), and you lock yourself into each distro's packages and make portability a pain (Ubuntu ships libfoo-dev 1.2, Arch ships libfoo-dev 1.3, CentOS has libfoo-devel 0.6, and Debian doesn't package it at all). When distro packages work, they're great, but they do have shortcomings.

> limit what's available by raising the barriers to entry

But that’s what you would have to do youself anyway. You can’t use all fresh upstream version of everything, since they don’t all work together. So some versions you’ll have to hold off on, some other versions might require minor patching. But this is exactly what distro maintainers do.

That's a fair point. I live in embedded world most of the time, and using third party libraries in that space is not always so easy :)

> Personally I much prefer `apt-get install libfoo-dev`

So, what do you do when different projects need different versions of libfoo? Right, you download and build it locally inside the project tree and fiddle with makefiles to link to the local version rather than the one installed by apt-get. So basically you do your own dependency management. Good luck with that.

> So, what do you do when different projects need different versions of libfoo?

Well typically other people's projects I would normally rely on the distro to compile other people's projects as well; and then the distro maintainers would sort that out. That may include making a patch to allow older projects to use newer versions of a library; or if the library maintainer made breaking changes, it might mean maintaining multiple versions of the library.

Obviously if I myself need a newer version than the distro has, I may need to work around it somehow: I might have to build my own newer package, or poke the distro into updating their version of the library.

I mean, honestly, I don't build a huge number of external projects (for the reason listed above), and so I've never really run into the issue you describe. It seems to me that the "language-specific package" thing is either a side-effect of wanting basically the same dev environment in Windows and MacOS as on Linux, or of people just not being familiar with distributions and seeing their value.

> So, what do you do when different projects need different versions of libfoo?

That’s an untenable situation. The package which depends on the older version of libfoo is either dead, in which case you should stop using it, or it will soon be updated to use the newer version of libfoo, in which case you’ll have to wait for a newer release. This is what release management is.

> The package which depends on the older version of libfoo is either dead, in which case you should stop using it, or it will soon be updated to use the newer version of libfoo

So you are suggesting that every time libfoo bumps its version I have to update the dependencies on all of my projects to use the latest, find and fix all the incompatibilities, test, release and deploy a new version? Seriously?

Yeah, I mean what's the alternative? Your code will bit rot if you don't keep up. You don't have a living software project if you don't do this.

You should read the release notes of the new version of your dependency, fix any obvious issues from that, see if your tests pass, and wait for bug reports to roll in for non-obvious things not caught by automated tests. Ideally you should do this before the new release of the dependency hits the repos of the distros most of your users use, so that it's only the enthusiasts that are hit by unexpected bugs.

Even if you could delay and batch the work together every several releases of a dependency, you're still doing the same amount of work, and it's usually simpler to keep up bit by bit than all at once.

One trick is to not use dependencies that have constant churn and frequent backward-incompatible changes, and to avoid using newly introduced features until it's clear they've stabilised. When you choose dependencies, you're choosing how much work you're signing up for, so choose wisely.

Of course you could go the alternate route and ship all dependencies bundled - but that is a way to ignore technical debt and accidentally end up with a dead project.

Also, your project should not demand a specific version of `libfoo`. If `libfoo` follows semver and a minor release breaks your project, that is a bug in `libfoo`. Deployments of production software should pin versions, but not your project itself.

> You don't have a living software project if you don't do this.

What I might have is a working piece of software that is an important part of company infrastructure. For mission critical software reliability is a much more important metric than being current. Unless there is some really compelling reason to update something it should not and will not get updated. There are mission critical services out there running on software that hasn't been changed in decades and that is a good thing.

> Also, your project should not demand a specific version of `libfoo`. If `libfoo` follows semver and a minor release breaks your project, that is a bug in `libfoo`.

Who the fuck cares if it is their bug or not? I need my service working, not play blame games. And if I have a well tested version deployed, why the hell would I want to fuck with that? And if I have tested my service when linked against libfoo I had better make sure that this is the version I have everywhere and that any new deployments I do come with this exact version.

But if I start a new project, I might want to use libfoo 3.14.15.whocares4 because it offers features X, Y and Z that I want to use.

Exactly. This is what we signed up for when we release software and commit to keeping it maintained. This is what we do.

If you instead just like writing software and throwing it over the wall/to the winds, you are an academian in an ivory tower, and have no connection to your users in the real world.

> you are an academian in an ivory tower, and have no connection to your users in the real world

Quite the opposite, actually. My responsibility is to my users. And that responsibility is to keep the software as stable as possible. So the only time I will consider upgrading my dependencies is when reliability requires it. If libfoo fixes some critical bug that affects my project, yes, maybe I should upgrade (although I am running the risk of introducing other regressions). If libfoo authors officially pronounce end of life for the version of libfoo I am using, maybe I should consider upgrading, even though it is safer to fork libfoo and maintain the well tested version myself. But it is irresponsible to introduce risk simply to keep up with the version drift. So if my project is used for anything important, I should strive to never upgrade anything unless I absolutely must.

> I should strive to never upgrade anything unless I absolutely must.

It seems that the choice is whether to live on the slightly-bleeding edge (as determined by “stable” releases, etc), or to live on the edge of end-of-life, always scrambling to rewrite things when the latest dependency library is being officially obsoleted. I advocate doing the former, while you seem to prefer the latter.

The problems with the former approach are obvious (and widely seen), but there are two problems with the latter approach, too: Firstly, you are always using very old software which are not using the latest techniques, or even reasonable techniques. This can even be considered to be bugs – like using MD5 hash for example, which, while being better than what preceded it, much software were using MD5 as a be-all-and-end-all hashing algorithm; this turned out later to be a mistake. The other problem is more subtle (and was more common in older times): It’s too easy to be seduced into freezing your own dependencies, even though they are officially unsupported and end-of-lifed. The rationalizations are numerous: “It’s stable, well-tested software”, “We can backport fixes ourselves, since there won’t be many bugs.” But of course, in doing this, you condemn your own software to a slow death.

One might think that doing the latter approach is the hard-nosed, pragmatic and responsible approach, but I think this is confusing something painful with something useful. I think that doing the former approach is more work and more pain from integration, and the latter approach is almost no work, since saying “no” to upgrades is easy. It feels like it’s good since working with an old system is painful, but I think one is fooling oneself into doing the easy thing while thinking it is the hard thing.

The other reason one might prefer the former approach to the latter is that by doing the former approach, software development in general will speed up by all the fast feedback cycles. It’s not a direct benefit; it’s more of an environmental thing which benefits the ecosystem. Doing the latter approach instead slows down all feedback cycles in all the affected software packages.

Of course, having good test coverage will also help enormously with doing the former approach.

> but I think this is confusing something painful with something useful

There is software out there that absolutely cannot break. Like "if this breaks, people will die". Medical software, power plant software, air traffic control software, these are obvious examples, but even trading and finance software falls into this category, if some hedge fund somewhere goes bankrupt because of a software bug real people suffer.

It doesn't matter how boring, inefficient and outdated these systems are. It doesn't matter how much pain they are to maintain and integrate. These are systems you do not fuck with. A lot of times people who do maintenance of these don't even fix known bugs in order to avoid introducing new ones and to avoid the rigorous compliance processes that has to be followed for every release. Updating the software just to bump up some related library to the latest version is simply not a thing in this context.

I am not working on anything like this. I work on a lighting automation system. If I fuck up my release, nobody is going to die (well, most likely, there are some scenarios), but if I fuck up sufficiently, a lot of people will be incredibly annoyed. So I have every version of every dependency frozen. All the way down the dependency tree. I do check for updates quite often and I make some effort to keep some things current, but some upgrades are simply too invasive to allow.

> I am not working on anything like this. I work on a lighting automation system. If I fuck up my release, nobody is going to die (well, most likely, there are some scenarios), but if I fuck up sufficiently, a lot of people will be incredibly annoyed. So I have every version of every dependency frozen. All the way down the dependency tree.

Most people’s systems are not that special that they absolutely need to do this, but it feeds one’s ego to imagine that it is. And, as I said, it feels more painful, but it’s actually easier to do this – i.e. being a hardass about new versions – than to do the legitimately hard job of integrating software and having good test coverage. It feeds the ego and feels useful and hard, but it’s actually easy; it’s no wonder it’s so very, very easy to fall into this trap. And once you’ve fallen in by lagging behind in this way, it’s even harder to climb out of it, since that would mean upgrading everything even faster to catch up. If you’re mostly up to date, you can afford to allow a single dependency to lag behind for a while to avoid some specific problem. But if you’re using all old unsupported stuff and there’s a critical security bug with no patch for your version, since the design was inherently buggy, you’re utterly hosed. You have no safety margin.

This is the danger of living on the edge of EoL, as you called it. Once you are forced to update a packages, usually with short notice at the most inconvenient time ever due to some zero-day vulnerability found in one of your depdendencies. Then the new version no longer supports another old version it has a subdependency to, which you also pinned, so you have to update the subdependency also. And to upgrade that package you have to update yet another subdependency, and so on.

Suddenly a small security-patch forces you to essentially replace your whole stack. If your test flags any error you have no idea which of the updated subpackages that caused it, because you have replaced all of them. Eventually you accumulate so much tech debt that it's tempting to cherry-pick the security patches into your packages instead of updating them to mainline, sucking you even deeper down the tech debt trap.

Integrating often means each integration is smaller, less risky and easier to pinpoint why the failure happens. Of course this assumes you have good automatic test coverage, which i assume you do if the systems are as life-critical as parent claim them to be.

There's also a big difference between embedded and connected systems here. Embedded SW usually get flashed once and then just do whatever they are supposed to do. Such SW really is "done", there is no need to maintain it or it's dependencies because it's not connected to the internet so zero-days or other vulnerabilities are not really a thing.

Conan is actually quite nice. I've been porting a series of projects to it and it's been a pleasant experience - Conan is very flexible, the documentation is thorough and the developers are very responsive on Slack/GitHub. https://conan.io/

maybe for web-development and ML. For my core scientific workflows I have about, uuuh, 10(?) libraries. And those are generally stable enough to upgrade 1 without breaking the other. Seems like the problem is with the users.

Just curious, you don't find learning entire languages hard or confusing but suddenly writing 3 commands to set up a venv is too 'messy'?

Not everyone uses the same 3 commands which means for a new python dev, finding out about venvs isn't as immediately obvious as it should be. Even then it's shooved (sadly) as an optional thing

It's a pain in the ass when you have colleagues who only know JS or HTML/CSS or whatever and they need to install the project to run it on their machines, and you have to explain "virtual envs" and other bullshit that doesn't exist in sane packaging ecosystems to get them up and running.

I think the challenge is that some dependencies have sub-dependencies that need to be different versions, and venv doesn't handle that automatically.

>you don't find learning entire languages hard or confusing


Biome. We no longer say ecosystem, we say "Biome".

OK so we're sidetracking here, but I have never seen anyone use "biome". Why change from "ecosystem"?

Curious as well. The thesaurus lists them as synonyms, though it appears there are some minor differences in meaning when used as scientific terms.

I see a lot of comments about "hey it's open-source, just fork". The reason people feel upset is because this project was shilled hard when it was released. The python packaging team was officially recommending it, stuff like that. There was some backlash because of legitimate usability concerns with the software, and what was perceived as the tacit blessing of a project solely due to the maintainer's having written another popular project.

> the maintainer's having written another popular project.

Potentially relevant: https://vorpus.org/blog/why-im-not-collaborating-with-kennet...

Holy crap. I had never heard of anything like that regarding him

Wow I didn't know it became that bad. He came across as an egostic person but I didn't know he's more than that...

That's the real issue here: prominent, established projects recommending other projects that are not remotely as well-established as the big projects, whereas such a recommendation may suggest that it is.

Big, high exposure projects should probably wait to recommend interesting new projects for actual use until they are sure that the management of that project is in reliable hands with enough backup to continue it if the original creator vanishes.

I think that's the default for most high exposure projects. Unless, of course, both projects have the same author, and that author's primary concern is increasing their own exposure.

The cult of personality that is being nurtured around certain members of the Python community definitely harms the Python ecosystem, because solutions with glaring flaws get a pass when they are stamped with someone's name, and the mismanagement of projects is rarely addressed.

Then there's also the issue of such members of the community exploiting the human tendency to worship others.

Do you think this is unique to the Python community? Any thoughts as to why it might be so?

You said the "project was shilled hard when it was first released". Why use a word with such strong conspiratorial connotations? Couldn't you just as easily argue that a lot attention was given to the project without implying that it was because of a shill conspiracy?

What concrete evidence leads you to the conclusion that coordinated shills are responsible for giving the project attention?

> The python packaging team was officially recommending it

NO! This is a common misconception[1].

Edit: Correction it's not endorsed by the core Python team but it's recommended by the Python Packaging Authority in various places. See replies below for more info.

[1] https://chriswarrick.com/blog/2018/07/17/pipenv-promises-a-l...

Look, if you can’t trust the endorsement of the organization that otherwise develops and maintains all of Python’s official packaging tools, who are you supposed to trust? Guido and only Guido?

> The thing that made it “official” was a short tutorial [1] on packaging.python.org, which is the PyPA’s packaging user guide. Also of note is the Python.org domain used. It makes it sound as if Pipenv was endorsed by the Python core team. PyPA (Python Packaging Authority) is a separate organization — they are responsible for the packaging parts (including pypi.org, setuptools, pip, wheel, virtualenv, etc.) of Python

True. This same tutorial cites pip-tools, hatch, poetry. Does that mean that they “endorse” these tools as well?

However, over here [0] I see pipenv, but not those:

> Use Pipenv to manage library dependencies when developing Python applications. See Managing Application Dependencies for more details on using pipenv.

> Consider other tools such as pip when pipenv does not meet your use case.

That recommendation is absolutely unambiguous.

[0] https://packaging.python.org/guides/tool-recommendations/

Right, I stand corrected.

So if it isn't supposed to be officially endorsed, should someone file an issue at https://github.com/pypa/packaging.python.org/issues?

It hasn't been removed, because PyPA disagree [0].

> It's an official project of PyPA still, it's just not being pushed as the "be-all-end-all" option.

[0] https://github.com/pypa/packaging.python.org/issues/589

> This same tutorial cites pip-tools, hatch, poetry. Does that mean that they “endorse” these tools as well?

Those tools are relegated to a small section at the end, whereas the main body of the text says "Pipenv is recommended for collaborative projects..."

I'd say that counts as a specific endorsement of Pipenv. I guess it's an endorsement by the "Python Packaging Authority" rather than "Python Core" but it's pretty hard for someone at the tutorial-reading level to perceive the difference.

It’s officially recommended on packaging.python.org [1]. If that’s not enough to count as an official endorsement, what is?

[1] https://packaging.python.org/guides/tool-recommendations/

Those pages are managed by the "PyPA" group, not technically python.org itself, although they were given a subdomain there. Start here and follow the links for more info: https://hynek.me/articles/python-app-deps-2018/

I’ve heard good things about pip-tools: https://github.com/jazzband/pip-tools

We also recently switched from Pipenv to pip-tools and so far it has been very pleasant.

Our workflow:

- Use pyenv to manage python versions (mostly works pretty well)

- In the beginning, use builtin python 3 tooling to create virtual env for the project: "python3 -m venv venv"

- Whenever needed, add new libs to "requirements.in"

- Run the "pip-compile" command to generate a new "requirements.txt" with the dependency and its sub-dependencies pinned by default to the exact version

- Run "pip install -r requirements.txt" to install the new package and its sub dependencies

- Check both requirements.in and requirements.txt into version control

The big advantage is that requirements.in specifies just the packages you care about, while requirements.txt has your packages and all of the sub dependencies pinned to the exact version.

The `pip-tools` portion of that is basically what I came up with myself and described in another thread at https://news.ycombinator.com/item?id=21779929 , except that I'm trying to use `pythonloc` to install things in a local `__pypackages__` folder per PEP-582 instead of using venvs.

I also switched to using pyenv to manage versioning. So far it's been painless. You should never mess with the system Python if you're on a Mac.

I recently switched from Poetry (to which I switched after pipenv) to pip-tools for some projects, because Poetry was not able to work properly with some dependencies.

Pip-tools has been a dream. It is just a thin layer of tools on top of Pip that separates your 'abstract' requirements (eg: django<3) from your 'release' requirements (eg: django==2.2.5) and managing running pip to have your virtualenv reflect your exact requirements. No new standard like pyproject that is partially supported (by 3rd party that is), no intention to (but failing at) being the all dominating way to do python packaging. Just a tool that is there to help you.

edit: it also automates updating the release requirements file (within the constraints of the abstract requirements).

Thanks for this. I've been a poetry fan since... the first release and pipenv really just fails for me. I'm going to check out pip-tools today.

Never used poetry, but pip tools is especially easy to integrate with CI/CD, because they are CLI commands that do specific simple tasks and don't hide things from the caller.

I personally use setup.py (my setup.py only calls setup() and all configuration is declaratively defined in setup.cfg) the pip-compile generates a version lock (requirements.txt) and that is passed between environments, so we are ensuring that the exact same dependencies are installed during deployment.

I like that poetry uses pyproject.toml to replace setup.py, requirements.txt, setup.cfg, MANIFEST.in and the newly added Pipfile. Simplifying it to one file seems smart.

I’ve heard good things wrt determinism and packaging for different environments (dev/test/prod) on top of the simplicity!

Have a look at https://github.com/jazzband/pip-tools#workflow-for-layered-r... Both the 'input' and 'output' requirements are pip compatible files (just with different extensions). It uses pip features like '-c' to include other contraints in your requirements.

For me, if they would merge Pip-tools into Pip and call it a day, my package management issues for Python are solved.

I've tried pip-tools, pipenv and poetry and pip-tools has been the easiest to use by far. I wrote a small comparison last year: <https://www.vincentprouillet.com/blog/overview-package-manag...

HN absorbed the trailing greater-than sign into your URL.


I was pretty badly turned off by Kenneth Reitz and the way he handled conflict with Pipenv. I disliked how instead of listening to feed back or being constructive he just gave off a kind of fuck you attitude. There were real and critical issues with pipenv that he would not budge on and it truly felt like he was in the minority. I know, its his project he can do what he wants but in the context it did not make sense. I especially disliked it due to how he sold Pipenv as being officially endorsed by Python (which its not and has never been) but it did trick me for a short time. Finding out it was never official endorsed is what made me never want to support any of his projects again.

Posting this here again in case anyone missed it. https://vorpus.org/blog/why-im-not-collaborating-with-kennet...

> Use Pipenv to manage library dependencies when developing Python applications. See Managing Application Dependencies for more details on using pipenv.

> Consider other tools such as pip when pipenv does not meet your use case.

Is this [0] not an official endorsement? It certainly seems as much.

[0] https://packaging.python.org/guides/tool-recommendations/

Oh wow new to me. Had not seen that one. Even more disappointing because the tool is far from primetime. Going back to the original. A few years back when I was using pipenv it kept the tagline that it was the future of python and endorsed fully but I am pretty sure that was a stretch or perhaps the maintainer was using his clout and his own recommendation.

Those pages are managed by the "PyPA" group, not technically python.org itself, although they were given a subdomain there. Start here and follow the links for more info: https://hynek.me/articles/python-app-deps-2018/

I'm aware that the Python Packaging Authority is not the Python Software Foundation. But they are also the Packaging Authority.

The PSF may not have officially granted them some sort of status, but as PyPA maintain pip, setuptools and warehouse, they are in fact the authority when it comes to packaging, unless the PSF comes out with a statement saying they aren't.

By having the subdomain, they're endorsed (possibly transitively) by whoever runs python.org. If they want to _not_ endorse whatever PyPA is doing, they need to not supply the subdomain.

You know who else handled conflict in a way that wasn't always understood? Linus Torvalds. I think there's a theme here. Why is it that successful project maintainers sometimes lose their patience with the community? Remember when GvR left the Python community? I think one of his final public statements was, "Now that PEP 572 is done, I don’t ever want to have to fight so hard for a PEP and find that so many people despise my decisions."

Didn't Linus decide that he'd not been doing this well, and needed to change the way he managed conflict, though? Or did I dream that?

I thought he was apologizing for his use of personal attacks, profanity, insults, and generally what he describes as a lack of empathy. Things like that. His hardliner attitude hasn't changed.

I can of see a difference though. I am not condoning some of the ways I have seen Linus handle threads but it at least always seemed that there was a valid reason at the root. Maybe it was not communicated well but there did seem to be a reason. In the case of Pipenv, there were broken workflows that would have made this tool unusable for a large portion of the community and the response was just go pound sand? That specific case has been resolved since then but I dropped using it after that thread came up.

Why would someone choose to use a third party library when the first party solution[0] is more than adequate in the first place?

[0] https://docs.python.org/3/library/venv.html

They don't do the same thing. pipenv allows you to create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.

I prefer poetry though, since the consensus seems to be coming together on pyproject.toml rather than individual files like Pipfile. A lot of tools have already started supporting the toml file for their config, or have PRs pending.

I understand the value of a locking mechanism in the JS ecosystem because 1/ many packages depend on an intricate web of other packages that overlap, 2/ many packages use semver ranges, and 3/ you don't want two versions of the same package running on a user's browser, due to conflicts and increased size.

I can't think of many Python packages that have the same issues, and Python code isn't sent to and running on a user's browser.

Am I wrong or is there a reason that a locking mechanism (other than git) is helpful in Python?

I think this has less to do with "you don't want two versions of the same package running on a user's browser" and more to do with "when I clone a project and run npm/pip install I want it to be in a known state".

I don't use Python/pip much but as for npm: the problem is when your dependencies, direct or indirect (dependencies of dependencies), aren't "exact". You have something like "~1.2.3" or "^1.2.3". If every developer followed Semvar perfectly, never shipped regressions or new bugs when fixing a bug, and was always able to identify every breaking change then life would be perfect.

That is, however, not the world we live in. So a "lock" file respects your "fuzzy" versions ^/~ when you first run the npm install and then subsequent runs will install using the exact versions you downloaded the first time. This helps solve the "works me me"/"work on my machine" problems. The idea being if you can run it locally then the build server and production can also build/run your code.

All of those apply to Python as well except the multiple packages concern has nothing to do with a browser and everything to do with the fact that a given Python process can only load one version of a library at a time (and probably for good reason).

I mean, Rust uses the same locking mechanism to great success. I've never seen a breaking change from upgrading a Rust dependency that preserves semver (which is 99% of them)

> pipenv allows you to create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.

Why on earth would anyone need all this to manage packages for their projects?

Are there any other programming languages whose package-management comes close to this level of intricacy and complexity?

“This project needs package X”

I mean how hard can that be to get right in a self-contained environment?

This whole story is just madness coupled deep denial and Stockholm-syndrome.

> Are there any other programming languages whose package-management comes close to this level of intricacy and complexity?

I think many do. Ruby, Node, Erlang/Elixir, Java, Go, Rust, dotnet, C... I’m having trouble thinking of a modern language that doesn’t have such package management mechanisms.

For many people doing anything more than writing one-off scripts, and especially for anyone who collaborates with others or shares their code, package management is so much more than just “this project needs package X”.

But all your examples are simple and reliable tools with a minimum of intricacy.

My criticism isn’t about having a package-management story.

It’s about having a terrible and complex one.

Your original comment was responding to “create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.” and saying that this is overly complicated. Yet every single one of the listed languages has package management tools that do all of these things.

If you think any of the listed examples have “simple” package management tools, I question how deeply you have used any of them. NPM has ~60 commands and hundreds of subcommands each with multiple option flags, and probably hundreds more config options. Gem/Bundle is similar, etc.

If anything, Python is trying to catch up in how complex it’s package managers can be.

Python only has venvs because it needs venvs.

For most other language-provided package-managers the software project you’re working on is the env, so you don’t need to construct or manage a venv at all.

So my point still stands.

> > pipenv allows you to create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.

> Why on earth would anyone need all this to manage packages for their projects?

> Are there any other programming languages whose package-management comes close to this level of intricacy and complexity?

> “This project needs package X”

> I mean how hard can that be to get right in a self-contained environment?

Packages often have subdependencies and their requirements at times may conflict. If you are very specific in the versions you want, it is more likely to cause issues in dependency resolution.

I would hardly call a dev's machine a "self-contained environment". Most developers I know work in a number of repos with varying requirements, and polluting their system libraries and packages with each project's requirements quickly pollutes the system and can lead to issues.

> I would hardly call a dev's machine a "self-contained environment"

But a software project is.

Only that software project needs those packages.

>“This project needs package X”

>I mean how hard can that be to get right in a self-contained environment?

Okay, I’ll bite.

What version of package X does it need?

Is there a specific version that’s been tested with this project and is known working?

Are there specific versions of its dependencies that have been tested and are known working?

Is it needed at runtime or only at build-time?

What repository can it be found in? PyPI is not the only Python repository; private repos are common.

And as a bonus cherry on top: how easy is it to make sure you have all the project’s dependencies installed in the venv for that project, and that you don’t have packages you’re not keeping track of? This is a UX thing, but developers are human and it matters.

I don't get it either... nor do I understand why venv is considered difficult to use.

"It automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages."

Well, thanks but automate "python -m venv myvenv" ? Add/remove packages from a "Pipfile" ? Do I have to specify dependencies somwhere else than requirements.txt ? Why ?

There must be some use cases I'm not aware of.

It's about dependencies of dependencies.

Requirements.txt only lists versions of your project's requirements, but Pip actually automatically installs dependencies of those requirements too. And those versions aren't listed in your requirements.txt.

https://realpython.com/pipenv-guide/#dependency-management-w... -- This page about Pipenv vs pip + virtualenv goes into more detail.

You can just 'pip freeze > requirements.txt'

Boom, all recursive dependencies frozen to their current state.

But you don't want to necessarily freeze them to their current state. Yes for repeatable builds, but not in your list of dependencies. They're different concerns and Pipenv splits them.

Giving a shout out to pip-tools here... it's a great, simple solution for pinning your requirements (including the deps of your deps), and keeping your venv's in sync.

pip freeze gets everything installed.

Yes, that 'freezes' everything installed... but that's the point. If you want to update a direct dependency, you just do pip install -u <my-dep>. Any indirect dependencies are updated, and you freeze again.

Sometimes direct-dependency-a and direct-dependency-b have a conflict on which version of indirect-dependency-y you need. This is what we call doing 'actual work.'

^This. I don't understand everyone in this thread complaining that it's hard to update a direct dependency. You literally just pip install it and "pip freeze > requirements.txt" again.

In my experience, the issues come around when you try to build envs cross-platform. There are a lot of dependencies that have missing versions or bugs for certain platforms. This is not a pip/venv problem though--it is more of a python problem.

It’s easy to also get the versions of all sub dependencies and put them in requirements.txt

That makes updating your direct dependencies harder. If you upgrade to using newer version of something, and that things dependencies have changed, you have to manually figure out which dependencies in requirements.txt were for it, and update or remove those etc.

> It’s easy to also get the versions of all sub dependencies and put them in requirements.txt

Is it easy to automatically do this when you want a new package version, without having to remember to do it? Is it easier to put together this process and train other developers in it and be diligent in its use? Is all of that easier than installing an application that has a similar interface to other tools, that does all that for you, and has a community of people to help with issues?

But then you're suddenly responsible for keeping track of your subdependencies and updating the versions of each that you want. That should be up to the dependencies.

Also if you manage to drop a dependency, you don't have an easy way to remove the things from requirements.txt that are only there because they're a subdependency.

Putting it all in one requirements.txt is just too simplistic.

If I had such need, I guess I could have two versions of the requirements.txt:

- One with direct dependencies (versions pinned) - One with direct dependencies + subdependencies (pip freeze output)

Am I being too naive ? (obviously yes if such tool as pipenv exists, but I'm trying to figure why people need *.lock files).

This is close to what pipenv does.

It also adds dependency management. If one subdependency is library_a > 1.0, and another is library_b < 2.0 while also e.g. 2.1 exists, then it will try to find a version between 1.0 and 2.0. Pip doesn't do that.

So in my mind, that's what pipenv is -- pip, virtualenv, those two files, plus dependency management.

> One with direct dependencies + subdependencies (pip freeze output)

That's a lock file...

Ok ! That's what I'd do then.

Are you only working on small projects? For reference, if your project is under 100k lines of code, it's small.

By your arbitrary metric, I work on both small and big projects. Not sure how the loc metric relates to the matter though.

Thanks for the link, interesting read.

I still think this is all too convoluted though. I hope an accepted de facto standard for this will emerge at some point.

1. Because pipenv is easier to use.

2. Because it's not at all clear that pipenv is a third party library. It's made by the same group that makes pip, so it's confusing that pip would be considered a de-facto standard but not pipenv when it's made by the same group, and under the same project in Github.

pip is not yet in the standard library. as opposed to venv.

Whilst you might have to define "in the standard library", pip is added in PEP 453 [0] (accepted in 2013), in a similar capacity to venv:

> However, to avoid recommending a tool that CPython does not provide, it is further proposed that the pip [18] package manager be made available by default when installing CPython 3.4 or later and when creating virtual environments using the standard library's venv module via the pyvenv command line utility.

[0] https://www.python.org/dev/peps/pep-0453/

Yes. And that is confusing, as pip is a de facto standard IMO.

Isn't pipenv a dependency management system that includes the python distribution by way of managing a venv, rather than a simple replacement for venv, sort of like poetry (though poetry, I think, does a better job, but for the problem that it doesn't seem to respect SSL options available for pip which are often needed in enterprise environments)?

pipenv does more than just create a venv, although it is my favorite tool for that. The most important thing it does is freeze the dependency tree using Pipfile.lock

> it does is freeze the dependency tree using Pipfile.lock

sorry but what does freezing the dependency tree mean?

The idea is to make builds (more) reproducible. I can build a python program, test it thoroughly, and then be reasonably assured the whole thing won't come crashing down in CI/CD from a bad update to a transient dependency. Then when I want to update the libraries I know I'm doing it purposefully and can commit the new dependency tree to source control.

probably recording exact dependency versions, based on a loose requirements.txt and when it was built.

You may want this because you have a library that you shouldn't be pinning to the third decimal on a sem-ver package, but that you don't want to hiccup in CI due to a dot-release.

Or maybe you think a loose file your tooling can read, and a hyper-specific file your builder should read, is a better interface for a project.

Yes, kind of like that. Except that it doesn't use requirements.txt but rather a file called Pipfile. In there you can also pin version, or leave them unspecified or only partially specified and you can also divide them in dev-packages and normal packages (so it allows for a bit more flexibility than a requirements.txt file).

a bit like "pip freeze > requirements.txt" then?

> a bit like "pip freeze > requirements.txt" then?

With the added bonus that it also contains a hash of the package so if someone pushes a new version with the same version number it would complain that the hashes don't match.

The Python Packaging Authority declared it the future of dependency management once upon a time and it nominally checked some important boxes such as managing a lockfile.

pipenv install --dev is one reason I guess and pipenv uninstall x gets rid of all the dependencies, that's nice to have

This is all MIT licensed, if people care so much, why has nobody forked this? Why are people talking about jumping ship to a completely different project instead of forking and cutting a new release from there?

It looks like 1.4k people have forked it. The question is, which fork do I use? The problem is not that the source is unable to be updated, the problem is how to you organize peoples' efforts under a trusted maintainer long term? How do I know which forking effort to trust?

My understanding is that that's kind of the point of groups like the "Python Packaging Authority". So if they're not going to merge pull requests and do maintenance on the project, that IS a problem, since they're supposed to be the official version right now.

> The question is, which fork do I use?

Return the forks of developers who have publicly stated they'd like to take over maintenance of this project.

I'd bet that narrows it to less than ten.

Now-- have a look at the blog posts where these maintainers explain their plan to sustain the project going forward and choose the most persuasive one.

I'd bet it's less than one.

Short circuited, problem solved. :)

PIL -> Pillow

pipenv -> pippi

They could call it pippi

Alternatively, they could add one of those 1.4k people in to help if they seem to have produced working code for the project prior.

Github forks are meaningless. Sometimes they're forked because people think it's the same as the "star" button. Sometimes it's to have a classy project show up on your profile. Sometimes it's because you want to submit a PR. Sometimes it's because your company requires a software but isn't willing to use the public mainline.

Doesn't mean that at least one of those people doesn't actually maintain their forks and do PRs to the project.

Most of those people aren't maintaining forks in any meaningful sense.

There are over 500 commits merged to master since the last release. The community is actively contributing but these changes don't get to the end user because the maintainers are not releasing them.

Bugs get reported and closed because they are fixed in master every day, wasting not only the end users time but also that of the people actively working on the project.

The problem is the name, not the content. There are lots of guides, blogposts, etc which say “use pipenv, is awesome”. However, if one actually tries to do this, as the bug says, one gets a very old release with seemingly no chances for bugfixes, etc..

This means a bad experience for people who try to use it, and makes whole Python ecosystem feel a tiny bit worse too. Imagine a frustration of someone reading the blog post , spending all the time learning about system, using it, and then discovering bugs won’t ever get fixed!

It is very easy to fix by an author: just a small, 4 line commit to readme and website saying “the project is dead , go elsewhere”. This will allow people to move on - maybe to a fork, or to some other project which does similar thing.

There's 1400 forks. I'm sure that people will move to one of them if the original is pronounced dead.

That normally means there have been 1.4k pull requests not 1.4k people actively maintaining forks.

The question is who will have access to make new releases on the Python Package Index. Otherwise, they'd have to find a new name for their fork.

Title and comments in that issue thread show that people misunderstand how the FOSS works. Maintainer is not 'the project', it's the current direction that shapes incoming pull requests.

This viewpoint doesn't account for authenticity as verified by, e.g., PyPi. Yes, you can use anybody's pipenv, but most people would greatly prefer to not go to such lengths.

> I don't think you're bad people, but the least y'all can do is be honest

I don't like this framing (sort of implies they aren't be honest) but regardless just switch if you're not happy with the release frequency and you have viable alternatives.

Especially if they haven't even said anything yet. A better suggestion would be "It's okay if you're not working on this, but let the community know." I have a feeling some use this for production grade work.

For me, and perhaps others, there is a desire to see this project explicitly move aside (vs slowly die), so that poetry and other projects can take the reins.

A lot of people say “just fork it” or “choose something else”, but the problem is that python is a finite community with a finite amount of energy, and a lot of this energy has been absorbed by the star power Kenneth acquired from his prior successes (namely, requests... which btw now has an even better replacement called httpx).

It’s almost like Kent needs to come out and say “I’m sorry I screwed up, here’s my towel; good night.”, so that people can move on.

there are a couple of issues with "moving on", since the project is owned by pypa, it feels it should be the default for python projects, pipenv has a good idea, but in my opinion, a flawed implementation, and maybe all my issues are already resolved in those 600+ commits without release.

About the framing of the question, I think it's because of all the flame wars in the past when people criticized pipenv and maintainers took it a bit personal.

I am one of those grubby little "dark matter" developers. I fled Perl for Python over a decade ago. While I do love Python, one of the impediments (not the only, and perhaps not the largest) to my progress are the endless packaging issues. Among the larger attractions to Python is that there is supposedly one obvious way to do things and here there is not. Instead, I must make my selection largely based on opinions that have the same foreboding stink of those I associate with arguments over Linux distros: dashed-off dismissals starting with "just" and drive-by engagements with the topic.

As a result, I end up rarely going outside of the standard library. In a perverse way, being locked on an un-upgradable (due to Reasons) version of Python 2.7.5 for the foreseeable future has helped put that temptation a bit further away.

Yes, Packaging Is Hard. It is certainly beyond me. I will probably never need to package anything I wrote, much less distribute it, so many of my concerns are purely academic.

Rather than fussing over things like the walrus operator (really, c'mon), I would love to see those who steer Python buckle down on issues like this, solve them, and then relentlessly backport the solution further back than everyone thinks is reasonable.

Here's how you solve this problem. I do this on my own projects and with Chart.js which we resurrected from the brink of a "2.0 is coming... please wait" cliff.

1. Add a project scope to the README -- this gives you and volunteers grounds to close issues that are not relevant. 2. Send a personal email to your 3 top contributors -- ask if you can give them push access (even if they aren't recent contributors) at a minimum as an insurance policy. 3. Automate or at least specify your release process. You should be able to do this reliably, while drunk and high, and when you have 17 other projects needing your attention. Example: https://github.com/fulldecent/FDWaveformView/blob/master/CON...

I have been involved in a few "takeovers" to implement the above and keep great projects running. A little structure and human goes a long way. You don't need to fork and be the new dictator to keep something great moving.

Serious question, what does pipenv (and poetry) have over conda?

Conda works with a different, parallel ecosystem, whose main source of packages is managed by a single company (Anaconda Inc). That company validates, rebuilds, and possibly silently patches code, to provide their own packages.

pipenv brings simpler workflow to pip. pip leverages packages which are published by their authors onto pypi, which is managed by the Python Foundation.

> whose main source of packages is managed by a single company

thats not true today, conda-forge is a community led effort that have open source recipes of the packages built by anaconda inc and many more others contributed by the community. I run conda with conda-forge packages only and it works great.


My limited experience is that Conda is great if you are all-in on the parallel ecosystem, but it doesn't play well with others. Or at least, it didn't for me.

This may have been true at the beginning, but nowadays I use “pip install” extensively in conda-created environments. Are there particular packages that you have trouble with?

The problem with this is that pip will install dependencies of the package you're installing, not knowing (or caring) that those dependencies are already available in the conda repositories.

Later, conda may install a different version of the same dependency as a dependency of something else. Depending on how exactly they are installed (egg, zipped egg, whether the folder has the version number in it), you either get two versions of the same package installed, with which one gets imported being arbitrary, or you get two sets of metadata, with one of them not matching what is actually installed, such that pip may think version requirements are satisfied when they are not. It's messy as anything, and the breakage can be subtle. I distribute packages to users who use conda, and my packages have dependencies that are available in conda, so this has been messing with a lot of my users' installs. I'm now just making conda packages for these projects to solve the issue.

I made this package [1] to try and automate the process of making conda packages out of my existing setuptools packages, I'm quite happy with it but since it is designed to serve the needs of my projects, I can't guarantee it will suit everybody's needs.

[1] https://github.com/chrisjbillington/setuptools_conda

Very different ecosystems. Its like recipe of pancakes on vegan and normal sites. It is doing the same but different target audience. For example, I rarely saw instruction for packages how to install it for conda except for cases when it is DS/ML oriented.

Conda is more generic - it allows packaging and distributing software written in languages other than Python. You can distribute java based software, C etc.

A lockfile for consistent builds, that alone rules out Conda for a lot of serious things.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact