The issue seems to be similar to one that we programmers regularly encounter in our day jobs: failing to consider maintenance and/or make a plan for what happens to a project in one, two, or three years. I don't mean to say "the authors of this project failed to consider maintenance" I mean "we, the whole community, including maintainers and consumers of the project, took maintenance for granted or were not concerned about it." Anyone who installed the tool without reading the project's governance model and maintenance plan signed up for "whatever happens happens," the governance model of most small projects. I myself do this all the time, I'm not saying I'm better by any means. To participate in the JavaScript/Python/etc. ecosystem generally requires being OK with this.
To me this isn't a question of one project, it's a question of OSS project governance in general. Is there a succession plan? How do you know when more maintainers are needed? How do you promote someone from contributor to committer to releaser?
We went around adding CoCs to everything a few years ago, perhaps a similar effort could be made with governance plans? Like a boilerplate plan projects could use, hopefully leading to a new community norm of "if you are using a project with no governance plan, that's up to you, but don't say you weren't warned!"
Can you suggest something a little more concrete? What should I add to my project so that if I die tomorrow the project can continue without me?
I work on a small project. I didn't add a CoC when the big push came about because I concluded that it was if I say it is bad it is bad, otherwise it is good - that is no CoC and so there is no point. (I did consider asking some large project - KDE for example - to be arbitrator, but I didn't bother)
I haven't thought about it much and I'm not a project maintainer. As for your bus factor of 1: a) it only matters when there's actually a large group relying on your project and b) the answer is "have multiple maintainers with commit bits."
To take a shot at something "more concrete":
* New releases will be cut at least on a quarterly schedule
* When the number of open issues passes 25, a new maintainer shall be added (if not sooner)
* Maintainers may leave a project at any time and are encouraged to do so if they need to move on to other things
* Maintainers may return when they have time
* Once per year, maintainers shall either reaffirm their desire to continue maintaining the project or step back from maintaining
* After 2 merged PRs, a contributor is promoted to a maintainer
The main functions here are 1) add clarity to whether anyone is actually maintaining the project & who 2) Clarify that maintaining the project is an ongoing commitment & it's perfectly fine to step back from that 3) Create a mechanism to add maintainers & to signal when new maintainers are needed and 4) Create some general guidelines around releases, so people know what to expect (and know when things are no longer happening).
This is certainly not a good governance plan for everyone, some may find it too prescriptive, but it seems like it could be a good start for projects like axios & pipenv. The idea is to prevent "is this project dead" type issues before they arise. When a project is losing maintainers & no one wants to step, it's clear that the project is waning.
Of course it's also perfectly reasonable to say "that's too much work" or "f*ck off, it's my project." This convention would help people make more informed decisions when adopting a project for use. Some will be fine with "whatever", some will want a clearer plan, both are OK.
I’m in a similar situation, maintaining a project that’s used widely enough it should probably have a continuity plan, but not enough that it’s developed a robust community of other contributors who could step in.
Jazzband [0] seems meant to address exactly this: “a collaborative community to share the responsibility of maintaining Python-based projects.” And it looked promising, but it’s not entirely clear that the Jazzband project itself is all that active (only news status update is the launch announcement from 2015; last public Twitter activity was in 2017).
Oh, just realized that pip-tools (being discussed favorably downthread as a pipenv alternative) is a Jazzband member project: https://github.com/jazzband/pip-tools.
Yup. Further, I do a lot more research in the dependencies I bring in (if I can help it).
Does this thing also bring in 1000x sub dependencies? How much am I REALLY going to use this. What's the actual added value here.
I've seen people bring in libraries for simply the dumbest things. The worst offender I've seen is lombok brought in for a @Logger annotation on one class. That stopped our java 6->8 migration because the version of lombok they brought in was old and for highly used library (so upgrading wasn't simple).
If you can write the used functionality in 10 minutes, you should not bring it in as a dependency.
If you can write the used functionality in 10 minutes, you should not bring it in as a dependency.
I disagree with this; it depends on the size of the dependency, plus the risk of a problem with the dependency vs the risk of making a mistake vs the risk of code drift when the same 10-min thing gets written a dozen different ways across an org.
I think you are dramatically underestimating the cost of a dependency.
The dependency may change in surprising ways in the future. The dependency may not change in expected ways in the future. (ie, now you're trapped in an old version of the language, or another one of your dependencies)
The problematic part with reinventing the wheel or NIH is that you add a fixed cost of doing business to any given change. The problem with importing the world is that you add a dynamic cost to simply existing without change.
It's a tradeoff, and it's not a tradeoff that's consistent across organizations. For some orgs it's really important to be first to market. In other orgs it's important to not break shit for your existing customers.
My org is well established in our field. Nothing matters more than retaining our existing customers. Importing a new lib requires approval from legal, which requires about a personweek. In other words, if one developer could implement the same thing in less than 40 hours, the developer should implement it themself. For us, a slow moving org, I think we're at the right place. For a startup, obviously not.
I think you may be dramatically underestimating the cost of NIH and code bloat.
I'm talking about things developers think they can do in ten minutes but that they are at high risk of getting wrong: generating SQL, parsing JSON, handling dates, etc. These are all things that have well-established libraries in most sane languages (maybe NodeJS is not sane in this regard), where everything that you save yourself from by using the library is worth even a 40 hour review process.
The axios example is an interesting one, because there are tons of people saying they're willing to help, but how many of those people will actually contribute anything without guidance of any kind? When the current maintainers are too busy to even provide guidance, you need to be a special kind of developer to even have a chance of becoming a real maintainer, and those kinds of developers are few and far between.
Pipenv is very controversial project who lost its reputation. Many times I have issues that was making my everyday life very uncomfortable. And of course I saw issues on GitHub with other people who also have the same problems. And then in the middle of the very constructive conversation somebody from the maintainers team or initial creator of the project jumping to the thread and very aggressively close it or saying something like "go f--k yourself we don't need it". Srsly? Who will like this level of the conversation? It is ok if you don't have resources or time or whatever, but we have this particular problem. Don't be rude and push people against you.
My personal opinion that some day ripples on the water made whole loop.
And last but not least: pip+requrements.txt are not the best, but pipenv doesn't add much over it. It gives you a little but introduce another level of abstraction over the same things with its own level of complexity.
I put myself in very uncomfortable position. When I need to prove my words by blaming someone who did very good job for the community in general. At least have these good intentions. And the same time I don't want shame somebody in public.
Wow... some users are subtly pressuring the maintainers into work here. The maintainer simply states "works as intended".
The users in that topic then continue the discussion in all of its aspects, expecting the maintainer to engage with them. I can imagine it feels to the maintainers as an energy-sucking discussion.
IMHO, you're not shaming the maintainers, you're shaming the community here.
Determining if a project is alive or dead really is a problem. And this problem will grow from year to year.
There are so many small and medium sized projects on Github, where you have no idea if they are maintained or not. Sometimes there are projects which are alive and kicking with multiple pull requests, maintainers promising changes or a major release and suddenly nothing. Sometimes smaller projects see no changes for months, but they simply work and don't need maintenance at all. Maybe the community moved elsewhere and and the project will only see bugfixes coming in and no more features?
The burden of figuring this all out lies with the visitor and is a annoying hassle.
I would love if Github could somehow show me on the landing page, if the project is worth investing time in. Maybe they could send a heartbeat to the maintainer and simply show the project as dead as soon as there is no response?
You said it yourself, this is a very hard problem.
There are Java packages that haven't seen a commit in years and are still perfect for a task. There are npm packages that are 2 months old and terribly outdated and unmaintained.
My personal favorite metric is "time to maintainer response": how long does it take for a maintainer to respond to issues or pull requests. Not necessarily to resolve them, but triage issues or provide guidance on a PR.
If this happens quickly and with reasonable responses, projects are usually solid, assuming they have existed for a while and see decent usage.
Author of Scriptella here. Thank you for mentioning it! The product was neglected for many years, but I never was ready to finally press the kill switch. Hoping that one day I will have more time to work on it...
In this case, the code is still maintained and at least some pull requests are being processed (I haven't checked in detail). The only thing is that there hasn't been an official release for some time.
These are not great metrics. There are no great metrics.
The official JSON implementation is a good example. It is a mature project. It's used everywhere. If you leave an issue or submit a PR that fixes a typo, the maintainer will flat out delete your comment and tell you to buy his book to educate yourself.
Maybe this behavior is good or maybe it's bad. But maintainers are human and metrics are not going to cleanly pick up how well a project is maintained.
Just fishing for a response wouldn't be very meaningful anyway. A far better metric would be the time it takes for a maintainer to respond to pull requests and other issues.
> My employer gets to make bullshit intrusions on my time like that, no way will I waste my life on make-work from Github.
You are not willing to give Github the information, if a project ,which you uploaded, is still active? But, will you give that information to users asking for details? Which way is more annoying to you and the users?
What I want to say is, that Github will turn into a huge graveyard in the years to come, with projects long forgotten still having a landing page like it's the best thing ever. Do you want every visitor to first contact the maintainer if the thing is still alive? Isn't that a huge waste of time?
In my opinion, Github will have to clean up the mess left behind by unmaintained projects at some point. The sooner they start, the easier it will be.
I think one could get most of the way there by having a bot that:
a) Autobuilds the library based on its declared dependency stack.
b) Runs a test suite, and reports back breakage.
Anything that sits broken more than (say) a month can safely be considered dead. Super-stable low-level code will continue passing their tests forever, and things dependent on specific API versioning in left_pad will be marked dead quickly...
Take SpeedCrunch for example: the current release (0.12) dates from 2015 AFAICS, and works very well for the most part.
There's a bug in that release though: it thinks 0^1 is NaN. That bug was fixed in or before 2018 (https://bitbucket.org/heldercorreia/speedcrunch/issues/836/0...), but there has not been a release since. There have been a number of commits this year, but not a lot.
Is the project dead or not? I honestly couldn't tell.
not sure about the project but in a case like that the project management appears to be dead for sure, because releases should not trail vital bug fixes like that. Same for pipenv.
He talks about maintenance and making releases. Otherwise people will still deal with bugs in the "current" release that have been fixed in the master branch for a long time.
And even if you did, there's no guarantee that the work in master is actually in a sane state. Some try to follow models like master is always runnable, or tested/built by ci/cd, but it's often simply not the case.
I've worked on atleast a few open source projects where I'd pull and build master, trying to obtain a particular bugfix, and it would fail, and I'd have to randomly rollback commits until I found a working build.
In one case the project had a seperate repo for a library that wasn't kept entirely in sync with the parent project.. had to randomly move back both repos until I found a matching pair that built and ran successfully, but after the bugfix was implemented.
Releases are important for anyone who isn't working on the repository directly -- they gaurantee the project is in a sane state.
Similarly, Slackware Linux's last stable release was 14.2, in June, 2016, but their -current release tree was last updated.... yesterday.
Now, I could be super huffy that they haven't released in years, but this is open source software that I'm not paying for. Moreover, I have the power to choose multiple alternatives. I can run the -current release. I can make my own "pseudo-release" by just forking here and calling this "14.2.1". Or I can use a different distro.
It would be really entitled of me to demand Patrick make a new release just because I want one, especially when he doesn't owe me anything, and I have multiple alternatives. And besides, the project is still going, it's just not doing what I want it to do; and that's Pat's prerogative.
I mean, you're right on all points, yet I confess I would be a lot more willing to give Slackware a try again if it would return to having a more-or-less annual official release, like it managed from its start in 1993 up through 2013. (Mast years prior to 2000 it had more than one release a year!)
There are things I (dimly) remember genuinely liking about Slackware in terms of its philosophy, but as near as I can tell the official installation method is "install the most recent ISO, then update from there," and when the most recent ISO is three and a half years old, that's not a great look. We've now gone the longest length in Slackware's history without even a point release.
I wouldn't demand Patrick make a new release, but I don't think it's wildly unreasonable to expect one by now. Bringing it back to the original article, I think the same can be said of pipenv.
Clementine the music player is my go-to example of this phenomenon. Its got 750 commits in since its last release in 2016, there is regular merging, but the project management is in a place where trying to deploy builds to every platform is acting as a barrier to any new releases.
It recently finally merged its Qt5 support into master (which was in a working state since 2014) about two months ago. That might help the release process.
What doesn't help is that a consumer facing product of that scale is a nightmare to manage - it has over 2k open issues on Github but probably 95% of them are "I don't know how to do X" tech support kinds of things.
I mean releases as in, a point in time snapshot of the code that has been declared fit to use, and released for distribution to the public. Regardless of the software used to prepare this. As distinct from just pulling the latest master branch and praying it's stable.
Indeed! It doesn't seem dead at all. While having a new release could be nice, I must also say I didn't face much trouble with the last release (I do Ansible work & use pipenv+pyenv to freeze the dependencies). I'm grateful to pipenv in how it has helped me here.
The 1.0 release is a must for a working poetry environment.
The people handling it are friendly and communicate over github and discord.
However, it's not all roses.
- VSCode supports poetry as a second class citizen
- Documentation is there, but only in github issues
- There is no migration path from pip and pipenv to poetry
- I can't do releases via CI because of a poetry bug.
We love DepHell. I migrated some work repos from Pipenv (and plain ol' pip) to Poetry. However, we didn't want to have a flag day where we updated our build tooling to be 100% Poetry, so I made a Makefile target that builds requirements.txt and setup.py from pyproject.toml. Now developers can work with pleasant tooling, but the build system can use the old stuff it already knows.
We're close to having everything migrated to Poetry. When that day comes, we can throw out all the compatibility stuff, update the build server, and be happy. Until that day, DepHell gives us an easy compatibility layer so that we don't have to do the migration all at once. It's awesome.
Sure we do. The whole software industry is built on hype cycles. If we didn't have them, we'd get paid normal wages and couldn't claim we're eating the world
Speaking as a maintainer of large open-source projects, I know what it's like to get stuck working on a release branch (which is sometimes just the `master` branch) that takes forever to become an official release. I'm not sure if that's the story here, but there have been a lot of commits since the last release, so maybe?
In any case, I strongly recommend publishing alpha/beta releases along the way, without the ceremony of an official release, so that folks can give feedback during the long months that fly by when you don't have nearly enough time to spend on the project.
Respect the needs of your conservative users by not publishing official releases until they're really ready, but trust your more engaged users to use whatever you have right now, and let you know what's working/broken, without judgment.
I’m sure poetry has bugs and it could have other problems (not sure), but if you look into how it works, from a software design / philosophy perspective, it’s absolutely amazing. It might be worth waiting a bit longer before you use it in a large production project. Once Heroku adds native support to the Python buildpack, and some other mainstream avenues add support, Poetry usage will increase enough to create the necessary scrutiny that will bring it up to true production-grade quality.
When you add a package in Poetry it will add the latest version of that package’s dependencies that also satisfy all the other packages (and their dependencies’ dependencies).
If you remove a package, Poetry will remove any dependencies of that package (actually uninstall them, rather than just leaving junk behind like pipenv), and will reestablish the latest versions of any other dependencies who’s versions were held back by the package you just removed or any of its dependencies...
Also, when you add a package with Poetry it doesn’t need to completely lock and install everything from scratch; only what you have just added and whatever its dependencies dictate; nothing more.
Also, you can tell Poetry to list off all the versions of your packages and the latest versions available (in case you’re like me and you pin all of your top level packages, having to Google “pypi django”, “pypi requests”, etc., to see if there are any newer versionS worth upgrading to, is a pain in the ass.
I feel like it might, if it replaced pip and gets full endorsement from the python org. It's a fantastic project, but untill it has complete backing it's just a technical risk to include in a project.
To a point. Poetry still isn't fully in board with being a project manager, it focuses too much on packaging. For example it doesn't do project scripts like nom scripts.
It doesn't support shell scripts which kind of sucks, but could be reasonable if it is expected to run on windows as well as posix. This is the scripts section from one project I used Poetry in. It seems to be fine for most things, even though the focus on managing a python module for publication is a little bit annoying. Once or twice I've run into version resolution troubles, but generally it's stable and much more ergonomic+easier to explain than pip+venv. One can of course use `poetry run pip freeze` to generate the standard `requirements.txt` file
[tool.poetry.scripts]
# NB: Poetry won't run arbitrary callables, so these are wrapped in a python script.
# Poetry needs the root dir to be a package (e.g. module `my_repo` must exist) in order to run scripts
# So they are just shoved into a dummy "my_repo.py" instead of going in e.g. "scripts.py"
# Adds Git hooks from `git_hooks` path (overwrites existing!)
hookup = "my_repo:hookup"
# Run formatter+linter, does not modify anything, errors if fail
lint = "my_repo:lint"
# runs `black` (formatter), modifies stuff
format = "my_repo:format"
# A list of files which failed linting. Most helpful with flake8 vim integration https://github.com/nvie/vim-flake8
lintyfiles = "my_repo:lintyfiles"
# unit tests
utest = "my_repo:utest"
# unit + integration tests in container
citest = "my_repo:citest"
cutest = "my_repo:cutest"
Python packaging is broken by convention. A million snowflake packages with no clear model for reuse and extension and a whole lot of unmaintained mess, very little of it organized or uniform. Regardless of what tooling you wrap around it, the ecosystem will always be a mess.
As someone who's been in software development for two the better part of two decades now I would say that's far beyond the truth.
Yes it's not great right now and it's not completely clear if I should recommend a new user pip+venv, pipenv, pyenv, poetry or conda and that's a big problem, Python was actually quite early to handle dependencies and packaging in a standardized and structured way. I remember that most of the projects I worked on early on in my career more or less completely relied on vendoring dependencies, if packaging systems existed they were of either very complex or the one that came with your os/distro.
However Python currently needs to catch up, we need some alignment and a clear path forward, it would also be great to have an official way of building artefacts. There's plenty of ways today but no great blessed way to recommend a newcomer.
> As someone who's been in software development for two the better part of two decades now I would say that's far beyond the truth.
As someone who has had to deal with Python's packaging a lot, I would say that it's really, really bad. Maybe not the worst in the world, but it is much closer to the worst than to the best.
There are 15 ways to package Python modules, none of which are feature complete (but a lot of them pretend to be). Every year or so someone decides that they need to write another packaging tool for Python, and then they give up before it is feature complete.
A tool like virtualenv is useful, but it overpromises and underdelivers (this is the overarching theme of Python packaging imo): it does not completely separate your environment from the system. E.g. virtualenv still uses the host system python packages cache, and it does something rather nasty with .so's that get copied from the host system...
And don't get me started on dependency management in Python. When the dependencies of your dependencies make breaking interface changes, you're in for a world of pain.
I find it amusing, in a sad way, that the language that prides itself on "there is one way to do it" screwed up so badly in the packaging department by not having one good way to do it. Meanwhile the language whose motto is "there is more than one way to do it" has a standard, sane way to package things, with multiple tools that work together in a coherent way. Perl got this one right.
> Meanwhile the language whose motto is "there is more than one way to do it" has a standard, sane way to package things, with multiple tools that work together in a coherent way. Perl got this one right.
I was always impressed by CPAN compared to most other package managers I've used. And, yeah, between Python, Perl, Node, Ruby, and even PHP, Python is by far the most inscrutable and the most prone to giving me fits when existing Python apps like Lektor just... inexplicably break after one of the Homebrew versions of Python gets updated.
Serious question: Isn't maven the reason that you can't really package Hadoop? It is my (high level) understanding that it is at the core of (or a big part of) why Hadoop is so hard to build on your own vs VM.
Hadoop is hard to package because build scripts are treated as second hand citizens for most projects. There's a high standard of code reviews for many projects, but for the build scripts it's "it works, merge it".
I've built quite big projects with Maven (>2 million LOC) without issues.
But the Python mess because initially, it didn't have such a system either. Then a bunch of them were introduced, and some of them became official but there was never one that solved all the problems.
I would consider the packages shipped by most distros as platform-provided package-management. It's not specific to C/C++, but if I'm developing something for say, Debian, I'm going to use as many system-provided libs as possible.
The faults aren't so much with the actual NPM software as with the whole ecosystem. The real root of the problem IMO is that Javascript has such a tiny std lib compared to other popular languages that have package management systems. This encourages lots of people who are missing various functions normally found in std libs to write packages implementing various combinations of those functions. Those who are writing larger packages then depend on various combinations of those little helper packages. So of course, if you need to use multiple large packages to do something useful, they'll tend to pull in a huge forest of tiny packages in a bunch of versions.
NPM itself was also not great, but did become great in the last majors. The weakness of the stdlib is still a problem but very, very slowly getting better with the new Ecmascript and node.js versions.
And it comes from a time, where a 2MB-webpage is considered acceptable. The whole idea of venvs is basically still considering some reuse. If you shout out loud "SCREW YOU LIBRARY REUSE", then npm is perfect of course...
Yes, but for opposite reasons. You probably don't get left-pad in an ecosystem without one — and preferably only one —obvious way to publish and use it.
- It was built in to Rust from the beginning and officially sanctioned so fifteen different people don't have to build their own incomplete, buggy package managers.
- You can add plugins for things like automatically updating or adding dependencies.
- It handles projects and subprojects.
- All you have to do to install dependencies and run a project is "cargo run", reducing friction for getting into new projects.
- For the most part, it just works.
That said it isn't perfect, especially when needing custom build scripts, but it's good.
It is used as package manager often enough -- stick "git clone", or ExternalProject into the CMakefile and you have the (bad) equivalent of "pip install" for C++.
For example, I mentioned "kitware superbuild" above. It describes itself [0] as:
> It is basically a poor man’s package manager that you make consistently work across your target build platforms
C/C++'s package management story is "defer it to distributions". Personally I much prefer `apt-get install libfoo-dev` to 1) each language inventing its own incompatible system, and 2) each developer self-publishing, so there's little to no safety or accountability when adding a new dependency.
So, to be clear, I also tend to prefer distro-supported packaging. However. Forcing everything to go through the distro means that you massively limit what's available by raising the barriers to entry, you slow pushing new versions (ranging from Arch's "as soon as a packager gets to it" to CentOS's "new major version will be available in 5 years"), and you lock yourself into each distro's packages and make portability a pain (Ubuntu ships libfoo-dev 1.2, Arch ships libfoo-dev 1.3, CentOS has libfoo-devel 0.6, and Debian doesn't package it at all). When distro packages work, they're great, but they do have shortcomings.
> limit what's available by raising the barriers to entry
But that’s what you would have to do youself anyway. You can’t use all fresh upstream version of everything, since they don’t all work together. So some versions you’ll have to hold off on, some other versions might require minor patching. But this is exactly what distro maintainers do.
> Personally I much prefer `apt-get install libfoo-dev`
So, what do you do when different projects need different versions of libfoo? Right, you download and build it locally inside the project tree and fiddle with makefiles to link to the local version rather than the one installed by apt-get. So basically you do your own dependency management. Good luck with that.
> So, what do you do when different projects need different versions of libfoo?
Well typically other people's projects I would normally rely on the distro to compile other people's projects as well; and then the distro maintainers would sort that out. That may include making a patch to allow older projects to use newer versions of a library; or if the library maintainer made breaking changes, it might mean maintaining multiple versions of the library.
Obviously if I myself need a newer version than the distro has, I may need to work around it somehow: I might have to build my own newer package, or poke the distro into updating their version of the library.
I mean, honestly, I don't build a huge number of external projects (for the reason listed above), and so I've never really run into the issue you describe. It seems to me that the "language-specific package" thing is either a side-effect of wanting basically the same dev environment in Windows and MacOS as on Linux, or of people just not being familiar with distributions and seeing their value.
> So, what do you do when different projects need different versions of libfoo?
That’s an untenable situation. The package which depends on the older version of libfoo is either dead, in which case you should stop using it, or it will soon be updated to use the newer version of libfoo, in which case you’ll have to wait for a newer release. This is what release management is.
> The package which depends on the older version of libfoo is either dead, in which case you should stop using it, or it will soon be updated to use the newer version of libfoo
So you are suggesting that every time libfoo bumps its version I have to update the dependencies on all of my projects to use the latest, find and fix all the incompatibilities, test, release and deploy a new version? Seriously?
Yeah, I mean what's the alternative? Your code will bit rot if you don't keep up. You don't have a living software project if you don't do this.
You should read the release notes of the new version of your dependency, fix any obvious issues from that, see if your tests pass, and wait for bug reports to roll in for non-obvious things not caught by automated tests. Ideally you should do this before the new release of the dependency hits the repos of the distros most of your users use, so that it's only the enthusiasts that are hit by unexpected bugs.
Even if you could delay and batch the work together every several releases of a dependency, you're still doing the same amount of work, and it's usually simpler to keep up bit by bit than all at once.
One trick is to not use dependencies that have constant churn and frequent backward-incompatible changes, and to avoid using newly introduced features until it's clear they've stabilised. When you choose dependencies, you're choosing how much work you're signing up for, so choose wisely.
Of course you could go the alternate route and ship all dependencies bundled - but that is a way to ignore technical debt and accidentally end up with a dead project.
Also, your project should not demand a specific version of `libfoo`. If `libfoo` follows semver and a minor release breaks your project, that is a bug in `libfoo`. Deployments of production software should pin versions, but not your project itself.
> You don't have a living software project if you don't do this.
What I might have is a working piece of software that is an important part of company infrastructure. For mission critical software reliability is a much more important metric than being current. Unless there is some really compelling reason to update something it should not and will not get updated. There are mission critical services out there running on software that hasn't been changed in decades and that is a good thing.
> Also, your project should not demand a specific version of `libfoo`. If `libfoo` follows semver and a minor release breaks your project, that is a bug in `libfoo`.
Who the fuck cares if it is their bug or not? I need my service working, not play blame games. And if I have a well tested version deployed, why the hell would I want to fuck with that? And if I have tested my service when linked against libfoo 1.6.2.13.whatever2 I had better make sure that this is the version I have everywhere and that any new deployments I do come with this exact version.
But if I start a new project, I might want to use libfoo 3.14.15.whocares4 because it offers features X, Y and Z that I want to use.
Exactly. This is what we signed up for when we release software and commit to keeping it maintained. This is what we do.
If you instead just like writing software and throwing it over the wall/to the winds, you are an academian in an ivory tower, and have no connection to your users in the real world.
> you are an academian in an ivory tower, and have no connection to your users in the real world
Quite the opposite, actually. My responsibility is to my users. And that responsibility is to keep the software as stable as possible. So the only time I will consider upgrading my dependencies is when reliability requires it. If libfoo fixes some critical bug that affects my project, yes, maybe I should upgrade (although I am running the risk of introducing other regressions). If libfoo authors officially pronounce end of life for the version of libfoo I am using, maybe I should consider upgrading, even though it is safer to fork libfoo and maintain the well tested version myself. But it is irresponsible to introduce risk simply to keep up with the version drift. So if my project is used for anything important, I should strive to never upgrade anything unless I absolutely must.
> I should strive to never upgrade anything unless I absolutely must.
It seems that the choice is whether to live on the slightly-bleeding edge (as determined by “stable” releases, etc), or to live on the edge of end-of-life, always scrambling to rewrite things when the latest dependency library is being officially obsoleted. I advocate doing the former, while you seem to prefer the latter.
The problems with the former approach are obvious (and widely seen), but there are two problems with the latter approach, too: Firstly, you are always using very old software which are not using the latest techniques, or even reasonable techniques. This can even be considered to be bugs – like using MD5 hash for example, which, while being better than what preceded it, much software were using MD5 as a be-all-and-end-all hashing algorithm; this turned out later to be a mistake. The other problem is more subtle (and was more common in older times): It’s too easy to be seduced into freezing your own dependencies, even though they are officially unsupported and end-of-lifed. The rationalizations are numerous: “It’s stable, well-tested software”, “We can backport fixes ourselves, since there won’t be many bugs.” But of course, in doing this, you condemn your own software to a slow death.
One might think that doing the latter approach is the hard-nosed, pragmatic and responsible approach, but I think this is confusing something painful with something useful. I think that doing the former approach is more work and more pain from integration, and the latter approach is almost no work, since saying “no” to upgrades is easy. It feels like it’s good since working with an old system is painful, but I think one is fooling oneself into doing the easy thing while thinking it is the hard thing.
The other reason one might prefer the former approach to the latter is that by doing the former approach, software development in general will speed up by all the fast feedback cycles. It’s not a direct benefit; it’s more of an environmental thing which benefits the ecosystem. Doing the latter approach instead slows down all feedback cycles in all the affected software packages.
Of course, having good test coverage will also help enormously with doing the former approach.
> but I think this is confusing something painful with something useful
There is software out there that absolutely cannot break. Like "if this breaks, people will die". Medical software, power plant software, air traffic control software, these are obvious examples, but even trading and finance software falls into this category, if some hedge fund somewhere goes bankrupt because of a software bug real people suffer.
It doesn't matter how boring, inefficient and outdated these systems are. It doesn't matter how much pain they are to maintain and integrate. These are systems you do not fuck with. A lot of times people who do maintenance of these don't even fix known bugs in order to avoid introducing new ones and to avoid the rigorous compliance processes that has to be followed for every release. Updating the software just to bump up some related library to the latest version is simply not a thing in this context.
I am not working on anything like this. I work on a lighting automation system. If I fuck up my release, nobody is going to die (well, most likely, there are some scenarios), but if I fuck up sufficiently, a lot of people will be incredibly annoyed. So I have every version of every dependency frozen. All the way down the dependency tree. I do check for updates quite often and I make some effort to keep some things current, but some upgrades are simply too invasive to allow.
> I am not working on anything like this. I work on a lighting automation system. If I fuck up my release, nobody is going to die (well, most likely, there are some scenarios), but if I fuck up sufficiently, a lot of people will be incredibly annoyed. So I have every version of every dependency frozen. All the way down the dependency tree.
Most people’s systems are not that special that they absolutely need to do this, but it feeds one’s ego to imagine that it is. And, as I said, it feels more painful, but it’s actually easier to do this – i.e. being a hardass about new versions – than to do the legitimately hard job of integrating software and having good test coverage. It feeds the ego and feels useful and hard, but it’s actually easy; it’s no wonder it’s so very, very easy to fall into this trap. And once you’ve fallen in by lagging behind in this way, it’s even harder to climb out of it, since that would mean upgrading everything even faster to catch up. If you’re mostly up to date, you can afford to allow a single dependency to lag behind for a while to avoid some specific problem. But if you’re using all old unsupported stuff and there’s a critical security bug with no patch for your version, since the design was inherently buggy, you’re utterly hosed. You have no safety margin.
This is the danger of living on the edge of EoL, as you called it. Once you are forced to update a packages, usually with short notice at the most inconvenient time ever due to some zero-day vulnerability found in one of your depdendencies. Then the new version no longer supports another old version it has a subdependency to, which you also pinned, so you have to update the subdependency also. And to upgrade that package you have to update yet another subdependency, and so on.
Suddenly a small security-patch forces you to essentially replace your whole stack. If your test flags any error you have no idea which of the updated subpackages that caused it, because you have replaced all of them. Eventually you accumulate so much tech debt that it's tempting to cherry-pick the security patches into your packages instead of updating them to mainline, sucking you even deeper down the tech debt trap.
Integrating often means each integration is smaller, less risky and easier to pinpoint why the failure happens. Of course this assumes you have good automatic test coverage, which i assume you do if the systems are as life-critical as parent claim them to be.
There's also a big difference between embedded and connected systems here. Embedded SW usually get flashed once and then just do whatever they are supposed to do. Such SW really is "done", there is no need to maintain it or it's dependencies because it's not connected to the internet so zero-days or other vulnerabilities are not really a thing.
Conan is actually quite nice. I've been porting a series of projects to it and it's been a pleasant experience - Conan is very flexible, the documentation is thorough and the developers are very responsive on Slack/GitHub.
https://conan.io/
maybe for web-development and ML. For my core scientific workflows I have about, uuuh, 10(?) libraries. And those are generally stable enough to upgrade 1 without breaking the other. Seems like the problem is with the users.
Not everyone uses the same 3 commands which means for a new python dev, finding out about venvs isn't as immediately obvious as it should be. Even then it's shooved (sadly) as an optional thing
It's a pain in the ass when you have colleagues who only know JS or HTML/CSS or whatever and they need to install the project to run it on their machines, and you have to explain "virtual envs" and other bullshit that doesn't exist in sane packaging ecosystems to get them up and running.
We also recently switched from Pipenv to pip-tools and so far it has been very pleasant.
Our workflow:
- Use pyenv to manage python versions (mostly works pretty well)
- In the beginning, use builtin python 3 tooling to create virtual env for the project: "python3 -m venv venv"
- Whenever needed, add new libs to "requirements.in"
- Run the "pip-compile" command to generate a new "requirements.txt" with the dependency and its sub-dependencies pinned by default to the exact version
- Run "pip install -r requirements.txt" to install the new package and its sub dependencies
- Check both requirements.in and requirements.txt into version control
The big advantage is that requirements.in specifies just the packages you care about, while requirements.txt has your packages and all of the sub dependencies pinned to the exact version.
The `pip-tools` portion of that is basically what I came up with myself and described in another thread at https://news.ycombinator.com/item?id=21779929 , except that I'm trying to use `pythonloc` to install things in a local `__pypackages__` folder per PEP-582 instead of using venvs.
I recently switched from Poetry (to which I switched after pipenv) to pip-tools for some projects, because Poetry was not able to work properly with some dependencies.
Pip-tools has been a dream. It is just a thin layer of tools on top of Pip that separates your 'abstract' requirements (eg: django<3) from your 'release' requirements (eg: django==2.2.5) and managing running pip to have your virtualenv reflect your exact requirements.
No new standard like pyproject that is partially supported (by 3rd party that is), no intention to (but failing at) being the all dominating way to do python packaging.
Just a tool that is there to help you.
edit: it also automates updating the release requirements file (within the constraints of the abstract requirements).
Never used poetry, but pip tools is especially easy to integrate with CI/CD, because they are CLI commands that do specific simple tasks and don't hide things from the caller.
I personally use setup.py (my setup.py only calls setup() and all configuration is declaratively defined in setup.cfg) the pip-compile generates a version lock (requirements.txt) and that is passed between environments, so we are ensuring that the exact same dependencies are installed during deployment.
I like that poetry uses pyproject.toml to replace setup.py, requirements.txt, setup.cfg, MANIFEST.in and the newly added Pipfile. Simplifying it to one file seems smart.
Have a look at https://github.com/jazzband/pip-tools#workflow-for-layered-r... Both the 'input' and 'output' requirements are pip compatible files (just with different extensions). It uses pip features like '-c' to include other contraints in your requirements.
For me, if they would merge Pip-tools into Pip and call it a day, my package management issues for Python are solved.
They don't do the same thing. pipenv allows you to create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.
I prefer poetry though, since the consensus seems to be coming together on pyproject.toml rather than individual files like Pipfile. A lot of tools have already started supporting the toml file for their config, or have PRs pending.
I understand the value of a locking mechanism in the JS ecosystem because 1/ many packages depend on an intricate web of other packages that overlap, 2/ many packages use semver ranges, and 3/ you don't want two versions of the same package running on a user's browser, due to conflicts and increased size.
I can't think of many Python packages that have the same issues, and Python code isn't sent to and running on a user's browser.
Am I wrong or is there a reason that a locking mechanism (other than git) is helpful in Python?
I think this has less to do with "you don't want two versions of the same package running on a user's browser" and more to do with "when I clone a project and run npm/pip install I want it to be in a known state".
I don't use Python/pip much but as for npm: the problem is when your dependencies, direct or indirect (dependencies of dependencies), aren't "exact". You have something like "~1.2.3" or "^1.2.3". If every developer followed Semvar perfectly, never shipped regressions or new bugs when fixing a bug, and was always able to identify every breaking change then life would be perfect.
That is, however, not the world we live in. So a "lock" file respects your "fuzzy" versions ^/~ when you first run the npm install and then subsequent runs will install using the exact versions you downloaded the first time. This helps solve the "works me me"/"work on my machine" problems. The idea being if you can run it locally then the build server and production can also build/run your code.
All of those apply to Python as well except the multiple packages concern has nothing to do with a browser and everything to do with the fact that a given Python process can only load one version of a library at a time (and probably for good reason).
I mean, Rust uses the same locking mechanism to great success. I've never seen a breaking change from upgrading a Rust dependency that preserves semver (which is 99% of them)
> pipenv allows you to create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.
Why on earth would anyone need all this to manage packages for their projects?
Are there any other programming languages whose package-management comes close to this level of intricacy and complexity?
“This project needs package X”
I mean how hard can that be to get right in a self-contained environment?
This whole story is just madness coupled deep denial and Stockholm-syndrome.
> Are there any other programming languages whose package-management comes close to this level of intricacy and complexity?
I think many do. Ruby, Node, Erlang/Elixir, Java, Go, Rust, dotnet, C... I’m having trouble thinking of a modern language that doesn’t have such package management mechanisms.
For many people doing anything more than writing one-off scripts, and especially for anyone who collaborates with others or shares their code, package management is so much more than just “this project needs package X”.
Your original comment was responding to “create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.” and saying that this is overly complicated. Yet every single one of the listed languages has package management tools that do all of these things.
If you think any of the listed examples have “simple” package management tools, I question how deeply
you have used any of them. NPM has ~60 commands and hundreds of subcommands each with multiple option flags, and probably hundreds more config options. Gem/Bundle is similar, etc.
If anything, Python is trying to catch up in how complex it’s package managers can be.
For most other language-provided package-managers the software project you’re working on is the env, so you don’t need to construct or manage a venv at all.
> > pipenv allows you to create, manage virtual environments(using venv) apart from managing requirements file(Pipfile) and an npm like locking mechanism, dependency graphs, dev dependencies and more.
> Why on earth would anyone need all this to manage packages for their projects?
> Are there any other programming languages whose package-management comes close to this level of intricacy and complexity?
> “This project needs package X”
> I mean how hard can that be to get right in a self-contained environment?
Packages often have subdependencies and their requirements at times may conflict. If you are very specific in the versions you want, it is more likely to cause issues in dependency resolution.
I would hardly call a dev's machine a "self-contained environment". Most developers I know work in a number of repos with varying requirements, and polluting their system libraries and packages with each project's requirements quickly pollutes the system and can lead to issues.
>I mean how hard can that be to get right in a self-contained environment?
Okay, I’ll bite.
What version of package X does it need?
Is there a specific version that’s been tested with this project and is known working?
Are there specific versions of its dependencies that have been tested and are known working?
Is it needed at runtime or only at build-time?
What repository can it be found in? PyPI is not the only Python repository; private repos are common.
And as a bonus cherry on top: how easy is it to make sure you have all the project’s dependencies installed in the venv for that project, and that you don’t have packages you’re not keeping track of? This is a UX thing, but developers are human and it matters.
I don't get it either... nor do I understand why venv is considered difficult to use.
"It automatically creates and manages a virtualenv for your projects, as well as adds/removes packages from your Pipfile as you install/uninstall packages."
Well, thanks but automate "python -m venv myvenv" ? Add/remove packages from a "Pipfile" ? Do I have to specify dependencies somwhere else than requirements.txt ? Why ?
Requirements.txt only lists versions of your project's requirements, but Pip actually automatically installs dependencies of those requirements too. And those versions aren't listed in your requirements.txt.
But you don't want to necessarily freeze them to their current state. Yes for repeatable builds, but not in your list of dependencies. They're different concerns and Pipenv splits them.
Giving a shout out to pip-tools here... it's a great, simple solution for pinning your requirements (including the deps of your deps), and keeping your venv's in sync.
Yes, that 'freezes' everything installed... but that's the point. If you want to update a direct dependency, you just do pip install -u <my-dep>. Any indirect dependencies are updated, and you freeze again.
Sometimes direct-dependency-a and direct-dependency-b have a conflict on which version of indirect-dependency-y you need. This is what we call doing 'actual work.'
^This. I don't understand everyone in this thread complaining that it's hard to update a direct dependency. You literally just pip install it and "pip freeze > requirements.txt" again.
In my experience, the issues come around when you try to build envs cross-platform. There are a lot of dependencies that have missing versions or bugs for certain platforms. This is not a pip/venv problem though--it is more of a python problem.
That makes updating your direct dependencies harder. If you upgrade to using newer version of something, and that things dependencies have changed, you have to manually figure out which dependencies in requirements.txt were for it, and update or remove those etc.
> It’s easy to also get the versions of all sub dependencies and put them in requirements.txt
Is it easy to automatically do this when you want a new package version, without having to remember to do it? Is it easier to put together this process and train other developers in it and be diligent in its use? Is all of that easier than installing an application that has a similar interface to other tools, that does all that for you, and has a community of people to help with issues?
But then you're suddenly responsible for keeping track of your subdependencies and updating the versions of each that you want. That should be up to the dependencies.
Also if you manage to drop a dependency, you don't have an easy way to remove the things from requirements.txt that are only there because they're a subdependency.
Putting it all in one requirements.txt is just too simplistic.
It also adds dependency management. If one subdependency is library_a > 1.0, and another is library_b < 2.0 while also e.g. 2.1 exists, then it will try to find a version between 1.0 and 2.0. Pip doesn't do that.
So in my mind, that's what pipenv is -- pip, virtualenv, those two files, plus dependency management.
2. Because it's not at all clear that pipenv is a third party library. It's made by the same group that makes pip, so it's confusing that pip would be considered a de-facto standard but not pipenv when it's made by the same group, and under the same project in Github.
Whilst you might have to define "in the standard library", pip is added in PEP 453 [0] (accepted in 2013), in a similar capacity to venv:
> However, to avoid recommending a tool that CPython does not provide, it is further proposed that the pip [18] package manager be made available by default when installing CPython 3.4 or later and when creating virtual environments using the standard library's venv module via the pyvenv command line utility.
Isn't pipenv a dependency management system that includes the python distribution by way of managing a venv, rather than a simple replacement for venv, sort of like poetry (though poetry, I think, does a better job, but for the problem that it doesn't seem to respect SSL options available for pip which are often needed in enterprise environments)?
pipenv does more than just create a venv, although it is my favorite tool for that. The most important thing it does is freeze the dependency tree using Pipfile.lock
The idea is to make builds (more) reproducible. I can build a python program, test it thoroughly, and then be reasonably assured the whole thing won't come crashing down in CI/CD from a bad update to a transient dependency. Then when I want to update the libraries I know I'm doing it purposefully and can commit the new dependency tree to source control.
probably recording exact dependency versions, based on a loose requirements.txt and when it was built.
You may want this because you have a library that you shouldn't be pinning to the third decimal on a sem-ver package, but that you don't want to hiccup in CI due to a dot-release.
Or maybe you think a loose file your tooling can read, and a hyper-specific file your builder should read, is a better interface for a project.
Yes, kind of like that. Except that it doesn't use requirements.txt but rather a file called Pipfile. In there you can also pin version, or leave them unspecified or only partially specified and you can also divide them in dev-packages and normal packages (so it allows for a bit more flexibility than a requirements.txt file).
> a bit like "pip freeze > requirements.txt" then?
With the added bonus that it also contains a hash of the package so if someone pushes a new version with the same version number it would complain that the hashes don't match.
The Python Packaging Authority declared it the future of dependency management once upon a time and it nominally checked some important boxes such as managing a lockfile.
I was pretty badly turned off by Kenneth Reitz and the way he handled conflict with Pipenv. I disliked how instead of listening to feed back or being constructive he just gave off a kind of fuck you attitude. There were real and critical issues with pipenv that he would not budge on and it truly felt like he was in the minority. I know, its his project he can do what he wants but in the context it did not make sense. I especially disliked it due to how he sold Pipenv as being officially endorsed by Python (which its not and has never been) but it did trick me for a short time. Finding out it was never official endorsed is what made me never want to support any of his projects again.
> Use Pipenv to manage library dependencies when developing Python applications. See Managing Application Dependencies for more details on using pipenv.
> Consider other tools such as pip when pipenv does not meet your use case.
Is this [0] not an official endorsement? It certainly seems as much.
Oh wow new to me. Had not seen that one. Even more disappointing because the tool is far from primetime. Going back to the original. A few years back when I was using pipenv it kept the tagline that it was the future of python and endorsed fully but I am pretty sure that was a stretch or perhaps the maintainer was using his clout and his own recommendation.
Those pages are managed by the "PyPA" group, not technically python.org itself, although they were given a subdomain there. Start here and follow the links for more info: https://hynek.me/articles/python-app-deps-2018/
I'm aware that the Python Packaging Authority is not the Python Software Foundation. But they are also the Packaging Authority.
The PSF may not have officially granted them some sort of status, but as PyPA maintain pip, setuptools and warehouse, they are in fact the authority when it comes to packaging, unless the PSF comes out with a statement saying they aren't.
By having the subdomain, they're endorsed (possibly transitively) by whoever runs python.org. If they want to _not_ endorse whatever PyPA is doing, they need to not supply the subdomain.
You know who else handled conflict in a way that wasn't always understood? Linus Torvalds. I think there's a theme here. Why is it that successful project maintainers sometimes lose their patience with the community? Remember when GvR left the Python community? I think one of his final public statements was, "Now that PEP 572 is done, I don’t ever want to have to fight so hard for a PEP and find that so many people despise my decisions."
I thought he was apologizing for his use of personal attacks, profanity, insults, and generally what he describes as a lack of empathy. Things like that. His hardliner attitude hasn't changed.
I can of see a difference though. I am not condoning some of the ways I have seen Linus handle threads but it at least always seemed that there was a valid reason at the root. Maybe it was not communicated well but there did seem to be a reason. In the case of Pipenv, there were broken workflows that would have made this tool unusable for a large portion of the community and the response was just go pound sand? That specific case has been resolved since then but I dropped using it after that thread came up.
This is all MIT licensed, if people care so much, why has nobody forked this? Why are people talking about jumping ship to a completely different project instead of forking and cutting a new release from there?
It looks like 1.4k people have forked it. The question is, which fork do I use? The problem is not that the source is unable to be updated, the problem is how to you organize peoples' efforts under a trusted maintainer long term? How do I know which forking effort to trust?
My understanding is that that's kind of the point of groups like the "Python Packaging Authority". So if they're not going to merge pull requests and do maintenance on the project, that IS a problem, since they're supposed to be the official version right now.
Return the forks of developers who have publicly stated they'd like to take over maintenance of this project.
I'd bet that narrows it to less than ten.
Now-- have a look at the blog posts where these maintainers explain their plan to sustain the project going forward and choose the most persuasive one.
Github forks are meaningless. Sometimes they're forked because people think it's the same as the "star" button. Sometimes it's to have a classy project show up on your profile. Sometimes it's because you want to submit a PR. Sometimes it's because your company requires a software but isn't willing to use the public mainline.
There are over 500 commits merged to master since the last release. The community is actively contributing but these changes don't get to the end user because the maintainers are not releasing them.
Bugs get reported and closed because they are fixed in master every day, wasting not only the end users time but also that of the people actively working on the project.
The problem is the name, not the content. There are lots of guides, blogposts, etc which say “use pipenv, is awesome”. However, if one actually tries to do this, as the bug says, one gets a very old release with seemingly no chances for bugfixes, etc..
This means a bad experience for people who try to use it, and makes whole Python ecosystem feel a tiny bit worse too. Imagine a frustration of someone reading the blog post , spending all the time learning about system, using it, and then discovering bugs won’t ever get fixed!
It is very easy to fix by an author: just a small, 4 line commit to readme and website saying “the project is dead , go elsewhere”. This will allow people to move on - maybe to a fork, or to some other project which does similar thing.
Title and comments in that issue thread show that people misunderstand how the FOSS works. Maintainer is not 'the project', it's the current direction that shapes incoming pull requests.
This viewpoint doesn't account for authenticity as verified by, e.g., PyPi. Yes, you can use anybody's pipenv, but most people would greatly prefer to not go to such lengths.
> I don't think you're bad people, but the least y'all can do is be honest
I don't like this framing (sort of implies they aren't be honest) but regardless just switch if you're not happy with the release frequency and you have viable alternatives.
Especially if they haven't even said anything yet. A better suggestion would be "It's okay if you're not working on this, but let the community know." I have a feeling some use this for production grade work.
For me, and perhaps others, there is a desire to see this project explicitly move aside (vs slowly die), so that poetry and other projects can take the reins.
A lot of people say “just fork it” or “choose something else”, but the problem is that python is a finite community with a finite amount of energy, and a lot of this energy has been absorbed by the star power Kenneth acquired from his prior successes (namely, requests... which btw now has an even better replacement called httpx).
It’s almost like Kent needs to come out and say “I’m sorry I screwed up, here’s my towel; good night.”, so that people can move on.
there are a couple of issues with "moving on", since the project is owned by pypa, it feels it should be the default for python projects, pipenv has a good idea, but in my opinion, a flawed implementation, and maybe all my issues are already resolved in those 600+ commits without release.
About the framing of the question, I think it's because of all the flame wars in the past when people criticized pipenv and maintainers took it a bit personal.
I am one of those grubby little "dark matter" developers. I fled Perl for Python over a decade ago. While I do love Python, one of the impediments (not the only, and perhaps not the largest) to my progress are the endless packaging issues. Among the larger attractions to Python is that there is supposedly one obvious way to do things and here there is not. Instead, I must make my selection largely based on opinions that have the same foreboding stink of those I associate with arguments over Linux distros: dashed-off dismissals starting with "just" and drive-by engagements with the topic.
As a result, I end up rarely going outside of the standard library. In a perverse way, being locked on an un-upgradable (due to Reasons) version of Python 2.7.5 for the foreseeable future has helped put that temptation a bit further away.
Yes, Packaging Is Hard. It is certainly beyond me. I will probably never need to package anything I wrote, much less distribute it, so many of my concerns are purely academic.
Rather than fussing over things like the walrus operator (really, c'mon), I would love to see those who steer Python buckle down on issues like this, solve them, and then relentlessly backport the solution further back than everyone thinks is reasonable.
Here's how you solve this problem. I do this on my own projects and with Chart.js which we resurrected from the brink of a "2.0 is coming... please wait" cliff.
1. Add a project scope to the README -- this gives you and volunteers grounds to close issues that are not relevant.
2. Send a personal email to your 3 top contributors -- ask if you can give them push access (even if they aren't recent contributors) at a minimum as an insurance policy.
3. Automate or at least specify your release process. You should be able to do this reliably, while drunk and high, and when you have 17 other projects needing your attention. Example: https://github.com/fulldecent/FDWaveformView/blob/master/CON...
I have been involved in a few "takeovers" to implement the above and keep great projects running. A little structure and human goes a long way. You don't need to fork and be the new dictator to keep something great moving.
Conda works with a different, parallel ecosystem, whose main source of packages is managed by a single company (Anaconda Inc). That company validates, rebuilds, and possibly silently patches code, to provide their own packages.
pipenv brings simpler workflow to pip. pip leverages packages which are published by their authors onto pypi, which is managed by the Python Foundation.
> whose main source of packages is managed by a single company
thats not true today, conda-forge is a community led effort that have open source recipes of the packages built by anaconda inc and many more others contributed by the community. I run conda with conda-forge packages only and it works great.
My limited experience is that Conda is great if you are all-in on the parallel ecosystem, but it doesn't play well with others. Or at least, it didn't for me.
This may have been true at the beginning, but nowadays I use “pip install” extensively in conda-created environments. Are there particular packages that you have trouble with?
The problem with this is that pip will install dependencies of the package you're installing, not knowing (or caring) that those dependencies are already available in the conda repositories.
Later, conda may install a different version of the same dependency as a dependency of something else. Depending on how exactly they are installed (egg, zipped egg, whether the folder has the version number in it), you either get two versions of the same package installed, with which one gets imported being arbitrary, or you get two sets of metadata, with one of them not matching what is actually installed, such that pip may think version requirements are satisfied when they are not. It's messy as anything, and the breakage can be subtle. I distribute packages to users who use conda, and my packages have dependencies that are available in conda, so this has been messing with a lot of my users' installs. I'm now just making conda packages for these projects to solve the issue.
I made this package [1] to try and automate the process of making conda packages out of my existing setuptools packages, I'm quite happy with it but since it is designed to serve the needs of my projects, I can't guarantee it will suit everybody's needs.
Very different ecosystems. Its like recipe of pancakes on vegan and normal sites. It is doing the same but different target audience. For example, I rarely saw instruction for packages how to install it for conda except for cases when it is DS/ML oriented.
Conda is more generic - it allows packaging and distributing software written in languages other than Python. You can distribute java based software, C etc.
There was a pull request merged a month ago. Looking at the commit history it doesn't seem dead to me. It might be more accurate to call it "poorly managed" or "uncommunicative".
The issue was posted 13 hours before your comment. I think it's reasonable to wait at least a day for somebody to respond before pronouncing them dead.
I finally tried out pipenv a few months ago but I ran into a huge showstopper: pipenv does not allow you to target multiple versions of Python. I've seen sever pull requests requesting this feature but the author has officially stated that he does not plan to support it.
My company still primarily uses Python 2, but I've been trying to push them towards 3. I need to support both in the mean time and that simply isn't an option with pipenv. If pipenv truly does die out and something else takes the spotlight, I really hope they include the feature.
An issue that was opened 16h ago, now on HN? If I was in the maintainer's position I'd not respond now just so I don't create a precedent. This looks like bullying.
I think for each project on Github it would make sense that the community can vote to have their own "official" community fork. PRs are then accepted by community approvals.
Changes can be merged back to the original repo. The maintainer could also declare the community fork to become the official project (and get its name).
This avoids lots of hassles with blaming the maintainer or organizing the community who want to help.
For situations like this it would be very useful if there was a way to easily share or make use of PR:s and issues of forked repos. I mean it sounds quite unnecessarily labor intensive to just create a fork and start merging those PR:s?
That way the barrier for moving to a diffident fork would be lowered.
We have a lot of python code, along with C++ and Java, and it is an absolute nightmare dealing with python. If we include conda-forge as a channel, dependency resolves can take hours with the right/wrong combination of packages. We can't easily replicate environment across machines, and build times for our conda packages can take tens of minutes, just to copy a few py files around. We hate it, but our ecosystem is built on it (data science/ML/etc), so it is what we have to deal with. I always look at these threads with hope, but then envision the nightmare that would be involved in moving to poetry or some other system.
Python ecosystem is weird tried many different so called solutions ie anaconda, pip and the like. Admittedly I am a python beginner so may be very wrong in this? But this kinda confirms to me that there is no one good solution yet.
There are commits, but no release, it's very frustrating if you depend on any of those commits, and there are no one available to cut a new version, or to pass on the torch.
For four years there was commits to the beanstalkd project, but no release, it took about a year after the issue was first raised until someone appeared and a maintainer was found: https://github.com/beanstalkd/beanstalkd/issues/399
Pipenv isn't stable and reliable yet though, at least compared to pip+venv. The tooling is really handy and it makes it in theory easier to write Python code, but in practice there is still work to be done.
This is a good question. I have wondered the same thing about software in general. In this specific case, though, there are 313 open issues and 33 pull requests. So people are finding bugs in the current version (and fixing them).
used to mainly/solely use pipenv, since it is in stale for quite a while now, I switched to the default venv, good enough for my use cases, and probably for 90% of other's user cases as well, I no longer feel pipenv is that important at all, get over with it, and move on.
I wonder if some automatic governance model could be put into place to ensure a project never dies. Of course anything I can think of has the potential for abuse so maybe forking is the best solution after all.
I see a lot of comments about "hey it's open-source, just fork". The reason people feel upset is because this project was shilled hard when it was released. The python packaging team was officially recommending it, stuff like that. There was some backlash because of legitimate usability concerns with the software, and what was perceived as the tacit blessing of a project solely due to the maintainer's having written another popular project.
That's the real issue here: prominent, established projects recommending other projects that are not remotely as well-established as the big projects, whereas such a recommendation may suggest that it is.
Big, high exposure projects should probably wait to recommend interesting new projects for actual use until they are sure that the management of that project is in reliable hands with enough backup to continue it if the original creator vanishes.
I think that's the default for most high exposure projects. Unless, of course, both projects have the same author, and that author's primary concern is increasing their own exposure.
The cult of personality that is being nurtured around certain members of the Python community definitely harms the Python ecosystem, because solutions with glaring flaws get a pass when they are stamped with someone's name, and the mismanagement of projects is rarely addressed.
Then there's also the issue of such members of the community exploiting the human tendency to worship others.
You said the "project was shilled hard when it was first released". Why use a word with such strong conspiratorial connotations? Couldn't you just as easily argue that a lot attention was given to the project without implying that it was because of a shill conspiracy?
What concrete evidence leads you to the conclusion that coordinated shills are responsible for giving the project attention?
> The python packaging team was officially recommending it
NO! This is a common misconception[1].
Edit: Correction it's not endorsed by the core Python team but it's recommended by the Python Packaging Authority in various places. See replies below for more info.
Look, if you can’t trust the endorsement of the organization that otherwise develops and maintains all of Python’s official packaging tools, who are you supposed to trust? Guido and only Guido?
> The thing that made it “official” was a short tutorial [1] on packaging.python.org, which is the PyPA’s packaging user guide. Also of note is the Python.org domain used. It makes it sound as if Pipenv was endorsed by the Python core team. PyPA (Python Packaging Authority) is a separate organization — they are responsible for the packaging parts (including pypi.org, setuptools, pip, wheel, virtualenv, etc.) of Python
However, over here [0] I see pipenv, but not those:
> Use Pipenv to manage library dependencies when developing Python applications. See Managing Application Dependencies for more details on using pipenv.
> Consider other tools such as pip when pipenv does not meet your use case.
> This same tutorial cites pip-tools, hatch, poetry. Does that mean that they “endorse” these tools as well?
Those tools are relegated to a small section at the end, whereas the main body of the text says "Pipenv is recommended for collaborative projects..."
I'd say that counts as a specific endorsement of Pipenv. I guess it's an endorsement by the "Python Packaging Authority" rather than "Python Core" but it's pretty hard for someone at the tutorial-reading level to perceive the difference.
Those pages are managed by the "PyPA" group, not technically python.org itself, although they were given a subdomain there. Start here and follow the links for more info: https://hynek.me/articles/python-app-deps-2018/
The problem is that plain pip does not resolve incompatible dependencies. Pipenv (and better: Poetry) do. Here's an example:
You project depends on packages Spam and Eggs. Both of them depend on another package, Foo. The problem is, maybe that don't depend on the same version:
Spam 1 depends on Foo 2 or newer.
Spam 2 depends on Foo major version 2, but it's known to be broken with Foo 3.
Eggs 1 depends on Foo 2 or newer.
Eggs 2 requires some of the new features in Foo3.
If you run "pip install spam eggs", it will do something like this:
- Spam version 2 is the newest, so I'll install that.
- Spam 2 depends on Foo 2, so I'll also install Foo 2.
- Eggs 2 is the newest, so grab it!
- Oh no! Eggs 2 depends on Foo 3, but we've already installed Foo 2, so I'll warn you that I couldn't install Foo 3 and then keep on going.
Now you're in a state where "import eggs" will fail because its dependency on Foo 3 wasn't satisfied.
Suppose you use Pipenv (or better: Poetry) instead. It will do something like this:
- What's the newest version of Foo that can satisfy requirements for both Spam and Eggs?
- What's the newest version of Spam and Eggs that can use the version of Foo we identified in the previous step?
- Install Foo 2, Spam 2, and Eggs 1.
Now you have Eggs 1 instead of Eggs 2, but all the versions play nicely together. And if you say "but I really want Eggs 2!", it would give you an error message like "well, that requires Foo 3, but that means you'll have to roll back to Spam 1. Are you OK with that?" The point is that it figures all this stuff out for you and gives you the information to make smart decisions.
This is "dependency resolution". Poetry is great at it. Pipenv is decent at it. Pip doesn't do it at all. That's the real reason these tools are becoming popular.
Here is a radical approach to software development. Paying for it's development and maintenance. How does that sound as a disruptor? Pay for pipenv's continued development, and for other software you use and depend on.
The same time I think there is some corner cases. As nature of the python packages can be a bit different comparing to the other packaging systems. File setup.py actually is python file and it is executed when processes. So, it possible that depending on existence of some libraries on your system you will get very different results. For example packages can require very different dependency if you build on Windows or Linux. This case hashes can't do much to guaranty consistency.
Unrelated: Is there anyone here who has substantially large project, which they would like to open-source but have refrained from it in the fear of getting dissed by the community because it wasn't written up-to standards or you know you'll not be able to maintain it?
Not big projects but this was my fear when I started open sourcing my hobby projects. At one point you will learn how to stop worrying and love the open source.
There will be all kinds of responses. Indeed, people complaining about the quality, but also poeple how help you get up to standards (also remember, there is not one standard).
People who find a spelling error in your README.md and make their first open source contribution to help you fix it. From hostile complainers to humble user. You can get the entire spectrum.
Find your own level of engagement and don't try to get stressed out by feeling guilty. But whatever you do, imho, always communicate something. Just putting a text at the top of your project's readme that you have taken a break and don't know when you return is always better then leaving people guessing.
As for the Pipenv project. I think people kept it at the highest standard because it was actively endorsed to be the one true new way of package managing, but anded up not trying to solve everyones problem. It's ok if you don't want to solve everyones problem, but make that clear in how you promote your project, not through shutting down issues and PR on the backside.
> I'm not mad, I don't think you're bad people, but the least y'all can do is be honest with us
how entitled can someone be, you have the source code, if you're not happy why not contributing yourself?
It is like some people don't realize most of these projects are other people working for free on their free time and giving that for free to anyone that feels like using it
Because only one person can make a release. That’s exactly the complaint: people have been contributing, a lot.. but the maintainer won’t release the contribution.
How would you feel if your fixed the bug patch more than two years ago, your PR was accepted.. but users still do not have the fix?
I wouldn't feel a thing, I would release the patch on my own fork and people would be free to merge my patch to their own fork or use mine as a base. It requires more work sure, but how would you solve that problem? People have their life, they can work hard on some project then drop it and never touch it again. You cannot force people willingly giving their time away for other people to do it all their life.
Do you think that a trivial, 4-line commit to README saying "this project is no longer maintained, look elsewhere" is too much to ask?
I agree that you cannot force people to give their time away, and anyone can, and should be able to, stop maintaining any project they have.
But putting "I am done" notice will likely take under 30 minutes, and will help a lot of people. It's a good idea to do it just out of respect to all those people who spent their time crafting those 669 commits which will never get to release now.
The issue seems to be similar to one that we programmers regularly encounter in our day jobs: failing to consider maintenance and/or make a plan for what happens to a project in one, two, or three years. I don't mean to say "the authors of this project failed to consider maintenance" I mean "we, the whole community, including maintainers and consumers of the project, took maintenance for granted or were not concerned about it." Anyone who installed the tool without reading the project's governance model and maintenance plan signed up for "whatever happens happens," the governance model of most small projects. I myself do this all the time, I'm not saying I'm better by any means. To participate in the JavaScript/Python/etc. ecosystem generally requires being OK with this.
To me this isn't a question of one project, it's a question of OSS project governance in general. Is there a succession plan? How do you know when more maintainers are needed? How do you promote someone from contributor to committer to releaser?
We went around adding CoCs to everything a few years ago, perhaps a similar effort could be made with governance plans? Like a boilerplate plan projects could use, hopefully leading to a new community norm of "if you are using a project with no governance plan, that's up to you, but don't say you weren't warned!"