If you publish a Python library without pinned dependencies, your code is broken. It happens to work today, but there will come a day when the artifact you have published no longer works. It's only a matter of time. The command the user had run before, like "pip install spacy==2.3.5" will no longer work. The user will have to then go to significant trouble to find the set of versions that worked at the time.
In short unpinned dependencies mean hopeless bit-rot. It guarantees that your system is a fleeting thing; that you will be unable to today publish an end-to-end set of commands that will work in 2025. This is completely intolerable for practical engineering. In order to fix bugs you may need to go back to prior states of a system and check behaviours. If you can't ever go back and load up a previous version, you'll get into some extremely difficult problems.
Of course the people who are doing the work to actually develop these programs refuse to agree to this. No we will not fucking unpin our dependencies. Yes we will tell you to get lost if you ask us to. If you try to do it yourself, I guess we can't stop you, but no we won't volunteer our help.
It's maddening to hear people say things like, "Oh if everyone just used semantic versioning this wouldn't be a problem". Of course this cannot work. _Think about it_. There are innumerable ways two pieces of code can be incompatible. You might have a change that alters the time-complexity for niche inputs, making some call time-out that used to succeed. You might introduce a new default keyword argument that throws off a *kwargs. If you call these things "breaking" changes, you will constantly be increasing the major version. But if you increase the major version every release, what's the point of semver! You're not actually conveying any information about whether the changes are "breaking".
Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries. There are tools like Pipenv and Poetry to make that easy.
This is less of an issue in (say) Node.js, where you can have multiple different versions of a library installed in different branches of the dependency tree. (Though Node.js also has a strong semver culture that almost always works well enough that pinning exact versions isn’t necessary.)
> Python libraries should not pin dependencies. _Applications_ can pin dependencies, including all recursive dependencies of their libraries.
Is the pypi package awscli an application or a library?
poetry is frustrating in that it doesn't allow you to override a library's declared requirements to break conflicts. They refuse to add support  for the feature too. awscli for example causes huge package conflict issues that make poetry unusable. It's almost impossible not to run into a requirement conflict with awscli if you're using a broad set of packages, even though awscli will operate happily with a more broad set of requirements than it declares.
The awscli documentation recommends installing it into its own virtualenv, in which case pinned dependencies may be reasonable. There are tools like pipx to automate that.
Though in practice, there are reasons that installing applications into their own virtualenv might be inconvenient, inefficient, or impossible. And even when it’s possible, it still comes with the risk of missing security updates unless upstream is doing a really good job of staying on top of them.
I don’t think that respecting declared dependency bounds is a Poetry bug. Pip respects them too (at least as of 20.3, which enables the new resolver by default: https://pip.pypa.io/en/latest/user_guide/#changes-to-the-pip...). If a package declares unhelpful bounds, the package should be fixed. (And yes, that means its maintainer might have to deal with some extra issues being filed—that’s part of the job.)
You should use boto3
Hopefully a library! As hopefully the AWS command-line interface is maintained and distributed separately from any SDK that powers it...
This is essentially what we do where I work. When we maked a tagged release, we will create a new virtual environment, run a pip install, run all the tests and then run pip freeze. The output of pip freeze is what we use for the install_requires parameter in the setup method in setup.py.
That said, a library could certainly could update their old releases with a patch release and specify a <= requirement on a particular dependency when versions newer than that no longer work. That said, it would be a bit of work since indirect dependencies would also have to be accounted for as well.
One of the things that prompted the OP was this breakage in Python's cryptography package  (OP actually opened this issue) due to the introduction of a Rust dependency in a 0.0.x release. The dependency change didn't change the public API at all, but did still cause plenty of issues downstream. It's a great question on the topic of semver to think about how to handle major dependency changes that aren't API changes. Personally, I would have preferred a new major release, but that's exactly your point syllogism — it's a matter of opinion.
As a sidenote, Alex Gaynor, one of the cryptography package maintainers is on a memory-safe language crusade. Interesting to see how that crusade runs into conflict with the anti-static linking crusade that distro packagers are on. I find both goals admirable from a security perspective. This stuff is hard.
Asking a publisher to qualify their library against a big range of versions just means that they need to do a lot more testing and support. Obviously they want to validate their code against one version, not 20, and certainly don't want an open ended > version which would force them to do a validation each time a minor dep is released.
Similarly when publishers say I will only work against version X, this puts a bigger burden on the user to configure their dependencies and figure out which version they can use. They would like to push that work onto vendors.
What's a bit depressing is that these economic concerns are not raised openly as the primary subject matter, but the discussion is always veiled in terms of engineering best practices. You're not gonna engineer your way out of paying some cost. Just agree on who bears the cost and how you will compensate them for the cost, then the engineering concerns became much easier.
Pinning dependencies in applications/binaries/end-products is clearly the right choice, but it’s much fuzzier for libraries.
There's pretty much no point in setuptools automatically installing library dependencies for you if you expect the library dependencies to be unpinned. In fact it would be actively harmful --- it just leads people to rely on a workflow that works today but will break tomorrow.
You're asking for an ecosystem where there's no easy way to go back and install a particular version of a particular library. That's not better than having version conflicts.
The other thing I'd note is that it's quite an understatement to say that pinning dependencies makes life "slightly easier" for library developers. We're not going to accept builds just breaking overnight, and libraries that depend on us aren't going to accept us breaking their builds either.
(At the app level, the right approach to “going back in time” is for those apps to pin all their deps, with a lockfile or ‘pip freeze’, not just top level ones. That is, one records the deps of requests==1.0.5 in addition to requests itself.)
The only conflict I've seen that can't be automatically resolved is when I had some internal dependencies with a common dependency, and one depended on the git repo of the common dep (the "version" being the sha hash of a commit), and another depended on a pinned version of the common dep. Obviously there's no good way to auto-resolve that conflict, so you should generally stick with versions for library deps and not git shas.
> you will be unable to today publish an end-to-end set of commands that will work in 2025
Since ~2010 I maintain an application with an unpinned requirements.txt; it doesn't even have version constraints at all.
The only breakages I had were either:
1. when switching from Python 2 to Python 3 (obviously)
2. when a new Python version introduces a bug (but Python is not pinnable anyway)
3. once, when a dependency released a new major version and removed an internal attribute I was using in my tests out of laziness (so that one is entirely on me)
The trick is to only use good libraries, that care about not breaking other people's code.
It's also worth noting that it's not your job as a developer to make sure your application can be installed anywhere; it's the packager's job to make sure your app can be installed in their distribution.
And if your users want to use pip (which is kind of the Python equivalent of wget + ./configure + make install) instead of apt/yum/... to get the very latest version of your software, then they should be able to figure out how to fix those issues.
1. 'only use good libraries'
2. 'it's not your job as a developer to make sure your application can be installed'
3. 'if your users want to use pip... they should be able to fix those issues'
However, this isn't a solution to the problem that led to the existence of language ecosystems. It is a refusal to acknowledge the problem.
That's the point of the link to Hyrum's law. The article argues that the practice of pinning encourages that attitude: consumers feel free to depend on internal implementation details, producers feel free to change behaviour arbitrarily, and no-one takes responsibility for specifying and maintaining a stable interface, which is how you actually break that knot - producers need to specify which parts are stable interfaces and which are not, consumers need to respect that and not depend on implementation details, and then you can actually use semver because it's clear what's a breaking change and what isn't.
Picking a suitable dependency specifier depends heavily on the maturity of the library you’re using and if you need any specific features added or removed in a specific release.
Saying your library depends on “spacy==2.3.5” is a lie that will mean any other library that depends on spacy>=2.3.6 can’t be used. Even if your code will realistically work fine with any spacy 2.x release.
I'm not saying we pin our dependencies to exact specific versions, but we absolutely do set an upper bound, usually to the minor version.
OK. That's more sensible, but "pinning" implies == to a specific version. If you know a library does semantic versioning and breaks their API then ~= is fine. Just not ==.
Libraries should absolutely not pin their dependencies. Applications should if you care about reproducible builds (not necessarily byte-for-byte, but "can build today == can build tomorrow").
Installing both libraries and applications in the same way in the same environment is a fundamental mismatch that pip encourages, and yes - it leads to fragile binaries.
The exact same mechanisms work fine with other programming languages, and (more importantly, probably) different developer communities.
In fairness, Python’s lack of static types does make things worse than the situation for compiled languages. (Though that’s a general argument against writing non-throwaway code in python).
People claim node does better, even though JS is also missing static types, so presumably they solved this issue somehow (testing, maybe?). I don’t use it, so I have no idea.
No, this is not true, for the simple reason that there will _always_ be unpinned dependencies (e.g. your compiler. your hardware. your processor) and thus _those_ are the ones that will guarantee bitrot.
Pinning a dependency only _guarantees you rot the same or even faster_ because now it's less likely that you can use an updated version of the dependency that supports more recent hardware.
Compilers of languages like C, C++, Rust, Go etc go above and beyond to maintain backwards compatibility. It is extremely likely that you will still be able to compile old code with a modern compiler.
> your processor
Hardware is common enough that people go out of their way to make backwards compatibility shims. Things like rosetta, qemu, all the various emulators for various old gaming systems, etc.
> your hardware
Apart from your CPU (see above) your hardware goes through abstraction layers designed to maintain long term backwards compatibility. Things like opengl, vulkan, metal, etc. The abstraction layers are in widespread enough use that as older ones stop being outdated people start implementing them on top of the newer layers. E.g. here is OpenGL on top of Vulkan: https://www.collabora.com/news-and-blog/blog/2018/10/31/intr...
> [Your kernel]
Ok, you didn't say this part, but it's the other big unpinned dependency. And it too goes above and beyond to maintain backwards compatibility. In fact Linus has a good rant on nearly this exact topic that I'd recommend watching: https://www.youtube.com/watch?v=5PmHRSeA2c8&t=298s
> Pinning a dependency only _guarantees you rot the same or even faster_ because now it's less likely that you can use an updated version of the dependency that supports more recent hardware.
Dependencies are far more likely to rot because they change in incompatible ways than the underlying hardware does, even before considering emulators. It's hard to take this suggestion seriously at all.
Yes, that is true. It is also very likely that you can more easily go back to a previous version of a dependency than you can go back to a previous hardware. The argument is that, therefore, pinning can only speed up your rotting.
If you don't statically link your dependencies, and due to an upgrade something breaks, you can always go back to the previous version. If you statically link, and the hardware, compiler, processor, operating system, whatever causes your software to break, now you can't update the dependency that is causing the breakage. And it is likely that your issue is within that dependency.
Pinning can only make you rot faster.
Pinning dependencies absolutely and unquestionably works better, and for longer, than dynamic linking, for this use case.
Pinned or not, if a software update breaks things, you can always just revert back to a previous version of your dependencies. This applies to a myriad soft problems including a dependency changing interface.
However, when pinning, when one of your static dependencies is broken due to a change outside your control (e.g. hardware, operating system, security issue making it unusable, or something else), the user's only recourse is to call the developer to fix the software.
I am not claiming whether one happens more frequently than the other, or claiming that hardware changes cannot break the main software itself, which often nullify the point. All these issues can happen to both software with static linking or dynamic linking. However dynamic linking has at least one extra advantage that static linking cannot have, and the opposite is not true.
> have you ever worked as an application developer? Responsible for getting working artifacts to users as a means to an end?
Look, ironically I find that all of this crap discussion is because of a newer generation of "application developers" that do not know yet what does it mean to "deliver working artifacts to users". Imagine my answer to that question.
In practice, this happens so infrequently it can be ignored as a risk. (When it does happen, users generally don't expect the software to continue to work.)
> dynamic linking has at least one extra advantage...
You don't seem to be acknowledging the downside risk to dynamic linking which motivates the discussion in the first place. An update to a dynamically linked dependency which breaks my delivered artifact is an extremely common event in practice.
Well I disagree there. Security issues or external protocol changes (e.g. TLSv1.2 to TLSv1.3) are rather frequent, not to mention usually customer wants to upgrade their machines (old ones broke) and existing operating system no longer supports the new hardware.
> An update to a dynamically linked dependency which breaks my delivered artifact is an extremely common event in practice.
Again, I agree. A "surreptitious" dependency update breaking the software is much more common. However, I have already acknowledged that _two times already_, and the point that I'm making is that it doesn't matter if you are pinning dependencies or not: customer CAN FIX these issues without help from the developer. They just have to roll back the update!
On the other hand customer CAN'T fix the first issue (e.g. new hardware).
If you're not operating in the large ecosystem then fine. But if your project is on e.g. pypi, then there is an issue.
(edit: Note, yes I know the virtualenvs exist, docker exists, etc. but those are space and complexity trade-offs made as a workaround for bad development practices)
Docker with sha256 tags fixes that issue (and docker container even specify a processor architecture).
I can see some instances in which this expectation is important, and others where it is likely not or else certainly less important than the security implications.
For the extremes, I see research using spaCy has a very strong interest in reproducibility and the impact of any security issues would likely be minimal on the whole simply due to the relatively few people likely to run into them.
On the other extreme, say some low-level dependency is somehow so compromised simply running the code will end up with the user ransomware'd after just-long-enough that this whole scenario is marginally plausible. Then say spaCy gets incorporated into some other project that goes up the chain a ways and ultimately ends up in LibreOffice. If all of these projects have pinned dependencies, there is now no way to quickly or reasonably create a safe LibreOffice update. It would require a rather large number of people to sequentially update their dependencies, and publish the new version, so that the next project up the chain can do the same. LibreOffice would remain compromised or at best unavailable until the whole chain finished, or else somebody found a way to remove the offending dependency without breaking LibreOffice.
I'm not sure how to best reconcile these two competing interests. I think it seems clear that both are important. Even more than that, a particular library might sit on both extremes simultaneously depending on how it is used.
The only solution - though a totally unrealistic and terrible one - that comes to mind is to write all code such that all dependencies can be removed without additional work and all dependent features would be automatically disabled. With a standardized listing of these feature-dependency pairs you could even develop more fine-grained workarounds for removal of any feature from any dependency.
The sheer scale of possible configurations this would create is utterly horrifying.
At any rate, your utter rejection of the article's point seems excessively extreme and even ultimately user-hostile. I can understand your point of view, particularly given the library you develop, however I think you should probably give some more thought to indirect users - ie users of programs that (perhaps ultimately) use spaCy. I don't know that it makes sense to practically change how you do anything, but I don't think the other viewpoint is as utterly wrongheaded as you seem to think.
What would help a lot is if the requirements were specified outside of the actual artifact, as metadata. Then the requirements metadata could be updated separately.
There's a very simple solution here: just don't write bugs.