> At this point you could even get rid of the version altogether and just use the commit hash on the rare occasions when we need a version identifier. We're all on the internet, and we're all constantly running npm install or equivalent. Just say "Leftpad-17", it's cleaner.
> And that's it. Package managers should provide no options for version pinning.
> A package manager that followed such a proposal would foster an eco-system with greater care for introducing incompatibility. Packages that wantonly broke their consumers would "gain a reputation" and get out-competed by packages that didn't, rather than gaining a "temporary" pinning that serves only to perpetuate them.
Pinning is absolutely crucial when building software from some sources out of your control, especially with larger teams or larger software. You cannot rely on the responsibility of others, be they 100s of upstream library authors, or 100s of other peer developers on your product. One among those people may cause a dependency to break your build, either by releasing a bad version, or foolishly adding a dependency without carefully checking its “reputation” for frequent breaking changes.
In either case, without pinning, your build is non-deterministic and sometimes starts failing it’s tests without any diff. You can’t bisect that. Your only remediation is manual debugging and analysis. Work stops for N engineers because everyone’s branches are broken.
I don’t think any level of community fostering is worth that kind of risk.
Does the author honestly fail to realize that commit hashes are incomprehensible to most humans and look like noise, it takes extraordinary mental effort to compare them? While difference between 1.2.3 and 1.4.5 is instantly apparent to any human?
> Pinning is absolutely crucial when building software from some sources out of your control
Amen. I am surprised how the author does not recognize it.
In fact, the only idea that does not come as outright false immediately after reading is "if you change behavior, rename". But thinking about it for a bit, it is wrong too. First, naming is hard. Finding good recognizable name for a package is hard enough, if each BC break would require a new one, we'd drown in names. Second, behavior breaks are not total. If RoR or ElasticSearch releases a new version with BC break, they do not stop being essentially the same package just with somewhat different behavior. Most of the knowledge you had about it is still relevant. Some pieces are broken, but not the whole concept. New name requires throwing out the whole concept and building a new one, essentially. It is not good for incremental gradual change.
Worse is that you can't just sort on hashes alphabetically and recover the correct version order. That alone is reason enough to use some kind of sortable versioning scheme.
The author's point is that the difference between 1.2.3 and 1.4.5 can be anything from a bugfix to the software going from self-hosted photo software to now it's driving a car. The assumption that users have the same perception of a major and minor version as the developer responsible for the versioning, is assuming that all humans understand versioning 100% identically.
I would think the best approch is "trust but verify" ANY update to a dependency. A dependency might save you some time but it's not a free pass to be irresponsible. There's not such thing as FOSS.
In fact that's the entire reason why "semantics versioning" doesn't really work:
* you can never be certain that the maintainer actually follows it
* one user's breaking change is one maintainer's bugfix
* while some packaging systems attempt to automate and enforce it (e.g. Elm's) even that only goes so far due to type system limitations (e.g. changing acceptable values from [0, 4] to [0, 4[ as a side-effect of some change deep in the bowels of the library's logic is pretty much guaranteed not to surface if you don't have a dependent type system, and may not do so even then)
I write a bunch of Rust. The Cargo package manager has several layers of "defense":
1. The "Cargo.lock" contains exact versions of all transitive dependencies. (And Cargo doesn't allow overwriting a version once published, and lock files even override normal package "yanks", so these version numbers are very stable identifiers.)
2. The "Cargo.toml" file contains manually-specified semver information. I can run "cargo update" to get the latest, semver-compatible versions of all my dependencies.
3. In the rare case that somebody messes up semver, I can just blacklist certain versions in "Cargo.toml", or add more specific constraints. I think I've done this maybe two or three times in thousands of dependency updates.
4. My own projects have unit tests, and of course, Rust does extensive compile-time checks. So I'm likely to catch any breakage anyway.
So the absolute worse-case scenario here is that somebody messes up (which is very rare). At this point, I can just file a bug upstream, lock a specific version down in "Cargo.toml", and wait for things to get sorted out.
I have zero need to "be certain". I'm quite happy with "It works nicely 99.9% of time, and it has an easy manual fallback when the inevitable problems occur."
- for a patch version, all exported (name, type) pairs of A.B.C are identical to those of A.B.(C-1)
- for a minor version, enforce first rule, but allow new types for new names.
I think the Elm package manager does this, based on other comments here in this thread.
Either way, that sort of automatic “exports” verification could be a base layer in a semver verification package manager, with integration tests layered on top.
Sometimes, this can get very gnarly. For example, C# has overloading, and it also has type inference for lambdas; and it tries to make these two work seamlessly. So, consider something like this:
int F(Func<int, int> g) => ...
string F(Func<string, string> g) => ...
F(x => x.ToUpper());
Now, when it tries Func<int, int>, it doesn't work, because if x is int, it doesn't have a method ToUpper, so the body of the lambda is ill-formed. On the other hand, given Func<string, string>, x is string, ToUpper() is well-formed, and returns a string, matching the function type. So that's the overload that gets chosen.
Now consider what happens if someone adds a method called ToUpper to int: now the code that was compiling before, and that was calling ToUpper on string, will suddenly fail because of ambiguity.
In more complicated scenarios (e.g. involving inheritance), it is even possible to have code that silently changes its meaning in such circumstances, e.g. when overloading methods across several classes in a hierarchy. It'd be rather convoluted code, but not out of the realm of possibility.
You also can never be certain that the car is not going to run through pedestrian crossing on red light when you step in - but it doesn't mean that traffic lights are useless.
Semver is like traffic lights. For reproducible builds/increased certainty harden it with some kind of pinning or local copy.
That's a bad argument. You can also never be certain any third-party code works at all, or did not change without changing versions, or does not have subtle bugs that surface only in your particular use case, etc. If you dismissed the whole concept on basis that somebody could fail to follow it you wouldn't be able to use any third-party dependencies at all. At which point the question of versioning is kinda moot.
> one user's breaking change is one maintainer's bugfix
This is a "grey zone" fallacy. From the fact that there might be disagreements on the margins, does not follow there is not a huge majority of clear-cut cases where everybody agrees that something is a huge or tiny change, and in this huge majority of cases it is useful. Even if there's no definition that would work in absolute 100% of cases, it works just fine in 99% of them.
Perhaps in the philosophical sense that you can never be certain of anything, this is true. But that only place that thinking leads is to staying under the covers all day.
You actually can achieve a high degree of certainty that third-party code works to some reasonable standard by testing it. Which is the whole point - once you have tested a version, and are content that it works, you wouldn't want to blindly switch to another version without again performing a similar set of tests.
Elm enforces semantic versioning with its type system, and one can also print the diff of values/functions between separate package versions. In principle, this is doable.
Effectively impossible, with a Turing complete programming environment.
(“Effectively,” because technically our physical computers have a bounded number of states and inputs, thus aren’t technically Turing complete, thus Rice’s theorem technically doesn’t apply.)
The same happens for the proposed model ¯\_(ツ)_/¯
The proposed author claims that "You cannot rely on the responsibility of others" but their scheme fails in the exact same way.
Semantic versioning and automatic "transitive" dependencies sure are handy, especially in that "naive" phase of development. But man, once you start having to start tracking performance and security, it's time to turn off any magic update.
Or, in other words, when you've got people depending on you, you need to start putting your security hat on and assume that there can be a breach with every change, or your performance hat and assume the system will crash due to a performance problem. No way would I want any rebuild OF THE SAME SOURCE subject to change.
Software versioning is a case of the XY Problem. We don't know what we want, but we know something that might help us get it (and we are wrong).
We don't care if you add behavior to your library (unless it fulfills a need we had or didn't know we had). What we really care about is if you take behavior away. And in some cases it's not even all the behavior. If I'm only using 20% of your library and you haven't changed it, I can keep upgrading whenever.
What semantic versioning is trying and failing to achieve is to make libraries adhere to the Liskov Substitution Principle (LSP). If you had a library that had sufficient tests, then any change you make that causes you to add tests but not change an existing test means that it should satisfy the LSP (tests that correct ambiguities in the spec notwithstanding). People can feel pretty safe upgrading if all of the tests they depend on are intact.
The difficult part of this 'solution' is that reusable code would have to have pretty exhaustive test coverage. I think we are already bifurcating into a world of library authors and library consumers, so I'm okay with this. We should feel comfortable demanding more than what we currently get from our library writers. In this world, what you version is your test harness, and not your code.
The problem isn't with versioning, it's with non-deterministic builds. Once you have a large enough project that you can no longer keep manual track of version changes, you build tooling which checks for the latest version and opens a pull request (and subsequent CI build on the pre-merge commit) with the latest version. This keeps your project up to date and prevents the woes of non-deterministic building.
IMHO it's a bad solution to a real problem.
Too often, I've seen software with out of date libraries or frameworks, with versions set a the start of a project and never touched once since. In some cases it even lead to security flaws not being patched.
Another side effect is that hard pinning complete versions makes for somewhat hard packaging (if you try to follow some distro guidelines (like the Debian ones), package libraries separately and not bundle everything together).
In some extreme cases, avoiding the small and incremental breaks & fixes in API can lead to code bases so far from the modern APIs of their dependencies that throwing away the code and re-implementing from scratch is actually a better solution.
I feel that was is missing in most languages is a way to properly maintain APIs (like tools native in the language that help ensure no breaks in the API contract) and language that have concepts of API versioning and API negotiation (kind of like rest API or protocol version negotiation) native to it.
If this was also combined with a backward compatibility policy of "at least maintain version N + version (N-1)" and clear deprecation notifications, it would make for smoother updates and makes most systems more reliable.
I would love to see a language with these items as core features.
But we are far from this situation. What I generally do instead is to carefully chose my dependencies in projects that have a good track record of not breaking their APIs too often while being correctly maintained and also I tend to try to keep the number of dependencies in my projects to a minimum. I almost never hard pin dependency versions.
It's no wonder that the MIL-STD-498 has a large focus on interfaces (Interface Requirements Specification (IRS) and Interface Design Description (IDD) specification documents). It's actually one of the harder parts to get right when designing complex systems (including complex pieces of software).
But yes, you should either commit deps, or have your own repo/caching-proxy which will neither change nor drop old versions.
And suddenly, the auto-scaling fails in production because something has updated to include a new bug.
I didn't mean that semantic versioning is bad, I meant that specifying your dependencies in terms of semantic versions is bad.
The author is confusing the (sometimes bad) behavior of package managers with version numbers.
Version numbers signal the intent of the release, but don’t guarantee anything. Any change in a dependency might break your project so no update is safe, regardless of what the version number says. That’s why people pin version (or, at least should be why.) It’s not a problem of version numbers at all.
That doesn’t make version numbers useless, though. Understanding the intent of an update is important info. It lets you estimate the cost/risk of taking an update.
So the author suggests that changing versioning will somehow improve things, when what we actually want is for package managers to be conservative by default.
He actually suggests that the latest version with the same major version be taken. That would work as long as... (1) none of your dependencies ever introduces bugs or accidental compatibility changes in a non-major update; and (2) all contributors update dependencies in lockstep (since any new code contribution could be dependent on a feature or behavior of a newer version of a dependency). In other words, that doesn’t work at all.
What we want is for package managers to lock the dependencies by default and for package repositories to require a version update when committing a dependency update. In fact, better to take the version number out of it: lock dependencies to the commit hash and leave version numbers for what they are good for, to communicate intent.
But the premise that a major version conveys no more information than a full rename is simply wrong. Is libfoo n + 1 more similar to libfoo n, or libbaz m? In extremely rare cases, it will be more similar to libbaz m than libfoo n, but in 99% of cases, that's not true. Even if that is not perfectly reliable, that probability conveys information.
P.S. I'm not commenting on the other ideas raised by the article.
"Well, we need to upgrade the Epic Flatuence server to Smelly Gangrene in order to stay compatible with whatever they're calling the next version of Wild Dingo --"
"The Russian version?"
"No, by a guy out of Vancouver."
"Right. Next, let's talk about Your Mother ..."
Rather he's talking (in a rather confusing way) about changing the way we convert names-and-version-numbers into identifiers used in package management systems.
The convention he suggests is basically what Debian has always done for C libraries (putting the soname in the package name), so it's certainly a plausible way to manage things.
It completely breaks upgrades. Newer versions aren't compatible with older versions, yet there's nothing telling apt that it should upgrade other stuff depending on the older version first. Also, who is to say wether foo5.3 is newer or older than foo5, and since foo5 depends on bar3.2, while foo5.3 depends on bar3.2a, what package do you have to upgrade first?
I understand where you're coming from: a complete project name change is something very different. I also think this is something that's commonly misunderstood.
From the article:
> conveys no more actionable information
The key here is actionable. What will change in what you as someone who consumes this third-party code will have to do if you adopt a major version bump versus a new library? Unless the new version is API compatible with the old one, you're likely going to have to do some major work in incorporating the new version; if the API is backwards compatible, it's not really a major version bump. The further a different library is from the one you're currently using is of course going to require a different level of rework, but that's part of what a major version bump signifies. I see what you're getting at with respect to some arbitrary new library, but if the new major version (with a new name) is from the same maintainer, Rails2 vs Rails v2 isn't so different from what you're going to need to do in your code.
As for which name you change, I think that's where the rubber meets the road. What's the name the code is referred to in your application? That's the one that really matters, and where a name change is warranted. If the language affords module namespacing, the new library is just a new require/import. Major versions aren't drop-in replacements. If the libraries are namespaced, you can conceivably use them side-by-side in the same codebase as you migrate. This is potentially very powerful. As a practical example, in one Java library I've been working on that has a long history, it's able to use various versions of JUnit in the same testing suite exactly because the different versions are namespaced.
Another example at a different level is being able to install more than one version of, say, python or sqlite, on the same system. Some packages name the binaries with a suffix, e.g., python27 or python3, so you can include and have both of them on your path without conflict. When you install current versions of sqlite, the binary is explicitly sqlite3 so as to not be confused with older, incompatible versions.
A new major revision of a library means that there are some non-backward compatible changes, and in many cases this means a limited number of fixes have to be done. Also in general it is expected that one will want to upgrade, so probably those changes are documented. Also hopefully the two libraries will share the same main concepts so using the new version will not be terribly different from the old.
Of course it is not always the case, see Angular, were two unrelated frameworks share the same name (I really think they should have changed the name).
I agree. As I mentioned in a comment to your sibling, I think there's a bit of a disconnect as to what people are thinking of renaming. With respect to what's changed, the major version number doesn't give you any additional information as to what has changed. You need to refer, as you mentioned, to the documentation, or the results of your own testing. That's what I'm getting at when I point out the operable word actionable.
I completely agree that the two libraries, if understood to be the same project, should likely share the same core concepts: after all, that's likely what attracted people to the library in the first place, and that likely only a limited number of fixes have to be done.
Angular might be the one exception I know of to this pattern.
There are at least three levels where the names matter: the project (e.g., Rails), the artifact (e.g., a particular gem), and the module(s) (e.g., what gets referenced by require). They're each important in their own way. I think renaming the Rails project instead of a major version bump is not what is typically the intent when talking about renaming.
Artifacts are a way of referencing a collection of files, and often there's some way of embedding metadata about the artifact in the name of it, though that's accidental, and naming these artifacts is independent (or at least not necessarily dependent) of the collection it represents: the same file can be found in any number of artifact builds. I think the practical implication is that you're renaming what is being required and referenced in the code itself.
So, you'd still have the same understanding that a migration from Rails2 to Rails5 would be bigger than one from Rails4 to Rails5, and that would be a whole different kettle of fish than moving to Django (or even Sinatra). I think we may be talking past each other as to the level where renaming (or really, new, additional naming) would take place.
Often they'll be developed by separate people who happened to inherit the name of the original, or "take up the reins" after the original became abandonware—but, either way, decided to just write a new one themselves rather than continuing on with the old. At that point, there is really no difference between calling it "libfoo (n + 1).0" and calling it "libbaz 1.0".
If you want to convey ancestry, you could add another preceding number, and that's often what people do—in the package name. For example, sqlite3 vs. sqlite4. If you wanted to make this "part of" the version tuple, you could: just have a super-major number before the major number.
But I find claims that he didn't write it: http://grammar.ccc.commnet.edu/grammar/twain.htm
That comes due to the confusion of the old thorn character Þ or þ which got all mangled and misread.
But it was an incorrect substitution then, and still would be now.
In the most common scripts at the time (cursiva anglicana and later bastarda anglicana), <y> and <þ> were already very similar (the only difference being which of the two strokes had the descender—both letters have a closed bowl in these scripts), so similar that it was common in bookhands to write a small dot above <y> to aid the reader even when <þ> was still visually distinct. You can see this clearly in , written c. 1399. Use of either character to represent <th> was already becoming quite rare in the late Middle English period, and remained in use almost exclusively as a scribal shorthand; it's not uncommon to encounter manuscripts in which <þ> and <th> are mixed freely.
I can't say for certain why it is that <y> started to be substituted for <þ> in manuscripts. Obviously in the common hands of the time they're visually very similar, yet I'd imagine anyone who'd been educated as a scribe would be able to spot and write the difference. Perhaps it was a stylistic choice. But it did happen: you can see an example at . This document was written c. 1445. Gutenberg's printing press is thought to have been invented at this time, but it wouldn't arrive in England until Caxton set one up c. 1476. It seems very unlikely to me that the printing press had anything to do with the initial substitution, though I'd imagine that the press was a major nail in thorn's coffin.
(both of these documents are written in a bastarda anglicana)
That said, it’s rarely that black & white so it’s quite possible it’s a combination of everything.
I spent some time poring over facsimiles of early printed English books. Caxton is credited with printing the first book in English, actually a couple years before he returned to England, while he was still in Flanders. In that book, it looks to me like there's two glyphs, one which resembles <þ> and one which resembles <y>, but they're used completely interchangeably as if they're the same character. However, pages printed on an actual press usually wound up smudging a bit when the page was peeled off of the block, so the differences I perceived could very well just be artifacts of the printing process rather than intentionally distinct glyphs.
That said, early books that were printed in England do have a noticeably distinct <þ> character, but it's use is limited to the common scribal abbreviations, which themselves are quite rare in this material. I couldn't pinpoint any precise date for <þ> -> <y>, but after around 1560, those old scribal abbreviations seem to disappear almost entirely (I only found one document after that which had any such abbreviations—from around 1680(!)—and this document unmistakably uses <y>). It's always apparent whether <y> is intended in its modern sense or as a substitute for <þ> because the abbreviations are always indicated with a superscript.
I wonder if abbreviations in certain types of work were viewed as un-ideal. They certainly seem less common (but still present) in longer works and in works which seem to be a bit more formal. In short works, especially where space was at a premium (broadsides, chapbooks, newsletters, etc.), they are more common, which is understandable. This also seems to be a continuation of a trend that started in manuscripts, where fancier and more elaborately-illuminated texts started employing abbreviations less often. But at the moment I haven't the foggiest why they appeared at all in longer works—especially when print comes into play—since their usage is rare and doesn't follow any obvious pattern—perhaps the typesetter was just running out of <t>'s!
The picture is pretty muddy, and further complicated by the timing of the press's appearance because we can't really know if the development of <þ> -> <y> would've happened anyway, or if it would've simply remained a quirk of certain scribes.
Anyway, when I woke up this morning I had no idea that this was how I was going to spend my day, and it was fun to dig through all this stuff :)
Slightly off-topic: while I was busy skimming facsimiles for abbreviations, I came across this gem, which I find both interesting and... "irrationally amusing" (I don't know how else to describe it, but if I weren't so robotic I might've been giggling like a child):
Bugs can creep into software even with the best of intentions, and it is useful for consumers of packages to have a choice whether to upgrade immediately or maybe wait until they have done some risk analysis. Also, the ability to roll back is invaluable.
I don't think forcing everyone to upgrade is going to force package maintainers to make zero mistakes.
Also the package-3, package-4 thing seems odd to me. I use Nuget and have used NPM / Gem / Haskell Stack on spare time projects. I would find it annoying to use this scheme because you have to hunt around searching in a different way to look for major upgrades compared to what you do for minor ones.
But most of all: I want my build to be reproducible, going back into history. Having the pinned version numbers committed to source control in ensures this (As you keep an onsite cache of the packages, or you trust the package repo to stay available and for published versions to be immutable).
Specifying a version does allow you to know which artifact to include, but that's at the artifact level, which often is conflated with the particular library that artifact provides. A build number or other included in the artifact gives you the same selection properties, yet is independent of the code the artifact actually provides. Currently code and artifact are labelled similarly, but that's more a purposeful accident rather than a guarantee or deterministic property.
As for the package-3, package-4 thing you mention, I agree, it's kind of a mess. I think that's more a result of this lack of separation between the code and the artifact (if I'm understanding you correctly), and module (what is referenced in the code itself namespacing and aliasing. I don't think it's a solved problem.
I grammerred badly, but this is what I meant when I said:
"As you keep an onsite cache of the packages"
In Nuget land this is quite easy but then you have some housekeeping to ensure the copy of the packages is backed up. This could be done with a separate source repo if you want to use that hammer.
Either way, a major upgrade would mean a significant change in the way you use the API in your project.
Assuming no accidental breaks of backwards compatibility in minor versions, this should work.
Package managers like npm5/yarn and Cargo have lock files that pin exact versions by default, for all dependencies, recursively. If you keep that file (commit it), your project will be immune to unexpected breaking updates.
Semver-based package managers also don't just naively always upgrade to the very latest version like the article suggests. The default behavior is to limit upgrades to semver-compatible versions. There are tools like greenkeeper which use your project's unit test to try out packages before upgrading, adding extra level of assurance.
While none of this is a perfect guarantee, it's IMHO working well enough.
I'm curious what you think of the two posts I was basing mine on:
"Spec-ulation" by Rich Hickey: https://www.youtube.com/watch?v=oyLBGkS5ICk
"Volatile Software" by Steve Losh: http://stevelosh.com/blog/2012/04/volatile-software
I think there may partly be a "universes colliding" effect here, and partly just the future being non-uniformly distributed.
E.g. the tilde (~) and caret (^) operators in npm allow you to specify version constraints on dependencies, that allow them to be resolved/pinned to higher minor/patch versions of that package, but not to a higher major version since that will by the Semver definition contain breaking changes and those might impact your project.
Some communities have also started doing the reverse, where as a library you can get all your users's tests to run to check that you have not broken something they were using.
When someone is on an older version of npm or on a different system, (linux / mac os) things get fucky with the lock file. The lock file is changed and they just blindly commit it. So kinks still need to be ironed out with npm, IMO.
He proposes tuples, but doesn’t care for semver. So what would the tuple mean?
As a human race, I’m sure we can do better with versioning semantics. Perhaps I’ll put some more spare personal time into that semantic schema kit I was designing. Ultimately it’s about data types, containment and meaning. And his proposal ignores most of that.
In practice, a string with a required format of "x.y.z" just is a 3-tuple of ints. Restricting the values to ints makes it easy enough to parse.
The example of npm shows some weird extensions like "1.2.3pre" which definitely complicate parsing, but that's not as much a difference between strings and tuples, as a difference between having a lot more options.
m.nn.pp-alpha-rc0 may convey meaning to the author, but it is meaningless to users because the 'alpha' and 'rc0' mean different things in different libraries, based on their release processes.
You have spent more sentences on my tuple idea than I spent introducing then completely emptying it. That suggests that I have failed to convey my larger point to you. I'm going to agree to disagree and move on.
I don't think you are using the word "meaningless" correctly. rc0 may not mean exactly the same thing in different software packages, but it always means the first release candidate. Thus, it is not meaningless - it conveys meaning. And if the user wants, they can look up that software package, and find out exactly what rc0 means, gaining even more meaning from the description. Perhaps you are trying to say that rc0 doesn't convey as much information as quickly as the alternative you are proposing?
A bug is encoded in the behavior of the program. "Fixing" a bug is changing the behavior of the program - breaking it if someone downstream was implicitly or explicitly relying on the bug.
Most people want the bug fixed, and would tag the change as 1.0.1 rather than 2.0.0; but in someone's dependency tree is version 1.0.0 of your project, and upgrading to 1.0.1 will break them, even though semver tells us that ..1 is a minor change for most consumers.
Incidentally, this is also why evolving the web platform is so hard. Among the uncountable sites on the internet, even if the current behavior seems totally broken, someone is shipping code that relies on it.
This is mistaken. As I've noted elsewhere in here, semver is a social protocol, not a technical one; in other words, it's wishy-washy meatspace stuff (and yet it's still the best we've got at the moment). As consequence of being wishy-washy meatspace stuff, it lets any library that uses semver define "breaking change" in any way it wants to. This is literally the first rule of the semver specification (where, for the entirety of the document, the "public API" is how breakage is defined):
"1. Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it should be precise and comprehensive."
The only thing that matters is that the public API is documented. It would be totally semver-compliant to have a line in your readme saying "there is no public API #yolo" and then completely redesign your library between versions 1.0.0 and 1.0.1.
Obviously this is undesirable in many ways, but I have yet to see a solution that does away with the wishy-washy meatspace stuff (at best, I have seen some tools to help package authors determine whether the public API breaks inadvertently, for some definitions of "public API").
If you're relying on undocumented behaviour in your programs, then you inevitably will find no/less value in semver.
And if your program isn't documented, then your code implicitly fills the role of documentation, which does mean that fixing any bugs can maybe be considered a breaking change. Or something like that.
If any program that used the previous behavior is still correct, then it's a non-breaking change. Despite what you say, this is a very common thing. Adding a new function, for example, usually fits here. But that is always verified orm the point of view of documented behavior, never looking for actual behavior.
If people are relying on undocumented behavior, they will have problems upgrading their libraries. They should be expecting it, if they are not, it's their problem not the library maintainer's. Wether to depend on undocumented behavior is a decision made by the library users at their discretion by weighting their opotions, it shouldn't impact upstream.
For non-system package managers like npm, always using a lockfile is the best current option IMO, for those package managers that support it.
This is not at all accurate, at least in my work. I use version pinning to be resilient to changes in the greater internet: builds don't use the internet, and instead use internally pinned and vendored sources for third party libraries.
If you rely upon the internet to build and deploy your code, then someone else's outage (like github's) can become your outage, or make yours worse. If you work on a product which has high uptime requirements, this is unacceptable.
Just as important, suppose any one of those dependencies breaks the social contract described in the posted article: they make a backwards-incompatible change, and your code no longer compiles or builds or (much worse!) subtly breaks. Now you're relying very strongly upon the good intentions and competency of all third party maintainers.
Worst of all, suppose any of your dependencies gets compromised and now has a security vulnerability. What do you do?
This is not a nitpick; I think it's central. Because I need network-free, reproducible builds of known-good code, I need to maintain a copy of third-party code somewhere. When I start doing that, I need to keep track of what version (which could be a commit, sure) I'm storing to know when I'm out of date. Then, when my cached copy is out of date, I need to know how safe it is to update to match the remote version: should I expect builds to break, or not?
Semantic versioning is a way for package maintainers to signal their expectations. It's not something you can rely completely on - in these systems I'm describing, you still need automated tests to give you more confidence that upgrading is safe. But semantic versioning improves your confidence which helps avoid wasting work on an upgrade that will certainly fail, and helps identify problems in advance.
It's also helpful to have minor and patch versions so applications can concisely describe which features of third-party dependencies are required to build and run your code. This makes it easier to integrate new applications - you can clearly see whether your cached copy of the third-party code is up-to-date enough.
This discussion gets at the distinction: https://stackoverflow.com/questions/4151495/should-gemfile-l...
The use of left-pad as an example in OP was intended as an acknowledgement of the deployment issues you point out. But I was probably being so subtle that I veered into cuteness :)
 See my example of "exemplary library use" at http://arclanguage.org/item?id=20221
> When I start doing that, I need to keep track of what version (which could be a commit, sure) I'm storing to know when I'm out of date. Then, when my cached copy is out of date, I need to know how safe it is to update to match the remote version: should I expect builds to break, or not?
> Semantic versioning is a way for package maintainers to signal their expectations. It's not something you can rely completely on - in these systems I'm describing, you still need automated tests to give you more confidence that upgrading is safe. But semantic versioning improves your confidence which helps avoid wasting work on an upgrade that will certainly fail, and helps identify problems in advance.
Putting these things in Gemfile.lock does not accurately reflect the workflow where all upgrades happen with humans in the loop, who need to be aware of and initiate any change that happens in your dependencies' code.
I agree that the best approach today is to maintain a local cache of a specific version, and to maintain unit tests to warn you of any breakage. In which case: what's the point of a package manager, again?
As I briefly mentioned above, my current approach is to be extremely conservative in introducing dependencies, inline any dependencies I do introduce, and then treat them as my own code, polishing them alongside my own stuff, deleting code paths I don't need, submitting patches upstream when I discover issues from my hacking. I'm investigating better ways to write automated tests, so that when tests pass we can actually gain confidence that nothing has regressed: http://akkartik.name/about. If we could do this we wouldn't need compatibility at all! But that is blue-sky research; OP is my attempt at meeting the rest of the world half-way. Assume package managers are trying to do something useful. How can we build them to actually do what they advertise?
> Putting these things in Gemfile.lock does not accurately reflect the workflow where all upgrades happen with humans in the loop, who need to be aware of and initiate any change that happens in your dependencies' code.
I agree that upgrades should be pull not push, so I'm not sure what you're disagreeing with here. We need upgrades to be easy to perform so that we'll be more likely to perform them and so that we'll perform them more often, thus keeping our projects up-to-date on the latest vulnerabilities and bugs.
And what if a random person decided to register the package named "Rails-6" ? This system is way too prone to malicious package name squatting.
Sounds like an extremely trivial thing to solve technically.
Make all name-X belong to the same account -- that is, treat "name" as the unique identifier that binds a package name to an account.
So nobody but those that have registered name-1, name-2 etc can register name-66.
That's like 2 lines of code.
Note that rails is (officially) hell bent on not using semver at all as significant breakage happens even at minor versions.
The idea is that backwards incompatibility should cause a name change, so really any untaken name would do, e.g. Rails-McTavish, Rails-Ocelot, Rails-馄, Rails-0x389, Rails-3.14159, ...
The alleged reason for skipping windows 9: many software packages were checking for older Windows versions (namely 95/98) by checking if the Windows version string starts with "Windows 9"
source: msft employee who isn't even close to the windows team.
that explanation ignores the fact that windows spoofs the windows version number for all applications by default (defaults to 8.1 i think), unless you have a special entry in your application manifest. so old applications will still work by default. not to mention that windows has compatibility shims to deal with this exact issue.
I'm not sure why this is potentially seen as a bad thing.
Call me crazy but I enjoy version locking every dependency because it lets me know exactly what I'm using and it's something commit to version control so other people know as well. It becomes a form of documentation.
Incidentally, version locking every dependency is identical to just inlining every dependency in the repo. And I actually like this approach a lot. See my example of "exemplary library use" at http://arclanguage.org/item?id=20221. OP was an attempt to reconcile my priorities with the mainstream.
> To begin with, it's weird that versions are strings. Parsing versions is non-trivial. Let's just make them a tuple. Instead of "3.0.2", we'll say "(3, 0, 2)".
3.0.2 alpha 6 says what?
> Next, move the major version to part of the name of a package. "Rails 5.1.4" becomes "Rails-5 (1, 4)". By following Rich Hickey's suggestion above, we also sidestep the question of what the default version should be. There's just no way to refer to a package without its major version.
And maintainers, being already loath to bump the major, start bumping only the minor on breaking changes.
And now if the maintainer breaks the package (either wilfully or without noticing) the user is shit out of luck and jolly well fucked.
> A package manager that followed such a proposal would foster an eco-system with greater care for introducing incompatibility.
I've got this very nice bridge you may be interested in.
> Packages that wantonly broke their consumers would "gain a reputation" and get out-competed by packages that didn't
There is literally no reason for that to happen any more than it currently does.
> The occasional unintentional breakage would necessitate people downstream cloning repositories and changing dependency URLs
So pinning would still exist except you'd have to fork the project you depend on, and would forget to update it ever after? Yeah that doesn't sound like a recipe for complete disaster.
> As a result, breaking changes wouldn't live so long that they gain new users.
Call me back about the bridge thing, you'd love it.
> if you change behavior, rename.
Unless you don't care, or don't notice, then don't, and the whole edifice falls down once again.
Of course, personally, I want every dependency I pick to be consistent and stable forever. But I want to break things and work on new and interesting problems. I'm also using an open source toolchain and aren't willing to pay for support or maintenance. Since I follow this stuff online, I'm also interested in new ways of solving old problems.
Maybe there should be a sense of responsibility. That seems to be lacking in this "engineering" climate.
I don't understand this criticism. The "default" in NPM is to use `npm install --save`, which adds the version of the package it installs to your `package.json` so future uses of `npm install` will automatically use a semver compatible version of that package.
You wouldn't let your teammates commit code without review, why trust a 3rd party... (Not saying need to look at diffs of third party libraries)
The complaint seems to be that the update command is not sufficiently safe by default (though the `--conservative`, `--minor` and `--strict` flags help there), which is fair enough, but why not just fix the default behavior?
This is not the case in NPM. `npm update` will only update to the latest version that matches the selector in your `package.json`.
So if you ran `npm install --save` and it wrote 'foo@^1.2.3', `npm update` will not update to release 2.0.0 which includes breaking changes, but will update to 1.2.5 which includes fixes.
The ^ symbol is the default which will allow new features and fixes, but not breaking changes. You can optionally set '~' on a conditional basis or npm-wide default for fixes only, or pin packages only if that's your fancy. But the default seems pretty sensible in my opinion.
> The behaviour of package-lock.json was changed in npm 5.1.0 by means of pull request #16866. The behaviour that you observe is apparently intended by npm as of version 5.1.0.
> That means that package.json can trump package-lock.json whenever a newer version is found for a dependency in package.json. If you want to pin your dependencies effectively, you now must specify the versions without prefix, that means you need to write them as 1.2.0 instead of ~1.2.0 or ^1.2.0. Then the combination of package.json and package-lock.json will yield reproducible builds. To be clear: package-lock.json alone does no longer lock the root level dependencies! 
The release notes he references are an absolute farce. All of this bluster about how "npm@5's first semver-minor release!" is going to provide "a much more stable experience." And yet,
> It fixes [#16866], allowing the package.json to trump the package-lock.json.
Fixes, ha! Even that link is broken.
Folks, that was a major breaking change. And they introduced it in a "minor" update. I agree with Rich Hickey that "semver" is an epic failure. He even uses package managers as an example in the cited talk, saying, What if you had to worry about what "version" of Maven central you were using? Well, that's exactly what npm did (only in the client).
How did I stumble onto this point? Because it broke the very first CI build I deployed to GitLab. Worked locally with Rollup 0.50.0, but Rollup 0.50.1 (which the CI used, because caret) introduced a regression that happened to break my package.
So yeah, npm's default is not appropriate for CI. It assumes that patch updates are non-breaking, and we all know that they're not.
This has long been a curse of ROS, the Robot Operating System, which is a collection of vaguely related packages which speak the same interprocess protocol. Installing ROS tends to create package clashes in Ubuntu, even when using the recommended versions of everything. This gives us a sense of what version pinning does to you like after a decade.
Tools for computing the total technical debt from version pinning in a package would be useful. You should be able to get a list of packages being used which have later versions, with info about how far behind you are in time and number of updates. Then at least you can audit version pinning technical debt. You can assign someone to bringing the technical debt down, updating packages to the latest version and running regression tests.
If you want to be absolutely anal about versioning, do your commit-based pinning all you want, just don't drag us all into it.
If libraries upon major version changes were renamed instead, the programmer would have to maintain a mental mapping between the predecessor library and the successor, and all code and configuration references to the other artifact would have to be updated 'by hand', without assistance to be gained from the package manager.
Including multiple versions of a library in your project is unsupported in many environments, so if the library author ever envisions a situation where the two 'versions' will be used concurrently, renaming makes sense. In the Java ecosystem, this approach was used for Apache Commons Lang, Apache HTTP Client, and Jackson.
Semantic versioning has good intentions, but adherence and accuracy of the signalling is variable. Absent external enforcement, that's just the way it is -- it's a self-asserted string that no one bothers to validate and everyone hopes they can rely on. If package repositories were more than just storage, perhaps this would be a different story.
It's strange to extoll the virtues of Go's dependency management, because by convention it binds the artifact identity to a resolvable external URL without any indirection. Meatspace maintenance changes like changing the hosting location of the artifact will cause Go to treat it as a different package, and inability to directly communicate externally will also break your build. Every other packaging solution has independently come to the conclusion that abstracting away artifact identity from artifact location is a good thing, but to each their own.
[Granted there's some evidence that language does affect thought patterns both for good and for ill, but overall I'm going to say my answer is...] No. Someone expressing a bad idea in German is still an idiot, except he's more of a Dummkopf.
So I think the problem with this piece is that the title and part of the argument seem to assert that versioning doesn't guarantee anything about package quality or "breaking-ness" (which is true). But then it goes on to assert that this other versioning system would help change that. I doubt it. Naming and versioning are just descriptors of a thing, and may not accurately describe the thing. In other words, a turd by any other name still smells like a turd.
The package itself is the thing. The only way to know anything about it is to test it with your own stuff before you migrate to it. Therefore always pin the last version that works, and when you become aware of a newer version, test with it, read the changelog, make a decision. Treat it like your own software, in other words. Because you're making it part of your own software - the software you'll be held responsible for.
If the author is serious about the proposals, he should really do some work to figure out why they only people to have actually tried them hate them so much, and not just relegate it to an offhand comment in a footnote.
Version numbers are what you expose to the real world, something which semantic versioning tries to standardize. I shouldn't have to care if you use git, hg or copy-paste-versioning internally.
Pinning versions in Gemfile.lock is totally fine. Pinning versions in Gemfile is a bad idea.
You specify what version constraint you want for each dependency (latest.release, major.+, major.minor.+, a specific version number...). The tool does the work of resolving those constraints to specific artifact references, which you commit to your repo in the form of a dependency.lock file that gives you reproducible builds. You can then resolve new artifact versions with `gradle generateLock`, or revert it to a previous commit if there is a breakage.
It's not clear to me what problem OP is trying to solve here.
Perhaps my two links at the start will be better uses of your time. Particularly http://stevelosh.com/blog/2012/04/volatile-software.
> the fact that manually specifying version numbers to avoid running newer code is commonplace, expected, and a “best practice” horrifies me
I think version pinning is not only a "best practice", I think it is absolutely necessary. I demand reproducible builds.
There's a distinction here between how you pick a version in production, and what happens when you do an `npm update` or `gem update`. I'm getting the sense since my last comment that in the Java world the second flow doesn't actually exist. What is your typical workflow for updating your libraries (fixed in the Gradle lockfile) to newer versions? In the Rails world you specify your dependencies in a file called Gemfile, and running `bundle install` fixes the versions chosen for them in a file called Gemfile.lock. In this context, Steve Losh is complaining about the former. Versions in Gemfile.lock are absolutely fine. It's auto-generated after all. Versions in the manually managed Gemfile are a smell.
Does this make sense? I think we have a misunderstanding rather than a fundamental disagreement. I'm actually kinda glad to hear that you had this confusion even after reading the OP (and thanks for doing so!). It makes me feel better about my own writing.
Re-generating the lock file would pull the latest versions that satisfy the constraints. That lock file is not supposed to be edited manually.
Ultimately the important things to the user are "is this version greater than this other version (implying better)", "will upgrading to this break anything" (extremely hard to answer), and "do I have to upgrade this to remain secure" (which may also break everything).
When a major change is made to something, as in a breaking change, where you as the developer are making the conscious decision that you will cause other people's shit to break if they keep pulling "latest" I absolutely think that the major version number should just be part of the naming scheme, not the versioning scheme.
We've actually implemented a similar system where I'm currently working. For versioning APIs, if you want to bump a major version then we force developers to start working with a whole new git repository, a whole new pipeline etc. If we have to patch some defect fix back to the old version, we can just cherry pick across forks, but most of the time we want to manage the life cycle of two bits of development that no longer do the same thing as being exactly that: bits of developed code that do different things
And what if that decision is not conscious (breaking changes due to minor changes happen all the time), or what if the breakage is conscious but necessary to fix something considered a bug? What then?
The format of Product A Version B.c seems to be most actionable as long as it's consistent.
A could be a major shift in the product. So something like Windows 98, Windows 2000, iPhone 3 and 4, Android Gingerbread and ICS.
Version B would update whenever it's unstable and possibly breaking. So a version 2.0 or 3.0 would introduce lots of new things, but new things tend to be unstable, even after extensive testing.
Version .c is more stable as the number increases. So a version 4.88 is more stable than version 4.81
I don't see how more dots (e.g. version 4.81.2 of 4.81c) are actionable.
As the article brings up, the problem is when the defaults highlight to a new shiny unstable version instead of a stable older one. So the defaults should point to the highest c number, just a version below B. If the latest build is Version 7.2, the stable build might be 6.214.
> At this point you could even get rid of the version altogether and just use the commit hash
Right. Until someone brazenly force pushes to a repository, and makes it hell for your package manager. You would now have to manually hunt your dependency's repository for the closest commit hash to the one that was removed. :|
> Package managers should provide no options for version pinning. A package manager that followed such a proposal would foster an eco-system with greater care for introducing incompatibility.
We all want to live in this ideal world. But this requires developers to trust the authors of upstream dependencies. Sometimes this is hard to do, especially for critical projects. I just want my shit to continue working, and not be subject to the whims or genuine human errors of upstream authors.
On another note, I always understood that using the latest version by default was sensible. For me, it means that, in a new project, one should start with the latest version of whatever third-party packages one wants to use. Once development actually gets underway, then the latest versions at that time should be pinned.
If my package is named "LeftPad-17", what's to stop someone else from creating a package named "LeftPad-18", which innocent folk may assume is the latest version of LeftPad.
A project name is semantically different from the version number of that code.
Except that it's not. Sure, as far as halfway-guaranteed compatibility goes, it's true, but the name identifies the project:
* Roughly speaking, the purpose of the software doesn't change with major versions.
* Also, roughly speaking, the people working on the software don't change with major versions.
Both of which are, arguably, more important than raw compatibility.
"I recently encountered this post from 2012 by Steve Losh, pointing out that if version numbers were any good, we'd almost always be looking to use the latest version number."
No, we wouldn't. When I build a piece of software, I build it with a particular version of its dependencies. Then, I test it with those versions. After I've tested it, I absolutely do not need changes in the underlying software changing how things work. Compatibility is never perfect.
"In particular, Semantic Versioning is misguided, an attempt to fix something that is broken beyond repair. The correct way to practice semantic versioning is without any version strings at all, just Rich Hickey's directive: if you change behavior, rename."
Semantic versioning includes a major number, a minor number, and a micro number; they have different semantics. Mapping that scheme to Java, a major number change means no compatibility is guaranteed. But minor and micro changes do make partial compatibility guarantees. Specifically, if you add a method specification to an interface, everything that implements that interface has to change to match it, but methods that accept that interface as a parameter do not need to change.
If user software implements an interface, changing that interface requires a change to at least the minor number. If user software uses, but does not implement, an interface, changing that interface may require only a micro number change.
I'm really fond of how Google does versioning in their internal codebase (from the public information about it that I know). They have a monorepo containing a single version of every library. When updating something, you're supposed to take into account all the places it's being used, and avoid breakages. Tests and tooling play an important role in making it possible.
I really want something like that for the open-source world, instead of having to deal with maintaining multiple versions, and dealing with the complexity of interactions between the all the permutations of different versions.
[/naive wish mode]
c.f. Koji+Bodhi for Fedora, Open Build Service for openSUSE, and the OpenQA instances for both distributions.
In short, we really seem to have forgotten the power of backward and forward compatibility because we assume a very short lifespan for our services (months, not decades).
Most apps merrily pull in hundreds of dependencies. Some deps need a complete build environment, others may require specific versions that quickly lead to fragility, dependency hell and thousands of wasted hours.
These kind of package managers only make sense and see maximum use in SAAS type apps or dev centric environments explaining perhaps why some devs do not realise the issue. But this just doesn't make sense for deployment.
SemVer is like language (e.g., English), more people speak the common language, easier to communicate. But doesn't prevent others who might speak EmojiVer to communicate. Certainly won't stop people using packages from both worlds.
No it doesn't. RubyGems allows you to say "I want to depend on the latest 2.x", and if your dependencies use semantic versioning it works pretty well.
You still want to pin your dependencies versions for deploying the versions you tested, but you should be able to safely upgrade using that scheme unless your dependencies screw up.
I don't love lockfiles, but you need some way deterministic dependency resolution.
I disagree with the “change the name” business. Sure, if your math routines turn into string routines, you probably need a name change. But if you are sticking to generally the same problem domain, but clients can expect breakage, then keep the name. The name is important.
This is not to say there have been bad ideas. Nor that there have been nothing but bad solutions. Indeed, I'd wager most solutions were perfectly fine, in isolation. They typically don't play well with other solutions, though. :(
Nitpicking, but this is not true. Lots of software is never formally released, and therefore doesn't have any version. Sometimes such software is useful enough to make its way into a software distribution, in which case distro maintainer has to invent a version number for it. Ugh.
Sure, semver should be associated (pinned?) to build numbers. Invariant. Immutable. Never changing.
Semver is the public face. But internally, I only care what build number someone is talking about.
The Rust ecosystem might be a demonstration of the inverse problem. Pinning versions is trivial in Rust because its default. You will never be pushed over a major version boundary without intentionally doing it (and "always get the latest" is just the "" version).
The problem then becomes that crates never reach 1.0. Flask is probably the oldest piece of software I use that similarly never reached 1.0 - albeit its been pretty close ever since the project was reorganized last year - but the entire Rust ecosystem is buried in Flasks. Software that is 99% of the way to what the creator wants 1.0 to be, but the last 1% is something nobody wants to do. When you are 99% of the way there, the software works in 99% of use cases, and that last 1% is... the last 1% anyone wanted to do. Which means they didn't want to do it.
Probably the most blatant example of this in Rust is regex, which is blessed crate that is still only at 0.2, among many others (just from my hobby Rocket project are base64, chrono (and time), cookie, dotenv, envy, error-chain, lazy_static, num-traits, rand, redis, and uuid all still in version 0.x. That being said, the impl period going on right now and this entire year of direction from the core team was explicit in bringing as much of these fundamentals to 1.0, so we will see how successful the effort is.
Probably the greatest problem with semantic versioning in that* context becomes feature creep. It is scary to go to 1.0 - you feel free in 0.x land where you can make major breaking changes in a .x release whereas you need to increment that scary major version number to do it later on. And besides the fear there is a lot of logistical headache properly maintaining semantically versioned software - you should expect if you never release a 2.0 to have someone trying to contribute 1.x bugfixes down the road, with the need to release 1.x.y bugfixes for all time. Because 2.0 is, like the article says, a new library - its a different API, you changed the meaning. So you are now maintaining two libraries.
There is also one final fear with the way Rust setup its crates ecosystem - if you are bold and break things but don't end up delaying 1.0 for way too long, you might end up incrementing the major version a bit. And there is a cultural and subconscious aversion to anything on crates.io you see at version 3.0 or heavens forbid 4.0 or more. That software is unreliable, the developer is changing it all the time!. But then you go and use a 0.15 crate that is having the same problem anyway, just without saying "this probably does its job" like a 1.0 can.
In the end, versioning truly is almost meaningless, even in an enforced semantic versioning system the intent breaks down and meaning is lost just because different people release software differently. But that is a real deep almost - because its still more information than not having it, and in Rust right now at least it gives more helpful information than not. I'd call it a success at that point - way more than say Linux, where the major version is incremented whenever Linus wants to run a Google+ poll...
> And there is a cultural and subconscious aversion to anything on crates.io you see at version 3.0 or heavens forbid 4.0 or more. That software is unreliable, the developer is changing it all the time!
I've never seen anyone at all express this attitude, in fact I see the opposite: if you're at any version past 0, you get heaps of praise for being willing to do the work it takes to commit to having stable releases.
I've seen so many people come and go at my last job and at other companies, I don't think anybody cares.
Packages that wantonly broke their consumers would "gain a reputation" and get out-competed by packages that didn't [...]
This quote illustrates a host of fallacious assumptions in the article:
1. That there's a large enough competitive field for any given package that one that publishes a "bad" release can be "punished" by being overtaken by competitor(s). This is weird on so many levels. The costs to switch to a competing library can often be quite high, more than just fixing the current one. Related, there's often only one viable library choice in a given domain, which implies both zero competitors and a high cost of using an alternative (whole-cloth rewrite).
2. QA is expensive: time to maintain automated tests, time to run any needed ad-hoc testing, time and experience to develop the library in a really robust way. But this proposal seeks to put in a competitive incentive that would fragment efforts to solve the same problem. So there's this assumption that the publisher is only ever just incompetent, not starved for resources to implement great software quality. A major success of open source packages is that they enable consolidation of effort (i.e. a community of contributors).
3. Sometimes a library is used in contexts that weren't foreseen by its authors, so a "breaking" change for some consumer is accidentally introduced. This may have happened before a package consumer strongly adopted it. This dashes up against an implication in the article that breakage is only ever introduced, vs. already existing. Packages are never in some pristine state of function and concept. Except for some kinds of simple packages, they're often ongoing, living exercises in understanding the real-world problem domain and applying software techniques to the solution.
Overall, I think saying that modern package managers "default" to unpinned is a straw-man: no sane package manager is intended to be used this way except for a few minutes after setting up a new project. "bundle install" that first time and now you've got a Gemfile.lock. The next time, you get the same packages as the prior install. Likewise, "bundle update" is an operation that should be dirtying your source tree, requiring the usual passes of building, testing, and other software quality process before it lands. AFAICT, nothing in this proposal would ever change these steps: initial package adoption, and software change evaluation. If there are problems with upstream package quality, those root at deeper issues of "why software quality is hard" rather than being the fault of current package managers. It's difficult to see how our package managers could influence this real-world social problem at all.
Unless, you know, you care about the purpose or authorship of the library.
I don't think pinning dependencies at package levels is a bad idea - ubuntu does this for a lot of packages but this relies on properly followed semantic versioning to work.
And how do you encode development versions? Alphas/betas/rcs? Versions are structurally more complex than a simple tuple, and they're strings b/c that's a convenient and easy representation.
And you can't just tell people to not have alphas, etc. Sometimes I need to test stuff on branches, so it can't have a normal, mainline version number; I need to signal that this is a proposed, test version, etc. Ignoring the human consequences here defeats the point of having a version number.
> package managers uniformly fail to provide the sane default of "give me the latest compatible version, excluding breaking changes."
Cargo? NPM? (And yes, I realize the author lists this one!) As best as I can tell, the author's complaint seems to be that you have to actually notate enough information for the package manager to determine "latest compatible version, excluding breaking changes", which both Cargo and NPM are very capable of. (E.g., in Cargo, you'd need to say something like ^1.2.) But being able to pull down the absolute latest is useful in a new project; I'll often use "dep" to start, get comfortable, and then restrict that to dep = "^3.2" or whatever.
> Since we always want to provide the latest version by default, the distinction between minor versions and patch levels is moot.
The distinction is a human, social level communication construct.
And is 9583af87a8b newer or older than ab8a61fe9ac?
I can't tell if this article is advocating that a package's name always refers to a single, unchanging version of the software (which effectively removes any notion of version handling from the package manager entirely) or if we're just putting the major in the name, and always running the latest version for that major. This latter stance seems to imply that the version does carry useful information. But not being able to pin is nuts: if some breakage is introduced, you should just screw everyone? How would reproducible builds work?
> In particular, Semantic Versioning is misguided
The article doesn't convince me that it understands the reasons behind semver: communication between humans as to what types of changes a potential upgrade contains. The tooling supports this so that I can pull in changes that shouldn't break the build, test them to see if that actually is correct, and then deploy them automatically, with minimal risk. Between security fixes and bug patches, I have a need to know what types of updates I'm looking at, and to ensure that the updates that aren't breaking changes get applied diligently — and that's why we have the notion of semver: so that we can all communicate this the same way, and so that our tools can understand it, and act accordingly.
Now, just because semver says that 1.3 should be compatible w/ 1.2 doesn't necessarily mean it is. Mistakes happen. That's what tests are for, and it doesn't mean you need to blindly upgrade; this is why Cargo has a separate notions of the versions that the project should be compatible with, such as ^1.2 recorded in a Cargo.toml, and what versions the project is actually using, exactly, which are contained in the lock file, so that the project can be rebuilt exactly as a previous build was, but can also be trivially upgraded where and when possible.
The article doesn't acknowledge the very real problems that semver tries to solve, and doesn't tell us how its proposal (whatever that is) would solve these issues. Instead, we get "if you change behavior, rename"; the very real outcome of this proposal is that either people would never apply another security patch again, and that dependency would rot, or they'd start reinventing the wheel named semver.
What we need is good, language independent, tooling to automatically select which versions to use, how risky the update is going to be, what problems are being fixed.
Tracking and correlating successful and failed updates centrally - call it "distributed CI".
Tentatively upgrading dependencies, running tests and rolling back on error.
Don't forget that even minor, bugfix releases introduce bugs or break code by actually fixing bugs that people were inadvertently relying upon.
Lack of automated tools only encourages the "vendorize, ship and forget" model that is so popular.