Hacker News new | more | comments | ask | show | jobs | submit login
Why Semantic Versioning Isn't (2015) (gist.github.com)
61 points by hliyan 19 days ago | hide | past | web | favorite | 44 comments



Rich Hickey made a similar (but more clear) case against semantic versioning in his Spec-ulation keynote https://www.youtube.com/watch?v=oyLBGkS5ICk

He advocates talking about "breakage" and "accretion" instead of the vague term "change".

His most controversial point, as far as I remember, was that any breaking change (i.e. a major version bump in semver) might as well be considered another product/package/lib and should better choose another name instead of pretending to be the same thing with a bumped number. For me that is taking it too far in most cases (e.g. when the intent of the software remains the same), since it would also have very disruptive effects on the community, documentation, reputation, ... of the package.

Another thought about semver: a bugfix requires only a minor version bump, but IMHO this could also be breaking if people were relying on the buggy behavior. I see the value of semver, but I guess it will always be a too-neat abstraction over something that is inherently too complex to communicate in a simple version string.


was that any breaking change (i.e. a major version bump in semver) might as well be considered another product/package/lib and should better choose another name instead of pretending to be the same thing with a bumped number

Rich Hickey is a proponent of "hammock-driven development." He's the antithesis of the Zuckerberg "move fast and break things" culture. If you're considering a breaking change, then maybe it's a good idea to hang on to it for a while until you've got a clearer picture of what you're doing. Then maybe you can aggregate a bunch of breaking changes into a partial (or total) redesign and ship that as a new product.

Yes, sometimes you may end up with the Python 3 scenario. But I think that's the right place to have a philosophical debate, rather than between only the insiders who participate in the decision to merge the breaking change.


As a Clojure user I very much agree with Rich's careful way of designing software.

But in case of the Clojure language itself, I'd personally prefer a big-bang version 2.0 with significant backwards-incopatible changes to clear up some accumulated inconsistencies, over a fork into yet another language with a new name. Such a fork risks splitting the community in two, each below a critical level, and also makes the decision to upgrade more open-ended (i.e. a 2.0 release explicitly supersedes 1.x and thereby communicates that it is supposed to be better and recommended. A fork is much more open-ended and provides more arguments to part of the community to stay behind).

But there is obviously a point where the original is sufficiently unrecognizable that a new name is more fitting.


> Another thought about semver: a bugfix requires only a minor version bump, but IMHO this could also be breaking if people were relying on the buggy behavior. I see the value of semver, but I guess it will always be a too-neat abstraction over something that is inherently too complex to communicate in a simple version string.

I find this approach to versioning and bugs problematic, because public APIs are about the contract with the user. It's hard to work with contracts with undocumented portions. If my API says that the behavior is supposed to be one thing, and in certain cases its different, my obligation is to bring the behavior to be consistent with what my API documentation says it's supposed to do. I feel like any other position would be a slippery slope into arguing that any change in behavior needs to be considered a breaking change, and therefore there's no such thing as a patch version change.


It's easy to treat all things in programming as if they're algorithms, but we should remember that there's people behind them too.

As a maintainer, it's on you to communicate effectively to your users. If you're patching buggy behavior but it's likely to break code, then a major bump may be appropriate. Likewise if your refactor is massively changing your basic contracts, then it may be wise to fork your own work and move to a new repo and name.

Semvers value is that it offers a clear enough guideline most of the time, which is about the best we can ask for.


Can't seem to find any cohesive point in this piece.

NOTE: Should be noted this is outdated as it's from 2014 (title says 2015, but that's just the last—minor—revision date). Particularly the stuff about Node running 0.x for a long time—effectively subverting the benefits of semver.

Arguments seem to boil down to:

- semver is harmful because some major libraries don't use semver (mainly discusses Node, which has switched to semver since this was written)

- semver is overly restrictive? because sometimes you want to make breaking changes without incrementing the major version number? Maybe? Don't quite understand the arguments made here so I may be misreading.

Is there anything else said?


> Can't seem to find any cohesive point in this piece.

Compressing complex, nuanced information into three integers is a fool's errand.


Isn't the whole point of semver that it's not complex or nuanced? A version is either backward-compatible or it's not.

The fool's errand would be expecting semver to encode all information about a version. That's not its purpose. Its purpose is to convey to users of a version some basic information about how it relates to previous versions from a user's perspective.

For more complex or nuanced information, the semver triad is not the place to look. Use git history, readme, changelog, release notes, etc.

It is tiresome to see people dismiss semver as bad because they misunderstand its purpose. "Semver is bad because it isn't what I think it should be" (then publish your own scheme), or "Semver is bad because it doesn't encode everything" (only the source itself could do that), or "Semver is bad because it would require some arbitrary numbers to increment faster than I am comfortable with" (i.e. "eww, semver has cooties").


SemVer does not handle complex or nuanced information, which is one of the beefs the article has with it. SemVer does have answers for the question it asks "what if you only broke it for 1% of users?" -> "tough luck, increase the major version" It seems that the author constructs a straw man for SemVer (albeit from real life examples) and then complains that it no longer serves its purpose.

Contrary to the article SemVer is okay to determine whether there is a compatibility break, as long as it is followed to the letter. Is is overly strict? Yes it is, but it does guarantee results. Maybe an argument should be that not all libraries should not follow SemVer because it is a burden similar to certifications and regulatory procedures.


What if a patch or minor version change causes the library to break?

This was a question a lot of people had to ask when npm didn't use package-lock.json and the `--save` option would automatically select a range of allowable versions.

Your coworker could `npm install` something entirely different than you and have all sorts of bugs because of it. And there are plenty of documented cases of exactly this happening.

The issue I see with semver is that it generally does not survive contact with the real world. People make mistakes when they package software, especially when they do it "without any warranty, express or implied". There's simply no substitute for testing whether a new package version will work or not; so in a realistic sense, semantic versioning simply doesn't work.


I agree that the version number does not give you a guarantee, that the software is without bugs. However, it does work the other way around: if the major version was bumped then you know something has broken and you will have to modify your code. If it was a bug fix release and something breaks, it is on them, not on you (as in, rollback your dependency upgrade and complain on their bug tracker). That is, if you did not rely on buggy behavior in the first place.

As a side effect, if you really try to follow SemVer it will force you to think ahead a bit more because the fact that you broke the API is reflected in something quite obvious, rather than buried in a change log somewhere.


I mean, you're still going to be issuing release notes.

I think it's important to realize that semver isn't about how accurately you summarize things. It's about what actions your customers will take from your version numbers.

If they're enterprise, then they'll expect to be able to apply anything short of a major version number change with an in-place software update, and they expect that they will not experience any significant or foreseeable breaking changes. It doesn't matter if you're using semver or not, this is how your enterprise customers will behave.

Part of the issue here, I think, is that you need to be roadmapping your project. If you're just churning and churning and producing features and changes all the time and you're not planning that, say, changes X and features Y go with v.Next, then, yeah, semver is going to be really hard. However, even if you're on the extreme end of agile you had to decide some set of features that made your product v1.0, didn't you? Why can't you do that again? You just have to decide how many features result in a major increment.

It does seem to lead you down the Java/Chrome versioning problem where the primary number just doesn't have meaning anymore (i.e., both Java and Chrome are technically still at v1.x, but they dropped that major version number because they realized they were never going to increment it).

Edit: I realize I'm describing semantic-ish versioning, but I'm of the firm belief that rules are meant to be flexible, rather than blindly followed. RFC 2119 is not a suicide pact, as it were.


If so, shouldn’t we add more data—-a fourth integer? TFA seems to suggest removing even what little benefit the three integers bring.


The elm programming language features "Enforced Semantic Versioning" for all packages. The compiler makes you major version up if you have changed the API.

https://elm-lang.org/


That seems like false security. Even if the API is exactly the same, changed behavior can easily break things.


Major: "you will have to change your software.

Minor: there are API changes which you might want to use.

Micro: the API hasn't changed.

None of which means you can skip testing when upgrading.


I use it more as a helpful guide than as gospel. Compared to arbitrary versioning systems where you have no idea whether something will break or not, you have a bit more insight as to the nature of the upgrade. You still need to run tests and qualify for regressions, but overall i've run into maybe 3 minor/patch regressions over hundreds of upgrades, and none of them have made it to production.


In practical use of semantic versioning my biggest problem isn't with the concept itself, but with my (personal) human perception of version numbers as seen throughout other software and projects on a day to day basis. A major version bump usually indicates a lot of changes, new features and most of the time also breaking changes. The major versions kind of indicate important milestones and not only "if you update, some stuff is incompatible"! So when I have to make a small, but useful or necessary, breaking change I'm reluctant to either bump the major version or deploy the change at all until there is enough reason to bump the major. I don't want to have a small, early stage library, to be on major version 6 just because I made multiple small edge-case changes in the API responses. While actually not true and maybe not even seen like it by most developers, I would see a version 6 as a rather mature project.

It's a small but often occurring pain factor with semantic versioning for me and the easy way out is to just make a minor version bump on these small breaking changes.

On the other hand, I know that it's useful to be reluctant with breaking changes, especially on widely used and big projects, because it takes other people more effort to upgrade the version. But for my small projects I would personally prefer a fourth number: major.breaking.minor.patch. The breaking part is used for simple API changes, especially for different edge-case behavior, while major is reserved for big milestone upgrades.


I think of this as: marketing.major.minor.patch



As far as I can tell, the author deplores that SemVer doesn't tell you the magnitude of changes as used to be the case.

Or it could be that it isn't perfect, but clearly neither is the system it replaces.

My problem is that it argues from a premise about what SemVer is/means to people. SemVer is not magical it's just a simple contract linking the public interface with the version number.

As such it's just a different trade off from the old ("magnitude of change") model. I personally find it more useful — it tells me whether I should look up breaking changes or not.


> I personally find it more useful — it tells me whether I should look up breaking changes or not.

It doesn't though, does it? It's a contract, sure, but there's no guarantee that the producer of the software you're depending on is actually adhering to that contract. It's the same for "romantic versioning" or whatever – regardless of what the version identifier looks like, it's your responsibility as a consumer to actually determine whether the new version fits.

I think the point of the article is that trying to mechanize the meaning of the version identifier is pointless, unless it's also properly enforced. It's just an id, no better or worse than a commit hash really. Thus, you may as well design version identifiers for humans, rather than potentially robots.


The point of semantic versioning is to give the version number more meaning than just a commit hash - to give the version number semantic meaning. It prioritizes what the user should expect over what the developer feels about the new version. Yes, people may claim to follow a scheme, and then not actually follow that scheme, but that's not an argument against the scheme. A similar argument could be made to all schemes that try to provide guarantees based on their use.

I claim semantic versioning is useful to humans. It communicates to humans what their expectations should be on API changes. That "robots" can take advantage of these rules is a bonus - and a good indication it's a good system, as creating predictable rules for things that we can then automate is rather the basis of software ecosystems.


> there's no guarantee that the producer of the software you're depending on is actually adhering to that contract

That goes right back to "but neither is the system it replaces". I would rather know that an overzealous producer has guidance even if he or she doesn't end up accepting that guidance.

> Thus, you may as well design version identifiers for humans

The tone of the reaction had a lot to do with the author insisting that semantic versioning has no semantic meaning to humans in the same thread where multiple people had made the case that it does. "170.0.0" is meaningful, he just doesn't like it.


Monotonic versioning [0] seems to be a strictly better scheme; "semantic" versioning doesn't actually include meaning, just the opinions of the authors of code. A language's development environment could enforce the monotonic requirement.

[0] http://blog.appliedcompscilab.com/monotonic_versioning_manif...


That doesn't seem to allow any kind of update to old versions, the most notable of it is security patches for past versions. But also supporting multiple concurrent branches of development (such as angular 1/2, or mobx's 4 and 5 - both are set to be supported for quite some time after their version is updated).


Elm[0], Rust[1] and probably plenty of other languages have "development environments" that enforce the semantic versioning requirements.

[0]: https://elm-lang.org/ "Enforced Semantic Versioning" [1]: https://github.com/rust-dev-tools/rust-semverver


For me, seeing a major update means, "go read the migration documentation first."

I've been playing with two ways to handle upgrading my dependencies:

1. Do them all at the start of a release cycle. Fix the bugs and breakages.

2. Update nothing. Ever. Unless I find a new feature I want or known issues that need to be fixed (security, stability, or performance).

I have yet to decide if one is better than the other. Neither has particularly hosed me... Yet.


The trick is that when your dependency tree is huge (as it is in typical ruby , and _even more so_ in typical Javascript, projects these days) -- you can't possibly have time to go review docs for everything in your dependency tree (including indirect dependencies).

The promise of semver is that you can let tooling _automatically_ update all dependencies that haven't bumped a major version. Thereby getting bug fixes, security patches (important!), and even new features, without having to manually review changelogs for hundreds/thousands of dependencies. I think that's basically it's whole point, the context of automatic tooling.

Is it perfect? No. Because some dependencies don't use semver; some maintainers make mistakes and there are 'bugs' in their release numbers; _and_ sometimes bugfixes and even security patches are available _only_ in a major release, so your automatic updates might miss them.

Is it helpful? In my experience in ruby, immensely. (It's probably no coincidence that semver was invented by a rubyist, at about the same time bundler, the tooling that can easily support this kind of updates, came into existence). You have to deal with the edge cases where it doesn't work, but it still works _a lot_, eliminating a _lot_ of manual review.

Ironically, javascript, which seems to be the ecosystem with the hugest dependency trees, only fairly recently got tooling that can conveniently do this kind of dependency update based on semver -- if you choose to use it this way. It depends on tooling, but also on developer behavior, both as maintainers and as consumers of dependencies, to have standard practices where "update everything but without any breaking changes" approaches are feasible.

I think it's indisputable that semver has made it more feasible to handle larger dependency trees in ruby. In fact, I think the most valid critique would be based on it's _success_ -- the "tooling" (including semver as a "social tool") has allowed much more complex dependency trees to be feasible, at least initially, and this is a _bad_ thing, because it opens you up to much more complex, dangerous, and hard to deal with dependency management hell problems down the line, when you have so many dependencies you can't feasibly even have any idea what they all are. Which you couldn't have even gotten to that point without the tooling that made it seem like it was feasible, and it shouldn't have.

This could be a reasonable argument, worth considering, I'm not necessarily making it myself. We quite literally couldn't be creating the software we are creating without the growth of open source dependencies, it would be too expensive.


I agree with the author that the changelog may be more important for humans. So instead of having machine readable version numbers, maybe we should focus on having machine readable changelogs where items belong to a predefined taxonomy? Aggregating changelogs from multiple packages would make it easier to see the potential impact on your solution.


I think this is an idea that's in the air: I pitched something similar where I work. A file which tells you information about CVEs patched, bugs fixed, features added. A list of teams affirming it passed their testing. Estimates of installation time. Opinions of necessity, urgency and difficulty.

This can, theoretically, be encoded in semver. But why go through that? I can easily publish information at a known endpoint, or bundle it with the package. Lossily compressing information because of the limits of software distribution circa 1988 is silly.


Good Lord. I'm going to paraphrase, since it's late and the gist doesn't really lay out a rational set of arguments.

"Not all projects use it!" Well, no, getting universal adoption of an idea is hard? That's not SemVer's job or fault.

"I don't have a well defined API, so what is a breaking change?" SemVer is going to be the least of your user's concern; the ill-defined API is going to hurt more when inevitable breakage leads to squabbles as to what is or isn't part of the API.

"SemVer won't save you from mistakes" Well… no, of course not. That's not the point. Mistakes happen. But it gives me, a developer, an idea of what to expect from an upgrade. In basically every other versioning scheme, I have to assume that any and every upgrade might include intentional breaking changes. With SemVer, I have an expectation: this shouldn't break changes. (But that's no substitute for testing, of course.)

"You should write a changelog instead!" You should just write a changelog too. If you're doing SemVer well, I might never need it, but if something accidentally breaks I'll want it. Without SemVer, I don't have a choice: I need that changelog to try to assess what could have been communicated in the version number had you used SemVer: is this a breaking change?

SemVer adds meaning to a version number; that meaning can be used to derive useful information, either as a machine or a human: that this change will break things.

Ideally, when a maintainer chooses to cause breakage, it can be small, minimal, and with a clearly communicated upgrade path (that ideally can be automated with tooling). The major bump tells me I might need to adjust my side / consumption of this dependency.

Since the article suggests nothing else in place of SemVer… most other schemes are date based (not relevant information to encode?) or just bump random numbers whenever they feel like it (Linux, I'm looking at you).


So much this. SemVer isn't perfect but it's a lot better than nothing.

In addition, generally speaking folks are pretty good at sticking to it. I've analysed the number from all the updates Dependabot makes to build SemVer compliance scores for packages (https://dependabot.com/compatibility-score/). Running across all packages for the Ruby ecosystem you get:

- Patch releases generally pass CI for 97% of their users

- Minor releases generally pass CI for 95% of their users

- Major releases generally pass CI for 85% of their users

It's not a perfect system, and library maintainers should definitely keep a changelog too (https://keepachangelog.com), but it's a lot better than nothing, or any of the alternatives.


Dependabot is great, and I love these statistics too :)

Do the above stats for patch and minor releases ignore 0.* releases?


They do, yep - only post 1.x releases are included.


Thanks :)


> SemVer adds meaning to a version number; that meaning can be used to derive useful information, either as a machine or a human: that this change will break things.

But it does not work like that in practice: people seem to be very fearful of being the guy who introduced a breaking change in a minor increment, so they increment the major whenever they are not sufficiently confident that there can't be a corner case where an API consumer would break. Since that corner case is almost universally possible (e.g. your new API extension's names might collide with names an API consumer might have foolishly declared under your namespace) the mindset of "when in doubt, increment major" dominates. The practical outcome of semver is that the important distinction between "chances are your app will be fine after the upgrade, no promises though" and "you might actually have to rearchitect a bit" has been completely lost.


> people seem to be very fearful of being the guy who introduced a breaking change in a minor increment, so they increment the major whenever they are not sufficiently confident that there can't be a corner case where an API consumer would break

Again, see the point about not having a well-defined API. (Again, this is taking issue with SemVer when you're not actually practicing it. But let's say I bend the rules a bit: what you suggest is fine, really. If you inadvertently bump a major and describe what the change is s.t. I can account for whether it breaks me — that's ok. As good as what SemVer wanted, no, but still better than just arbitrary versioning. There's likely still some subset of changes that you can reasonable assume are non-breaking. A commit that does linting only? Not going to break your API.)

> Since that corner case is almost universally possible (e.g. your new API extension's names might collide with names an API consumer might have foolishly declared under your namespace

In the few languages where this is possible, it is almost always implicitly not part of the API. E.g., in Python… if you're monkey-patching members/attribtes inside another package and that causes breakage — SemVer is not your problem here. You're going to have issues under any versioning scheme.


I think it's worked out pretty well in ruby. Which I believe is the ecosystem the semver inventer was working in when he invented it, so maybe that's why it's a good fit.

While certainly not perfect, I think semver has greatly eased dependency management in ruby ecosystem.

Certainly more sophisticated forms of machine-readable change information would be useful, if feasible to implement and maintain at a justifiable cost for the benefit provided. But that's no reason not to use a simpler thing that helps.

Unless it is making dependency management _harder_ in some ways in some ecosystems. (Or perhaps making it no easier or harder, but at significant expense, or the cost of confusing devs as to what's going on). That would be an interesting argument. This post didn't make it or provide any such examples from actual experience or with concrete context.

What matters is if it helps or hurts or does nothing in actual practice, which is tricky because you it's hard to know that in the hypothetical, you need a dependency ecosystem where large numbers of maintainers are trying to use semver to know, instead of just be trying to predict, how it's working out. But we have several of those now. What ultimately matters is if it's helping or not. I think it is helping significantly.


The SemVer summary states:

Major: for incompatible API changes

Minor: for backwards-compatible new functionality

Patch: for bug fixes

A few sparse thoughts:

The definition seems to imply that version 1.1.0 can have twice the features of v.4.0.0, which in turn might have exactly the same features as version 1.0.0, just backwards incompatible.

APIs that strive for backward compatibility never get a "major" release? Has Windows ever had a "major" release?

If anybody was relying on a bug to be present, does that qualify the patch as major? And how do you know it in advance?

I can understand the usefulness of SemVer; what I don't understand is why it should replace instead of integrate the usual release numbers. Just because they both are described by the same English word?


> The definition seems to imply that version 1.1.0 can have twice the features of v.4.0.0, which in turn might have exactly the same features as version 1.0.0, just backwards incompatible.

That's possible. Feature regressions sometimes happen when software is redesigned, or obsolete features may have been deprecated and removed. Either case would be a breaking change leading to a major version bump.

> APIs that strive for backward compatibility never get a "major" release?

That's right. Why would you announce that you're introducing breaking API changes when you're actually maintaining backward compatibility? That just makes headaches for you users, and for the sake of what? Marketing?

> If anybody was relying on a bug to be present, does that qualify the patch as major?

If fixing the bug changed any official, public, documented APIs in a breaking way, then yes—or if the documentation was inadequate such that the implementation essentially was the official APIs. If it only affects users who were not following the APIs, that's more of a grey area. I'd say that's their problem, not the library authors'; if you use a library in interesting and novel ways, contrary to the directions, and it breaks as a result, you get to keep both pieces. That isn't a bug in the library and doesn't reflect on the semantic versioning.

> And how do you know it in advance?

Sometimes you don't. In that case the unexpected API breakage is a bug just like any other, and you'd need to retract the minor/patch release and either bump the major version or fix the break in the API.


I honestly think a lot of complaints about SemVer could be fixed by just tacking a vestigial "1." at the front of the number.


The Java solution.


Wahhh SemVer isn’t perfect, thus it’s useless, I’m so much cleverer than everyone...

Then a junior dev comes along at work and parrots this suggesting we just use the git sha, sweet bejeeesus!




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: