Hacker News new | past | comments | ask | show | jobs | submit login
Why Semantic Versioning Isn't (gist.github.com)
60 points by jashkenas on Aug 29, 2014 | hide | past | favorite | 35 comments

There's a lot of things you can say about Semantic Versioning, but it's hard to say that it isn't semantic. What is it, then? Syntactic? Temporal?

The fact of the matter is that Semantic Versioning does capture something about the behavior of the release relative to others. It helps you answer the question, given the same inputs (where input can be defined as a sequence of API calls and their arguments), will these two versions produce the same results? That is inherently a question of the semantics of the libraries. In fact it's very similar to the question that the Liskov substitution principle[0] asks of objects and their classes, but using software libraries as the reusable, swappable component.

People make mistakes with Semantic Versioning. Sometimes, they just flat out don't care. The fallibility of a humanity is no excuse to throw out a perfectly good concept that can be very useful even when executed improperly, given that no other system is at hand to assist with the complex task of building large software projects out of a bunch of smaller parts.

Properly understanding what the "semantic" in Semantic Versioning means can go a long way. For example, if semantics is what the version number is looking to capture, perhaps a good place to turn when deciding how to bump a version is tests. Have they changed? How so? If you're doing property-based testing, answering this question becomes even easier.

[0]: http://en.wikipedia.org/wiki/Liskov_substitution_principle

Well, at some point the argument regarding the "semantic" nature of the build numbers is just silly. Especially in today's world where it is very possible that you could do things in a version 1.x.x of a product that you can not do in a version 2+.x.x. Anyone that just takes in any x.x.y update is asking for trouble if they don't have good testing in place. How does the saying go, "one developer's bug is some user's feature?" That is, if the only real meaning I can get is that "these two aren't compatible," I don't care what friggin scheme you did.

I confess that I am particularly annoyed with semantic versioning romance where it is deemed capable of solving all dependency issues internally to a company. To that point, this piece really resonated with me.

But, at the end of the day, I don't particularly care that there is "some breaking change" between two versions much more than I care that there were additional methods added. In the end, I want to know exactly what changed between things. As such, a changelog and a non-automated update system that is audited and purposely used is much preferrable to the promise of semantic versioning.

Though, I do have to cede that the promise of semantic versioning is better than the reality of never updating that happens for many folks.

1. That some things are possible 1.x.x that aren't possible in 2+.x.x is a fact that nobody would deny, and is consistent with semantic versioning. The point is to clarify the vague terms of "compatibility" and "breaking" by tying them to semantics of the library, as described above. Ultimately, this is a determination made by the developer. Any responsible operator is going to test before deploying, and that's how it should be regardless of versioning scheme.

2. In combination with automated testing, it can go a long way.

3. The first part doesn't make any sense to me. As to the second part, semantic versioning and a changelog are not mutually exclusive. In fact, the practice of maintaining semantic versions typically leads to much higher quality changelogs.

4. Yes, the question is: If not semantic versioning, then what?

For me, the answer would be conscious upgrades of dependencies with a thoughtful view of the changelog.

Of course, I think I would also prefer less of a web of dependencies between so many different things now. It is almost comical. A few years ago, the complaint was that something like Java's JButton would have upwards of 160+ methods from the web of interfaces and superclasses it came from. Seems that today, that web "you can't possibly comprehend it" lies more in the dependencies of a product than in the individual classes.

And this only gets worse as some dependencies grow to encompass more and more.

I suspect you can be quite rigorous about the definition of a "breaking change." Removing methods from a public API or changing their interfaces (including adding or deleting required fields from payloads, or removing/re-typing any existing fields from responses) is just a quick stab at it.

It seems like, rather than attempting to document some of the complexity of API changes/versioning at an admittedly coarse grain, your argument is that we should stick with version "truthiness." I don't think I can get behind that.

> I suspect you can be quite rigorous about the definition of a "breaking change."

You'd be surprised. Here are two real examples.

1) Project X 1.5.2 inadvertently returns `undefined` in some edge cases when it is documented to return (and previously returned) `null`. Due to the way people use the API, it's rarely encountered (hey they're both falsy) and nobody even bothers to report the discrepancy between docs and behavior until 1.7.2 is the current version. Unfortunately, some large projects that are very common use the actual behavior of 1.6.0 and are checking for a return `=== undefined`. Are you justified in breaking them?

2) Project X 1.8.0 does a big refactoring that eliminates tons of bugs and improves performance, and even passes extensive unit tests with flying colors. Unfortunately there are lots of "Ten things I learned by reading the source for Project X" blogs that describe non-public API behavior. And again, the projects using these behaviors are popular projects that many other people depend on. How do you deal with these breaking changes of undocumented functionality?

Bugs and non-public API shouldn't be covered. If you were supposed to do X but you did Y, it's OK to fix that and break clients who rely on Y without bumping your major version. It's nice to keep stuff working when you can do so reasonably, but not a requirement.

These questions are tricky when you're building a popular OS with a lot of third-party software whose users will blame you if they break on a new release, even if the third-party software maker is actually to blame. They're a lot less tricky in a case like this, where you're providing a library for other programmers to consume.

If it's not documented it's not part of the "public API".

Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it should be precise and comprehensive.

These can both be considered fixes. Undocumented functionality is not part of a public API.

So I know this is an idealistic point of view, but:

1. It's documented to return `null`. The fact that it returns undefined in some edge cases is a bug. If you want to be nice about it, you can approach them and explain the upcoming behavior change as a heads up, however it's still a bug that should be fixed.

2. No sympathy for those projects here; you can't use undocumented/private interfaces and expect them to be officially supported. I do it myself, however I do it with full knowledge that every single version change is potentially breaking to me, and I make sure to have unit tests to confirm the behavior still exists in X version. I may also approach the project and say something like "Hey, I found using X undocumented interface is actually really useful, how can we expose this as an official thing?"

It isn't just an idealistic view, it is somewhat counter to the view where source is the ultimate documentation.

This is especially true in projects that don't have the resources to maintain a well done documentation site with the source.

It is also completely counter to attitude of the most successful open source maintainer, by many measures.[1]

[1] https://bugzilla.redhat.com/show_bug.cgi?id=638477#c129

You would be better served by having a roadmap of features that are planned to go with (or came with) major versions. Following that, don't bloody make heavy changes that can be avoided.

And realize, that for many folks, they don't consider the version number of the product so much as they just consider the name.

Semantic versioning is useful for specific interfaces.

Not long ago I implemented a scheme almost exactly like it, but it's for a specific software component. The client is compiled against a header file for that component, and pulls in, statically, the major and minor numbers. I've implemented this sort of thing in the past too more than once.

At run time, the client is linked dynamically to the component, and so it can be calling a newer version than the one against which it was compiled. The component supports a "checkversion" function in which the client reports the numbers with which it was compiled. The function returns the current numbers to the client, along with an indication of whether there is compatibility (which the client can ignore). If necessary, the component uses the client's info to implement any backward compatibility hacks needed for that client. The client can also use the returned numbers in some way, if necessary.

These numbers for that component have nothing to do with any official version number for the project, or the package which contains that component. They are, exactly as this article says for the "robot", not for humans.

It definitely is "semantic" versioning: the numbers have rigorously defined semantics, which is evaluated and used to make a concrete decision based on rules.

Um, yeah, semantic versioning is for machines. That's the point. You're primarily writing software to be consumed by other software, not humans.

If you're actually shipping a product, then go ahead and have your human version number. No one needs to know the semantic version.

Just don't try to put your round humanized version peg into the square semantic version whole.

When underscore does get a major change that the author feels is actually worth a major number signal, then it's quite possible that we'd be better off served by a name change: 'super underscore' or something in the same vein.

When such major changes happen, there are usually a large contingent of users that are disappointed because the old version will probably stop getting enhancements and may even stop getting bug fixes.

A name change is also a very strong signal that the API changes are significant. Wider adoption of SemVer has ensured that major number changes are seen as less significant than they used to be, especially as the number gets larger.

After reading "Semver Has Failed Us" [1] and this current debacle (among others) I feel like something like "ferver" [2] is more practical. The funny thing is, is ferver is what a lot of libs are de facto, even if they are trying to rigidly adhere to semver. `save-prefix = ~` has been in my ~/.npmrc ever since the default became `^`.

[1] http://www.jongleberry.com/semver-has-failed-us.html

[2] https://github.com/jonathanong/ferver

Another view:

Of all the recent ideas of how version numbers should work, SemVer has this advantage: I know how to interpret version numbers.

But when FF 31 becomes FF 32 and Chrome 37 becomes Chrome 38, what will that mean? (I honestly don't know. Can anyone explain?)

It means 6 weeks have passed since the last Firefox release, and they are releasing a snapshot of whatever features they were able to make stable in that time. Also, it will not be an extended support release, because 32 % 7 != 3 (every 7th release is an ESR starting at version 10).

Well, now at last I know. Thanks!

Semantic versioning means less for a product like Firefox or Chrome than for a library like Underscore that thousands of other pieces of software rely on.

the difference here is that a library does usually have some notion of well-defined interface/API and thus it is possible to determine/track and thus to communicate when the major (backward compatibility breakup), the minor (improvement, extension), and bug fix level changes happen. In case of the products like Chrome there is too much internal engineering effort would be required for too small payout in trying to define and maintain what the "interface" of the application is and when and what kind of changes happened to it.

Browsers expose a large number of well-defined APIs to web pages! But breaking backwards compatibility for those is very rarely acceptable.

SemVer has some objective meaning. How much meaning, how accurate it is, how objective, and how useful it is can be debated. The point is that it has some kind of objective interpretation at all.

This is in contrast to almost every other versioning system which basically boils down to "and how does this change make you feel?"

Browsers aren't a great example as things like that really need to be A: Update to date and B: Can't break, i.e. no web browser is probably going to be able to remove the <FONT> HTML tag even though it should be out of use by what... a decade already?

So all you need to see is if there is a newer version than your current one. v3.4.31 or v3.6.2 isn't really helping tell me anything.

If I really care I have to read the patch notes either way.

>no web browser is probably going to be able to remove the <FONT> HTML tag even though it should be out of use by what... a decade already?

You'd be surprised. Even thousands of webpages written TOMORROW will use the FONT tag.

Version numbers attempt to compress an awful lot of information into one totally ordered number. I'm really looking forward to seeing the approach in https://thestrangeloop.com/sessions/towards-annex-a-fact-bas...

I think the only issue is that the middle digit needs to become a 'breaking change', the last digit anything which isn't, and the first digit should be left to the maintainer to increment as he/she sees fit. It seems like that's how most people care to use it, and it also bypasses the '0.x.x is special issue'.

We use this format http://datever.org/ in our shop.

It's easy to point out problems with semantic versioning; I'd rather read about solutions. Semantic versioning is not harmful. What's harmful is its misuse; don't rely on all projects to apply it the same way.

With or without semver, I'm going to lock my dependencies and test thoroughly when I do bump them. But semver sure is handy to help me comprehend the degree of changes in project dependencies. It's also great for finding the sweet spots among common dependencies.

I'm all ears for something better, but semantic versioning does not need to be condemned, we simply need to use it correctly.

It's a false promise... but also a worse-is-better phenomenon. Good enough, if your ignore the problem cases, to allow rapid progress. It tempts you, and delivers convenience often enough you come to rely on it, only to then discover more subtle problems it was hiding later. It's the worst system, except for all the others.

I've always thought of semver as documenting common sense: fix bugs often, add features when possible, and don't break the public API unless you have to.

If all projects strictly followed semver, we wouldn't have this debate. But it's not semver's fault that some developers ignore common sense.

Simple, add an optional human significant "ultra" version (better name suggestions welcome) to indicate a philosophical version upgrade consistent with how major versions have historically been used (as marketing communications).

E.g., or 1.7.0 are equivalent.

Is there not some kind of clever code or progress score to be derived from the churn of a projects test cover? Might be a cool way to encourage rolling source control and writing tests into a core way of judging quality.

semver has had a major success has been in the osgi communiy where adoption seems like 100 pct. part of the reason is the fantastic build tool bnd which can analyze your code and tell you if its created a breaking change. personally I dont think such a tool is even worth pursuing in languages like js since redefinition and meta programming is a common practice. the failure for me is the language and the community not the idea.

I really wish OSGi wasn't in that ubiquitous space where it gets nothing but 5 and 1 star reviews. Worse, each end of the spectrum seems to have an agenda.

Seriously, it sounds like a nice idea and all. But it also sounds like an increase in work. For many shops, just getting out what they are currently committed to is tough. Increasing the commitment level seems counter to getting things done.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact