I disagree with much of this post.
> Go makes updating major versions so cumbersome that in the majority of cases, we have opted to just increment minor versions when we should increment major versions.
I disagree with this, wholeheartedly. I think Go's module system making modules separate packages on major versions is actually great once you actually know how it works. The problem is simply that it's non-obvious and not well understood. I think they could have made it more obvious requiring in on v1 for starters.
Beyond that you should absolutely be tagging major versions for every breaking change, and it should be something a developer has to stop and think about. Not something to be taken lightly, EVEN with regard to internal libraries.
You can learn how to code things preemptively so they are flexible for additive changes and require fewer breaking changes. It's almost always worth the effort in the long run.
To my posts point, the problem is just people don't know how it works, and it's design makes it something you learn late into a project, when you're tagging v2, rather than something you hit early.
FWIW, my post for anyone didn't catch it:
The problem is a mismatch between the Go team's perception of semver usage (or perhaps how they think it should be used) and its real-world usage. Some examples:
Major versions are used to communicate more than breaking changes.
Project identity doesn't change if a project changes, it is not a new package.
Hardly any projects use strict semver in the real world, and other package managers allow a more flexible, looser semver.
Just what is a breaking change anyway? Real-world packages make small security changes which break compatibility, sometimes minor changes do too, and that's ok if breakage is very limited in scope.
In the current go modules, new major versions are not easily discoverable or announced, and new importers will need to consult docs or know about go get @latest to find the latest version. This could be fixed with tooling if the problem is recognised. There are possible solutions if they want to keep /v2 paths, but we shouldn't pretend there are no problems caused by this change. It makes it easier to produce breaking changes and harder to version increment just for new features (as most projects from linux to rails do).
Personally I think it condones breaking changes and will lead to more breaking changes, not fewer - if you're forced to change import paths at higher versions why not just delete that api you don't like? Within Google say there are significant downsides to this and pressure to fix your own breakage, outside, not so much, including in packages people are required to use, like api client packages for popular services.
Significant breaking changes introduce real pain for consumers and in an open ecosystem should be avoided at all costs.
If they used go mod and didn't see it as a breaking change you'd be in the same situation. I think many people would debate whether this is breaking or not or deserves a major version - I think most people would say no if that was the only change.
If they used go mod and did see this as a breaking change, they'd increment the major version and you'd never notice. The unfortunate consequence is most of their consumers would continue happily using a version 1.3 multiple major versions behind and missing critical security fixes (most projects backport fixes for say two major versions) - and they'd probably end up with users on a range of major versions from 1.x to 21.x if this is your definition of a significant breaking change.
In theory under strong semver every single possible breaking change would be a new major version and every bit of software more than a few years old would be on version 945.5.1 - in practice, in the real world we never see that sort of versioning, and people use major versions for something very different (for significant changes in api surface (whether breaking or not), and sometimes for significant breaking changes). It's a signal rather than a hard objective rule.
None of this is insurmountable of course, but it does need to be attended to. At present go mod seems to assume strong semver and the result would be IMO a proliferation of breaking changes and outdated software being used (as opposed to current Go which almost forces everyone to be on HEAD and to try to avoid breaking others).
I don't argue for strict semver, you can never be sure if a supposedly minor change won't actually break things, sure, but some changes are guaranteed to break things. Why not at least mark them with the version bump?
I'm pretty sure a huge number of Go projects no longer work with earlier Go versions, it's just a question of how far back you go. If you're back at Go 1.10 I'd recommend freezing your dependencies and putting them in your own repo, it's the only way to be sure. Otherwise this is likely to happen to you again as people aren't great about supporting all versions of Go 1.x - it's quite a support burden.
I would note that even the Go language itself has dropped architectures and support for bootstrapping with older Go versions without bumping the major version. I think that's fine. Of course that does break strict semver (oops), but nobody cares because nobody actually expects strict semver in real software. What's important is the impact - it's sometimes ok to remove features if nobody is using them and I've never seen a project use a major version for that. I've also never seen a project bump a major version for a breaking bug fix.
What would be ideal is if we had a means of communicating more information about releases of code and their relationship than semver provides. I.e. I can publish a repo and add some kind of "language-support.json" document that specifies I support the 3 latest major releases of Go, and have the package manager figure out whether my version of Go is supported. Other ideas for metadata would be the ability to add labels to releases, and have the option to filter/prioritize upgrades based on those.
I would love if package managers supported labels as part of the metadata, and I could get a summary of all the labels between my current version and the new version. So on an upgrade, I could get a label diff for the package versions like "security:cve-1234 feature:oauth2-support bugfix:stale-kafka-messages". Those are cherry picked things that make nice labels, not everything makes sense in those messages. But sometimes it feels like we just do global updates and make sure everything builds, just to keep the tech debt low. I have no idea what actually changed in the package, and as long as it builds and tests pass I don't have to know. That's because we have so many dependencies, and it would take ages to read the release notes for all of the versions of all of the dependencies between our current version and the new one. Labels provide a means to communicate a succinct version of what changed; succinct enough for someone to read while they're waiting on tests to run.
Go the language is really best in class when it comes to keeping backwards compatibility.
And at this point, I just stopped digging down that rabbit hole and instead added
TESTIFY := github.com/stretchr/testify
mkdir -p "$(GOPATH)/src/$(TESTIFY)" ; git clone --depth 1 --branch v1.3.0 https://$(TESTIFY).git "$(GOPATH)/src/$(TESTIFY)"
Isn't the one of the reasons we use semver in the first place so that after you do some small change to the current application you're working on, you don't suddenly find yourself having to update 2/3 of your the environment just to compile it?
I think strict semver is reasonably possible for things that are libraries rather than products in their own right.
But you're right, people want major versions to indicate something significant and new (and occasionally major versions have a contractual significance as well - requiring customers to pay an upgrade fee), rather than small breaking changes. It's even sometimes possible to create a new library that has huge changes in the intended way you use it, its conceptual model and masses of new functionality, but it maintains backwards compatibility - it's perfectly understandable that the package owner wants to indicate this with a major version bump.
The other issue I've come across is that what constitutes a breaking change is much more subjective than many people realise. Any change is a breaking change if someone is reading in your library and tweaking bytes in specific locations. Of course, for most libraries they shouldn't be doing that, but that means that if someone is using your library wrong, then you don't worry about breaking them. But the question of if someone is using your library wrong is pretty subjective. At an extreme, that could be considered to be relying on any behaviour not explicitly documented as intended. Ultimately, it comes down to the judgement of the package maintainer, and that doesn't always match up with the judgement of the user.
Having version numbers that mean something to machines is very useful when it works though. Perhaps we should just separate those from human-targetted versions rather than go full sentimental versioning: http://sentimentalversioning.org/ Maybe something like <human version>.<machine readable api version>.<unique incrementing build number> could work.
Different producers/projects have different expectations for strictness in semver, and that's ok, they can use it differently in a negotiation with their consumers. Also different consumers have different requirements, and most tools provide a way for them to specify that on a per-import basis (upgrade only minor of these pls unattended, upgrade nothing on this one, only breaking changes on that one).
In short, loose semver is a feature, not a bug.
You would like to announce a new major version with fanfare, bumping a version number.
A programmer wants to be sure that his piece of software continues working if there has only been a minor version bump of a library he is using.
Did you read the reasoning of Russ Cox in respect to vgo, versioning and the module system in general? It reads very plausible and understandable.
Just because so many peopele interpret semver in their sense instead what it really means shouldn't prevent the Go core team to try to do it right.
There are multiple reasons given above why strict semantic versioning is not used in real-world software, that is a reductive summary of one of them.
Yes, I've read those articles a while back when they were written. In a very real sense, a consumer can never be sure that software keeps running properly if a library changes in any way without extensive checks. Real-world loose semantic versioning is a promise, not a proof, and it works better that way.
I don't personally think the current go proposal is terrible or sucks, but I do think it could be improved, and it'll be improved by listening to how people use versions.
It is a new package. From a practical perspective once you break backwards compatibility the new major version is its own independent thing. If you pretend its the same thing then you run into the situation where two libraries need v1 and v2 and therefore can't coexist. This is going to destroy your entire ecosystem. Replacing a few strings is nothing in comparison.
Not saying its right or wrong, but clearly it is not a death sentence.
Linux distros have the same issue, rpm or .deb packages cannot have the same name but still support having 2 versions of a package/library installed.
If it's a shared library and they don't get the SONAME versioning right for incompatible backwards compabiliry, it's even more cumbersome, and incurs more work for everyone to consume that library.
That perspective isn't appropriate to use outside of a narrow scope. In particular it's not appropriate to use at an ecosystem level, it's too surprising.
Part of the appeal of GO is simplicity: get some things down in a minimal yet complete way then consistently do it on the strength that it's a net-win: less complexity to learn that takes you further in most cases. This isn't the only game in town, but it is a good game. C++ is different: we've got 21 million ways of doing things, but you only pay for what you use. That's fine too if you know what you're doing.
Back to GO: The fact that the real world is wishy-washy here sometimes in the mood, sometimes not, sometimes I like blondes some days I like brunettes other days isn't the go way ... is a sometimes depressing part of DEVOPS: not formal. It's not the GO way either.
So I think GO's standpoint that changing something to explicitly ask for a breaking change isn't out of line. It can't be that tough of an ask to mean what you say and say what you mean when we label code with a semantic version.
Both this post and the other V2 problem GO post fail to better get at the crux of the problem: In GO there are two names:
- the path we import in code
- the actual GIT URL (or vendor ref) down to tag/commitId in go.mod that the previous item ultimately resolves to
The module name in code is ambiguous. It's desirable to some that code is unchanged ever wrt to imported module name. If true, then we must turn our attention to what's in go.mod. Here the ask is for hints that there's a breaking version available ... and it'll be up to the DEV to use it or ignore it changing go.mod as they deem makes sense.
If false, then you've gotta change the code to reference the breaking version if that's the desire ... realizing it's still ambiguous because it doesn't resolve down to a commitId or tag ... so that leaves go.mod on the table for change too.
Nobody debates that code must change to reflect breaking changes if the breaking change is included in go.mod. The questions are: did we ask? Where? Did we know there's a breaking change, and can't we get some help knowing there's a breaking change since GO hits GIT anyway?
At it's crux then ... there are two names. Leaving the code unchanged and changing go.mod with some tool help is better IMHO.
It's just an API kind of thing. If a function / constant / var / type was available and is not anymore, it's a breaking change. It's an objective measure, there is nothing left to subjectivity.
Now, of course a bug fix is going to "break" things if your code relied on the bogus behavior. An updated algorithm can also break things (I maintain a SAT solver, whenever I update the underlying algorithm, even if it's way faster on average, in some limited cases it can make one very specific problem way slower to solve, which can have bad consequences). But as it has no consequence on the API part of things, it's not a major change in semver's meaning.
Or are they? Check this out: https://www.unisonweb.org/docs/tour#-the-big-technical-idea
Strict semver (as proposed by go mod) - any breaking change means a new major version, and a 'new package' at a new import url. This means older users are left behind unless they explicitly upgrade and producers are encouraged to make breaking changes because it is easy for them. At present there are no measures to alert consumers to upgrade or communicate what the changes are. This doesn't reflect current practice even in the Go project itself (at 1.x despite small breaking changes/deprecations).
Loose semver (as used in the real world) - package identity is constant, minor breakage happens at every patch level and the level of breakage is negotiated between producers and consumers at the package level. semver is used to signal changes (major - big changes, possible breakage; minor - small changes, less breakage; patch - tiny changes, possible breakage to that area). Note major versions usually are used for big changes, not just breaking changes and big changes can be just as painful for consumers (e.g. in v3 we're introducing a new api for payments, start using it, the old one is still there but will go away soon in v3.1). Usually package management systems provide mechanisms to help that negotiation (automatic upgrades within minor/patch on the assumption breakage is minimal).
There are good reasons for the current go mod defaults (simple dependency resolution, simple migration between versions), but they do ignore real-world usage IMO and will lead to a bit of pain without some further work to resolve those contradictions.
A convenient tip: in the real world this is almost never true, so when you find yourself insisting that it is, stop and think.
And that's fine, because when you break compatibility, you're actually not talking about the same library. And this makes upstream developers think twice about breaking compatibility. Accretion of features should be preferable to breakage.
It's what Rich Hickey talks about in his Spec-ulation talk: https://www.youtube.com/watch?v=oyLBGkS5ICk
Go's package management is really well designed I think and also actually semantically relying on semver
I think most people, particularly those using Go, would agree with this.
It does not follow that we should make breaking changes easier or routine, nor that we should force people to use strict semver (which is not widely used for good reasons).
I see why they've done that as it simplifies assumptions but prefer the way other package managers handle this where it is left to producers and consumers to negotiate how strict they want to be.
Do you actually want that? Do you actually know that the multiple versions of the dependency you happen to be using do not conflict? If the dependency is in different namespaces their locks and other globals are also in different namespaces.
There are disadvantages to both and in particular duplicating dependencies should be seen as a short-term fix for a transition, not something you do routinely. If you do it routinely you'd see bugs due to shared locks, state, or changed data structures - it breaks assumptions about pkg globals for example, an important part of the language used extensively in the stdlib.
If you don't duplicate versions then you will have to wait for every single library in existence to update to the latest major version. This can take decades and still fail. Just take a look at Python 3.
If people aren't using it, it's weird.
That smells of trouble to me... either it should be good enough people want to use it, or painful enough not using it that people grit their teeth and do the right thing because its less effort in the long term (eg. you cannot use module at all unless you do it right).
If there is no obvious downside to doing the 'wrong' thing, and it's less effort, and people are doing the 'wrong thing', in practice, in the wild...
...well, I'm not sure this has really worked out ideally.
Certainly, "great" is not how I would describe it.