
Semantic Import Versioning - SamWhited
https://research.swtch.com/vgo-import
======
calcifer
From the "Avoiding Singleton Problems" section:

> Another problem would be if there were two HTTP stacks in the program.
> Clearly only one HTTP stack can listen on port 80; we wouldn't want half the
> program registering handlers that will not be used. Go developers are
> already running into problems like this due to vendoring inside vendored
> packages.

This is only a problem if you allow nested vendor/ directories, which "dep"
(you know, the "official experiment" that suddenly got discarded to the
surprise of its developers) doesn't have because it recurses through the
entire dependency tree and reduces it to a single vendor/, just like many
(most?) other languages.

The whole post reads like the author thinks Go has a very unique dependency
management problem that no other language ever had which somehow necessitates
a completely unorthodox solution. Three blog posts into "vgo", I still don't
see why...

~~~
dkarl
Go doesn't have a unique dependency management problem. This problem is shared
by many other languages that have all solved it poorly. I haven't written much
Go and am not a big fan of the language, but I am watching this discussion
eagerly because a successful solution in Go will be an example for other
languages to follow.

~~~
kenhwang
Curious about what languages you think solved it poorly, and why.

~~~
dkarl
I guess now that I think of it, all the solutions I've seen break down into
two techniques.

First, loading different versions of the library under the same name to
provide to different parts of your code. This has different risks depending on
how and when symbols are looked up or linked. In some languages you can end up
with the two versions accidentally calling into each other or using each
other's symbols. In other languages you run into the "expected Foo but got
Foo" type errors mentioned by munificent. That's what happens when you use
classloader tricks as a half-assed way of isolating "components" in Java.

Second, loading different versions of the library under different names. This
requires hacking the compiled code or the source code; convenience and
reliability will depend on the quality of the tools you're working with.
Sophistication ranges from using sed to munge source code to using tools like
objcopy that can read and rewrite compiled artifacts. Java "shading" (not
"shadowing" as I said earlier) relies on rewriting class files.

------
JeremyBanks
This article actually made the issue finally click for me: Go is having
trouble solving this problem because it has more going on in the global
environment, like its ancestor languages, and unlike more modern languages.
They need to come up with a complicated framework to reign in their spooky
action at a distance, because they have lots of implicit global relationships
where we might prefer more explicit and local ones.

Yikes.

~~~
ithkuil
> where we might prefer more explicit and local ones.

could you expand on this? who is "we" and what are those explicit and local
relationships? are you talking about an opensource ecosystem or a private
enterprise?

~~~
JeremyBanks
In this case I mean open-source communities and the companies that heavily
rely on them.

~~~
ithkuil
I'm not sure I understand your point. It seems you're implying (please correct
me if I'm wrong) that ecosystems of modern open-source languages are more
local, more tightly coupled and having people working in concert towards a
common goal.

In my experience this is often what happens when a new community is spawned.
E.g. when nodejs started it was there was always one single implementation of
that thing I needed and it was reasonably up to date with the rest of the
things around it (including the runtime environment, of which there was only
one or perhaps two mayor versions of). As time went on more people started to
contributing, often with different levels of commitment.

Often "modern" gets conflated with "young". Almost by definition, young
communities don't develop the same kind of problems of mature communities,
yet.

------
bbatha
I like this quiet a lot, it fits very nicely in the go stack.

Absent this article is a discussion of alternatives to renaming. Reading the
article would hint that a semver major bump would just leave you high and dry!
This is certainly true in some ecosystems like Ruby or Java < 9\. But other
ecosystems have solved this problem at the language level. Javascript, Rust,
and others allow you to import multiple versions of a module so long as you
don't expose types from that module in your public interfaces (not enforced by
JS -- but by convention). You still reach the problem the article references
once you have those types in your public interfaces or they use singletons.
That means that your langauge's package manager needs to handle these
dependencies differently (peer dependencies) giving you more flexibility at
the cost of additional complexity in package management.

------
ryanianian
Basically some motivation and how to use go + semver.

But is there a way to statically-compile dependencies? Is that even a thing?

(I'm not a go user so forgive me if I need a good RTFM session.)

It seems like a lot of these problems come from two dependencies wanting
different versions of a third dependency.

Instead of just depending on a dependency as a semver string, I could
(theoretically I think) depend on a statically-compiled version of the
dependency so it's free to call whatever libs it wants - effectively eliminate
the concept of transitive dependencies.

For projects with large dependency graphs, you may end up with relatively
large binaries since you end up with lots of duplicate object-code for common
libraries, but I wonder how much of a problem this actually is (and if simple
de-duping may solve a huge chunk of it).

We spend lots of engineering effort to resolve dependencies as source just to
end up compiling them into our executable anyway.

I'm sure this isn't a new concept, but it struck me as odd that go is fighting
with it so much recently considering statically-compiled, "library-free"
executables is definitely in go's wheelhouse - why not extend it to libraries?

~~~
pjmlp
The problem also happens in static linking, this is as old as the library
concept.

For example, the public symbols of the libraries might collide, they might
have side-effects that misbehave because there are multiple versions, they can
rely on yet on another library that can actually only be linked exactly once,
....

~~~
ryanianian
Isn't that just a 'bug' in the algorithm for code-layout - that the symbol
names weren't unique enough or something? This would require some breaking-
changes in the way libraries are laid out and linked against, but basically
you "should" be able to statically compile a dependency and then effectively
hide all the symbols from its dependencies so nothing else knows how to link
against them directly.

------
lifeisstillgood
Am I over simplifying things in the article to say

"At some point you cannot transparently support multiple target platforms"

we are all used to different builds for intel, ARM, 32/64 etc. why should we
be surprised to see Azure and AWS as fundamentally incompatible.

I mean I know i was horrified when I grepped my nose dependancies to find 900+
packages, but i was pleased to find i had a clear version number on each of
them. (Yes I am hand-waving what the depenancy manager resolves which is I
guess the point of this post in some manner, but the post here seemed to be
saying when you have incompatible requirements you are stuffed. And yes,
that's true. So don't have transitive errors - this is only a problem for
package maintainers not for developers, and so i suspect is a package
_aggregation_ problem?

------
munificent
_> "Incompatible changes should not be introduced lightly to software that has
a lot of dependent code. ..."

I certainly agree that “incompatible changes should not be introduced
lightly.”_

This is agreeing with a sentence that the semver authors didn't write. The
clause "that has a lot of dependent code" isn't in there arbitrarily.

What everyone in an ecosystem wants is high quality, easy-to-use, stable
packages. In a perfect world populated by programming demigods, v1 of every
package would be all three of those. In practice, human software engineers do
not design usable APIs and write robust bug-free code without feedback from
users. In order to act on that feedback, they need to change their code, which
sacrifices stability.

The way this works in other healthy package ecosystems is that packages have a
lifecycle. Early in the package's lifetime, it is undergoing rapid, breaking
change while it finds its way. It can do that relatively easily because there
are a small number of users harmed by the churn. If it gets popular, that
implies it has found a good local optimum of design and quality. At that
point, stability takes precedence and the package's evolution slows down.

The path to a great library is usually through several versions of a kinda-
shitty one. A good package manager supports both maintainers and consumers
working on packages at all stages of that lifecycle.

 _> Able to predict the effects on users more clearly, authors might well make
different, better decisions about their changes. Alice might look for way to
introduce the new, cleaner API into the original OAuth2 package alongside the
existing APIs, to avoid a package split. Moe might look more carefully at
whether he can use interfaces to make Moauth support both OAuth2 and Pocoauth,
avoiding a new Pocomoauth package. Amy might decide it’s worth updating to
Pocoauth and Pocomoauth instead of exposing the fact that the Azure APIs use
outdated OAuth2 and Moauth packages. Anna might have tried to make the AWS
APIs allow either Moauth or Pocomoauth, to make it easier for Azure users to
switch._

Those decisions are only "better" because they route around a difficulty the
package manager arbitrarily put in the first place.

There is already _plenty_ of essential friction discouraging package
maintainers from shipping breaking changes arbitrarily. Literally receiving
_furious_ email from users that have to migrate is pretty high on that list. I
don't see value in explicitly adding more friction in the package manager
because the package manager authors think they know better than the package
maintainer how to serve their users.

 _> To be clear, this approach creates a bit more work for authors, but that
work is justified by delivering significant benefits to users._

Users don't want all of the work pushed onto maintainers. Life needs to be
easy for maintainers too, because happy maintainers are how users get lots of
stuff to use in the first place. If you push all of the burden onto package
maintainers, you end up with a beautiful, brilliantly-lit grocery store full
of empty shelves. Shopping is a pleasure but there's nothing to buy because
producing is a chore.

Good tools distribute the effort across both kinds of users. There's obviously
some amortization involved because a package is consumed more than it's
maintained, but I'm leery of any plan that deliberately makes life harder for
a class of users, without very clear positive benefit to others. Here, it
seems like it makes it harder to ship breaking changes, without making
anything else noticeably easier in return.

 _> They can't just decide to issue v2, walk away from v1, and leave users
like Ugo to deal with the fallout. But authors who do that are hurting their
users._

Are they hurting users worse than not shipping v2 _at all_? My experience is
that users will prefer an imperfect solution over no solution when given the
choice. It may offend our purist sensibilities, but the reality is that lots
of good applications add value to the world built on top of mediocre, half-
maintained libraries. Even the most beautiful, well-designed, robust packages
often went through a period in their life where they were hacky, buggy, or
half-abandoned.

A good ecosystem enables packages to _grow_ into high quality over time,
instead of trying to gatekeep out anything that isn't up to snuff.

 _> In Go, if an old package and a new package have the same import path, the
new package must be backwards compatible with the old package._

This doesn't define _for whom_ it must be backwards compatible. Breaking
changes are not all created equal. Semver is a pessimistic measure. You bump
the major version if a change _could break at least one user, in theory._ In
practice, most "breaking" changes do not break most users.

If you remove a function that turned out to not be useful, that's a "breaking"
change. But any consumer who wasn't calling that function in the first place
is not broken by it. If maintainer A ships a change that doesn't break user B,
a good package manager lets user B accept that change as easily as possible.

As far as I can tell, the proposal here requires B to rewrite all of their
imports and deal with the fact that their application may now have two
versions of that package floating around in their app if some other dependency
still used the old version. That's pretty rough.

What you'll probably see is that A just never removes the function even though
it's dead weight both for the maintainer and consumer. This scheme encourages
packages to calcify at whatever their current level of quality happens to be.
That might be fine if the package already happens to be great, but if it has a
lot of room for improvement, this just makes it harder to do that improvement.

~~~
matt_m
Well, the article calls for a v0, which seems to be exactly for the use case
you describe? There are no import path changes, undergoing "rapid, breaking
change" is allowed, and if you ever find a good local optimum you can graduate
to v1 without any import path change either. I don't see any requirement to
ever move to v1, although users may understandably prefer libraries that do. I
don't quite understand what additional support you are looking for from "a
good package manager".

I'm also not sure this makes it harder to ship a v2. Sure, users will have to
change their import paths, although I'm sure tooling like GoLand can easily
automate this. But this also frees library maintainers to do extensive API
redesigns, without worrying about breaking everything or hanging their
existing users out to dry. In particular, the ability to make v1 depend on
(and become a wrapper for) v2 is quite nice. Not only does this pattern not
break existing code, but it even allows users who have not yet migrated to the
new API to benefit from the active development on the latest branch. And of
course there is the potential for some degree of automated migration, through
inlining wrapper functions as mentioned in the article.

------
throw7
Has something like gcc symbol versioning been talked about? I can probably
sense the sneers from some, but I'd imagine there could be an evolution/"go
way" to implement it.

~~~
4ad
This is about software (source code) versioning. Shared library symbol
versioning is a completely orthogonal concept.

------
__david__
I feel like the article made its case pretty well, but I really dislike idea
that I need to duplicate my library into a "v2/" directory (or a different top
level git repo) in practice. Maybe I'm misunderstanding something, but this
seems to be exactly what branches are for. If I'm not able to specify a branch
name in the package "path" then there's something really wrong.

~~~
matt_m
It wasn't obvious to me either, but apparently vgo translates that into the
appropriate git tag, it's not actually a separate directory.

~~~
4ad
Yeah, it's just the import path that it's changed, but it's still ugly as sin
and makes the mapping between import paths and filesystem paths non-trivial.

------
k__
While I still think SemVer is crap (because of the edge cases) this seems to
be a reasonable approach to library versioning.

~~~
4ad
Mind to expand about SemVer?

~~~
k__
It says only majors should break the API, but bugs do it all the time. So that
rule is just wrong and gives a false feel of safety.

~~~
4ad
And the alternative is?

~~~
k__
Simply treat every new version as a major release, everything else is a lie.

Sure, you can structure it by "intent" but don't pretend a bugfix can't break
your API

------
justicezyx
I am guessing my view of versioning being the fundamental abstraction for
constructing software system is not well shared.

I did not see anything that isn't an approximation of versioning with added
semantic tailor to more focused use cases.

