Hacker News new | past | comments | ask | show | jobs | submit login
Go += Package Versioning (swtch.com)
336 points by bketelsen on Feb 20, 2018 | hide | past | web | favorite | 201 comments



I'm going to comment mostly on the parts of the proposal that I think are wrong, but don't take this to be an overall negative response. I'm excited to see smart folks working on this, and package management is a really hard problem. There are no silver bullets to code reuse.

Context for those who don't know: I along with Natalie Weizenbaum wrote pub[1], the package manager used for Dart.

> Instead of concluding from Hyrum's law that semantic versioning is impossible, I conclude that builds should be careful to use exactly the same versions of each dependency that the author did, unless forced to do otherwise. That is, builds should default to being as reproducible as possible.

Right on. Another way to state this is: Changing the version of a dependency should be an explicit user action, and not an implicit side effect of installing dependencies.

    import "github.com/go-yaml/yaml/v2"
> Creating v2.0.0, which in semantic versioning denotes a major break, therefore creates a new package with a new import path, as required by import compatibility. Because each major version has a different import path, a given Go executable might contain one of each major version. This is expected and desirable. It keeps programs building and allows parts of a very large program to update from v1 to v2 independently.

It took me several readings to realize that you encode the major version requirement both in the import string and in the module requirements. The former lets you have multiple copies of the "same" module in your app at different major versions. The latter lets you express more precise version requirements like "I need at least 2.3, not just 2.anything".

I think it's really going to confuse users to have the major version in both places. What does it mean if I my code has:

    import "github.com/go-yaml/yaml/v2"
But my go.mod has:

    require (
      "github.com/go-yaml/yaml" v1.5.2
    )
I don't know if the goal of this is to avoid lockfiles, or to allow multiple versions of the same package to co-exist, but I think it's going to end up a confusing solution that doesn't cleanly solve any problem.

For what it's worth, Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain, but it doesn't appear to be insurmountable. Most of the pain seems to be in the performance issues over-constrained dependencies caused in our old version solver and not in the user's code itself.

In almost all cases, I think there is a single version of a given package that would work in practice, and I think it's confusing for users to have an application that has multiple versions of what they think of as the "same" package inside it. This may be less of an issue in Go because it's structurally typed, but in Dart you could get weird errors like "Expected a Foo but got a Foo" because those "Foo"s are actually from different versions of "foo". Requiring a single version avoids that.

> I believe this is the wrong default, for two important reasons. First, the meaning of “newest allowed version” can change due to external events, namely new versions being published. Maybe tonight someone will introduce a new version of some dependency, and then tomorrow the same sequence of commands you ran today would produce a different result.

No, I think newest (stable version) is the right default. Every package manager in the world works this way and the odds that they all got this wrong are slim at this point.

At the point in time that the user is explicitly choosing to mess with their dependencies, picking the current state of the art right then is likely what the user wants. If I'm starting a brand new from scratch Ruby on Rails application today, in 2017, there is no reason it should default to having me use Rails 1.0 from 2005.

Every version of the package is new to me because I'm changing my dependencies right now. Might as well give me the version that gets me as up-to-date as possible because once I start building on top of it, it gets increasingly hard to change it. Encouraging me to build my app in terms of an API that may already be quite out of date seems perverse.

> This proposal takes a different approach, which I call minimal version selection. It defaults to using the oldest allowed version of every package involved in the build. This decision does not change from today to tomorrow, because no older version will be published.

I think this is confusing older versions and lower. You could, I suppose, build a package manager that forbids publishing a version number lower than any previously published version of the package and thus declare this to be true by fiat.

But, in practice, I don't think most package managers do this. In particular, it's fairly common for a package to have multiple simultaneously supported major or minor versions.

For example, Python supports both the 2.x and 3.x lines. 2.7 was released two years after 3.0.

When a security issue is found in a package, it's common to see point releases get released for older major/minor versions. So if foo has 1.1.0 and 1.2.0 out today and a security bug that affects both is found, the maintainers will likely release 1.1.1 and 1.2.1. This means 1.1.1 is released later than 1.2.0.

I think preferring minimum versions also has negative practical consequences. Package maintainers have an easier job if most of their users are on similar, recent versions of the package's own dependencies. It's no fun getting bug reports from users who are using your code with ancient versions of its dependencies. As a maintainer, you're spending most of your time ensuring your code still works with the latest so have your users in a different universe makes it harder to be in sync with them.

Look at, for example, how much more painful Android development is compared to iOS because Android has such a longer tail of versions still in the wild that app developers need to deal with.

If you do minimum version selection, my hunch is that package maintainers will just constantly ship new versions of their packages that bump the minimum dependencies to forcibly drag their users forword. Or they'll simply state that they don't support older versions beyond some point in time even when the package's own manifest states that it technically does.

There is a real fundamental tension here. Users — once they have their app working — generally want stability and reproducibility. No surprises when they aren't opting into them. But the maintainers of the packages those users rely on want all of their users in the same bucket on the latest and greatest, not smeared out over a long list of configurations to support.

A good package manager will balance those competing aims to foster a healthy ecosystem, not just pick one or the other.

[1]: https://pub.dartlang.org/


You (and likely everyone else) should look at the tour as well before commenting, as I think many people are misunderstanding some of the subtler points.

> If I'm starting a brand new from scratch Ruby on Rails application today, in 2017, there is no reason it should default to having me use Rails 1.0 from 2005.

In the tour it states, "We've seen that when a new module must be added to a build to resolve a new import, vgo takes the latest one." which means that the newest Rails would be used and set in your `go.mod` file.

From that point onwards the "minimal version" will be used, which means vgo won't upgrade you to a version released tomorrow unless you (or a module you use) explicitly state that they need that newer version.

This is a much saner default than the one you describe (imo) as people still get recent versions for new projects, but once they are using a specific version they won't upgrade unless they need to or want to.


> When a security issue is found in a package, it's common to see point releases get released for older major/minor versions. So if foo has 1.1.0 and 1.2.0 out today and a security bug that affects both is found, the maintainers will likely release 1.1.1 and 1.2.1. This means 1.1.1 is released later than 1.2.0.

I should have addressed this in the original reply and its too late to edit now, but this isn't an issue. I downloaded vgo and verified that you CAN release a 1.1.1 AFTER 1.2.0 and it is treated correctly as far as I can tell.

See github.com/joncalhoun/vgo_main:

    $ vgo list -m -u
    MODULE                          VERSION                    LATEST
    github.com/joncalhoun/vgo_main  -                          -
    github.com/joncalhoun/vgo_demo  v1.0.1 (2018-02-20 18:26)  v1.1.0 (2018-02-20 18:25)
v1.0.1 is newer than v1.1.0, but isn't treated as the latest version. I suspect that RSC didn't mean "older" in the literal datetime sense, but rather in the context of semantic versioning where "older" means you don't release v1.3.4 AFTER you have released v1.3.5


> In the tour it states, "We've seen that when a new module must be added to a build to resolve a new import, vgo takes the latest one." which means that the newest Rails would be used and set in your `go.mod` file.

That works for adding a new dependency. But, as I understand it, if I decide to upgrade my dependency on foo by changing its already-present version in my app's module file, this does not upgrade any of the transitive dependencies that foo has. Instead, it selects the lowest versions of all of those transitive dependencies even though my goal with foo itself is to increase its version.

So now I have to reason about sometimes it picks the latest version and sometimes it doesn't, depending on the kind of change I'm making.


The new release of the dependency can also bump the minimum required versions of its dependencies, as part of their release cycle. If they don't, you can upgrade them as any other dependency; after all transitive dependencies are just dependencies.

That said, you can just upgrade all the dependencies with vgo get -u and get the "always latest" behaviour. This is a desirable result, but it shouldn't happen at each and every fresh build.

You can have automation that periodically tries to bump all the versions and if all tests passes send you a PR with the proposed update.

With the proposed rules you get 1. Repeatable builds as with lock files 2. Simple to reason about constraint resolution on case of multiple modules depending on the same module.


Let's say I create a program that is using foo and end up with the following dependencies:

main:

    requires "foo" v1.0.0
foo (v1.0.0):

    requires "bar" v1.0.0
Right now if I check my dependencies, I'll have something like this:

    MODULE    VERSION
    main      -
    bar       v1.0.0
    foo       v1.0.0
Now lets say some time passes, and both foo and bar release new versions:

foo:

    v1.0.0
    v1.1.0
bar:

    v1.0.0
    v1.0.1
    v1.1.0
    v1.1.1
    v1.1.2

And the deps for foo v1.1.0 are:

foo (v1.1.0):

    require "bar" v1.0.1
Realizing that foo has an update, I decide I want to upgrade. I'd do vgo get foo. My updated dependencies (shown with "vgo list -m") are:

    MODULE    VERSION
    main      -
    bar       v1.0.1
    foo       v1.1.0
bar gets its version increased as well, using the version specified by the foo package's module. This makes sense to me - the foo package maintainer has stated that he only needs v1.0.1 to be stable, so we default to what he specified.

Now imagine I want to add another package, say it is the wham package and it has the following dependencies:

wham (v1.0.0):

    require "bar" v1.1.1
If I add this to my code my versions will now be:

    MODULE    VERSION
    main      -
    wham      v1.0.0
    bar       v1.1.1
    foo       v1.1.0
bar now uses v1.1.1 because it is the minimal version that satisfies all of my modules. vgo DOES upgrade bar for us, but not beyond the lower version number required to satisfy all of our modules. That said, we can still upgrade it manually with "vgo get bar", after which it will be using v1.1.2 because our main dependencies would become:

main:

    requires "foo" v1.1.0
    requires "wham" v1.0.0
    requires "bar" v1.1.2
In short, upgrading foo WILL upgrade all of foo's dependencies in order to meet it's minimum version requirements, but no further. That said, you can still manually upgrade any of those dependencies.

To me this makes sense. The creator of foo may have avoided upgrading the dependency on bar for some performance reasons, so this upgrade only happens in your code if it is required by another package, you initiate it manually, or if the foo package releases a new version with updated dependencies in its go.mod file.

PS - I've tested this all using the prototype of vgo. You can see yourself by grabbing this code: github.com/joncalhoun/vgo_foo_main and then use vgo to list dependency versions and try upgrading foo which has a dep on demo.


For what it's worth, Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain, but it doesn't appear to be insurmountable. Most of the pain seems to be in the performance issues over-constrained dependencies caused in our old version solver and not in the user's code itself.

In almost all cases, I think there is a single version of a given package that would work in practice, and I think it's confusing for users to have an application that has multiple versions of what they think of as the "same" package inside it. This may be less of an issue in Go because it's structurally typed, but in Dart you could get weird errors like "Expected a Foo but got a Foo" because those "Foo"s are actually from different versions of "foo". Requiring a single version avoids that.

I think this makes a strong case for not releasing major version upgrades that use the same package names. The very idea of two incompatible things having the same name should set off alarm bells. Instead of trying to make that work, we should be avoiding it.

In the absence of this principle, the Java ecosystem has developed a compensatory mechanism of packaging "shadowed" versions of their dependencies alongside their own code. This is an ugly hack to accomplish the same thing after the fact, so we are already incurring even more complexity than would be imposed by following this rule.


> I think this makes a strong case for not releasing major version upgrades that use the same package names. The very idea of two incompatible things having the same name should set off alarm bells. Instead of trying to make that work, we should be avoiding it.

If you do that, I think you'll find in practice that one of two things happens (or more likely, both, in a confusing mixture):

1. People start releasing packages whose names include version numbers. "markdown2", etc. Then you get really confusing hallways conversations like, "Yeah, you need to use markdown2 1.0.0."

2. People start coming up with weird confusing names for the next major version of packages because the current nice name is taken. Then you get confusing conversations like, "Oh, yeah, you need to upgrade from flippitywidget to spongiform. It's almost exactly the same, but they removed that one deprecated method." Also don't forget to rename all of your imports.


I think you'll find in practice that one of two things happens

I think the existence of those practices (like Java dependency shading) proves that people are struggling towards this solution on their own, without support from the language or the community. With official support, if major versions work the same way for everybody, it won't need to be so janky.

In practice, I predict that people would start behaving better, doing what they should have been doing (and what many have been doing) all along: avoiding unnecessary breaking changes in non-0.x libraries, choosing function signatures carefully, and growing by accretion and living with their mistakes. Right now, I think some developers see major version bumps as a convenient way to erase their mistakes, without taking into account the cost imposed on users who end up juggling dependency conflicts.


The major version goes into the name, marldown2, but the version numbers should be monotonic, so when when viewed on a number line they are in-order. This also allows the programmer to import both, and have a smooth transition between the deps.


I'm not really enthused by efforts to bring in the same packaging constructs as other languages, which all have the effect of making time-to-compile after git clone longer.

Frankly I think we tend to conflate two separate but related tasks in these discussions: communicating updates, and distributing dependencies.

vendor/ folders are a totally fine distribution system - optimal even. Time-to-compile is 0 because you get the dependencies with git clone.

So really the problem we have is communicating updates (and some git server tooling smarts to deduplicate files but let github solve that).


According to the article, the solution I suggest has been proposed in the Go community, and I don't know of any language where it has actually been adopted, so Go might be the first. I just know that Java has been forced to work around its absence.

As for the tasks that need to be solved here, the primary one I see is reconciling the needs of different libraries. What do you do when you depend on library A and library B and they need two incompatible versions of library C? As I see it, there's no clean way to answer that question if A and B expect the two incompatible versions to be present with the same name.


Yeah, I'm also a bit concerned with vendoring going away. The proposed solution to preserving upstream is imho very elegant (caching proxies) and scales better than vendoring, but it requires a bit more infrastructure. Perhaps vgo could be taught to look in a local directory for an exploded content of what logically is the upstream archive (and that local directory could be checked in git, or preserved as a cache by your CI etc).


I think vendoring is very useful and it would be a step back if it becomes harder.

Caching proxies for zip downloads sounds nice, but it's more than just "a bit more infrastructure". I think it would be a huge burden to package publishers if each of them has to manage their own dependency zip mirror as a separate piece of infrastructure. You need version control anyway; checking your dependencies into that same version control does not require a new piece of infrastructure.

Coming from Ruby, where rubygems.org is a very painful point of failure, in my eyes the fact that Go dependencies are not a separate download is a big plus.

In fact without a single blessed dependency repository such as rubygems.org, in the Go case you have as many points of failure at build time as there are different code hosting sites in your dependency graph.


I don't think package publishers are the ones that need to manage the caching proxies: everyone who wants stable builds that don't break when some upstream deletes an old version of a package (or a networking error) needs a proxying cache.

Vendoring turned your git repo as your poor man's proxying cache. It also made some people unhappy. In my current company we use phabricator for code reviews and it doesn't work well if the commit size is bigger than some threshold.

I love to have the option of not checking in dependencies. I'm not sure this option has to be forced on everybody though.


> In the absence of this principle, the Java ecosystem has developed a compensatory mechanism of packaging "shadowed" versions of their dependencies alongside their own code.

well that is okish..

however... looking at the xml stuff.. that is probably bad. basically a lot of packages repackage xerces2 or the java.xml apis. besides that jaxp with streaming is present in java since 1.6+. But nobody removes this stuff.


Now is a good time to mention the great blog post, "So you want to write a package manager," by Sam Boyer: https://medium.com/@sdboyer/so-you-want-to-write-a-package-m...


>I'm going to comment mostly on the parts of the proposal that I think are wrong, but don't take this to be an overall negative response. I'm excited to see smart folks working on this, and package management is a really hard problem.

And as usual Golang ignores progress in the area by package managers such as npm, cargo, et al for what seems like a half-hearted solution.

Issues I see: the introduction of modules on top of packages solve no real problem, the addition of major version numbers as part of the package identification (and thus allowing the same program to use different versions of a package), and "minimal version selection" solves nothing that lock/freeze files wouldn't solve better, while preventing users from getting important minor but compatible updates (e.g. security) as a "default".


Having had lengthy conversations with Sam Boyer on this topic, I know that at least he deeply knows how these systems work. So it hasn't felt like "ignored" to me, at least, as an outside observer.


Perhaps not ignored in the "didn't know about them sense", but in the "nevertheless went ahead and did its own thing".


That's unconstructive and substance-less.

Could you expand on the progress you mentioned, or explain what parts of the counterexamples you gave the Golang folks should learn from? In what ways is the proposal a half-hearted solution?


Added some issues with the current proposal.


> I think it's really going to confuse users to have the major version in both places. What does it mean if I my code has:

after reading the proposal, my understanding is:

the 'import "github.com/go-yaml/yaml/v2"' directive would lead to installing the oldest version of yaml 2.x that is supported by your other dependencies.

meanwhile, the go.mod file would mean that any dependencies that use the incompatible yaml 1.x library, would lead you you installing the oldest 1.x version after 1.5.2 which would then be used all dependencies that import the 1.x version

> No, I think newest (stable version) is the right default. Every package manager in the world works this way and the odds that they all got this wrong are slim at this point.

Doing this is meant to allow reproducible builds without requiring the use of a lock file. As to why they don't want a lock file... that isn't really addressed in the article. Lock files do seem like the most sane way to provide truly reproducible builds (that aren't dependent on the repo tags no changing since they are usually locked to a specific commit hash). I think the decision to avoid a lock file is a bad one and certainly needs to be justified.

> I think this is confusing older versions and lower. You could, I suppose, build a package manager that forbids publishing a version number lower than any previously published version of the package and thus declare this to be true by fiat.

I agree, I also think they they meant to say "minimal minor version" since major version have different import paths and are BC incompatible.

Ideally, "prefer oldest / prefer newest" should be something that can be configured per requirement in the go.mod file so that people who don't care about reproducibility don't have to go through and bump their minimum versions every time any dependency has a new release. Making this dependent on using a flag every time you run 'vgo get' is silly and doesn't allow you to do this for some packages and not others without having to write your own script to make a bunch of 'vgo get' invocations.

> I think preferring minimum versions also has negative practical consequences. Package maintainers have an easier job if most of their users are on similar, recent versions of the package's own dependencies.

Ideally, the first step you take when you encounter a bug with a library would be to check to see if a more recent version of the library fixes the bug. In practice, I don't know how many people would fail to do this before filing a bug report.


Current Go vendoring already allows two versions of a package to be used at the same time, and it is problematic. Both copies of the package will run their initialization, which works if they are completely self contained, but if they interact with any other global state things can start going wrong. You mutexes protecting external global resources don't work, because they are in different namespaces. These and similar problems are why common wisdom is that libraries should not pull in vendored dependencies.

So yes, attempting to allow multiple versions of the same module will cause grief.

You would also need to work around a way to override the choices made by dependencies, eg. to rebuild with security or bug fixes, without requiring me to fork everything.


> Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain

Dart is primarily targetting web deployment, in which code/executable size is a major concern. For Dart's use-case it makes perfect sense to force the developer to sort this out ahead of time, painful as it might be. For lots of other languages (including go), the primary expected deployment is to a binary executable where bloating it with multiple versions of dependencies to make the builds easier and make it possible to use two dependencies with mismatched dependenices of their own is very rarely a problem.


How would it be possible to guarantee reproducible/deterministic compatibility without using something like the hash of the entire library as an implicit "version"? (say, the git SHA)

I believe that the Nix OS does something along these lines to guarantee deterministic builds.

A number of times in at least a couple of languages, I've seen a library keep the same (exact) version number but make some small "bugfix" change that ended up breaking things. Often, nothing stops someone from doing that.


The proposal doesn't seek to guarantee reproducible builds; it merely seeks to enable them, through the methods they outline.

If you did want to guarantee reproducible builds with SHA-1 hashes, one way would be to introduce those into the .mod files they outlined. But that'd be clunky; it's much easier to reason about a version number than it is a digest hash.

Another method would be to introduce a lock file where those details are kept from plain view, but my sense was that they wanted a little more openness about the mechanism they were using than a lockfile provides (which is why .mod files use Go syntax, save the new "module" keyword they would introduce). After all, that's how dep works right now: they might as well just keep the lock file.

Cases where tags are being deleted, or worse—where accounts are deleted, then recreated with (other? same?) code, may be said to break the semver contract the library or binary author someone has with their users. As such, it may be seen as outside of scope for what they are seeking to accomplish with vgo.


What are the criticisms of lockfiles? I've used lockfiles more or less successfully in Rails, Elixir, and of late, Node. I thought it was a proven (if imperfect) idea...


I should note I am fine with lockfiles myself, so I can only speculate as to what RSC/others feel. It is fair to say that lockfiles are not Go—they would be another set of syntax that has nothing else to do with the language aside from their use in package management. So one might argue that it would be desirable to have a solution to Go's package management that was achieved using Go, which are what the module files are written in.


So make the lock file a simple go program that just returns an array?

This approach actually adds tons of flexibility... not sure it’s needed, but you could return different lock data based on whatever logic you needed


> Dart does not let you have two versions of the same package in your application

Not as direct imports, but Rust allows transitive deps to get the version they specify.


I have misgivings about all this version pinning. At first, it seems to make things easier. Programs don't break because some imported package changed. So it looks like a win. At first.

Over time, though, version pinning builds up technical debt. You have software locked to old unmaintained versions of packages. If you later try to bring some package up to date, it can break the fragile lace of dependencies locked in through past version pin decisions.

Some version pinning systems let you import multiple versions of the same package into the same build to accommodate dependencies on different version. That bloats executable code and may cause problems with multiple copies of the same data.

On the other side, once package users are using version pinning, it's easier for package maintainers to make changes that aren't backwards compatible. This makes it harder for package users to upgrade to the new package and catch up. Three years after you start version pinning, it's going to be a major effort to clean up the mess.

The Go crowd tries to avoid churn. Their language and libraries don't change too much. That's a good, practical decision. Go should reject version pinning as incompatible with the goals of Go.


This is horrible advice.

I work in a lot of languages (including Go), which gives me some perspective. On the extreme ends of this issue we have Maven, which specifies hard version numbers (fuzzing is possible but culturally taboo) and NPM, which only recently acquired the ability to lock version numbers and still has a strong culture of "always grab latest".

The Maven approach is unquestionably superior; code always builds every time. If you walk away from an NPM build for nine months, the chance that it will build (let alone work as before) is vanishingly small. I waste stupid amounts of time fighting with upstream dependencies rather than delivering business value.

People break library interfaces, that's just a fact of life. The only question is whether they will break your code when you explicitly decide to bump a version, or when randomly when someone (possibly not even you) makes changes to other parts of the system.


> The Maven approach is unquestionably superior; code always builds every time.

If that's your goal. There's a middle ground which includes security updates: asking the community to follow semver. Rust devs seem to do it well and my crates build well with non-exact version numbers after not visiting them for a while. Not sure why the majority of Rust devs do a good job with this and JS devs don't. I suppose it's the strictness of the contract in the first place.

With Go, I've found popular library maintainers doing a good job at BC contract compat even when there has been only one blessed version. I don't assume giving them a bit more flexibility is going to hurt anything.


> There's a middle ground which includes security updates: asking the community to follow semver.

No, that doesn't work. People make mistakes, and you end up not being able to build software. It might work _most_ of the time, but when things break, it's of course always at the worst possible time.

I think Cargo got it right: use semver, but also have lock files, so you can build with _exactly_ the same dependencies later.


Sorry for the confusion, but that's what I meant. I was addressing the use-exact-versions-all-the-time argument. Maven goes too far in that it is taboo to do anything but use specific versions. Cargo, composer, etc do follow the proper approach. Maven w/ version numbers like "[1.2,1.3)" shouldn't be so taboo IMO. The "commit lock files for apps and don't for libs" is also practical.


Ah, OK, we're in agreement then :)


With strong typing, you can analyze the code and automatically increment the semver versions based on public API changes. Elm does this AFAIK. Reduces the amount of mistakes that are possible.


> With strong typing, you can analyze the code and automatically increment the semver versions based on public API changes.

This is nice, but should be noted does not absolutely prevent missing backward-incompatible changes; that a function has the same signature does not mean that it has backward compatible behavior.

(With a rich enough strong, static type system, used well to start with, you might hope that all meaningful behavior changes would manifest as type changes as well, but I don't think its clear that that would actually be the case even in a perfect world, and in any case, its unlikely to be the case in real use even if it would be in the ideal case.)


Yup, elm pioneered this space. We're giving it a shot too https://github.com/rust-lang-nursery/rust-semverver


Rust actually uses lock files, just not for libraries.

I've been bit by this a few times when I've come back to code 2-3 months later and forgot to include the .cargolock so I pretty much always use the lock file these days.


Lock files are best of both worlds: you specify the latest at check in time and then freeze whatever was picked. No need to reinvent the world.


Yeah it's crazy that there's still so much controversy around this topic considering Node and Ruby have had amazing dependency management for over half a decade at this point. Dependency management in those languages is pretty much a solved problem and the fact that go isn't there yet drives me nuts since I have to work with it every day for my job.


Node and Ruby have had amazing dependency management for over half a decade at this point.

Eh?

NPM didn't have package-lock.json until v5, released in 2017. Before then there was the optional shrinkwrap that nobody used, so builds were totally unreproducible.

Ruby at least had Gemfile.lock from early days. Unfortunately there have been so many compatibility problems with different versions of Ruby itself that someone needed to invent rvm, rbenv, and chruby. Getting every dependency to behave in the same Ruby version was sometimes an odyssey. Still, at least builds are reproducible... as long as you're running on the same OS/CPU arch (uh oh native code!)

Ruby is actually pretty alright given the constraints, but Node/NPM is the canonical example of how NOT to do dependency management, and they're still trying to figure out here in 2018.


In my experience NPM shows exactly how to build a package manager. They are slowly fixing the problems one by one but half a decade ago NPM was terrible compared to what it is today.


Well, the ways Node and Ruby "solve" this is are almost diametrically opposed, so I don't think it makes sense to call this a solved problem.


Indeed this thread has caused me to retreat back into my Ruby hole. Seems like any contemporary solutions should be at least as good as a Gemfile/.lock.


Bundler has been out since 2008. Just saying.


Node breaks on me all the time.


The picture becomes more complicated when you consider a hierarchy of packages, because presumably the packages you depend on would have their own lock files, which represent the versions those packages have been tested with.

npm will pick the latest version of the dependencies compatible with the package.json of the packages (package-lock.json, is not published as part of the package). This means that with npm we may end up using a version of a transitive dependency which is much later than than the one that your direct dependency was tested with.

The proposal for vgo will take the minimum version compatible with the packages, and hence pick a version which is closer to the actually tested one.


This proposed “minimum version” behavior will effectively prevent “automated” security updates, which is what most reasonable people expect from their package manager.

Consider all tens of thousands of CVE bugs found in widely used image, video, XML, etc. libraries. Even “updated” software is often compromised via outdated libraries.

With a “min version” approach, none of those will get patched without explicit action by a developer, who won’t do that because he doesn’t even know about the bug unless he reads the thousands of messages each day on bugtrak.

If anything, history has shown that we should be focusing package management on getting security fixes to the the end user as soon as possible.

This proposal is bad for security.


Note that even currently, lockfiles have the effect of locking you to an old version of the software. You could choose to automatically update all dependencies to latest in CI, but then by that token you can run vgo get -u, to get the latest versions here as well.

One of the issues which the vgo developers point out is that the "latest" behaviour has the perverse effect that a module A may declare a dependency of version 1.1 of a module B, and may have never been even tested on that version because the package manager always picked the latest.

In some sense, the vgo approach is saying that across hierarchy of module owners, each one should keep upgrading their manifest to the latest versions of their direct dependencies, and ensure that things keep working, rather than relying on the topmost module to pull in the latest of the entire tree of dependencies. This seems to be good for the entire ecosystem.


> each one should keep upgrading their manifest to the latest versions of their direct dependencies, and ensure that things keep working

But that’s the problem. You’re relying on N people, including the package maintainers and end developers to all take timely action in order to get a security fix to the end user.

That simply won’t happen.

What should happen is a single package maintainer fixes a vulnerability, and that fix automatically flows through the system to all places where that package is used. And insecure versions should be made unavailable or throw a critical “I won’t build” error.

Perhaps some way of marking versions as security critical might help, but the proposed approach will leave tons of vulnerable libraries in the wild.

All the current package managers for other languages have this issue to some degree. Golang should do better with knowledge of those mistakes.


PHP package manager, Composer, uses lock files only for applications, not for libraries. The developer makes sure that everything works and then commits the lock file.


> People break library interfaces, that's just a fact of life.

Sure, but in my experience it happens at least 10x as frequently in NPM than with Go. It's really common for me to update all my Go dependencies and everything just continues to work. With NPM I have to start looking for migration documentation and change a bunch of my code.


For me, it was difficult that I couldn't have reproducible builds. I would kick off a deploy and it would one day break due to some dependency (incorrectly) making a breaking change. It's important to keep packages up to date, but I believe it should be a conscious decision of the developer.


I also think reproducable builds are very important. Vendoring has solved this for me in Go even before it was officially supported.

My main fear is that when people get used to the idea that their master repo doesn't have to have a backwards compatible interface, then updating dependencies will look awfully similar to NPM where authors change interfaces based on their weekly mood.


> I waste stupid amounts of time fighting with upstream dependencies rather than delivering business value.

Have you paid the upstream developers? Because you sound like you did.


Maven doesn't specify hard version numbers - the default meaning of "1.0" is >= 1.0, and some packages specify dependencies without a version at all. It's quite possible to get updates of transitive dependencies without realising it (especially since most tooling does not encourage pinning any dependency other than the superficial ones).


We do reject version pinning. There is no way to pin a particular version. The only constraint you can express is ">= this specific version". If nothing else pushes it forward, though, you'll keep getting that specific version. But it's not pinning, it's just stating a minimum, and the system uses the single oldest version that satisfies all the stated minimums (the max of the mins).


That seems like it only works well if library authors are both extremely well behaved and also don't make mistakes, and there's no way that will be universally true. Every time I've had to pin a version it's because a backwards incompatibility issue or more likely just a bug is, intentionally or not, added to a library.

It's going to be much more painful if people have to fork when a bug is introduced lest they have to wait a week or more to get a bugfix upstreamed.


I think you misunderstood what was said. In this proposal, each of your modules specifies a minimum version, but when code is built it tries to use the minimal version that meets those requirements.

Eg I might have my module that says:

    "some/pkg" v1.4.1
Which means I need at least version 1.4.1.

When the code builds it will try to use 1.4.1 EVEN IF 1.4.2 EXISTS, unless it is forced to use 1.4.2 by another module you depend on. That is, say you are using module X and it says:

    "some/pkg" v1.4.2
At this point 1.4.1 cannot be used - module x, which you are using, won't work with 1.4.1 - so pinning doesn't help. Your code will not build unless you manually update your pinned version.

What the minimal version selection does is say "okay one module needs >=1.4.1, another needs >=1.4.2. What is the minimal version that satisfies these requirements?" And the answer to that is `1.4.2`, so even if 1.4.8 exists, 1.4.2 is used in that scenario.

I don't know how this will work long term. I think there are definitely some concerns to consider (eg many minor version bumps are for security fixes), but the scenario you are describing - the newer version causing issues - just isn't an issue as I understand it because your code won't opt to use a newer version unless you (or another module you use) explicitly tell it to.


Gotcha, so it always chooses the minimum possible version, I was conflating multiple pieces from this thread. That would solve this possible issue at least, you're correct.


It sounds like you won't see the bug unless you commit a change to your own project's go.mod to change a version number. Assuming you test your changes before committing, your repo can never be broken by someone's commit to a different repo.

On the other hand, whenever you edit a direct dependency's minimum version, it could upgrade any other dependency via a cascade of minimum version upgrades. (In which case, don't commit it.) This keeps your team moving, but it might make dependency upgrades harder.

For example, it's up to each repo's owners (for applications, at least) to notice when there's a security patch to one of its transitive dependencies, and generate and test a commit. Someone will need to write a tool to do this automatically.

If the test fails: now what? I guess that's when you need to look into forking.


If you have no way to do pinning, how do you e.g. reproduce a bug or issue with a specific set of older dependencies?


You can specify that a given dependency be replaced by another one. That only applies at the top level though, not when the go.mod file with the replace clause is used by another module.


The ability to depend on library versions that do not exist is a misfeature. It should not be possible for someone to build a new version of their software and cause your software to cease building or running.

This doesn't just result in non-reproducible builds, but it results in them at unpredictable times and, if you have servicing branches of your code, backward through time. This is not a good property if you need to know what you are building today is the same as what you built yesterday, modulo intentional changes, or even that it will build or run.


It is difficult to live without versions pinning in commercial projects. Imagine if you get an urgent task that needs to be completed today, but then you find out that the dependencies were updated, your project doesn't build anymore and you need to fix that first.


> Over time, though, version pinning builds up technical debt.

Version pinning does not. Misusing it might, but don't do that.

Version pinning of dependencies is a tool for assuring the behavior of releases, but in development you should be generally be updating to the latest stable version of dependencies and then pinning them for the next release, not keeping the old pinned versions.


> Misuisng it might, but don't do that.

Isn't technical debt usually just the situation of having many things that should be done, but aren't? In other words, the accumulation of "don't do that" cases in code.


> Isn't technical debt usually just the situation of having many things that should be done, but aren't?

Sure, the things that you shouldn't do are the things that lead to technical debt. What I'm saying is that "version pinning to assure consistent behavior in stable releases" is not one of those things.

What is one of those things is "leaving pins from the last release in place in development".

Version pinning in releases is a means of providing stability for downstream users. It should be encouraged.

Leaving those pins in place while developing the next version is a source of technical debt. It should be discouraged.


Technical debt is definitely something to worry about, however I think most companies' first priority is a working product.

Technical debt is to be handled during the development cycle, not in source code and most of the issues you bring up (bloated software, version pinning) are solved in build systems and again should not be handled by source code.

Anyone working in an enterprise enviorment, especially that has clients on different systems, would instantly crumble without being able to target specfic builds.

Legacy and long term support software is everything.

>Go should reject version pinning as incompatible with the goals of Go.

This would instantly make go a 'no-go' in any enterprise enviorment.


Don't think of it as /version/ pinning. Think of it as Interface / API pinning.

A //public library// should have a well thought out interface that works at least for one project and preferably for several different projects.

If the interface needs to change I agree with the recommended semantics of planning what that change should look like and creating a new named interface (API) for that specification.

Since go expects everything to build against the 'latest' version of an interface, it should be obvious what's wrong if a stale cached version is in use. Thus /expanding/ an interface / API is permissible. They should be thought of as /write only/.


Being able to create consistent builds seems more important to me than being forced to fix builds because random packages changed their API.

You're right that it is technical debt to not be on the latest version, but if I need to roll back to a previous (of my application) version due to a regression, I want to be 100% sure I get the exact same state as I did before. If my dependency management software doesn't solve this, it's not good enough.


Just because you don't need reproducible builds doesn't mean nobody needs them.

As soon as some tool is used to provide reproducible builds, you have effective version pinning.

If your organization values avoiding technical debt accumulation over reproducable builds, they don't don't have to use version pinning and can always run 'go get' with the '-u' flag.

> once package users are using version pinning, it's easier for package maintainers to make changes that aren't backwards compatible.

I don't think ease of making backwards incompatible changes with vgo. Both before and after vgo, backwards incompatible changes must go in a new major version / new import path. The ability to pin major versions already exists.


Although no one talks about it you don't have to use go get. I don't know if that's bad or not, but at least two projects are doing it.


> Over time, though, version pinning builds up technical debt. You have software locked to old unmaintained versions of packages.

If you are talking about open source, it is not a problem: anyone can upgrade dependencies and make a pull request. Without version pinning, the project will always be in a "broken" state.

For example, recently I tried to run Firefox 3.5 on linux and it crashed with modern versions of gobject libraries although they are supposed to be backwards-compatible. I wish Debian package manager allowed to install an old version of a library specially for this application.


I am pretty sure this is exactly the problem semantic versioning is trying to solve. You only pin to the major version, which is supposed to ensure no breaking change occurs.


But it just delays the problem and introduces new ones...

Patch changes, don't care, minor changes, don't care, major changes, you're screwed [maybe].

https://www.youtube.com/watch?v=oyLBGkS5ICk (https://github.com/matthiasn/talk-transcripts/blob/master/Hi...)


personally this is the main reason I have avoided go from the beginning. without being able to specify a version of the library I want to use, whose to say that the next time I try to run my program it might use a newer version of a library and not run. If that happens there was no easy way around it other than digging into the code and fixing the problem.


Since Go is compiled and statically linked there's no way that library version can change between runs of your program. It can change between compilations of your program though, the standard go solution to that is vendoring code which will guarantee that it's always the same code getting compiled. This is not without its issues, but I find it to be a pretty usable system, just smart enough to get what I need done. To be fair though, vendoring didn't exist in the beginning of go, so this was a perfectly good reason to avoid it then.


Go stopped being statically linked a couple of versions ago.

Yes, by default it uses statically linking, however generating dynamically linked binaries and libraries is also an option.


Damn it. Just when dep seemed like it was going to finally end the horrible nightmare that is go dependency management, this comes along. There's this great survey of all the progress that had been made and then "but we're doing it in this bizarre terrible way because..."


this sentence

> First, the meaning of “newest allowed version” can change due to external events, namely new versions being published.

is the root of what's wrong with this proposal - namely conflating time of dependency resolution with the time of build. cargo (using it because it's been referenced by the post) solves this problem by telling you to check in the cargo.lock that it generates whenever you 'cargo update'. this means that your version control preserves the build dependencies, just as it preserves build recipes (e.g. build.rs or travis or whatever). in this world, there's no 'newest allowed version', because there is only one version.

wise men before me said 'as simple as possible, but no simpler' and I'm afraid this proposal is too simple.


>End of GOPATH

Yes yes yes yes finally! This has been the #1 reason I've avoided Go for years and even my recent forays back into the language were only after I figured out some hacks that I could use to avoid it.


I've been programming in Go for a couple of years now and found that this need to use "hacks ... to avoid it" is totally naive and irrational.

Agreed, GOPATH is awkward--at first. Because it's different from what most programmers are used to. But once it's adopted it actually makes a lot of sense, and looking back it seems bizarre to ever have such a strong desire to avoid it at all costs--and the costs are high.

The same applies for Gofmt as well.


I should add that another big part of the reason I've been avoiding the language for years is the cult of Golang telling me my complaints are "totally naive and irrational".


I've been working with Go for 6 years, your complaints are not "totally naive and irrational".

I don't mind GOPATH, but I did argue that if vendoring was to be a thing then it needed to be dropped entirely. I was also not a fan of the vendoring mechanism.

The approach given here looks much more thought out getting rid of each, first proposal I've actually liked.


This kind of attitude around package management and especially dependency injection have been major turn-offs for me as well. More than a sufficient number of the community has been emitting a high-pressure stream of NIH combined with ignorance of existing solutions, and it has been crippling the community for years.


Dependency injection is very commonly used in the Go ecosystem. Do not confuse resistance about unidiomatic DI frameworks with resistance about DI itself.


Which ones are common? I've been hoping this would be the case, but I've been unable to find evidence for it.

e.g. the most-popular one I can see so far is Facebook's "inject", which has a remarkably small 841 stars: https://github.com/facebookgo/inject

Even Glide, Godep, and Dep have several thousand. And those can barely claim "most", much less "very common".


What do you mean which ones? I am talking about dependency injection as a technique/pattern. In Go you can just do it without importing anything.

Please read:

https://medium.com/@benbjohnson/standard-package-layout-7cdb...

(I've linked to the exact point in the article where it talks about DI but you might want to read the rest as well.)


Yeah, it's widespread as a hand-coded thing. It has to be - you essentially have no choice in Go for even basic things like tests.

But that just means that everyone rolls their own thing by hand, with widely-varying feature-sets, method names, bugs, etc. DI frameworks are very much nicer for interop between a bunch of libs.


gofmt, sure. Consistency in style is great.

But GOPATH was a mistake in the same way Go's hitherto lack of versioning was: it deliberately breaks easy isolation between projects / dependency management.

This mindset only works if you control most of the ecosystem (such as with a monorepo), but it's a irritating mess otherwise.


But you do control all the source code within your GOPATH! It's right there on your file system.

There's also a mechanism to take control of the same code even when it's outside your GOPATH. Github calls this mechanism a "pull request".


It's really not irrational to want to organize my projects in a way other than by their (current) git repo URL. It's also not irrational to not want disparate projects to not share the same pool of dependencies (which even go has begrudgingly accepted as evidenced by "dep" and this proposal). Go is alright but I refuse to drink the GOPATH cool aid.


It’s totally reasonable to want to organize projects in a different way, but I think GOPATH’s caveats are overstated.

- It’s an evironment variable, which makes it relatively easy to modify. To switch your dependencies from one pool to another, you can modify your GOPATH.

- The vendor directory lets you pin your dependencies per-project without modifying GOPATH. It’s not the most beautiful solution (and Go doesn’t often offer the most beautiful solution to any particular thing), but it’s workable.


GOPATH was weird to get used to at first, but I gave in and I actually store all my projects, even non-go projects, in the same folder hierarchy now. It has made me much more organized. In fact here's a zsh function I have for cloning new projects into that structure. It makes my home directory much less of a mess.

  gclone() {
  		dir=$(echo $1 | sed 's/^http\(s*\):\/\///g' | sed 's/^git@//g' | sed 's/\.git$//g' | sed 's/:/\//g' )
  		git clone $1  "$HOME/src/$dir"
  		cd "$HOME/src/$dir"
  }


Because you didn't want to set an env var?


The problem is not the env var, because whether or not it was set or with a default value, you have to clone the project https://github.com/foo/bar.git at ~/go/src/github.com/foo/bar and name the imports github.com/foo/bar accordingly and run the go commands from there. I used to (automatically) just export GOPATH="${PWD}/.gopath" but this doesn't work anymore since 1.8 or 1.9.

People† at large just want to clone https://github.com/foo/bar.git wherever they see fit, like directly at ~/Workspace/bar, ~/work/bar, ~/projects/contrib/bar or even /tmp/bar.

† "People" includes CI systems that expect the typical "clone and cd straight into" thing to work, resulting in a lot of boilerplate and/or symlink hacks to work around such expectations.


> Because you didn't want to set an env var?

You haven't needed to set a GOPATH since 1.8, which was released over a year ago (we're now at 1.10). Since 1.8, the Go toolchain will use a default GOPATH; the environment variable is only needed as an override.


GOPATH has a default. Unix: $HOME/go Windows: %USERPATH%/go

but on ci servers it could be akward.


For what reasons did you need to avoid it?


Some people want to store their projects in their own folder structure and not be forced to store it in the very weird way Go does it. I didn't avoid Go for this reason but I did find the whole Go workspace way it does things a big turnoff.


You say that because you have infinite filesystem space. I don't.


Though GOPATH avoids duplication, it's entirely possible to avoid duplication without GOPATH.


And downloading the full Git repository for every `go get` is frequently a far greater waste than even storing copies of all the versions you're currently using in all projects you work on.

GOPATH is unambiguously terrible.


you're right.


Glad to see this! I'm still digesting the details, but to comment on https://research.swtch.com/cargo-newest.html

Cargo does not use the latest version. Saying

  toml = "0.4.1"
is the same as saying

  toml = "^0.4.1"
NOT saying

  toml = "=0.4.1"
which is what rsc would guess.

This decision was made because ^ is the most reasonable default; over-use of `=` destroy's Cargo's ability to help you upgrade, given that you basically asked it to not think about it at all. Further, it would significantly bloat binaries, as this would mean many more duplicate versions of packages in your project. Basically, idiomatic use of = is only in very specific situations.

We could have required that you always specify some operator, but we also like reasonable defaults. Writing `^` everywhere is just more typing.

The transitive dependency issue isn't any different, really: that's due to said packages also not using =.


Hi. As I said in that page, I do know that cargo is working as designed, and that "0.4.1" is the same as "^0.4.1". My point is maybe a little more subtle, that cargo takes the newest allowed under the constraints, so given "^0.4.1", it has a choice between 0.4.1, 0.4.2, 0.4.3, 0.4.4, and 0.4.5, and it takes the last.

When you're adding a new direct dependency, taking the latest is almost certainly right. But when you're pulling in a new transitive dependency, I think taking the last is problematic: it seems much safer to me to take the one closest to what the author of that code tested with.

I'll elaborate more on this in a post tomorrow.

For what it's worth, I think I do understand what cargo is doing, and why, and I completely respect that approach. I just think this other approach might have some different properties worth exploring. As I've been working on this I've been using cargo as the gold standard, really. If the end of this all is that people say Go's package management is as nice as cargo, then I will be very very happy. A long way to go yet for sure.


I think your wording there is incorrect and very misleading, even if you do understand it. You say

> Cargo selects the latest version of a new dependency and its own new dependencies

but that's not true; it selects the "latest semver-compatible minor version", which is a pretty different thing. The way you've phrased it makes it seem like cargo just flat out selects the newest version period (which can cause breaking changes and a whole lot of pain).

So of course "people are frequently surprised", your actual statement is incorrect and misleading. "Blindly updating to new versions" is a completely different (and scary) proposition from "updating to semver compatible minor versions". The latter still has its issues (I think the maximally minimal approach that vgo is taking is a pretty neat solution to this and some other issues) but it's a much more contained set of issues.


> but that's not true; it selects the "latest semver-compatible minor version", which is a pretty different thing.

No, it's exactly the same thing. Russ already established that an incompatible module is not just a later version but, in fact, a totally different thing.


To be clear, I don't think you don't understand it; I'm afraid that the way that you've worded it means that others won't understand it.

I also very much don't think you should cargo cult! As munificient said above, this is a Hard Problem, and there are a lot of options. In some senses, I'm glad you're not just copying things, as well, that's how we all learn.


I think if you aren't going to simply "ignore versions", like we do in rebar3 and took from maven (we used cargo and maven/leiningen as the gold standards when working on rebar3) then taking the latest makes more sense. A patch version must be important enough to release instead of waiting for the next minor release, which I think says something about taking the latest.

But that is only if versions are used this way at all, I've found, at least in the Erlang world, the model of taking the first "version" found in the dep tree and locking to it to work well. In that case a patch version is not considered special anyway.


> I think taking the last is problematic: it seems much safer to me to take the one closest to what the author of that code tested with.

If someone doesn't respect semantic versioning the difference of one minor or few minor versions is irrelevant.

> For what it's worth, I think I do understand what cargo is doing, and why, and I completely respect that approach.

I think understanding SemVer would help a long way.


I get down voted but look at this https://go-review.googlesource.com/c/vgo/+/95700/4/vendor/cm...

hard coding v1 and v2 check, anyone who understands semvar would never make such mistake!


`=` deps also are inherently incompatible with libraries, especially in languages like go where you can only have one version of a library in your dependency graph at a time. e.g. if my library depends on foo =0.4.1 and I try to bring in a library that needs foo >=0.4.2 I can't compile my program. If I directly depend on foo I can override it, if its nested I can't reach into my dependencies without an override mechanism (which I suspect the go devs will want to avoid to keep the system simple).

Its also worth noting that cargo uses a lock file. toml will only get new versions when add/remove/update a package in your cargo. Normal builds will all use your Cargo.lock.


> Minimal Version Selection

Means no security fixes at the price of, well, minor developer inconvenience? What is the inconvenience, exactly?

> ...tomorrow the same sequence of commands you ran today would produce a different result.

I mean, I guess this is technically true. But seems like it shouldn't be an issue in practice as the API you're calling shouldn't have changed, just the implementation. If it has changed, then downgrade the dependency?


As a long-time maintainer of various packaging systems (and co-author of one):

I find myself wholly in agreement with the idea that maximal version selection is appropriate for operating systems and general software components, but not necessarily desirable for a build system.

When you consider the evaluation of a dependency graph in the context of a SAT solver, you realize that the solver would consider both a minimal and maximal version as potentially satisfying the general constraints found in a dependency graph. Whether you then optimize the solution for a minimal or maximal version becomes a matter of policy, not correctness.

The security concern can be appropriately addressed by increasing the minimum version required in the appropriate places.

With all of that said, I think that Go's potential decision to optimize for minimal version selection is likely to be considered a bug by many because it will not be the "accepted" behavior that most are accustomed to. In particular, I can already imagine a large number of developers adding a fix, expecting others to automatically pick it up in the next build, and being unpleasantly surprised when that doesn't happen.

This is an interesting experiment at the least, but I hope they make the optimizing choice for minimal/maximal version configurable (although I doubt they will).


> When you consider the evaluation of a dependency graph in the context of a SAT solver, you realize that the solver would consider both a minimal and maximal version as potentially satisfying the general constraints found in a dependency graph. Whether you then optimize the solution for a minimal or maximal version becomes a matter of policy, not correctness.

I believe rsc is hoping they can avoid the need for a SAT solver entirely by going with minimum versions (and, implied, not allowing a newer package to be published with a lower version number).


Whether you use a SAT solver or not doesn't really matter for the purposes of this discussion -- the SAT solver is just a tool that can be used to find a solution given a set of logical statements.

My point really was this: both a maximal and minimal version can exist that satisfy a minimum-version (<=) bound dependency. In such case, which one is chosen is a matter of policy, not correctness.

As for not allowing a newer package to be published with a lower version number, that is sometimes necessary in a back-publishing scenario. For example, imagine that you publish a newer package with a higher version that has breaking changes and a security fix, and you also publish a newer package of an older version that just has the security fix. It's entirely valid to do so, for what should be obvious reasons.


Actually, I think the SAT solver is avoided by making the only version constraints of the form <=.

Using the min version appears to eschew the need for lock files. Want to upgrade? Bump your min version.


> Using the min version appears to eschew the need for lock files.

This only works if the system also prevents you from publishing a version of a package with a lower number than any previously-published version of that package.

So, for example, after you've shipped foo 1.3.0, if a critical security issue is found in foo 1.2.0, you can't ship foo 1.2.1. Instead, all of your users have to deal with revving all the way up to foo 1.3.1 where you're allow to publish a fix.

It's not clear to me why they're trying so hard to avoid lockfiles. Lockfiles are great.


You CAN ship v1.2.1 after v1.3.0 is live. I have tested this with the vgo prototype and it works fine (see github.com/joncalhoun/vgo_main):

    $ vgo list -m -u
    MODULE                          VERSION                    LATEST
    github.com/joncalhoun/vgo_main  -                          -
    github.com/joncalhoun/vgo_demo  v1.0.1 (2018-02-20 18:26)  v1.1.0 (2018-02-20 18:25)
Notice that v1.0.1 was released AFTER v1.1.0

What the minimum version is doing is giving our code a way to automatically resolve upgrades if they are necessary. Eg if module X requires module Z w/ a version >= 1.0.1, while module Y requires Z with a version >= 1.1.0 we clearly CANNOt use v1.0.1, as it won't satisfy the requirements of Y, but we CAN use v1.1.0 because it satisfies both.

The "minimum" stuff basically means that even if a version 1.3.2 of Z is available, our code will still use v1.1.0 because this is the minimal version to satisfy our needs. You can still upgrade Z with vgo, or if you upgrade a module X and it now needs a newer version of Z vgo will automatically upgrade in that case (but to the minimal version that X needs), but random upgrades to new versions don't just occur between builds.


What happens if:

1. I depend on foo with constraint ">1.5.0". The current minimum version of foo that meets that is 1.7.0.

2. Later, foo 1.6.0 is published.

3. I run go get.

If I understand the proposal correctly, that go get will now spontaneously downgrade me to foo 1.6.0. That defies the claim that builds are always reproducible.


So, I think you're right... but this is only a flaw if you as a user specify a lower bound that does not exist. The tool won't do this. And it can be prevented by disallowing referring to versions that don't exist.

It's entirely valid (and interesting! I hadn't thought of this one), but I'm not sure if this would happen even once IRL, except for people trying to break the system. Which can be fun, but isn't a risk.


My experience from maintainer a package manager and trying to keep the ecosystem healthy — which mirrors my experience on lots of other systems with many users — is that anything your system allows people to do will be done at some point.


heh, good point :)

as always, of course there's a relevant xkcd: https://xkcd.com/1172/


If your module requires at least version 1.3.0 of a dependency (i.e. 1.3.0 is the minimum version that satisfies the constraint) then it doesn't matter if a new version appears upstream. 1.3.0 is always the minimum version that satisfies >=1.30. That is, unless version 1.3.0 disappears from upstream.

If there is only one constraint against a dependency, then it behaves exactly as version locking.

The "maximum of the minimums" rule kicks in when the same dependency appears more than once in the dependency graph, because the constrains might be different.

vgo won't fail and say "incompatible versions". It will just resolve to the biggest of the lower limits. It's up to the build and test system to judge if the combination works.


One potential problem with dependency resolution that focuses solely on minimal or maximal-bound resolution is that it usually ignores the untested version combination problems.

That is, a given version that satisfies a version bound may not have necessarily been tested with all of the different combinations of versions of components it is used with. It will be interesting to see how minimal version selection interacts with that particular challenge. For a build system it may matter much less than an operating system.


As far as I know, that problem is effectively unsolvable. For most real-world-sized dependency graphs, the set of all valid combinations of dependency versions is heat-death-of-the-universe huge.


The way the system I worked on "solved it" was to provide another layer of constraints called "incorporations" that effectively constrained the allowed versions used to resolve dependencies to a version that was delivered for a given OS build. Administrators could selectively disable those constraints for some packages, at which point resolution was solely based on the "standard" dependencies. But by default, dependencies would "resolve as expected".

The "incorporate" dependencies are described here: https://docs.oracle.com/cd/E26502_01/html/E21383/dependtypes...


We can only hope they'll at least have a tool that you can run that will bump up all the versions to specify the latest. But then of course that will make solving dependencies a lot harder since it's much more likely you'll get an unsolvable set of deps. It's hard to see how this could ever help anything.

Lock files also have the major advantage that you have a hash of the thing you're downloading so you're not depending on the maintainer not e.g. force pushing something.


Because only the minimum version can be specified, you will always get a solvable set of deps. Those deps might not build but they will be solvable.


These issues all seem to be solved by lockfiles, which are a part of the dep tool. Why ditch them here? It seems like rsc seems to be hinting at an argument for simplicity. But the long history of package managers, and my anecdotal experience confirms, that you're going to need a lock file no matter what. Good versioning practices are not enough, npm versions 1-4 prove that you need locking, and it must be on by default to make reproducible software. Minimum versioning will likely be better than the newest version for stability, but its not enough.


Not only that, but locking allows for what is IMO the best way to handle this through CI:

Always run your builds with maximal version selection in CI (unlocked), and lock the new versions in if it passes all tests, otherwise leave the previous locked versions and flag it for review.


> I mean, I guess this is technically true. But seems like it shouldn't be an issue in practice as the API you're calling shouldn't have changed, just the implementation. If it has changed, then downgrade the dependency?

And if the new implementation has a new bug, you might be screwed. It worked last week, but not this week. How do I get back to the working version?


Use 'git bisect' or some similar tool to walk back and find the point where the lock file change triggered the bug?


If you have a lock file, there's no problem. I was arguing about what happens when there isn't a lock file.

I may have inferred something that wasn't in the original comment, by reading too many of the other comments on this page.


Sort but given that we're relying on git mostly, what are the advantages of lock files over submodules/subtrees ?


Well, that would lock all of the projects into git. What if some libraries are in git, and others are in mercurial and darcs?

I think it’s better to have a package manager that’s not reliant on a particular source control system.


A lot of this seems to hinge on "go get" being the best way to get Go source code. Most discussions I've read so far come down to "well, this might be cool, but it would break go get, so we cannot ever do it".

I for one could live without go get alltogether. In 99% of cases it boils down to a simple git clone anyway, which could (should?) be done by the package manager anyway.

Maybe it's time to re-evaluate the existence of go get, now that we're seeing that "just clone master and hope nothing breaks" has obviously not worked for everyone outside of Google? Maybe bundling Go (the compiler and linker) with tools for fetching source code wasn't such a good idea after all?

Just my 2 cents...


I find it bizarre how unnecessarily hard that’s been made to avoid a slight amount of extra typing. The github imports work with the most basic options but I’ve seen people waste a lot of time on internal servers, SSH vs. HTTPS, tagging, etc. where it’d have been cleaner, easier and forward-compatible if you just gave it a URL and it shelled out to run “git clone <URL>”.


go get, go install and go build are cool and great tools! Let's not turn them into a maven/pom/npm crap!


I like the removal of GOPATH, even though I don't mind it at all, it can be an annoyance, especially when I don't want to co-mingle work/personal stuff. Now I don't need to worry about setting GOPATH based on what I'm working on. Plus sometimes I'd wonder, is it all in vendor, or is it using GOPATH for something.

Plus now libraries are "modules" so libraries can have dependencies for specific versions, before the question is what do you do if multiple dependencies have the same dependencies in their own separate vendor directories. This change removes that, as it's handled by the vgo tool for all modules in that build.

go.mod is kind of a cross between lock-file and dependency listing. I think it will work alright.

All together, it seems to be a cross between gb and dep, while also attempting to solve library-packages tracking dependencies too.


Highly suggest to read the demo at https://research.swtch.com/vgo-tour, which really clarifies everything. I do not write Go, but I did write some bits in it in the past, and now think that this will be a very useful addition to the Go toolchain. When I looked at Go for the first time I really disliked the simple but sort-of castrated go-build, and didn't really follow the developments since then, but this seems to solve quite a bit of problems. I especially like how you can shadow a certain dependency with a given package at a given path, and how it facilitates vendoring. Hope this becomes included in go soon.


If there is neither GOPATH nor vendor directories, where will the downloaded packages be?

I digged vgo a bit and found the downloaded packages currently reside in `$GOPATH/src/v/cache`, and seems like vgo won't work without GOPATH for now (https://github.com/golang/vgo/blob/b6ca6ae975e2b066c002388a8...).


I think that the choice to use "prefer minimum version by default" going to confuse a lot of new developers used to the industry standard being the opposite. If you are going against the industry default, there should be a strong reasons. I don't think reasons provided really justify this.

> First, the meaning of “newest allowed version” can change due to external events, namely new versions being published. Maybe tonight someone will introduce a new version of some dependency, and then tomorrow the same sequence of commands you ran today would produce a different result.

This is why we have lock files and they work better for this.

While vgo's approach does allow reproducible builds, it doesn't allow you to guarantee reproducibility the way that a lock file that has the specific commit hashes does. With specific commit hashes you can verify the the version you are building with in production is the exact same code as the one your security team audited prior to release (assuming git has fixed its hash collision vulernability).

You can get around this by abandoning version constraints in your go.mod file, but then you have to track these version constraints out of band and manually figure out what commit hash to stick in go.mod

You could also get around this by storing the hases for the approved versions out of band and creating a build script that verifies these hashes prior to building.

Both of these workarounds seem to defeat the point of having a standard package manager in the first point.

> Second, to override this default, developers spend their time telling the package manager “no, don't use X,” and then the package manager spends its time searching for a way not to use X.

If you are concerned with allowing users to override this default, why not have directive to override this default that can optionally be added to each requirement in the go.mod file. This avoids an unexpected default and doesn't force people who want the industry standard default to use a script to set which dependencies use or don't use the -u flag with 'go get'.


>With specific commit hashes you can verify the the version you are building with in production is the exact same code as the one your security team audited prior to release (assuming git has fixed its hash collision vulnerability)

Can we take a moment and point out that the hash collision vulnerability is still at large with no ETA? This is after years of it being considered insecure

I feel like this point continues to be glossed over with versioning systems that depend on git commit hashes.

Bruce Schneier warned in February 2005 that SHA needed to be replaced. (https://www.schneier.com/blog/archives/2005/02/cryptanalysis...) Git development didn't start until April 2005. So before git was even developed, SHA was identified as needing to be deprecated.

So now, 13 years later, this is still an issue.


If you want to check that you're building with the sources you think you're building, you probably don't want to be using `go get` anyway; the internet is kinda lossy, and that repo you wanted to pull probably doesn't exist anymore. I would probably use the proxy / redirect support to feed it the known-good sources you archived (a la maven / srpm / etc.).

As an example, jteeuwen/go-bindata was deleted this month and somebody made a new repo in its place... hopefully with the same contents.


I predict that under the proposed minimum version selection system some package will decide it's important and flawless enough that users should always depend on the most recent version. The package will recommend that users depend on version v1.0.0 but the first release will be v1.100.0. Subsequent releases will decrement the minor version to make sure newer releases are selected.


It certainly complicates the picture. I think the only way to make this workable is if you adopt a convention that you specify your minimum dependency if you're a library, but the current version if you're an executable. Of course, this could just be codified in the tool, which would be a big improvement. It also makes testing a lot harder though since your library ends up tested against a different version than what your users are using.

Ugh.


User's wouldn't be able to depend on version v1.0.0 if it doesn't exist. From what I understand the tool isn't downloading the list of all versions from GitHub and selecting the minimum. Its searching for the maximum version number (i.e. minimal required version), from your go.mod file and all your dependencies' go.mod file.


I think you're correct. So my nasty little hack wouldn't work. That's probably a good thing.


Interesting loophole, but it requires the community to go along with it. If it turns out people like the minimum version policy and the package is actually important, I would expect someone to fork the repo and do things the normal way?


As I look this over a couple things really jump out at me...

1. VCS tags are mutable. That's lock files store revision ids. Go is being used to build immutable infrastructure but the proposed package management system uses mutable versions.

2. The proposal is less featureful that dep, npm/yarn in JS, composer in PHP, maven, crates for rust, and others. I wonder how people will react to that.


Mutable tags are my primary concern here, yeah. It seems pretty mitigate-able by using that (wonderful IMO[1]) `v1.2.3-date-sha` syntax though - it's just not human editable[2].

[1]: it fixes so many readability problems with SHA-pinned lock files, easily shows downgrades in `diff` output, and `sort` likely produces the exact result you wanted.

[2]: which may not be a problem, since you could in theory just re-run the tool to fix it when you enter "v1.2.3" by hand.


One that that I'm not able to tell from the proposal: is the /v1 at the end of an import path a magical value that the tool handles, or do I literally need to have that directory in my repo and have a version history represented at the top level of every project?

Why not do the symbol mangling in the compiler if the goal is to have multiple simultaneous versions built in? This is easy to make backward compatible since it could default to v1.


Like all Go things, it looks kinda strange. But after playing around for a bit, it feels really good. The minimal version selection simplicity is genius.


I feel like the Go devs optimized implementation simplicity early on (likely a sensible choice given Google's "one giant repo" workflow), and that has resulted in some not-insignificant ecosystem complexity later (for others not using "one giant repo" workflows).

It will be interesting to see how this plays out in the longer term.


Please don't remove vendoring...


I'm fine with vendoring going away as long as we can check these new modules into version control and have vgo use them. Adding a caching proxy adds too much infrastructure for most folks, and without it you can't have reproducible and available builds.

I like Go's distributed package imports (even if everyone just uses Github), but it means you're distributing single points of failure all over the place. Vendoring (whether source or modules) solves this problem.


Yeah, this.

You want reproducible builds? Vendor your crap. Check it into your repository. Live free.


Or adopt a system with a sane versioning system and immutable artifacts, e.g. Maven Central.


which still means that your ci needs to download this stuff. with vendor it is already checked it. consider builds on docker etc, where build caching is hard. go is a breeze.

build takes like half a second because everything is already there. mvn will first download the world. I'm a scala developer and sbt will probably take like 10 minutes just downloading stuff, if a cache missed (even with a local proxy).


I wonder how it is possible to take 10 minutes, unless it is downloading directly from internet.

We always have internal Plexus mirrors, and the Jenkins servers have their global .m2 local repository.

Even big JEE applications barely take more than 5 minutes to build.

The only build system I really dislike are Android builds with Gradle, trying to beat C++ compilation times.


well sbt uses ivy which is way slower than maven, when it comes to resolving stuff.


First things first, Maven's config files are an elaborate form of torture. Second, there is no such thing as remote artifacts that are immutable.

Check your shit into your git repository. And managing that becomes a problem, you've already done goofed up and have too many dependencies, and it's unlikely there's _anything_ reproducible about your software.


> Second, there is no such thing as remote artifacts that are immutable.

And yet, that's exactly what Maven Central artifacts are.

And why Maven Central is rock solid for reproducible builds while supporting versioning.


Vendoring is here to stay.


Am I totally missing something in this document or does it not talk about where these modules either as zips or source code are stored? It mentions $GOPATH and vendor as not necessary anymore.

Where will the actual code be?


I am curious why go doesn't just vendor dependencies like npm does when confronted with conflicting requirements instead of trying to figure out a version that satisfies all worlds? I always found this to be a nice feature when working with npm.

I am also firm believer in version-pinning/lockfiles. Updating versions should be a separate workflow with first-class built-in tools to support it. I think that is the area where most package managers fall flat. They basically rely on the developer to do all the heavy lifting.


nodejs's "require", which pulls different versions depending on which module calls it, is a cool trick.

Unfortunately, it also has a downside, and in go that downside would be noticeable.

Let's take one easy example: a logging library. Let's say I pull in "logrus v1.0.1" and one of my dependencies pulls in "logrus v1.0.2". In my "main" function, I set logrus's default loglevel to debug (logrus.SetLevel(logrus.DebugLevel)).

If go did the thing nodejs does (a different copy of logrus for my dependency than my main), the "logrus.SetLevel" would only affect my package, not any of my dependencies; they'd still log at a default level.

This would be true of other things; "func init" code would run once per dependency that had that library, not just once, maps wouldn't be shared, pools, etc.

This is a lot more memory usage, but it's also really surprising, especially in the case of logging.

I definitely prefer having only one copy of a library in memory and having package-level variables (like loglevel) work as expected.

If that ends up failing and I need two different versions, vendor+import-path rewriting allows an escape hatch

These days, npm actually tries to minimize and flatten dependency versions as much as possible to avoid the huge memory tax.


> This proposal ... deprecates GOPATH in favor of a project-based workflow

That would be awesome. GOPATH was an awful idea that works only for distribution package managers.


How would you specify dependencies on a commit/tag in a particular branch (say, during development I always want to build with the HEAD of the development branch for this module)? I know I could clone what I want somewhere and use a replace directive inside the go.mod file, but I was wondering if there's any way to do this using only vgo.


  To name untagged commits, the pseudo-version v0.0.0-
  yyyymmddhhmmss-commit identifies a specific commit made on 
  the given date.


Will there be an official proxy or will open-source projects still break when someone moves/deletes a repository?


That's what local caches are for.


Sure but me cloning the repository of someone else won't give me that person cache. Without a central cache, it might work on that person computer but not on mine if a repository moved.


On one hand, as a developer, I like a Go builtin package management authored and supported by the Go team; on the other hand, I feel bad for the open source contributors who worked hard on third party package managers. Even dep, the officially supported PDM, seems supposed to fade.


> Even dep, the officially supported PDM, seems supposed to fade.

I beleve this was always the plan. The dep readme certainly makes it sound that way at least:

> dep is a prototype dependency management tool for Go. It requires Go 1.8 or newer to compile. dep is safe for production use.

> dep is the official experiment, but not yet the official tool. Check out the Roadmap for more on what this means!


> A build of a module by itself will always use the specific versions of required dependencies listed in the go.mod file. As part of a larger build, it will only use a newer version if something else in the build requires it.

This seems fraught


I think you need a noun to go with that adjective.


With danger


This is a well reasoned proposal. I've got a couple qualms.

Let's say you care about strict build reproducibility. You want some form of manifest + lock file that defines exactly what versions of dependencies your package expects. Great; more power to you.

The only reason you'd want that is if you aren't storing your dependencies with your source code in source control. Otherwise, what's the point? If you were vendoring and committing it, you already have strict reproducibility, and you have the "supported versions" defined in the git repositories that come alongside your dependencies.

So, adding this manifest+lock file allows you to get away with not vendoring+committing. Awesome. You've gained reproducibility, but certainly not strict reproducibility.

Dependency package maintainers can rewrite git history. They can force push. They can push breaking changes with minor versions. They can delete their repositories. All of these things have already happened in NPM, either maliciously or accidentally; why do we think they wouldn't happen with Go?

If you want strict reproducibility but are expecting it with just a manifest+lock file, without dependency vendoring+committing, you're not getting it. Full stop.

So, really, by adding a manifest+lock file, you're adding a "third level" of reproducibility (#2 on this list).

1. Low Reproducibility: Pull HEAD on build.

2. Partial Reproducibility: Use tooling to pin to a git commitish, pull this on build.

3a. Full Reproducibility (Easy+Dirty): Vendor+Commit all dependencies.

3b. Full Reproducibility (Hard+Clean): Mirror your dependencies into a new self-controlled git repo.

I am struggling to think of a solid use case that would find real value in #2. I have no doubt that people think it would be valuable, but then the moment the left-pad author deletes his repository, you're going to second guess yourself. You didn't want #2; you wanted #3 and settled for #2 because #3 was too hard or dirty and you convinced yourself that adding tooling and version numbers was keeping you safe. Because semver, right?

Moreover, this comes at the cost of complexity, which is strictly against Go's core doctrine.

Moreover, a #2 solution looks strikingly similar to gopkg.in. Instead of referencing Repository:HEAD we reference Repository:Commitish. The main difference is that gopkg.in is an external service whereas this would be controlled tooling. But by choosing #2 you're already exposing yourself to bad actors, accidents, github going down, all of the above. So you're willing to accept that exposure, but aren't willing to add that little extra exposure of one more service?

I agree that the problem today, even with dep, isn't perfect. But I still strongly believe that the answer lies somewhere in the ideas we already have, not by importing ideas from NPM. We need a #3 that is Easy and Clean, not a #2.


Is the "import compatibility rule" (ie, `/v2` suffixes in package paths) implemented in vgo today?


Amazing that in 2018, a language team needs to be convinced that versioning packages is a good idea.

What's next, persuading them that generics are pretty cool too?


Considering there are existing and official dependency management tools for Go, the Go team doesn't need to be convinced that versioning is a good idea. This article is a discussion about mainlining a dependency management system as part of the existing `go` tool.

They're similarly aware that generics are a desirable thing to have, but haven't determined the ideal way to implement generics into the language.


Did you actually read the article? It is clear from the background and history sections that the need for versioning was apparent from early on.


this looks damn complex.


Just use Bazel to build your go code. It already solved this problem.


How does Go not have package versioning? What do you guys use to get the appropriate version of a dep?


There's a bunch of external packages. One by the go team: https://github.com/golang/dep


None of the top 10 of contributors to dep are from the Go core team, dep is a community project that eventually clawed their way into official recognition by sheer force of will by some great people.


Just use Bazel to build your go code. It has already solved this problem.


Generics first please, I can live with using tools like vndr and deep.


Sorry, maybe generics next.


Thanks!

Vendoring needs to be solved and unified, I get it.

But to me, the biggest thorn is generics. I have ran into too many cases where generics would have made my life a lot easier.


We've heard this one before, and uppercase doesn't summon it any more effectively.

https://news.ycombinator.com/newsguidelines.html


I was being cheeky.

Thank you for your attention.


> we also must not remove the best parts of the current go command: its simplicity, speed, and understandability.

Then you propose to introduce the concept of modules and make the world a hell?

I believe versioning should be left of to third-party tools. For example, no one is complaining about the lack of versioning built in Node.js.




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: