Context for those who don't know: I along with Natalie Weizenbaum wrote pub, the package manager used for Dart.
> Instead of concluding from Hyrum's law that semantic versioning is impossible, I conclude that builds should be careful to use exactly the same versions of each dependency that the author did, unless forced to do otherwise. That is, builds should default to being as reproducible as possible.
Right on. Another way to state this is: Changing the version of a dependency should be an explicit user action, and not an implicit side effect of installing dependencies.
It took me several readings to realize that you encode the major version requirement both in the import string and in the module requirements. The former lets you have multiple copies of the "same" module in your app at different major versions. The latter lets you express more precise version requirements like "I need at least 2.3, not just 2.anything".
I think it's really going to confuse users to have the major version in both places. What does it mean if I my code has:
For what it's worth, Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain, but it doesn't appear to be insurmountable. Most of the pain seems to be in the performance issues over-constrained dependencies caused in our old version solver and not in the user's code itself.
In almost all cases, I think there is a single version of a given package that would work in practice, and I think it's confusing for users to have an application that has multiple versions of what they think of as the "same" package inside it. This may be less of an issue in Go because it's structurally typed, but in Dart you could get weird errors like "Expected a Foo but got a Foo" because those "Foo"s are actually from different versions of "foo". Requiring a single version avoids that.
> I believe this is the wrong default, for two important reasons. First, the meaning of “newest allowed version” can change due to external events, namely new versions being published. Maybe tonight someone will introduce a new version of some dependency, and then tomorrow the same sequence of commands you ran today would produce a different result.
No, I think newest (stable version) is the right default. Every package manager in the world works this way and the odds that they all got this wrong are slim at this point.
At the point in time that the user is explicitly choosing to mess with their dependencies, picking the current state of the art right then is likely what the user wants. If I'm starting a brand new from scratch Ruby on Rails application today, in 2017, there is no reason it should default to having me use Rails 1.0 from 2005.
Every version of the package is new to me because I'm changing my dependencies right now. Might as well give me the version that gets me as up-to-date as possible because once I start building on top of it, it gets increasingly hard to change it. Encouraging me to build my app in terms of an API that may already be quite out of date seems perverse.
> This proposal takes a different approach, which I call minimal version selection. It defaults to using the oldest allowed version of every package involved in the build. This decision does not change from today to tomorrow, because no older version will be published.
I think this is confusing older versions and lower. You could, I suppose, build a package manager that forbids publishing a version number lower than any previously published version of the package and thus declare this to be true by fiat.
But, in practice, I don't think most package managers do this. In particular, it's fairly common for a package to have multiple simultaneously supported major or minor versions.
For example, Python supports both the 2.x and 3.x lines. 2.7 was released two years after 3.0.
When a security issue is found in a package, it's common to see point releases get released for older major/minor versions. So if foo has 1.1.0 and 1.2.0 out today and a security bug that affects both is found, the maintainers will likely release 1.1.1 and 1.2.1. This means 1.1.1 is released later than 1.2.0.
I think preferring minimum versions also has negative practical consequences. Package maintainers have an easier job if most of their users are on similar, recent versions of the package's own dependencies. It's no fun getting bug reports from users who are using your code with ancient versions of its dependencies. As a maintainer, you're spending most of your time ensuring your code still works with the latest so have your users in a different universe makes it harder to be in sync with them.
Look at, for example, how much more painful Android development is compared to iOS because Android has such a longer tail of versions still in the wild that app developers need to deal with.
If you do minimum version selection, my hunch is that package maintainers will just constantly ship new versions of their packages that bump the minimum dependencies to forcibly drag their users forword. Or they'll simply state that they don't support older versions beyond some point in time even when the package's own manifest states that it technically does.
There is a real fundamental tension here. Users — once they have their app working — generally want stability and reproducibility. No surprises when they aren't opting into them. But the maintainers of the packages those users rely on want all of their users in the same bucket on the latest and greatest, not smeared out over a long list of configurations to support.
A good package manager will balance those competing aims to foster a healthy ecosystem, not just pick one or the other.
> If I'm starting a brand new from scratch Ruby on Rails application today, in 2017, there is no reason it should default to having me use Rails 1.0 from 2005.
In the tour it states, "We've seen that when a new module must be added to a build to resolve a new import, vgo takes the latest one." which means that the newest Rails would be used and set in your `go.mod` file.
From that point onwards the "minimal version" will be used, which means vgo won't upgrade you to a version released tomorrow unless you (or a module you use) explicitly state that they need that newer version.
This is a much saner default than the one you describe (imo) as people still get recent versions for new projects, but once they are using a specific version they won't upgrade unless they need to or want to.
I should have addressed this in the original reply and its too late to edit now, but this isn't an issue. I downloaded vgo and verified that you CAN release a 1.1.1 AFTER 1.2.0 and it is treated correctly as far as I can tell.
$ vgo list -m -u
MODULE VERSION LATEST
github.com/joncalhoun/vgo_main - -
github.com/joncalhoun/vgo_demo v1.0.1 (2018-02-20 18:26) v1.1.0 (2018-02-20 18:25)
That works for adding a new dependency. But, as I understand it, if I decide to upgrade my dependency on foo by changing its already-present version in my app's module file, this does not upgrade any of the transitive dependencies that foo has. Instead, it selects the lowest versions of all of those transitive dependencies even though my goal with foo itself is to increase its version.
So now I have to reason about sometimes it picks the latest version and sometimes it doesn't, depending on the kind of change I'm making.
That said, you can just upgrade all the dependencies with vgo get -u and get the "always latest" behaviour. This is a desirable result, but it shouldn't happen at each and every fresh build.
You can have automation that periodically tries to bump all the versions and if all tests passes send you a PR with the proposed update.
With the proposed rules you get
1. Repeatable builds as with lock files
2. Simple to reason about constraint resolution on case of multiple modules depending on the same module.
requires "foo" v1.0.0
requires "bar" v1.0.0
require "bar" v1.0.1
Now imagine I want to add another package, say it is the wham package and it has the following dependencies:
require "bar" v1.1.1
requires "foo" v1.1.0
requires "wham" v1.0.0
requires "bar" v1.1.2
To me this makes sense. The creator of foo may have avoided upgrading the dependency on bar for some performance reasons, so this upgrade only happens in your code if it is required by another package, you initiate it manually, or if the foo package releases a new version with updated dependencies in its go.mod file.
PS - I've tested this all using the prototype of vgo. You can see yourself by grabbing this code: github.com/joncalhoun/vgo_foo_main and then use vgo to list dependency versions and try upgrading foo which has a dep on demo.
I think this makes a strong case for not releasing major version upgrades that use the same package names. The very idea of two incompatible things having the same name should set off alarm bells. Instead of trying to make that work, we should be avoiding it.
In the absence of this principle, the Java ecosystem has developed a compensatory mechanism of packaging "shadowed" versions of their dependencies alongside their own code. This is an ugly hack to accomplish the same thing after the fact, so we are already incurring even more complexity than would be imposed by following this rule.
If you do that, I think you'll find in practice that one of two things happens (or more likely, both, in a confusing mixture):
1. People start releasing packages whose names include version numbers. "markdown2", etc. Then you get really confusing hallways conversations like, "Yeah, you need to use markdown2 1.0.0."
2. People start coming up with weird confusing names for the next major version of packages because the current nice name is taken. Then you get confusing conversations like, "Oh, yeah, you need to upgrade from flippitywidget to spongiform. It's almost exactly the same, but they removed that one deprecated method." Also don't forget to rename all of your imports.
I think the existence of those practices (like Java dependency shading) proves that people are struggling towards this solution on their own, without support from the language or the community. With official support, if major versions work the same way for everybody, it won't need to be so janky.
In practice, I predict that people would start behaving better, doing what they should have been doing (and what many have been doing) all along: avoiding unnecessary breaking changes in non-0.x libraries, choosing function signatures carefully, and growing by accretion and living with their mistakes. Right now, I think some developers see major version bumps as a convenient way to erase their mistakes, without taking into account the cost imposed on users who end up juggling dependency conflicts.
Frankly I think we tend to conflate two separate but related tasks in these discussions: communicating updates, and distributing dependencies.
vendor/ folders are a totally fine distribution system - optimal even. Time-to-compile is 0 because you get the dependencies with git clone.
So really the problem we have is communicating updates (and some git server tooling smarts to deduplicate files but let github solve that).
As for the tasks that need to be solved here, the primary one I see is reconciling the needs of different libraries. What do you do when you depend on library A and library B and they need two incompatible versions of library C? As I see it, there's no clean way to answer that question if A and B expect the two incompatible versions to be present with the same name.
Caching proxies for zip downloads sounds nice, but it's more than just "a bit more infrastructure". I think it would be a huge burden to package publishers if each of them has to manage their own dependency zip mirror as a separate piece of infrastructure. You need version control anyway; checking your dependencies into that same version control does not require a new piece of infrastructure.
Coming from Ruby, where rubygems.org is a very painful point of failure, in my eyes the fact that Go dependencies are not a separate download is a big plus.
In fact without a single blessed dependency repository such as rubygems.org, in the Go case you have as many points of failure at build time as there are different code hosting sites in your dependency graph.
Vendoring turned your git repo as your poor man's proxying cache. It also made some people unhappy. In my current company we use phabricator for code reviews and it doesn't work well if the commit size is bigger than some threshold.
I love to have the option of not checking in dependencies. I'm not sure this option has to be forced on everybody though.
well that is okish..
however... looking at the xml stuff.. that is probably bad.
basically a lot of packages repackage xerces2 or the java.xml apis. besides that jaxp with streaming is present in java since 1.6+. But nobody removes this stuff.
And as usual Golang ignores progress in the area by package managers such as npm, cargo, et al for what seems like a half-hearted solution.
Issues I see: the introduction of modules on top of packages solve no real problem, the addition of major version numbers as part of the package identification (and thus allowing the same program to use different versions of a package), and "minimal version selection" solves nothing that lock/freeze files wouldn't solve better, while preventing users from getting important minor but compatible updates (e.g. security) as a "default".
Could you expand on the progress you mentioned, or explain what parts of the counterexamples you gave the Golang folks should learn from? In what ways is the proposal a half-hearted solution?
after reading the proposal, my understanding is:
the 'import "github.com/go-yaml/yaml/v2"' directive would lead to installing the oldest version of yaml 2.x that is supported by your other dependencies.
meanwhile, the go.mod file would mean that any dependencies that use the incompatible yaml 1.x library, would lead you you installing the oldest 1.x version after 1.5.2 which would then be used all dependencies that import the 1.x version
> No, I think newest (stable version) is the right default. Every package manager in the world works this way and the odds that they all got this wrong are slim at this point.
Doing this is meant to allow reproducible builds without requiring the use of a lock file. As to why they don't want a lock file... that isn't really addressed in the article. Lock files do seem like the most sane way to provide truly reproducible builds (that aren't dependent on the repo tags no changing since they are usually locked to a specific commit hash). I think the decision to avoid a lock file is a bad one and certainly needs to be justified.
> I think this is confusing older versions and lower. You could, I suppose, build a package manager that forbids publishing a version number lower than any previously published version of the package and thus declare this to be true by fiat.
I agree, I also think they they meant to say "minimal minor version" since major version have different import paths and are BC incompatible.
Ideally, "prefer oldest / prefer newest" should be something that can be configured per requirement in the go.mod file so that people who don't care about reproducibility don't have to go through and bump their minimum versions every time any dependency has a new release. Making this dependent on using a flag every time you run 'vgo get' is silly and doesn't allow you to do this for some packages and not others without having to write your own script to make a bunch of 'vgo get' invocations.
> I think preferring minimum versions also has negative practical consequences. Package maintainers have an easier job if most of their users are on similar, recent versions of the package's own dependencies.
Ideally, the first step you take when you encounter a bug with a library would be to check to see if a more recent version of the library fixes the bug. In practice, I don't know how many people would fail to do this before filing a bug report.
So yes, attempting to allow multiple versions of the same module will cause grief.
You would also need to work around a way to override the choices made by dependencies, eg. to rebuild with security or bug fixes, without requiring me to fork everything.
Dart is primarily targetting web deployment, in which code/executable size is a major concern. For Dart's use-case it makes perfect sense to force the developer to sort this out ahead of time, painful as it might be. For lots of other languages (including go), the primary expected deployment is to a binary executable where bloating it with multiple versions of dependencies to make the builds easier and make it possible to use two dependencies with mismatched dependenices of their own is very rarely a problem.
I believe that the Nix OS does something along these lines to guarantee deterministic builds.
A number of times in at least a couple of languages, I've seen a library keep the same (exact) version number but make some small "bugfix" change that ended up breaking things. Often, nothing stops someone from doing that.
If you did want to guarantee reproducible builds with SHA-1 hashes, one way would be to introduce those into the .mod files they outlined. But that'd be clunky; it's much easier to reason about a version number than it is a digest hash.
Another method would be to introduce a lock file where those details are kept from plain view, but my sense was that they wanted a little more openness about the mechanism they were using than a lockfile provides (which is why .mod files use Go syntax, save the new "module" keyword they would introduce). After all, that's how dep works right now: they might as well just keep the lock file.
Cases where tags are being deleted, or worse—where accounts are deleted, then recreated with (other? same?) code, may be said to break the semver contract the library or binary author someone has with their users. As such, it may be seen as outside of scope for what they are seeking to accomplish with vgo.
This approach actually adds tons of flexibility... not sure it’s needed, but you could return different lock data based on whatever logic you needed
Not as direct imports, but Rust allows transitive deps to get the version they specify.
Over time, though, version pinning builds up technical debt. You have software locked to old unmaintained versions of packages. If you later try to bring some package up to date, it can break the fragile lace of dependencies locked in through past version pin decisions.
Some version pinning systems let you import multiple versions of the same package into the same build to accommodate dependencies on different version. That bloats executable code and may cause problems with multiple copies of the same data.
On the other side, once package users are using version pinning, it's easier for package maintainers to make changes that aren't backwards compatible. This makes it harder for package users to upgrade to the new package and catch up. Three years after you start version pinning, it's going to be a major effort to clean up the mess.
The Go crowd tries to avoid churn. Their language and libraries don't change too much. That's a good, practical decision. Go should reject version pinning as incompatible with the goals of Go.
I work in a lot of languages (including Go), which gives me some perspective. On the extreme ends of this issue we have Maven, which specifies hard version numbers (fuzzing is possible but culturally taboo) and NPM, which only recently acquired the ability to lock version numbers and still has a strong culture of "always grab latest".
The Maven approach is unquestionably superior; code always builds every time. If you walk away from an NPM build for nine months, the chance that it will build (let alone work as before) is vanishingly small. I waste stupid amounts of time fighting with upstream dependencies rather than delivering business value.
People break library interfaces, that's just a fact of life. The only question is whether they will break your code when you explicitly decide to bump a version, or when randomly when someone (possibly not even you) makes changes to other parts of the system.
If that's your goal. There's a middle ground which includes security updates: asking the community to follow semver. Rust devs seem to do it well and my crates build well with non-exact version numbers after not visiting them for a while. Not sure why the majority of Rust devs do a good job with this and JS devs don't. I suppose it's the strictness of the contract in the first place.
With Go, I've found popular library maintainers doing a good job at BC contract compat even when there has been only one blessed version. I don't assume giving them a bit more flexibility is going to hurt anything.
No, that doesn't work. People make mistakes, and you end up not being able to build software. It might work _most_ of the time, but when things break, it's of course always at the worst possible time.
I think Cargo got it right: use semver, but also have lock files, so you can build with _exactly_ the same dependencies later.
This is nice, but should be noted does not absolutely prevent missing backward-incompatible changes; that a function has the same signature does not mean that it has backward compatible behavior.
(With a rich enough strong, static type system, used well to start with, you might hope that all meaningful behavior changes would manifest as type changes as well, but I don't think its clear that that would actually be the case even in a perfect world, and in any case, its unlikely to be the case in real use even if it would be in the ideal case.)
I've been bit by this a few times when I've come back to code 2-3 months later and forgot to include the .cargolock so I pretty much always use the lock file these days.
NPM didn't have package-lock.json until v5, released in 2017. Before then there was the optional shrinkwrap that nobody used, so builds were totally unreproducible.
Ruby at least had Gemfile.lock from early days. Unfortunately there have been so many compatibility problems with different versions of Ruby itself that someone needed to invent rvm, rbenv, and chruby. Getting every dependency to behave in the same Ruby version was sometimes an odyssey. Still, at least builds are reproducible... as long as you're running on the same OS/CPU arch (uh oh native code!)
Ruby is actually pretty alright given the constraints, but Node/NPM is the canonical example of how NOT to do dependency management, and they're still trying to figure out here in 2018.
npm will pick the latest version of the dependencies compatible with the package.json of the packages (package-lock.json, is not published as part of the package). This means that with npm we may end up using a version of a transitive dependency which is much later than than the one that your direct dependency was tested with.
The proposal for vgo will take the minimum version compatible with the packages, and hence pick a version which is closer to the actually tested one.
Consider all tens of thousands of CVE bugs found in widely used image, video, XML, etc. libraries. Even “updated” software is often compromised via outdated libraries.
With a “min version” approach, none of those will get patched without explicit action by a developer, who won’t do that because he doesn’t even know about the bug unless he reads the thousands of messages each day on bugtrak.
If anything, history has shown that we should be focusing package management on getting security fixes to the the end user as soon as possible.
This proposal is bad for security.
One of the issues which the vgo developers point out is that the "latest" behaviour has the perverse effect that a module A may declare a dependency of version 1.1 of a module B, and may have never been even tested on that version because the package manager always picked the latest.
In some sense, the vgo approach is saying that across hierarchy of module owners, each one should keep upgrading their manifest to the latest versions of their direct dependencies, and ensure that things keep working, rather than relying on the topmost module to pull in the latest of the entire tree of dependencies. This seems to be good for the entire ecosystem.
But that’s the problem. You’re relying on N people, including the package maintainers and end developers to all take timely action in order to get a security fix to the end user.
That simply won’t happen.
What should happen is a single package maintainer fixes a vulnerability, and that fix automatically flows through the system to all places where that package is used. And insecure versions should be made unavailable or throw a critical “I won’t build” error.
Perhaps some way of marking versions as security critical might help, but the proposed approach will leave tons of vulnerable libraries in the wild.
All the current package managers for other languages have this issue to some degree. Golang should do better with knowledge of those mistakes.
Sure, but in my experience it happens at least 10x as frequently in NPM than with Go. It's really common for me to update all my Go dependencies and everything just continues to work. With NPM I have to start looking for migration documentation and change a bunch of my code.
My main fear is that when people get used to the idea that their master repo doesn't have to have a backwards compatible interface, then updating dependencies will look awfully similar to NPM where authors change interfaces based on their weekly mood.
Have you paid the upstream developers? Because you sound like you did.
It's going to be much more painful if people have to fork when a bug is introduced lest they have to wait a week or more to get a bugfix upstreamed.
Eg I might have my module that says:
When the code builds it will try to use 1.4.1 EVEN IF 1.4.2 EXISTS, unless it is forced to use 1.4.2 by another module you depend on. That is, say you are using module X and it says:
What the minimal version selection does is say "okay one module needs >=1.4.1, another needs >=1.4.2. What is the minimal version that satisfies these requirements?" And the answer to that is `1.4.2`, so even if 1.4.8 exists, 1.4.2 is used in that scenario.
I don't know how this will work long term. I think there are definitely some concerns to consider (eg many minor version bumps are for security fixes), but the scenario you are describing - the newer version causing issues - just isn't an issue as I understand it because your code won't opt to use a newer version unless you (or another module you use) explicitly tell it to.
On the other hand, whenever you edit a direct dependency's minimum version, it could upgrade any other dependency via a cascade of minimum version upgrades. (In which case, don't commit it.) This keeps your team moving, but it might make dependency upgrades harder.
For example, it's up to each repo's owners (for applications, at least) to notice when there's a security patch to one of its transitive dependencies, and generate and test a commit. Someone will need to write a tool to do this automatically.
If the test fails: now what? I guess that's when you need to look into forking.
This doesn't just result in non-reproducible builds, but it results in them at unpredictable times and, if you have servicing branches of your code, backward through time. This is not a good property if you need to know what you are building today is the same as what you built yesterday, modulo intentional changes, or even that it will build or run.
Version pinning does not. Misusing it might, but don't do that.
Version pinning of dependencies is a tool for assuring the behavior of releases, but in development you should be generally be updating to the latest stable version of dependencies and then pinning them for the next release, not keeping the old pinned versions.
Isn't technical debt usually just the situation of having many things that should be done, but aren't? In other words, the accumulation of "don't do that" cases in code.
Sure, the things that you shouldn't do are the things that lead to technical debt. What I'm saying is that "version pinning to assure consistent behavior in stable releases" is not one of those things.
What is one of those things is "leaving pins from the last release in place in development".
Version pinning in releases is a means of providing stability for downstream users. It should be encouraged.
Leaving those pins in place while developing the next version is a source of technical debt. It should be discouraged.
Technical debt is to be handled during the development cycle, not in source code and most of the issues you bring up (bloated software, version pinning) are solved in build systems and again should not be handled by source code.
Anyone working in an enterprise enviorment, especially that has clients on different systems, would instantly crumble without being able to target specfic builds.
Legacy and long term support software is everything.
>Go should reject version pinning as incompatible with the goals of Go.
This would instantly make go a 'no-go' in any enterprise enviorment.
A //public library// should have a well thought out interface that works at least for one project and preferably for several different projects.
If the interface needs to change I agree with the recommended semantics of planning what that change should look like and creating a new named interface (API) for that specification.
Since go expects everything to build against the 'latest' version of an interface, it should be obvious what's wrong if a stale cached version is in use. Thus /expanding/ an interface / API is permissible. They should be thought of as /write only/.
You're right that it is technical debt to not be on the latest version, but if I need to roll back to a previous (of my application) version due to a regression, I want to be 100% sure I get the exact same state as I did before. If my dependency management software doesn't solve this, it's not good enough.
As soon as some tool is used to provide reproducible builds, you have effective version pinning.
If your organization values avoiding technical debt accumulation over reproducable builds, they don't don't have to use version pinning and can always run 'go get' with the '-u' flag.
> once package users are using version pinning, it's easier for package maintainers to make changes that aren't backwards compatible.
I don't think ease of making backwards incompatible changes with vgo. Both before and after vgo, backwards incompatible changes must go in a new major version / new import path. The ability to pin major versions already exists.
If you are talking about open source, it is not a problem: anyone can upgrade dependencies and make a pull request. Without version pinning, the project will always be in a "broken" state.
For example, recently I tried to run Firefox 3.5 on linux and it crashed with modern versions of gobject libraries although they are supposed to be backwards-compatible. I wish Debian package manager allowed to install an old version of a library specially for this application.
Patch changes, don't care, minor changes, don't care, major changes, you're screwed [maybe].
Yes, by default it uses statically linking, however generating dynamically linked binaries and libraries is also an option.
> First, the meaning of “newest allowed version” can change due to external events, namely new versions being published.
is the root of what's wrong with this proposal - namely conflating time of dependency resolution with the time of build. cargo (using it because it's been referenced by the post) solves this problem by telling you to check in the cargo.lock that it generates whenever you 'cargo update'. this means that your version control preserves the build dependencies, just as it preserves build recipes (e.g. build.rs or travis or whatever). in this world, there's no 'newest allowed version', because there is only one version.
wise men before me said 'as simple as possible, but no simpler' and I'm afraid this proposal is too simple.
Yes yes yes yes finally! This has been the #1 reason I've avoided Go for years and even my recent forays back into the language were only after I figured out some hacks that I could use to avoid it.
Agreed, GOPATH is awkward--at first. Because it's different from what most programmers are used to. But once it's adopted it actually makes a lot of sense, and looking back it seems bizarre to ever have such a strong desire to avoid it at all costs--and the costs are high.
The same applies for Gofmt as well.
I don't mind GOPATH, but I did argue that if vendoring was to be a thing then it needed to be dropped entirely. I was also not a fan of the vendoring mechanism.
The approach given here looks much more thought out getting rid of each, first proposal I've actually liked.
e.g. the most-popular one I can see so far is Facebook's "inject", which has a remarkably small 841 stars: https://github.com/facebookgo/inject
Even Glide, Godep, and Dep have several thousand. And those can barely claim "most", much less "very common".
(I've linked to the exact point in the article where it talks about DI but you might want to read the rest as well.)
But that just means that everyone rolls their own thing by hand, with widely-varying feature-sets, method names, bugs, etc. DI frameworks are very much nicer for interop between a bunch of libs.
But GOPATH was a mistake in the same way Go's hitherto lack of versioning was: it deliberately breaks easy isolation between projects / dependency management.
This mindset only works if you control most of the ecosystem (such as with a monorepo), but it's a irritating mess otherwise.
There's also a mechanism to take control of the same code even when it's outside your GOPATH. Github calls this mechanism a "pull request".
- It’s an evironment variable, which makes it relatively easy to modify. To switch your dependencies from one pool to another, you can modify your GOPATH.
- The vendor directory lets you pin your dependencies per-project without modifying GOPATH. It’s not the most beautiful solution (and Go doesn’t often offer the most beautiful solution to any particular thing), but it’s workable.
dir=$(echo $1 | sed 's/^http\(s*\):\/\///g' | sed 's/^git@//g' | sed 's/\.git$//g' | sed 's/:/\//g' )
git clone $1 "$HOME/src/$dir"
People† at large just want to clone https://github.com/foo/bar.git wherever they see fit, like directly at ~/Workspace/bar, ~/work/bar, ~/projects/contrib/bar or even /tmp/bar.
† "People" includes CI systems that expect the typical "clone and cd straight into" thing to work, resulting in a lot of boilerplate and/or symlink hacks to work around such expectations.
You haven't needed to set a GOPATH since 1.8, which was released over a year ago (we're now at 1.10). Since 1.8, the Go toolchain will use a default GOPATH; the environment variable is only needed as an override.
but on ci servers it could be akward.
GOPATH is unambiguously terrible.
Cargo does not use the latest version. Saying
toml = "0.4.1"
toml = "^0.4.1"
toml = "=0.4.1"
This decision was made because ^ is the most reasonable default; over-use of `=` destroy's Cargo's ability to help you upgrade, given that you basically asked it to not think about it at all. Further, it would significantly bloat binaries, as this would mean many more duplicate versions of packages in your project. Basically, idiomatic use of = is only in very specific situations.
We could have required that you always specify some operator, but we also like reasonable defaults. Writing `^` everywhere is just more typing.
The transitive dependency issue isn't any different, really: that's due to said packages also not using =.
When you're adding a new direct dependency, taking the latest is almost certainly right. But when you're pulling in a new transitive dependency, I think taking the last is problematic: it seems much safer to me to take the one closest to what the author of that code tested with.
I'll elaborate more on this in a post tomorrow.
For what it's worth, I think I do understand what cargo is doing, and why, and I completely respect that approach. I just think this other approach might have some different properties worth exploring. As I've been working on this I've been using cargo as the gold standard, really. If the end of this all is that people say Go's package management is as nice as cargo, then I will be very very happy. A long way to go yet for sure.
> Cargo selects the latest version of a new dependency and its own new dependencies
but that's not true; it selects the "latest semver-compatible minor version", which is a pretty different thing. The way you've phrased it makes it seem like cargo just flat out selects the newest version period (which can cause breaking changes and a whole lot of pain).
So of course "people are frequently surprised", your actual statement is incorrect and misleading. "Blindly updating to new versions" is a completely different (and scary) proposition from "updating to semver compatible minor versions". The latter still has its issues (I think the maximally minimal approach that vgo is taking is a pretty neat solution to this and some other issues) but it's a much more contained set of issues.
No, it's exactly the same thing. Russ already established that an incompatible module is not just a later version but, in fact, a totally different thing.
I also very much don't think you should cargo cult! As munificient said above, this is a Hard Problem, and there are a lot of options. In some senses, I'm glad you're not just copying things, as well, that's how we all learn.
But that is only if versions are used this way at all, I've found, at least in the Erlang world, the model of taking the first "version" found in the dep tree and locking to it to work well. In that case a patch version is not considered special anyway.
If someone doesn't respect semantic versioning the difference of one minor or few minor versions is irrelevant.
> For what it's worth, I think I do understand what cargo is doing, and why, and I completely respect that approach.
I think understanding SemVer would help a long way.
hard coding v1 and v2 check, anyone who understands semvar would never make such mistake!
Its also worth noting that cargo uses a lock file. toml will only get new versions when add/remove/update a package in your cargo. Normal builds will all use your Cargo.lock.
Means no security fixes at the price of, well, minor developer inconvenience? What is the inconvenience, exactly?
> ...tomorrow the same sequence of commands you ran today would produce a different result.
I mean, I guess this is technically true. But seems like it shouldn't be an issue in practice as the API you're calling shouldn't have changed, just the implementation. If it has changed, then downgrade the dependency?
I find myself wholly in agreement with the idea that maximal version selection is appropriate for operating systems and general software components, but not necessarily desirable for a build system.
When you consider the evaluation of a dependency graph in the context of a SAT solver, you realize that the solver would consider both a minimal and maximal version as potentially satisfying the general constraints found in a dependency graph. Whether you then optimize the solution for a minimal or maximal version becomes a matter of policy, not correctness.
The security concern can be appropriately addressed by increasing the minimum version required in the appropriate places.
With all of that said, I think that Go's potential decision to optimize for minimal version selection is likely to be considered a bug by many because it will not be the "accepted" behavior that most are accustomed to. In particular, I can already imagine a large number of developers adding a fix, expecting others to automatically pick it up in the next build, and being unpleasantly surprised when that doesn't happen.
This is an interesting experiment at the least, but I hope they make the optimizing choice for minimal/maximal version configurable (although I doubt they will).
I believe rsc is hoping they can avoid the need for a SAT solver entirely by going with minimum versions (and, implied, not allowing a newer package to be published with a lower version number).
My point really was this: both a maximal and minimal version can exist that satisfy a minimum-version (<=) bound dependency. In such case, which one is chosen is a matter of policy, not correctness.
As for not allowing a newer package to be published with a lower version number, that is sometimes necessary in a back-publishing scenario. For example, imagine that you publish a newer package with a higher version that has breaking changes and a security fix, and you also publish a newer package of an older version that just has the security fix. It's entirely valid to do so, for what should be obvious reasons.
Using the min version appears to eschew the need for lock files. Want to upgrade? Bump your min version.
This only works if the system also prevents you from publishing a version of a package with a lower number than any previously-published version of that package.
So, for example, after you've shipped foo 1.3.0, if a critical security issue is found in foo 1.2.0, you can't ship foo 1.2.1. Instead, all of your users have to deal with revving all the way up to foo 1.3.1 where you're allow to publish a fix.
It's not clear to me why they're trying so hard to avoid lockfiles. Lockfiles are great.
What the minimum version is doing is giving our code a way to automatically resolve upgrades if they are necessary. Eg if module X requires module Z w/ a version >= 1.0.1, while module Y requires Z with a version >= 1.1.0 we clearly CANNOt use v1.0.1, as it won't satisfy the requirements of Y, but we CAN use v1.1.0 because it satisfies both.
The "minimum" stuff basically means that even if a version 1.3.2 of Z is available, our code will still use v1.1.0 because this is the minimal version to satisfy our needs. You can still upgrade Z with vgo, or if you upgrade a module X and it now needs a newer version of Z vgo will automatically upgrade in that case (but to the minimal version that X needs), but random upgrades to new versions don't just occur between builds.
1. I depend on foo with constraint ">1.5.0". The current minimum version of foo that meets that is 1.7.0.
2. Later, foo 1.6.0 is published.
3. I run go get.
If I understand the proposal correctly, that go get will now spontaneously downgrade me to foo 1.6.0. That defies the claim that builds are always reproducible.
It's entirely valid (and interesting! I hadn't thought of this one), but I'm not sure if this would happen even once IRL, except for people trying to break the system. Which can be fun, but isn't a risk.
as always, of course there's a relevant xkcd: https://xkcd.com/1172/
If there is only one constraint against a dependency, then it behaves exactly as version locking.
The "maximum of the minimums" rule kicks in when the same dependency appears more than once in the dependency graph, because the constrains might be different.
vgo won't fail and say "incompatible versions". It will just resolve to the biggest of the lower limits. It's up to the build and test system to judge if the combination works.
That is, a given version that satisfies a version bound may not have necessarily been tested with all of the different combinations of versions of components it is used with. It will be interesting to see how minimal version selection interacts with that particular challenge. For a build system it may matter much less than an operating system.
The "incorporate" dependencies are described here: https://docs.oracle.com/cd/E26502_01/html/E21383/dependtypes...
Lock files also have the major advantage that you have a hash of the thing you're downloading so you're not depending on the maintainer not e.g. force pushing something.
Always run your builds with maximal version selection in CI (unlocked), and lock the new versions in if it passes all tests, otherwise leave the previous locked versions and flag it for review.
And if the new implementation has a new bug, you might be screwed. It worked last week, but not this week. How do I get back to the working version?
I may have inferred something that wasn't in the original comment, by reading too many of the other comments on this page.
I think it’s better to have a package manager that’s not reliant on a particular source control system.
I for one could live without go get alltogether. In 99% of cases it boils down to a simple git clone anyway, which could (should?) be done by the package manager anyway.
Maybe it's time to re-evaluate the existence of go get, now that we're seeing that "just clone master and hope nothing breaks" has obviously not worked for everyone outside of Google? Maybe bundling Go (the compiler and linker) with tools for fetching source code wasn't such a good idea after all?
Just my 2 cents...
Plus now libraries are "modules" so libraries can have dependencies for specific versions, before the question is what do you do if multiple dependencies have the same dependencies in their own separate vendor directories. This change removes that, as it's handled by the vgo tool for all modules in that build.
go.mod is kind of a cross between lock-file and dependency listing. I think it will work alright.
All together, it seems to be a cross between gb and dep, while also attempting to solve library-packages tracking dependencies too.
I digged vgo a bit and found the downloaded packages currently reside in `$GOPATH/src/v/cache`, and seems like vgo won't work without GOPATH for now (https://github.com/golang/vgo/blob/b6ca6ae975e2b066c002388a8...).
> First, the meaning of “newest allowed version” can change due to external events, namely new versions being published. Maybe tonight someone will introduce a new version of some dependency, and then tomorrow the same sequence of commands you ran today would produce a different result.
This is why we have lock files and they work better for this.
While vgo's approach does allow reproducible builds, it doesn't allow you to guarantee reproducibility the way that a lock file that has the specific commit hashes does. With specific commit hashes you can verify the the version you are building with in production is the exact same code as the one your security team audited prior to release (assuming git has fixed its hash collision vulernability).
You can get around this by abandoning version constraints in your go.mod file, but then you have to track these version constraints out of band and manually figure out what commit hash to stick in go.mod
You could also get around this by storing the hases for the approved versions out of band and creating a build script that verifies these hashes prior to building.
Both of these workarounds seem to defeat the point of having a standard package manager in the first point.
> Second, to override this default, developers spend their time telling the package manager “no, don't use X,” and then the package manager spends its time searching for a way not to use X.
If you are concerned with allowing users to override this default, why not have directive to override this default that can optionally be added to each requirement in the go.mod file. This avoids an unexpected default and doesn't force people who want the industry standard default to use a script to set which dependencies use or don't use the -u flag with 'go get'.
Can we take a moment and point out that the hash collision vulnerability is still at large with no ETA? This is after years of it being considered insecure
I feel like this point continues to be glossed over with versioning systems that depend on git commit hashes.
Bruce Schneier warned in February 2005 that SHA needed to be replaced. (https://www.schneier.com/blog/archives/2005/02/cryptanalysis...) Git development didn't start until April 2005. So before git was even developed, SHA was identified as needing to be deprecated.
So now, 13 years later, this is still an issue.
As an example, jteeuwen/go-bindata was deleted this month and somebody made a new repo in its place... hopefully with the same contents.
1. VCS tags are mutable. That's lock files store revision ids. Go is being used to build immutable infrastructure but the proposed package management system uses mutable versions.
2. The proposal is less featureful that dep, npm/yarn in JS, composer in PHP, maven, crates for rust, and others. I wonder how people will react to that.
: it fixes so many readability problems with SHA-pinned lock files, easily shows downgrades in `diff` output, and `sort` likely produces the exact result you wanted.
: which may not be a problem, since you could in theory just re-run the tool to fix it when you enter "v1.2.3" by hand.
Why not do the symbol mangling in the compiler if the goal is to have multiple simultaneous versions built in? This is easy to make backward compatible since it could default to v1.
It will be interesting to see how this plays out in the longer term.
I like Go's distributed package imports (even if everyone just uses Github), but it means you're distributing single points of failure all over the place. Vendoring (whether source or modules) solves this problem.
You want reproducible builds? Vendor your crap. Check it into your repository. Live free.
build takes like half a second because everything is already there. mvn will first download the world. I'm a scala developer and sbt will probably take like 10 minutes just downloading stuff, if a cache missed (even with a local proxy).
We always have internal Plexus mirrors, and the Jenkins servers have their global .m2 local repository.
Even big JEE applications barely take more than 5 minutes to build.
The only build system I really dislike are Android builds with Gradle, trying to beat C++ compilation times.
Check your shit into your git repository. And managing that becomes a problem, you've already done goofed up and have too many dependencies, and it's unlikely there's _anything_ reproducible about your software.
And yet, that's exactly what Maven Central artifacts are.
And why Maven Central is rock solid for reproducible builds while supporting versioning.
Where will the actual code be?
I am also firm believer in version-pinning/lockfiles. Updating versions should be a separate workflow with first-class built-in tools to support it. I think that is the area where most package managers fall flat. They basically rely on the developer to do all the heavy lifting.
Unfortunately, it also has a downside, and in go that downside would be noticeable.
Let's take one easy example: a logging library. Let's say I pull in "logrus v1.0.1" and one of my dependencies pulls in "logrus v1.0.2". In my "main" function, I set logrus's default loglevel to debug (logrus.SetLevel(logrus.DebugLevel)).
If go did the thing nodejs does (a different copy of logrus for my dependency than my main), the "logrus.SetLevel" would only affect my package, not any of my dependencies; they'd still log at a default level.
This would be true of other things; "func init" code would run once per dependency that had that library, not just once, maps wouldn't be shared, pools, etc.
This is a lot more memory usage, but it's also really surprising, especially in the case of logging.
I definitely prefer having only one copy of a library in memory and having package-level variables (like loglevel) work as expected.
If that ends up failing and I need two different versions, vendor+import-path rewriting allows an escape hatch
These days, npm actually tries to minimize and flatten dependency versions as much as possible to avoid the huge memory tax.
That would be awesome. GOPATH was an awful idea that works only for distribution package managers.
To name untagged commits, the pseudo-version v0.0.0-
yyyymmddhhmmss-commit identifies a specific commit made on
the given date.
I beleve this was always the plan. The dep readme certainly makes it sound that way at least:
> dep is a prototype dependency management tool for Go. It requires Go 1.8 or newer to compile. dep is safe for production use.
> dep is the official experiment, but not yet the official tool. Check out the Roadmap for more on what this means!
This seems fraught
Let's say you care about strict build reproducibility. You want some form of manifest + lock file that defines exactly what versions of dependencies your package expects. Great; more power to you.
The only reason you'd want that is if you aren't storing your dependencies with your source code in source control. Otherwise, what's the point? If you were vendoring and committing it, you already have strict reproducibility, and you have the "supported versions" defined in the git repositories that come alongside your dependencies.
So, adding this manifest+lock file allows you to get away with not vendoring+committing. Awesome. You've gained reproducibility, but certainly not strict reproducibility.
Dependency package maintainers can rewrite git history. They can force push. They can push breaking changes with minor versions. They can delete their repositories. All of these things have already happened in NPM, either maliciously or accidentally; why do we think they wouldn't happen with Go?
If you want strict reproducibility but are expecting it with just a manifest+lock file, without dependency vendoring+committing, you're not getting it. Full stop.
So, really, by adding a manifest+lock file, you're adding a "third level" of reproducibility (#2 on this list).
1. Low Reproducibility: Pull HEAD on build.
2. Partial Reproducibility: Use tooling to pin to a git commitish, pull this on build.
3a. Full Reproducibility (Easy+Dirty): Vendor+Commit all dependencies.
3b. Full Reproducibility (Hard+Clean): Mirror your dependencies into a new self-controlled git repo.
I am struggling to think of a solid use case that would find real value in #2. I have no doubt that people think it would be valuable, but then the moment the left-pad author deletes his repository, you're going to second guess yourself. You didn't want #2; you wanted #3 and settled for #2 because #3 was too hard or dirty and you convinced yourself that adding tooling and version numbers was keeping you safe. Because semver, right?
Moreover, this comes at the cost of complexity, which is strictly against Go's core doctrine.
Moreover, a #2 solution looks strikingly similar to gopkg.in. Instead of referencing Repository:HEAD we reference Repository:Commitish. The main difference is that gopkg.in is an external service whereas this would be controlled tooling. But by choosing #2 you're already exposing yourself to bad actors, accidents, github going down, all of the above. So you're willing to accept that exposure, but aren't willing to add that little extra exposure of one more service?
I agree that the problem today, even with dep, isn't perfect. But I still strongly believe that the answer lies somewhere in the ideas we already have, not by importing ideas from NPM. We need a #3 that is Easy and Clean, not a #2.
What's next, persuading them that generics are pretty cool too?
They're similarly aware that generics are a desirable thing to have, but haven't determined the ideal way to implement generics into the language.
Vendoring needs to be solved and unified, I get it.
But to me, the biggest thorn is generics. I have ran into too many cases where generics would have made my life a lot easier.
Thank you for your attention.
Then you propose to introduce the concept of modules and make the world a hell?
I believe versioning should be left of to third-party tools. For example, no one is complaining about the lack of versioning built in Node.js.