Hacker News new | past | comments | ask | show | jobs | submit login

I'm going to comment mostly on the parts of the proposal that I think are wrong, but don't take this to be an overall negative response. I'm excited to see smart folks working on this, and package management is a really hard problem. There are no silver bullets to code reuse.

Context for those who don't know: I along with Natalie Weizenbaum wrote pub[1], the package manager used for Dart.

> Instead of concluding from Hyrum's law that semantic versioning is impossible, I conclude that builds should be careful to use exactly the same versions of each dependency that the author did, unless forced to do otherwise. That is, builds should default to being as reproducible as possible.

Right on. Another way to state this is: Changing the version of a dependency should be an explicit user action, and not an implicit side effect of installing dependencies.

    import "github.com/go-yaml/yaml/v2"
> Creating v2.0.0, which in semantic versioning denotes a major break, therefore creates a new package with a new import path, as required by import compatibility. Because each major version has a different import path, a given Go executable might contain one of each major version. This is expected and desirable. It keeps programs building and allows parts of a very large program to update from v1 to v2 independently.

It took me several readings to realize that you encode the major version requirement both in the import string and in the module requirements. The former lets you have multiple copies of the "same" module in your app at different major versions. The latter lets you express more precise version requirements like "I need at least 2.3, not just 2.anything".

I think it's really going to confuse users to have the major version in both places. What does it mean if I my code has:

    import "github.com/go-yaml/yaml/v2"
But my go.mod has:

    require (
      "github.com/go-yaml/yaml" v1.5.2
    )
I don't know if the goal of this is to avoid lockfiles, or to allow multiple versions of the same package to co-exist, but I think it's going to end up a confusing solution that doesn't cleanly solve any problem.

For what it's worth, Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain, but it doesn't appear to be insurmountable. Most of the pain seems to be in the performance issues over-constrained dependencies caused in our old version solver and not in the user's code itself.

In almost all cases, I think there is a single version of a given package that would work in practice, and I think it's confusing for users to have an application that has multiple versions of what they think of as the "same" package inside it. This may be less of an issue in Go because it's structurally typed, but in Dart you could get weird errors like "Expected a Foo but got a Foo" because those "Foo"s are actually from different versions of "foo". Requiring a single version avoids that.

> I believe this is the wrong default, for two important reasons. First, the meaning of “newest allowed version” can change due to external events, namely new versions being published. Maybe tonight someone will introduce a new version of some dependency, and then tomorrow the same sequence of commands you ran today would produce a different result.

No, I think newest (stable version) is the right default. Every package manager in the world works this way and the odds that they all got this wrong are slim at this point.

At the point in time that the user is explicitly choosing to mess with their dependencies, picking the current state of the art right then is likely what the user wants. If I'm starting a brand new from scratch Ruby on Rails application today, in 2017, there is no reason it should default to having me use Rails 1.0 from 2005.

Every version of the package is new to me because I'm changing my dependencies right now. Might as well give me the version that gets me as up-to-date as possible because once I start building on top of it, it gets increasingly hard to change it. Encouraging me to build my app in terms of an API that may already be quite out of date seems perverse.

> This proposal takes a different approach, which I call minimal version selection. It defaults to using the oldest allowed version of every package involved in the build. This decision does not change from today to tomorrow, because no older version will be published.

I think this is confusing older versions and lower. You could, I suppose, build a package manager that forbids publishing a version number lower than any previously published version of the package and thus declare this to be true by fiat.

But, in practice, I don't think most package managers do this. In particular, it's fairly common for a package to have multiple simultaneously supported major or minor versions.

For example, Python supports both the 2.x and 3.x lines. 2.7 was released two years after 3.0.

When a security issue is found in a package, it's common to see point releases get released for older major/minor versions. So if foo has 1.1.0 and 1.2.0 out today and a security bug that affects both is found, the maintainers will likely release 1.1.1 and 1.2.1. This means 1.1.1 is released later than 1.2.0.

I think preferring minimum versions also has negative practical consequences. Package maintainers have an easier job if most of their users are on similar, recent versions of the package's own dependencies. It's no fun getting bug reports from users who are using your code with ancient versions of its dependencies. As a maintainer, you're spending most of your time ensuring your code still works with the latest so have your users in a different universe makes it harder to be in sync with them.

Look at, for example, how much more painful Android development is compared to iOS because Android has such a longer tail of versions still in the wild that app developers need to deal with.

If you do minimum version selection, my hunch is that package maintainers will just constantly ship new versions of their packages that bump the minimum dependencies to forcibly drag their users forword. Or they'll simply state that they don't support older versions beyond some point in time even when the package's own manifest states that it technically does.

There is a real fundamental tension here. Users — once they have their app working — generally want stability and reproducibility. No surprises when they aren't opting into them. But the maintainers of the packages those users rely on want all of their users in the same bucket on the latest and greatest, not smeared out over a long list of configurations to support.

A good package manager will balance those competing aims to foster a healthy ecosystem, not just pick one or the other.

[1]: https://pub.dartlang.org/




You (and likely everyone else) should look at the tour as well before commenting, as I think many people are misunderstanding some of the subtler points.

> If I'm starting a brand new from scratch Ruby on Rails application today, in 2017, there is no reason it should default to having me use Rails 1.0 from 2005.

In the tour it states, "We've seen that when a new module must be added to a build to resolve a new import, vgo takes the latest one." which means that the newest Rails would be used and set in your `go.mod` file.

From that point onwards the "minimal version" will be used, which means vgo won't upgrade you to a version released tomorrow unless you (or a module you use) explicitly state that they need that newer version.

This is a much saner default than the one you describe (imo) as people still get recent versions for new projects, but once they are using a specific version they won't upgrade unless they need to or want to.


> When a security issue is found in a package, it's common to see point releases get released for older major/minor versions. So if foo has 1.1.0 and 1.2.0 out today and a security bug that affects both is found, the maintainers will likely release 1.1.1 and 1.2.1. This means 1.1.1 is released later than 1.2.0.

I should have addressed this in the original reply and its too late to edit now, but this isn't an issue. I downloaded vgo and verified that you CAN release a 1.1.1 AFTER 1.2.0 and it is treated correctly as far as I can tell.

See github.com/joncalhoun/vgo_main:

    $ vgo list -m -u
    MODULE                          VERSION                    LATEST
    github.com/joncalhoun/vgo_main  -                          -
    github.com/joncalhoun/vgo_demo  v1.0.1 (2018-02-20 18:26)  v1.1.0 (2018-02-20 18:25)
v1.0.1 is newer than v1.1.0, but isn't treated as the latest version. I suspect that RSC didn't mean "older" in the literal datetime sense, but rather in the context of semantic versioning where "older" means you don't release v1.3.4 AFTER you have released v1.3.5


> In the tour it states, "We've seen that when a new module must be added to a build to resolve a new import, vgo takes the latest one." which means that the newest Rails would be used and set in your `go.mod` file.

That works for adding a new dependency. But, as I understand it, if I decide to upgrade my dependency on foo by changing its already-present version in my app's module file, this does not upgrade any of the transitive dependencies that foo has. Instead, it selects the lowest versions of all of those transitive dependencies even though my goal with foo itself is to increase its version.

So now I have to reason about sometimes it picks the latest version and sometimes it doesn't, depending on the kind of change I'm making.


The new release of the dependency can also bump the minimum required versions of its dependencies, as part of their release cycle. If they don't, you can upgrade them as any other dependency; after all transitive dependencies are just dependencies.

That said, you can just upgrade all the dependencies with vgo get -u and get the "always latest" behaviour. This is a desirable result, but it shouldn't happen at each and every fresh build.

You can have automation that periodically tries to bump all the versions and if all tests passes send you a PR with the proposed update.

With the proposed rules you get 1. Repeatable builds as with lock files 2. Simple to reason about constraint resolution on case of multiple modules depending on the same module.


Let's say I create a program that is using foo and end up with the following dependencies:

main:

    requires "foo" v1.0.0
foo (v1.0.0):

    requires "bar" v1.0.0
Right now if I check my dependencies, I'll have something like this:

    MODULE    VERSION
    main      -
    bar       v1.0.0
    foo       v1.0.0
Now lets say some time passes, and both foo and bar release new versions:

foo:

    v1.0.0
    v1.1.0
bar:

    v1.0.0
    v1.0.1
    v1.1.0
    v1.1.1
    v1.1.2

And the deps for foo v1.1.0 are:

foo (v1.1.0):

    require "bar" v1.0.1
Realizing that foo has an update, I decide I want to upgrade. I'd do vgo get foo. My updated dependencies (shown with "vgo list -m") are:

    MODULE    VERSION
    main      -
    bar       v1.0.1
    foo       v1.1.0
bar gets its version increased as well, using the version specified by the foo package's module. This makes sense to me - the foo package maintainer has stated that he only needs v1.0.1 to be stable, so we default to what he specified.

Now imagine I want to add another package, say it is the wham package and it has the following dependencies:

wham (v1.0.0):

    require "bar" v1.1.1
If I add this to my code my versions will now be:

    MODULE    VERSION
    main      -
    wham      v1.0.0
    bar       v1.1.1
    foo       v1.1.0
bar now uses v1.1.1 because it is the minimal version that satisfies all of my modules. vgo DOES upgrade bar for us, but not beyond the lower version number required to satisfy all of our modules. That said, we can still upgrade it manually with "vgo get bar", after which it will be using v1.1.2 because our main dependencies would become:

main:

    requires "foo" v1.1.0
    requires "wham" v1.0.0
    requires "bar" v1.1.2
In short, upgrading foo WILL upgrade all of foo's dependencies in order to meet it's minimum version requirements, but no further. That said, you can still manually upgrade any of those dependencies.

To me this makes sense. The creator of foo may have avoided upgrading the dependency on bar for some performance reasons, so this upgrade only happens in your code if it is required by another package, you initiate it manually, or if the foo package releases a new version with updated dependencies in its go.mod file.

PS - I've tested this all using the prototype of vgo. You can see yourself by grabbing this code: github.com/joncalhoun/vgo_foo_main and then use vgo to list dependency versions and try upgrading foo which has a dep on demo.


For what it's worth, Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain, but it doesn't appear to be insurmountable. Most of the pain seems to be in the performance issues over-constrained dependencies caused in our old version solver and not in the user's code itself.

In almost all cases, I think there is a single version of a given package that would work in practice, and I think it's confusing for users to have an application that has multiple versions of what they think of as the "same" package inside it. This may be less of an issue in Go because it's structurally typed, but in Dart you could get weird errors like "Expected a Foo but got a Foo" because those "Foo"s are actually from different versions of "foo". Requiring a single version avoids that.

I think this makes a strong case for not releasing major version upgrades that use the same package names. The very idea of two incompatible things having the same name should set off alarm bells. Instead of trying to make that work, we should be avoiding it.

In the absence of this principle, the Java ecosystem has developed a compensatory mechanism of packaging "shadowed" versions of their dependencies alongside their own code. This is an ugly hack to accomplish the same thing after the fact, so we are already incurring even more complexity than would be imposed by following this rule.


> I think this makes a strong case for not releasing major version upgrades that use the same package names. The very idea of two incompatible things having the same name should set off alarm bells. Instead of trying to make that work, we should be avoiding it.

If you do that, I think you'll find in practice that one of two things happens (or more likely, both, in a confusing mixture):

1. People start releasing packages whose names include version numbers. "markdown2", etc. Then you get really confusing hallways conversations like, "Yeah, you need to use markdown2 1.0.0."

2. People start coming up with weird confusing names for the next major version of packages because the current nice name is taken. Then you get confusing conversations like, "Oh, yeah, you need to upgrade from flippitywidget to spongiform. It's almost exactly the same, but they removed that one deprecated method." Also don't forget to rename all of your imports.


I think you'll find in practice that one of two things happens

I think the existence of those practices (like Java dependency shading) proves that people are struggling towards this solution on their own, without support from the language or the community. With official support, if major versions work the same way for everybody, it won't need to be so janky.

In practice, I predict that people would start behaving better, doing what they should have been doing (and what many have been doing) all along: avoiding unnecessary breaking changes in non-0.x libraries, choosing function signatures carefully, and growing by accretion and living with their mistakes. Right now, I think some developers see major version bumps as a convenient way to erase their mistakes, without taking into account the cost imposed on users who end up juggling dependency conflicts.


The major version goes into the name, marldown2, but the version numbers should be monotonic, so when when viewed on a number line they are in-order. This also allows the programmer to import both, and have a smooth transition between the deps.


I'm not really enthused by efforts to bring in the same packaging constructs as other languages, which all have the effect of making time-to-compile after git clone longer.

Frankly I think we tend to conflate two separate but related tasks in these discussions: communicating updates, and distributing dependencies.

vendor/ folders are a totally fine distribution system - optimal even. Time-to-compile is 0 because you get the dependencies with git clone.

So really the problem we have is communicating updates (and some git server tooling smarts to deduplicate files but let github solve that).


According to the article, the solution I suggest has been proposed in the Go community, and I don't know of any language where it has actually been adopted, so Go might be the first. I just know that Java has been forced to work around its absence.

As for the tasks that need to be solved here, the primary one I see is reconciling the needs of different libraries. What do you do when you depend on library A and library B and they need two incompatible versions of library C? As I see it, there's no clean way to answer that question if A and B expect the two incompatible versions to be present with the same name.


Yeah, I'm also a bit concerned with vendoring going away. The proposed solution to preserving upstream is imho very elegant (caching proxies) and scales better than vendoring, but it requires a bit more infrastructure. Perhaps vgo could be taught to look in a local directory for an exploded content of what logically is the upstream archive (and that local directory could be checked in git, or preserved as a cache by your CI etc).


I think vendoring is very useful and it would be a step back if it becomes harder.

Caching proxies for zip downloads sounds nice, but it's more than just "a bit more infrastructure". I think it would be a huge burden to package publishers if each of them has to manage their own dependency zip mirror as a separate piece of infrastructure. You need version control anyway; checking your dependencies into that same version control does not require a new piece of infrastructure.

Coming from Ruby, where rubygems.org is a very painful point of failure, in my eyes the fact that Go dependencies are not a separate download is a big plus.

In fact without a single blessed dependency repository such as rubygems.org, in the Go case you have as many points of failure at build time as there are different code hosting sites in your dependency graph.


I don't think package publishers are the ones that need to manage the caching proxies: everyone who wants stable builds that don't break when some upstream deletes an old version of a package (or a networking error) needs a proxying cache.

Vendoring turned your git repo as your poor man's proxying cache. It also made some people unhappy. In my current company we use phabricator for code reviews and it doesn't work well if the commit size is bigger than some threshold.

I love to have the option of not checking in dependencies. I'm not sure this option has to be forced on everybody though.


> In the absence of this principle, the Java ecosystem has developed a compensatory mechanism of packaging "shadowed" versions of their dependencies alongside their own code.

well that is okish..

however... looking at the xml stuff.. that is probably bad. basically a lot of packages repackage xerces2 or the java.xml apis. besides that jaxp with streaming is present in java since 1.6+. But nobody removes this stuff.


Now is a good time to mention the great blog post, "So you want to write a package manager," by Sam Boyer: https://medium.com/@sdboyer/so-you-want-to-write-a-package-m...


>I'm going to comment mostly on the parts of the proposal that I think are wrong, but don't take this to be an overall negative response. I'm excited to see smart folks working on this, and package management is a really hard problem.

And as usual Golang ignores progress in the area by package managers such as npm, cargo, et al for what seems like a half-hearted solution.

Issues I see: the introduction of modules on top of packages solve no real problem, the addition of major version numbers as part of the package identification (and thus allowing the same program to use different versions of a package), and "minimal version selection" solves nothing that lock/freeze files wouldn't solve better, while preventing users from getting important minor but compatible updates (e.g. security) as a "default".


Having had lengthy conversations with Sam Boyer on this topic, I know that at least he deeply knows how these systems work. So it hasn't felt like "ignored" to me, at least, as an outside observer.


Perhaps not ignored in the "didn't know about them sense", but in the "nevertheless went ahead and did its own thing".


That's unconstructive and substance-less.

Could you expand on the progress you mentioned, or explain what parts of the counterexamples you gave the Golang folks should learn from? In what ways is the proposal a half-hearted solution?


Added some issues with the current proposal.


> I think it's really going to confuse users to have the major version in both places. What does it mean if I my code has:

after reading the proposal, my understanding is:

the 'import "github.com/go-yaml/yaml/v2"' directive would lead to installing the oldest version of yaml 2.x that is supported by your other dependencies.

meanwhile, the go.mod file would mean that any dependencies that use the incompatible yaml 1.x library, would lead you you installing the oldest 1.x version after 1.5.2 which would then be used all dependencies that import the 1.x version

> No, I think newest (stable version) is the right default. Every package manager in the world works this way and the odds that they all got this wrong are slim at this point.

Doing this is meant to allow reproducible builds without requiring the use of a lock file. As to why they don't want a lock file... that isn't really addressed in the article. Lock files do seem like the most sane way to provide truly reproducible builds (that aren't dependent on the repo tags no changing since they are usually locked to a specific commit hash). I think the decision to avoid a lock file is a bad one and certainly needs to be justified.

> I think this is confusing older versions and lower. You could, I suppose, build a package manager that forbids publishing a version number lower than any previously published version of the package and thus declare this to be true by fiat.

I agree, I also think they they meant to say "minimal minor version" since major version have different import paths and are BC incompatible.

Ideally, "prefer oldest / prefer newest" should be something that can be configured per requirement in the go.mod file so that people who don't care about reproducibility don't have to go through and bump their minimum versions every time any dependency has a new release. Making this dependent on using a flag every time you run 'vgo get' is silly and doesn't allow you to do this for some packages and not others without having to write your own script to make a bunch of 'vgo get' invocations.

> I think preferring minimum versions also has negative practical consequences. Package maintainers have an easier job if most of their users are on similar, recent versions of the package's own dependencies.

Ideally, the first step you take when you encounter a bug with a library would be to check to see if a more recent version of the library fixes the bug. In practice, I don't know how many people would fail to do this before filing a bug report.


Current Go vendoring already allows two versions of a package to be used at the same time, and it is problematic. Both copies of the package will run their initialization, which works if they are completely self contained, but if they interact with any other global state things can start going wrong. You mutexes protecting external global resources don't work, because they are in different namespaces. These and similar problems are why common wisdom is that libraries should not pull in vendored dependencies.

So yes, attempting to allow multiple versions of the same module will cause grief.

You would also need to work around a way to override the choices made by dependencies, eg. to rebuild with security or bug fixes, without requiring me to fork everything.


> Dart does not let you have two versions of the same package in your application, even different major versions. This restriction does cause real pain

Dart is primarily targetting web deployment, in which code/executable size is a major concern. For Dart's use-case it makes perfect sense to force the developer to sort this out ahead of time, painful as it might be. For lots of other languages (including go), the primary expected deployment is to a binary executable where bloating it with multiple versions of dependencies to make the builds easier and make it possible to use two dependencies with mismatched dependenices of their own is very rarely a problem.


How would it be possible to guarantee reproducible/deterministic compatibility without using something like the hash of the entire library as an implicit "version"? (say, the git SHA)

I believe that the Nix OS does something along these lines to guarantee deterministic builds.

A number of times in at least a couple of languages, I've seen a library keep the same (exact) version number but make some small "bugfix" change that ended up breaking things. Often, nothing stops someone from doing that.


The proposal doesn't seek to guarantee reproducible builds; it merely seeks to enable them, through the methods they outline.

If you did want to guarantee reproducible builds with SHA-1 hashes, one way would be to introduce those into the .mod files they outlined. But that'd be clunky; it's much easier to reason about a version number than it is a digest hash.

Another method would be to introduce a lock file where those details are kept from plain view, but my sense was that they wanted a little more openness about the mechanism they were using than a lockfile provides (which is why .mod files use Go syntax, save the new "module" keyword they would introduce). After all, that's how dep works right now: they might as well just keep the lock file.

Cases where tags are being deleted, or worse—where accounts are deleted, then recreated with (other? same?) code, may be said to break the semver contract the library or binary author someone has with their users. As such, it may be seen as outside of scope for what they are seeking to accomplish with vgo.


What are the criticisms of lockfiles? I've used lockfiles more or less successfully in Rails, Elixir, and of late, Node. I thought it was a proven (if imperfect) idea...


I should note I am fine with lockfiles myself, so I can only speculate as to what RSC/others feel. It is fair to say that lockfiles are not Go—they would be another set of syntax that has nothing else to do with the language aside from their use in package management. So one might argue that it would be desirable to have a solution to Go's package management that was achieved using Go, which are what the module files are written in.


So make the lock file a simple go program that just returns an array?

This approach actually adds tons of flexibility... not sure it’s needed, but you could return different lock data based on whatever logic you needed


> Dart does not let you have two versions of the same package in your application

Not as direct imports, but Rust allows transitive deps to get the version they specify.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: