
Package Management: The problem with using version ranges - kiyanwang
https://www.lucidchart.com/techblog/2017/03/15/package-management-stop-using-version-ranges
======
yuchi
Please note: I can speak only for the Node.js npm/Yarn ecosystem.

This article is terribly outdated. Version ranges are great and the only way
to manage correctly your dependencies without having tons of duplicates. Also
they are a security guarantee, because with them you can get security fixes
“automatically”.

Yet there have been problems, that’s why both Yarn and npm@5 implement a lock
file which gives you all the guarantees of a range-less package management
with the power of a range-based dependency tree.

If module developers following this (IMHO not lucid) article start bundling
specific versions in their npm packages everyone will suffer. It’s as stupid
as it get.

Please, do your research before calling out on practices.

~~~
marichards
Getting security fixes automatically is not a good thing.

Too many open source projects fail to ensure non-breaking changes or long term
support of multiple release versions.

So either you rely on greatest(>) that has no guarantees to not break and not
just patches (~), you try patches and a new release is effectively end of
support or you rely on greatest, fail to do a regular audit and realise the
dependency is no longer in active development and will receive any updates
(which upon typical discovery would require a change of dependency used).

Security isn't just about vulnerability, and I don't like to ask it (because
it isn't universal), but is not a risk of system failure often worse than a
risk of system vulnerability? It depends somewhat on how critical the system
is and defaulting to explicit versions instead of latest is probably a better
practice in my opinion.

I've done development using NodeJs and on the JVM and found I had far fewer
problems with Java dependency management.

~~~
senko
> Too many open source projects fail to ensure non-breaking changes or long
> term support of multiple release versions.

Relying on those projects is not a good thing.

~~~
digi_owl
When such big names as Gnome can't pull it off, then what projects can one
rely on?

~~~
zer0tonin
For example React doesn't ever introduce breaking change by surprise.

~~~
lucisferre
I think that should be qualified with "intentionally". Angular is supposed to
be following semver but often has "regressions" in point releases which they
fix in a following point release. Usually fairly quickly but it is still not
guaranteed.

------
infinity0
Yet another package management "advice" article from the point-of-view of a
developer and not a maintainer.

You try maintaining an ecosystem where every dependency constraint is a hard
specific version. You'll spend 80% of your time fiddling with bumping
versions. Fuck that shit.

~~~
k__
Well, someone has to do the work.

But version pinning is probably the best way at the moment.

This way the maintainers can use ranges and the developers can pin what they
really need in the end-product.

There are more developers than maintainers, so this scales much better.

~~~
eeZah7Ux
> There are more developers than maintainers, so this scales much better.

Wrong comparison. The majority of software is built, deployed, maintained and
used by entirely different organizations over years.

Companies doing end-to-end CD and running their software only internally are
the minority.

This kind of bad practices from few developers turns into countless hours of
work to maintain systems years later.

------
bmn__
Paul Draper from TFA totally ignored the perspective of a packager.

It would be a completely unacceptable situation for application A1 to depend
on DB abstraction library v0.8651, A2 on v0.8653, A3 on v0.8670, A4 on v0.87…

How do you ship this to your distro users? Almost always there is no easy way
to install several libraries of the same name but different versions in
parallel, and it's never supported by the distro auto-packaging tools.

Patching to use the latest version of the dependencies and verifying the
patched version works ok is a huge burden on package maintainers. The
application author knows the code much better!

The current solution of specifying minimum versions of dependencies exists not
because we like the suffering caused by occasional breakage. No, it exists
because it is a practical, working solution for the real world that is the
best balanced trade-off.

~~~
codedokode
The problem is that nobody really tests all the versions they specify in a
dependency list. They just install the dependencies, and if everything works,
commit the file.

In PHP we usually have a version range in libraries' dependencies and lock
file that specifies exact dependency versions for the entire application. So
you can be sure that you get tested combinantion of libraries.

~~~
eeZah7Ux
> The problem is that nobody really tests all the versions they specify in a
> dependency list

Some do. Some others target the versions shipped in a stable Linux
distribution.

And even if the upstream don't test some combinations of versions,
distributions will test what they ship.

~~~
codedokode
> Some do.

How can they check future versions of libraries that will match the range?

------
molsson
In npm all packages have their own dependencies so specifying a specific
version sort of works. In Maven for example the dependency tree is flattened
by the package manager so that at runtime there can only be a single version
of libA. This is just how Java 8 and below works, you simply cannot import two
versions of com.example.Foo at the same time. This will change in Java 9 with
Jigsaw though. OSGi was a way to workaround this problem in <= Java 8. npm
also has a "flattening" feature to save disk space but this is a completely
different thing, an implementation detail that is never visible to the
packages themselves.

~~~
SanderMak
Just want to point out that Jigsaw/Java 9 will not support multiple versions
out-of-the-box. (disclosure: I'm currently authoring Java 9 Modularity for
O'Reilly)

------
lucb1e
I can only imagine the amount of new version conflicts that this would
generate (I've already seen more than I'd like) if we would no longer go
"1.8.0 has a feature that we depend on, so 1.8.0 and newer". Every time every
dependency gets an update, every dependent package would need updating too (or
at least its metadata). Hell no.

Perhaps go for a range + last tested version, e.g. "1.9.7 worked" so when 2.0
comes out and breaks, you know what to use instead.

~~~
yuchi
That’s how semver ranges work, in fact saying `^1.3.12` means “I tested for
v1.3.12 and I want all non-breaking updates that come afterwards”.

~~~
HurrdurrHodor
Well, if you already know which updates are non-breaking, of course you have
solved this problem.

~~~
lucb1e
Indeed.

If you change the output of "ls", you know everything that uses ls will break
and you can bump a major version. But if you change e.g. the date format to
include a timezone (visible when -l is specified), does such a change really
warrant a major version bump? It's just a minor change but it might still
break things. Or maybe if you refactor some stuff in curl, the curl-dev
package might suddenly be incompatible with something that depended on it.

------
flukus
If specific version are used then you can end up with multiple versions of the
same dependency which can cause additional problems if they're passed around.
In compiled languages you end up with errors like "can't assign typeA to typeA
because they are different types", when they are different versions of the
same type.

Not sure how node would handle the same type of error but I'm guessing it
would involve subtle compatibility issues.

~~~
yuchi
You would have no errors except when dependencies expect to be “the only one”,
in “peer dependency” style.

------
jacques_chester
This is right: unless you use exact versions, ideally vendored or cached in a
local package server, you do not have even basic reproducibility.

However.

The reason people use ranges is because they don't want to handle the
administrative burden of tracking upstream changes and updating their
software.

The OWASP Top 10 (2017 draft) shows that using known-vulnerable components is
a common security weakness in software. Nobody sets out to do this, but the
flipside of having access to massive troves of dependencies is that you have
dozens, hundreds, perhaps thousands of dependencies you might not know about.

I've worked on Cloud Foundry Buildpacks. A large amount of effort has gone
into exactly this problem: tracking upstream versions, pulling new ones
immediately, building them and testing them. But it's at varying levels of
sophistication.

For some dependencies, you get structured data. NodeJS publish an index.json
file[0] which is fairly trivial to keep up with. But for other sources, which
I will leave unnamed, it's necessary to do flaky, unreliable HTML parsing to
detect new releases. Data about what's been found so far is kept in a ci-
robots[1] repo to ensure atomicity and simplicity (I'd prefer a Concourse
resource that emits versions one by one, but that's just me).

What's needed is a uniform way to describe available versions of software
packages. Uniform across all the major tributaries of versions: package
managers, distributions, the works. If it becomes possible, as Buildpacks have
after a great deal of effort, to automate fetching and testing new
dependencies, then the need for ranges vanishes. You can rely on your CI/CD to
run each dependency change through its paces and update the firmly fixed list
for you.

[0] [https://nodejs.org/dist/index.json](https://nodejs.org/dist/index.json)

[1] [https://github.com/cloudfoundry/public-buildpacks-ci-
robots](https://github.com/cloudfoundry/public-buildpacks-ci-robots)

------
kronos29296
Something like Haskell Stack fixes a lot of problems with versions but it is
not suitable for all situations. It is a lot of work to produce a curated set
of packages that work. But most of the time things just work so people work
around it the rest of the time.

~~~
wereHamster
It's not stack but stackage you are referring to.

Stackage puts considerable effort on the whole community to ensure that
packages remain compatible, and forces package publishers (developers) to
update their packages in a timely manner (otherwise the package will be
dropped from the snapshot). That works in the Haskell community, but I'd be
hard pressed to believe any javascript developer who has a package on npm.org
would put the same amount of effort into keeping it up to date.

~~~
kronos29296
Package manager is stack and the online repo is hosted at stackage. You manage
packages from stackage using stack. You probably can't do it in a larger
community like JS or Python because it is too big. Haskell is at the right
size to do such things.

~~~
wereHamster
Not all Haskell packages (those in hackage.org) are in the stackage snapshots.
Stackage is opt-in, there must be a maintainer willing to resolve the
occasional build failures.

That means there is no such thing as 'too big', because the number of all
existing packages in a ecosystem doesn't matter. It's the commitment that
counts, and that's what I don't see in the JS community.

------
u801e
What I would like to see is support for existing package managers like rpm and
deb that could be mirrored from these language repositories (e.g. Python,
Ruby, Perl, etc). This would be more difficult for packages that have external
language bindings, but a source rpm/deb package could work).

------
codedokode
PHP's package manager (composer) has a lock file that contains exact versions
of packages and can be commited to the filesystem. So we don't have problems
with version ranges.

------
ex_amazon_sde
It's sad to see many people quoting npm as a good example while version
ranging has been mastered by package managers on Linux and earlier Unix
systems for decades.

------
baq
there's one solution in the article that's valid: version lock files. these
are a must have for any reasonable deployment strategy.

now when you talk about actually developing the software, this is a completely
different set of requirements and that one is best served with very coarse
grained (but safe) version ranges. different tool for a different use case.

------
ryanbrunner
I feel like this is describing a solved problem, and what's more, the author
knows the solution. Lock files eliminate the issue of mysterious build
failures and allow almost trivial creation of automated canary builds to
detect problems early.

------
_pmf_
This is about using open version ranges. Closed ranges are slightly less evil,
though errors can creep in, too.

~~~
yuchi
Very nice point. If you look at all npm examples in the article it shows the
_> = M.m.b_ pattern which is never found in the wild. You’ll find _~M.m.b_ for
VERY old packages and _^M.m.b_ for modern ones (last 4 years). Those let the
minor and build increase, but not the major.

------
sjrd
How come this is not already obvious to everyone?

I completely agree with the article. Version ranges are evil. Avoid them at
all cost.

~~~
alexscheelmeyer
One reason is that many developers work with package managers / languages that
are INCAPABLE of handling multiple versions of the same dependency. Without
this you quickly get version deadlocks without semver.

Another reason is that many developers are obsessing on getting the latest
version of their dependencies for fear of security issues or just missing out
on the latest and greatest - and they often completely ignore retesting the
application since they now have someone to blame if it fails (that other
developer should not have pushed the breaking change with a minor version
bump!)

I agree with you that it should be the standard to have fixed versions and
update your dependencies at a time of your choosing so that everything can get
tested properly - but it seems to be an uphill battle.

~~~
eeZah7Ux
> many developers are obsessing on getting the latest version of their
> dependencies for fear of security issues

Getting the latest version is how you get new vulnerabilities.

Various software distributors, including some Linux distros let software bake
in for this reason and can be even faster than the upstreams in developing and
applying patches to known vulnerabilities.

Also, unfixed but known vulnerabilities are less dangerous: security and
system engineers can work around them, also IDS/IPS can detect and often block
attacks.

