
Semantic Versioning - A technique for avoiding "dependency hell" - mojombo
http://semver.org/
======
joblessjunkie
While this is a reasonable (and common) version numbering scheme, it does
little to address dependency hell.

A trivial bug fix can be backwards incompatible with systems that have been
built to rely on the bug.

The presence of new, unused features in a library can impact the runtime
behavior of the library in important ways.

All changes of any kind must be considered to be potentially backwards
incompatible for some consumers.

So the rules for deciding whether to increment a version number by 0.0.1,
0.1.0, or 1.0.0 don't hold up. A producer cannot predict the impact of a
change to all consumers.

~~~
arohner
Yes.

Additionally, this system dies horribly once you introduce branching.

Versioning is another problem I want to introduce to the Test Case Wiki. The
test case wiki is an idea I've had for a public wiki for hairy problems with
numerous, poor implementations. I.e. a page on implementing a library for
handling times and dates. "Have you thought of the following corner cases?".
Software Versioning belongs there too.

~~~
mojombo
Can you elaborate on the kind of branching you're doing, and what your current
favorite solution for that situation is?

~~~
arohner
At the day job, we commonly see situations where the customer has a bug they
need fixed, and they're not willing to wait for the next scheduled release. We
give them a one-off release containing only that fix. What do you label it? If
you label it 1.0.1, then the next "official" release will be 1.0.2. Worse,
what is the label for another release off the branch, after your trunk release
of 1.0.2? I find this unsatisfactory because the two releases are not
necessarily related to each other. AFAICT, all version systems that involve
only numbers and dots will either fail to handle branches, or become
horrifically complex.

I've seen products where releases were branches off branches. This happened
because the customer is extremely risk adverse and we had new code in trunk
that they didn't want to test; they wanted known-good code plus a subset of
trunk. Management went along with it because the customer is several orders of
magnitude larger. We strayed from trunk for so long that trunk got dropped,
and one of the branches was designated the new trunk.

My general goals for a version system are:

    
    
        1) provide a unique version for every build
        2) Give an indication of where this code came from.
    

I currently advocate a Git-like solution. I don't promise that this is 100%
free of corner cases, but I believe it's better than everything else I've
seen. Give every build a SHA1 / GUID. If you're running Git, this is just the
SHA1 of the tree, when you build. That is the build's "official version". If
you want to see where the build came from, git can draw you a pretty picture.
When doing the build, allow the user to specify a human-friendly label, as a
string. This can be anything, from "v1.0.1" to "v1.0.0 + one-off fix for
MegaCorp" or "Bob's developer build for testing foo". In the product, display
both versions.

~~~
mojombo
Interesting. Can you explain why you're doing one-off fixes instead of
creating a general release with the fix that everyone can use? This seems like
a bit of a complicated edge case that most people don't need a solution for.
Semantic Versioning is very simple and has no intention of solving every
possible versioning problem.

~~~
arohner
I never thought it (one-off fixes and branching) was a good idea. It happened
because one customer constitutes a large percentage of our revenue, and the
business guys made the decision with little care about the effects it would
have on the software.

Yes, this is all a complicated edge case, but in certain contexts (corporate
software, where you aren't calling the shots, or, you _are_ calling the shots
but aren't willing to tell a customer constituting 40% of your revenue to fuck
off), you need a better solution. Additionally, at the start of the project
you don't always know whether you'll need to support branching. Often, once
you figure out you need branching it's too late to fix. Fixing requires
changing the build process, testing time, educating QA and Support about the
new process, etc.

I'd rather have a well-understood, general purpose solution ahead of time.
IMO, "git versioning" accomplishes this, and it's just as simple.

------
kscaldef
Hardly a new concept, and one that many companies and projects have used for
ages. I'm not criticizing this policy at all, but the impression of taking
credit rubs me the wrong way. Yes, there's a couple sentences in the middle:
"This is not a new or revolutionary idea. In fact, you probably do something
close to this already." But, actually, I and many others have done _exactly_
this, not just close to it, for a long time.

~~~
mojombo
> Hardly a new concept, and one that many companies and projects have used for
> ages.

Certainly. And I don't claim to take credit for any of these concepts (and
explicitly say so). The whole point here is to give this idea a name and clear
spec so that I can tell people my software uses Semantic Versioning instead of
writing it all out every time. It's useful to me and my coworkers, so I
decided to share it in case others find it useful as well.

~~~
gleb
This is called Unix shared library version numbering convention.

* <http://apr.apache.org/versioning.html> * [http://www.freebsd.org/doc/en/books/developers-handbook/poli...](http://www.freebsd.org/doc/en/books/developers-handbook/policies-shlib.html) * [http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries...](http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html) * <http://home.bolink.org/ebooks/unix3/mac/ch05_04.htm>

If you can make it popular in Ruby world you can call it whatever you please
:-) You should be able to find 10 (and probably 20) year old rants about how
technology X does not do version numbering right, and should use Unix shared
library standard.

------
peterwwillis
I think this is a solution without a problem.

Version numbers are effectively meaningless to most package managers. Sure,
they parse the numbers and can do basic comparison about whether something is
less than, equal to or greater than a given version. But all a package manager
cares about (afaik) is its requirements and conflicts. If you have a way to
ensure its requirements exist, and it does not conflict with anything already
installed, everything is fine. (Oh, and that it provides what something else
requires, which is somewhat the same as a requirement but in a different
context)

To solve this you simply build your packages so they will never conflict, both
logically and physically on disk. This way you can have any version of any
software installed all at the same time. The software is compiled to link
against the location to the software it depends on. Symlinks create the basic
structure of what the "default" paths should be for a given application, and
an alternate path can always be specified manually to call a different
version.

~~~
mojombo
It solves a significant problem that I've encountered several times in real
world packaging setups. The problem it solves is outlined on the linked
website.

Semantic Versioning is also about more than just dependencies. It's about
transparently and accurately communicating the impact that a new version will
have on your existing code. As a user of a large number of libraries, a more
rigorous approach to versioning would make my life immensely simpler.

------
christofd
Interesting idea. I could imagine whole environments of development stacks
being defined: e.g. a Rails 2.3.5 production stack, or a Django stack, or a
Clojure webdev/ data mining stack, and all according to exact version numbers.

I could see that for Big Ticket projects such as Rails/ Clojure/ Django etc.
people would maintain an up-to-date 'semver' database.

The only thing is that the name 'semantic' is not really a winner (it sets
people on edge). Maybe something else will survive in the end, unless people
are comfortable enough with the pragmatic value of the idea and will look past
the name.

------
joeyh
Why is there an arbitrary requirement in the middle of it about the format of
tags used in a version control system?

Symbol versioning is a better approach, on systems that support it.

------
thwarted
I think a lot of the issues of "dependency hell" come from the non-obvious way
to specify a specific version of a shared library to link against (in fact, I
can't even find out how to do this in the documentation for gcc or ld now, but
I know I've done it, and that it's possible).

Take libncurses for example; I'm using it as an example because there was a
time when multiple versions were installed on many linux distros. Both
ncurses4 and ncurses5 could be available, and in fact, this used to be case
with a lot of linux distros, ncurses4 was often installed for binary
compatibility with stuff that was previously compiled.

So you end up with the following files on your machine:

    
    
       libncurses.so.5.7
       libncurses.so.5 (symlink to .5.7, maintained by ldconfig)
       libncurses.so (symlink to .5.7, mainted by ldconfig)
    

Ideally, you'd be able to install ncureses4 on the same machine, and you'd end
up with these files also:

    
    
       libncurses.so.4.x
       libncurses.so.4 (symlink to .4.x, maintained by ldconfig)
    

the .so entry is a link, maintained by ldconfig, to the latest version. There
is no conflict because shared linkage is against a specific A.B.C version,
resolved at runtime (see below).

So if you compile/link with -lncurses, you get the latest one (which is a
reasonable default), as that's what the .so points to. But if you have
ncurses4 installed, and you know you want to link against that API, you need
to link against the specific file (with a full path and version number in it),
rather than use ld.so.cache and gain the advantages of runtime library
location resolution.

So the work around for the inability to specify an exact version number to
link against without having to give the full path to the library is to move
some of the version number to before the .so and make it part of the library
name. So you now you link with -llibraryX-2.4 which looks for
liblibraryX-2.4.so. No one calls this, or thinks of it as, libraryX-2.4, they
think of it as libraryX. Since there is no standard way to do name or indicate
version numbers of with this work around, you sometimes end up with these
different formats:

    
    
       libXXX.so.1.2.3
       libXXX-1.so.2.3
       libXXX-1.2.so.3
       libXXX-1.2.so.1.2.3
       libXXX-1.2.3.so
    

Each library maintainer making their own version number format. This could all
be the exact same library version.

A quick demo to show that when using -llibrary (and -shared), the full path
isn't stored in the binary, just the filename and version number, which is
resolved at runtime by the dynamic linker in ld.so to an actual full path
using the cache created by ldconfig (hardly conclusive, but you get the gist
-- could also use binutils stuff here to see this):

    
    
       $ strings /etc/ld.so.cache | grep libmagic
       libmagic.so.1
       /usr/lib64/libmagic.so.1
       $ ldd `which rpm` | grep magic
            libmagic.so.1 => /usr/lib64/libmagic.so.1 (0x00007f9dfed95000)
       $ strings `which rpm` | grep libmagic
       libmagic.so.1
    

In this case, we see that ld.so.cache maintains a library name to filename
mapping, rpm mentions just the library name (with version number), and ldd
resolves that to a full path, and that there is no mention of the full path in
the rpm binary itself.

I'd really like to see a -llibrary=version option that helps with this,
allowing a specific A, A.B, or A.B.C (or whatever string is after .so in the
filename)... maybe I was thinking of some linker other than gnu ld that works
like this.

~~~
peterwwillis
If you know what version of a library you want to build against and you know
the path to it you can link against it. The compilation and linking is no
longer automatic because you have to specify your dependent library
specifically, but you can put it anywhere you want and it won't conflict with
a different version of the same name.

One way to simplify this is to customize your library builds and associate a
unique pkgconfig .pc file with them. When you build your application,
reference the .pc file for the library version you want. If the application
does not support pkgconfig you could write a wrapper that parses it and
provides the build/link options you desire. If you're building packages you're
doing enough work by hand that this should not be unnecessarily complex.

~~~
thwarted
_If you know what version of a library you want to build against and you know
the path to it you can link against it. The compilation and linking is no
longer automatic because you have to specify your dependent library
specifically, but you can put it anywhere you want and it won't conflict with
a different version of the same name._

But that's my point: we should be able to get automatic linking. Different
versions of the same library already don't conflict because the file names are
different:

    
    
       libncurses.so.4.0.1
       libncurses.so.5.1.1
    

_When you build your application, reference the .pc file for the library
version you want. If the application does not support pkgconfig you could
write a wrapper that parses it and provides the build/link options you
desire._

True, I see what you're saying. It always seemed to me that custom pkconfigs
for this case, that generate full paths to the libraries instead of -llibrary
options, is a bandaid on not having automatic linking against a specific
version.

~~~
peterwwillis
i too hope one day all this will be automatic (though we still have to specify
a version, and keep up to date on what version supports what, so its hardly
automatic for the dev or packager). the real fix to me is to embed all
relevant information in the binaries and let the dynamic linker figure out
which one should be used at run time. compile time would merely embed what
version was used to build the app. maybe this is already possible? i'm
curious...

~~~
thwarted
While that would be cool, it doesn't seem very pragmatic. I think I'd rather
have control over exactly which version is chosen (where I can use
implementation details in making a decision), rather than have it decide
automatically based on claims expressed in the library metadata.

 _though we still have to specify a version, and keep up to date on what
version supports what, so its hardly automatic for the dev or packager_

I think the common case is for _new_ development, you'd most likely develop
against the most recent release version, but you don't need to keep up to date
on which version supports what because you can continue to use the old
versions with long-since compiled binaries as long as it can be installed
along side the latest version. I think people and distros, in general, are too
quick to remove older versions from being installed (or even available), which
increases the on-going maintenance requirements of still popular binaries.

On the other hand, during transition periods, distros have been pretty good
about this, like the libc5 vs libc6 transition, and by providing compat
packages. But this has mainly been an issue with closed source abandonware
(Skype for a long time was still using OSS and needed an ancient version of
some audio libs).

