

Babel – Nimrod's package manager - dom96
http://picheta.me/articles/2014/06/babel--nimrods-package-manager.html

======
tikhonj
This post got me thinking: why must every language reinvent the package
manager, poorly? It seems profoundly wasteful. How much of the logic and
infrastructure involved is actually specific to the language being used? I
could be missing something, but it seems very little.

It would be really cool if languages just started using a common package
manager instead. It could also be more fully-featured, able to do things like
track dependencies that aren't written in the language like C libraries. We
would also avoid having a ton of custom file formats and dependency solvers.

From what I've seen, Nix[1] is the perfect candidate for this. It's self-
contained, can install packages locally and does a good job of simultaneously
managing multiple versions of the same package: all very useful for
development. It's also portable, working on Linux, OS X, FreeBSD and even
Windows with CygWin.

In a sense, I suppose this would be like reinventing the system package
manager. But I don't think that's a problem. Unfortunately, most Linux package
managers seem configured in a way that is not convenient for development (ie
global packages only). It's also a bit of a pain for a language community to
maintain a package for everyone's favorite package manager; if they just
settled on one (like Nix), they could have a single Nix repository similar to
how they have a custom package list now.

[1]: [https://nixos.org/nix/](https://nixos.org/nix/)

~~~
eudox
We have this discussion literally every time. Different languages have
different conventions and build systems, and packaging a library for every
operating system package manager is just not feasible.

A single package, for a single package manager that runs everywhere and is
tightly integrated with the language, is the only reasonable solution.

~~~
tikhonj
My point is _not_ to use the system package manager, but to replace the
various language-specific ones with, say, Nix. It should be flexible enough to
work with the different languages' build systems and conventions, while
reusing all the existing systems for managing dependencies, conflicts and so
on. Very little of all that logic is language specific, and it's traditionally
done poorly by language-specific package managers.

Basically, I do agree on using a single package manager. I just don't think
that the need to be integrated with the language is severe enough to warrant
writing a new one for each language! It's much less effort for everyone
involved to set integration up for an existing package manager instead.

In other words: replace Pip/Cabal/Babel/whatever with Nix, maybe with some
plugins for the needs of your specific language. You would even still ship it
with your language (optionally), so the only change would be maintaining a
plugin, if necessary, instead of a whole manager.

~~~
munificent
> to replace the various language-specific ones with, say, Nix.

So now, to use packages for programming language X, I also have to install
Nix, be running a Unix-based OS (sorry Windows folks!), and have whatever
other dependencies and toolchain Nix requires installed.

This is the core problem. A programming language _needs_ a package manager
today. Code reuse is too important for a language to succeed with out a nice
user experience to get and share code and deal with transitive dependencies.

That puts a package manager on the critical path for a language. Given that,
it's important that the package manager itself have minimal dependencies. It
would suck, for example, if a package manager for Python was written in Ruby
and required Ruby to be installed!

The only thing most languages' package managers are willing to depend on is
the language itself and its core libraries. That means for every programming
language, you get a new package manager.

It's not ideal, but it seems to be the best solution we've found given those
constraints.

~~~
sparkie
You're focusing too much on "THE Nix", and not the idea behind it. Nix is an
idea which detracts from traditional package management by concluding that
ranged dependency versions actually create more problems than they solve, so
we should just do away with them, give every package an immutable identity,
and specify the identities as dependencies - thereby making reproduction
significantly more reliable and free of unwanted side effects like environment
variables or user-built packages (outside of the distro's own).

Nothing in this idea says you need to run a Unix based OS, only the two
current implementations of the idea (Nix and Guix) do so currently. The
concept can be applied to a language specific package manager (LSPM) too, and
IMO would be much better than the way many of the existing LSPMs behave -
which is to duplicate dependencies into project directories - almost
completely eliminating the reason to have shared object dependencies over
static linked dependencies in the first place - because you need to manually
update the dependencies of every project when a dependency gets a bug-fix.

The real problem with LSPMs is that they're functionally incomplete PMs - they
require you to (manually) use a system wide package manager to install the
native dependencies which you inevitably need, because it's inefficient to
rewrite everything in your language rather than provide bindings to a native
library. The LSPMs basically spit out their dummy when you try to install
something and a native dependency is not met - often giving a cryptic error
message which doesn't explain the cause of a problem, or how to resolve it.

I've suggested several times what needs to be done to resolve this, and it
doesn't involve throwing away the LSPMs, but augmenting them. The solution is
to provide a system wide daemon which facilitates native installs required for
by the language, whereby the LSPM can query and request the installation of
packages over a well defined protocol, and the daemon forwards those requests
to a system-level PM. This could also improve the state of GUIs for package
management - allowing a unified interface to installing them rather than a
dozen ad-hoc CLI interfaces.

Of course, I don't have all of the details on how this would be implemented -
for starters, it would add a new dependency to every package manager using it
(the IPC mechanism and protocol used), although almost any language can use a
socket based IPC. Each LSPM would also need to introduce some new notation for
"external" packages, which it doesn't manage internally, but instead sends off
a query to the daemon, waits for a response which will either fail (the native
package is not installed or won't be installed), which will also improve the
state of error messages, or it may succeed, and the LSPM can continue building
the rest of the project.

If such system is going to be built, the Nix model of immutable packages seems
really the only sane approach to the problem. The identity-based approach to
dependencies means that when building a package, you're always going to be
building against dependencies the developers intended - because they're the
same ones he/she tested against. Each time a developer builds successfully
against a different dependency version he should add its identity to the
package definition, confirming that it has been tested. If you decide you want
to build against a different version, you can simply adjust the identities,
but you _should not_ expect it to work without some manual fixing - because
people tend to break things. If you happen to have luck, and your modified
dependency works - you should publish the updated package definition so other
people can learn from your testing, and blame you if you published a broken
package (ie, failure to build implies a _bug_ , as it should)

~~~
munificent
> Nix is an idea which detracts from traditional package management by
> concluding that ranged dependency versions actually create more problems
> than they solve

That may be the right choice for the kinds of packages that Nix targets but
not for other language's semantics. For example, I wrote the package manager
for Dart and it specifically intentionally does use versions because forked
shared dependencies don't play nice with the language's semantics.

> The solution is to provide a system wide daemon which facilitates native
> installs required for by the language, whereby the LSPM can query and
> request the installation of packages over a well defined protocol, and the
> daemon forwards those requests to a system-level PM.

That's a large ball of complexity and dependencies. It isn't clear that that's
actually a more effective solution than just rolling a package manager for
each language.

> The identity-based approach to dependencies means that when building a
> package, you're always going to be building against dependencies the
> developers intended - because they're the same ones he/she tested against.

That doesn't play very well with shared dependencies in many cases.

Ultimately, every package manager has a bunch of policies embedded in it.
Things that it just decides are The Way Stuff Should Work. Those policies are
_heavily_ influenced by what how the programming language is used.

Trying to find the One True Package Semantics seems like a lost cause to me.

~~~
sparkie
To me, the motivation for specifying exact dependencies as much a social
concern as a technical one - it puts responsibility, or blame, where it
belongs. If we take an example scenario for "bad packaging", something like:

* developer X releases package A, version 1.0

* developer Y releases package B, version 1.0, depending on package A > 1.0

* developer X releases package A version 1.1 with breaking changes

* user Z with package A 1.1 installed attempts to build package B and fails

Who is to blame in this case? At first thought you'd blame developer X for
releasing breaking changes without bumping up the major version, however,
there was nothing in his "contract" that says he cannot do this - moreover,
even if one specified up front that they will "not release breaking changes
without bumping the major version", how can you even enforce that at the
software level?

The way we actually model the "blame" in the industry is to blame Z - the
idiot user who can't build software, who failed to follow incorrect
documentation and feels like an idiot because he doesn't know what was wrong -
then he wastes significant time searching for solutions, debugging build
scripts, re-versioning the dependencies by introducing A10 and A11 in place of
A, because he needs both, and his PM can't have two of the same name, and the
names can't contain periods (Arbitrary restriction of some PMs).

The actual blame here lies in Y - he is the one who documented that his
software will work with any version of A greater than 1.0, but I find this
declaration to be idiotic - he cannot possibly uphold this "contract", because
he is not in control of A. The best guarantee he can give is "I've tested this
with 1.0, but it may work with any later version, however, do so at your own
risk."

Ok, so there's no really a "contract" to be held - particularly for public
domain works and "permissive" licensed works where the developer is just
giving stuff away for free, but in the case of copyleft works - such as the
GPL, it specifically requires that you distribute all of the scripts required
to build the software. While perhaps not a violation of the license in the
legal sense - releasing build scripts which simply don't work is a violation
of the spirit of the GPL.

We could argue that "may work with any later version" is precisely what we
mean with "> 1.0", but it confuses me as to why anyone would think releasing
an _untested_ build script would ever work - you would never release software
into production without first testing it. Releasing a package with a
dependency that _doesn 't yet exist_ cannot possibly be tested - it's not fit
for production. If a developer has tested a piece of software with all
dependency versions between 1.0 and 2.0, and decides they all work
successfully, then I see this as the only real case where ranged versions are
acceptable - except in this case you have a set of exact dependencies tested
against which could be used instead of the range, so why not specify
dependencies as a set?

These complaints are not specific to any language semantics - it's just a case
of untested code being unreliable, and tested code, where all the "tests" are
encoded into the definition of the software (thereby removing the "hidden
knowledge" that goes into building it), is close to perfectly reliable - any
failure to reproduce was probably beyond the control of a package publisher
anyway (although when we have ReproducibleBuilds, this should never happen.)

> That's a large ball of complexity and dependencies. It isn't clear that
> that's actually a more effective solution than just rolling a package
> manager for each language.

As I stated, LSPMs are an incomplete solution - they don't provide the
complete "dependency resolution" which we expect from a PM. If we take for
example, I release a Cabal package for Cairo bindings in Haskell, and I run
"cabal install Cairo", what actually happens?

If I'm lucky, a compatible (by magic) version of native cairo is installed,
and it builds fine. If Cairo is not installed, it will fail with some cryptic
message saying ExitFailure 1, something or other about gtkhs-buildtools. If
I'm an expert already I can wave my magic wand and fix the problem, otherwise
I'm googling a solution. Turns out the problem is simple in this case, I don't
have the package "cairo-dev" installed on my OS, and I need to install it
manually through another PM, say, apt.

But then this defeats the purpose - the LSPM, Cabal, has failed at dependency
resolution - it can't resolve Cairo because it's beyond its scope - having
Cairo on the system is the responsibility of the system PM (which is really an
LSPM where the specific language is C, etc).

The point here is to eliminate the need for _manual_ installation of cairo -
by providing the means for Cabal and apt to talk - Cabal can simply ask "do
you have cairo", to which it can reply "yes", or "no, would you like me to
install it", or "no, go away".

This may not be such a concern in dart, since you don't really have bindings
to native libraries.

I don't see how this is a large ball of complexity, but I think it could
reduce it. As things stand, some LSPMs attempt to somehow resolve native
dependencies through ad-hoc means, by looking for them in specific locations
in the filesystem, or by dumping the native dependencies into the project
folder itself (good luck picking up bug-fixes and making portable projects
this way). Other LSPMs simply shut down and cry when you need a native
dependency, "it's not my problem".

Rather than trying bridge individual PMs on an ad-hoc basis, it would make
more sense to come up with the right abstraction/verbs for them to talk - and
each one have an adapter to a global "package manager manager daemon",. This
would be like "QUERY cairo", "INSTALLED", "NOT INSTALLED", etc. (As I said, I
don't have all of the details, as that would require a large social effort.)

The reason the Nix model would be required for this kind of collaboration of
PMs should be obvious - if I query "cairo", is it going to ask apt to install
cairo, or ask Cabal to install cairo? It can't possibly guess without the user
disambiguating - adding version numbers here can't help either. The only
solution here is to make sure there can be no ambiguity between them -
different names won't work because the different package managers have
distinct repositories managed by separate communities. The solution is an
identity, hash-based, or based on a cryptographic signature - it's required
for any kind of security if we were to have PMs talk to each other - however,
using hash based identities, we can treat the knowledge of the hash as a
capability, and if two different package managers (such as apt and yum)
provided the same package identity, it wouldn't matter which one performed the
installation, because it's the same capability.

I'm not trying to find the "One True Package Semantics", but rather, create an
environment where each PM can handle its own niche, but bridge the gap that's
missing where they simply stop working - by having some kind of shared
language between them.

~~~
sitkack
This isn't solvable with version numbers, either use Haskell or layer a type
system over package interfaces along with the runtimes ability to upgrade and
downgrade dependent packages on an link error.

It would be helpful of every successful package DAG for that occurrence we
communicated somewhere for others to use.

~~~
sparkie
Right. Version numbers can't work, which is why Nix avoids them and organizes
packages by the hash of their full dependencies instead - thereby making every
package unique and immutable - reliable upgrades and downgrades can be
performed because there's no "unfortunate" side effects that can be introduced
- such as the example I gave above where you need to introduce new packages
"A10" and A11" instead of "A", because you need both, and you already mutated
"A" to mean something other than it originally meant (hence, rollback is a
breaking change with most PMs).

If we're looking for a way to "communicate" working DAGs for other people to
use, I think the best place to look is existing in DVCSs -
submodules/subrepositories basically provide what we need - because every
commit to a repository using submodules refers to exact commits in exact
branches of the submodules. If we managed to use git ubiquitously in such way
that all of our dependencies were submodules, and in turn, those dependencies
used submodules for their own dependencies, we have the entire DAG that would
otherwise need to be written as Nixpkgs.

This also solves the version explosion issue, where there could exist dozens
or hundreds of derivations of the same packages with slightly different
configurations, bug-fixes or other changes - because a "derivation" just
becomes a specific commit in a specific branch of a repository at a known URL,
and to utilize the different version from a dependent software, you just
update the submodule and point to the desired head.

~~~
sitkack
Yep, we should run/deploy out of the DVCS.

------
tiedemann
I'm just happy other people than me like hg and Nimrod which is the sole
reason of this comment.

~~~
dom96
I'm happy you took the time to make this comment :)

------
joeevans
With all the options, it's too bad the authors had to choose the name of a
popular emacs mode.

Just makes it harder to find either.

~~~
MetaCosm
To be fair, "babel" isn't really a unique emacs term. It had a meanings
outside of emacs, and a movie... and multiple albums... and a book... and the
open babel chemical project... and the python internationalization library...

Maybe it was confusing already?

~~~
dom96
It really isn't an issue as long as there is only one package manager named
Babel. If you can't find Babel by Googling "babel" then search for "babel
package manager" instead, you will likely be doing that anyway.

