
Tarballs, the ultimate container image format - severus_snape
https://www.gnu.org/software/guix/blog/2018/tarballs-the-ultimate-container-image-format/
======
theamk
I like simple archives, but can it be not tarballs? For the kinds of
application described in this article, tarballs are pretty bad:

Either you extract it from scratch every time you run an app, taking a long
time penalty...

... or you extract once to cache, and assume that nothing changes the cache.
This is pretty bad from both operational and security perspective:

\- backups have to walk through tens of thousands of files, thus becoming much
slower

\- a damaged disk or a malicious actor can change one file in the cache,
making damage which is very hard to detect.

There are plenty of mountable container formats -- ISO, squashfs, even zip
files -- which all provide much faster initial access, and much better
security/reliability guarantees, especially with things like dm-verity.

~~~
2ion
Yes, most tarballs do not support random access (there are some metadata
extensions that allow this). This makes large tarballs annoying to use on
systems with slow disk I/O (even a hard disk may be too slow (to the degree of
being annoying to work with)). This is by far my biggest gripe with the
format. Certainly, smaller tarballs are a very handy format as long as you
stay inside the Unixy world of computing – and as long as you keep looking out
for the various incompatibilities between the different tar implementations.

~~~
textmode
"... there are some metadata extensions that allow this)."

Where to find these extensions? Are they portable between Linux and BSD?

The 1998 dict project included a utility called "dictzip" for random access to
the contents of _gzip compressed files_.

Dumb question: Is it possible to create a utility or even a hack that performs
"random access" into _tar archives_?

Example use case: the user only wants to untar a small number of selected
files from a large tarball such as a source tree. The user has tried both the
"-T filelist" option and using memory file systems instead of hard disk
drives.

~~~
barrkel
A zip file is a concatenation of gzipped files. A .tar.gz is a gzip stream of
concatenated files. Anything that could do random access into the contents of
a zip file entry could do similar things with a tarball.

~~~
jahewson
Not so simple. A bundle of streams is not the same as a stream of bundles.

~~~
barrkel
With a transparent random access overlay, the difference mostly disappears,
reducing to whether the stream needs to be scanned or whether it's indexed,
which is itself orthogonal - zip file directory at the end is redundant.

~~~
netheril96
So you mean at each "random access", you actually have to scan the whole
.tar.gz file to find the location? For large tarballs, that will definitely
hinder performance a lot. The difference does not disappear at all.

~~~
barrkel
Apparently you have a comprehension problem.

~~~
netheril96
Then enlighten me.

------
paulfitz
How about sqlar as a container format?
[https://sqlite.org/sqlar.html](https://sqlite.org/sqlar.html) A regular
sqlite database file, with anything you like in it. Mountable as a file system
with sqlarfs. Written by the sqlite guy.

~~~
infogulch
Interesting I didn't know this existed. Is there a way to layer sqlar like
docker images? (Besides just tarring them up I guess.)

I wonder if this could be implemented with the WAL/journal system. Make each
layer immutably append to the previous layers to make restarting at any layer
trivial. I'm not sure if there's such a way to hook into the journal directly
like that though.

~~~
zaarn
Should be doable with overlayfs (or similar) or alternatively some extensions
to sqlar.

sqlar is after all only a table definition, if you don't need FUSE access or
are willing to write your own, SQLite3 can go a long way of providing
arbitrary neat functionality.

------
RX14
I really love the work the guix folk are doing. I'd love to run guixsd on my
laptop if it was easy and supported to run plain upstream linux instead of
linux-libre. It just seems like such a lovely easy to use project from the
little time I've spent playing with it, it's actually a small shame they're
part of the "unsexy" GNU project and subject to GNU politics.

~~~
jolmg
What GNU politics are you referring to that makes you reconsider using guixsd?

EDIT: Also what's unsexy about GNU? I'm really curious.

~~~
tremon
Its refusal to package firmware binaries, for one, even if that firmware is
required to have a useful machine. I'm looking at AMD specifically here, where
recent graphics cards (including APU's) don't even do text-mode without the
firmware.

(edit: I understand the why of it, and even agree on principle, but it still
prevents me from running linux-libre on most of my systems)

~~~
t0nt0n
Next time you build or choose a system, consider one that can run free
software.

I did, and it makes most things quite a bit easier.

Edit: I did after struggling with hw requiring nonfree blobs of different
shapes and size for a couple of years. Currently I was lucky to get my hands
on a system that I can run using linux-libre and the only component I have
"extra" is a usb wifi card.

~~~
dragontamer
> Next time you build or choose a system, consider one that can run free
> software.

The only workstation that boots with entirely free software is like, the Talos
II PowerPC, with a minimum cost of $5000.

Everyone else requires a binary blob somewhere. Either a UEFI blob, BIOS blob,
some kind of driver somewhere, or whatnot. Raspberry Pi, AMD, Intel,
everybody.

And before the Talos II, I don't think an "Open PC" devoid of proprietary
binary blobs even existed. At least, something that is reasonably modern (ie:
64-bit, decent security, decent support with modern OSes)

~~~
namibj
What about pre-ME thinkpads, after replacing the wifi card with an ath9k/open
source firmware one? Does the intel chipset graphic require a blob for simple
framebuffer/textmode operation? Because I can't remember including any blobs
in the libreboot I use there, and iirc I get output before a linux kernel is
able to load device firmware.

It is 64bit, and runs pretty much anything from (from what I can tell, but not
sure, due to CHMPXCHG16B) Windows 10, over FreeBSD to Android. Probably even
something like QNX.

Yes, you might not call this reasonably modern, but according to the hard
facts you listed as qualifiers for being reasonably modern, they tick off.

~~~
dragontamer
Unless it is running the "coreboot BIOS" (which very few things are), then it
has a binary "non-free" blob booting it up.

~~~
namibj
I don't remember whether the video BIOS was extracted from the old binary or
if it is the open-source replacement, but I'd tend towards the latter as I
don't remember searching for the backup/dump of the original firmware.

And yes, it's running coreboot, and at least CLI/linux-framebuffer arch linux
works. I didn't yet get to setting the rest of the system up, but considering
I bought it specifically for high-security operation, as the ME can be
physically removed without loosing more than the build-in Ethernet port, I'm
not pressed to do it anytime soon.

Edit: I'm pretty sure I followed [0], which leads me to the new conclusion
that I did use libreboot, a more strict version of coreboot (think
coreboot=Archlinux, libreboot=GNU Guix), and had to fiddle with the question
whether the open-source video bios would work. This confuses me a little, as I
remembered buying an X61s, not an X60s, but from the fact that it booted after
flashing, I deduce it had to be an X60.

[0]: [https://libreboot.org/docs/hardware/#list-of-supported-
think...](https://libreboot.org/docs/hardware/#list-of-supported-
thinkpad-x60s)

------
tannhaeuser
That article made me warm up to guix and its practical side. Are guix app
bundles just bare tar archives with /usr/local prefix semantics or do they
need special metadata files? How are compiled binaries with hardcoded and/or
autoconf'd prefixes handled for relocation (I guess using Linux namespaces
somehow)?

~~~
rekado
In Guix every package ends up in its own directory, which may have references
to other packages in /gnu/store. An application bundle is really just a
package closure, i.e. the directory for the package and all directories it
references, recursively. One way to bundle up things is with `tar` (the
default of `guix pack`), but Guix also supports other bundling targets, such
as Docker. No special metadata files are required.

Relocation currently requires a little C wrapper, which uses Linux namespaces,
as the blog post indicates.

If you want something more advanced, such as a bundle that includes an init
and services, it's best to use `guix system`, which builds VM images among
others.

------
chx
For relocatable ELF binaries, there's also
[https://github.com/intoli/exodus](https://github.com/intoli/exodus)

~~~
foob
The packages that Exodus produces are actually quite similar to those
introduced in this announcement. Both tools generate simple tarballs that can
be extracted anywhere to relocate programs along with their dependencies, and
both tools bootstrap the program execution using small statically compiled
launchers written in C. They contrast _guix pack_ against Snap, Flatpak, and
Docker, but Exodus would probably make a more apt comparison in many ways.

------
cyphar
This is remarkably off-beat for the GNU project. Tar files are _far_ from the
most ideal tool for container images because they are sequential archives and
thus extraction cannot be done using any parallelism (without adding an index
and being in a seekable medium, see the rest of this comment). I should really
write a blog post about this.

Another problem is that there is no way to just get the latest entry in a
multi-layered image without scanning every layer sequentially (this can be
made faster with a top-level index but I don't think anyone has implemented
this yet -- I am working on it for umoci but nobody else will probably use it
even if I implement it). This means you have to extract all of the archives.

Yet another problem is that if you have a layer which just includes a
_metadata_ change (like the mode of a file), then you have to include a full
copy of the file into the archive (same goes for a single bit change in the
file contents -- even if the file is 10GB in size). This balloons up the
archive size needlessly due to restrictions in the tar format (no way of
representing a metadata entry in a standard-complying way), and increases the
effect of the previous problem I mentioned.

And all of the above ignores the fact that tar archives are not actually
standardised (you have at least 3 "extension" formats -- GNU, PAX, and
libarchive), and different implementations produce vastly different archive
outputs and structures (causing problems with making them content-
addressable). To be fair, this is a fairly solved problem at this point
(though sparse archives are sort of unsolved) but it requires storing the
metadata of the archive structure in addition to the archive.

Despite all of this Docker and OCI (and AppC) all use tar archives, so this
isn't really a revolutionary blog post (it's sort of what everyone does, but
nobody is really happy about it). In the OCI we are working on switching to a
format that solves the above problems by having a history for each file (so
the layering is implemented in the archiving layer rather than on top) and
having an index where we store all of the files in the content-addressable
storage layer. I believe we also will implement content-based-chunking for
deduplication to allow us to handle minor changes in files without blowing up
image sizes. These are things you cannot do in tar archives and are
fundamentally limited.

I appreciate that tar is a very good tool (and we shouldn't reinvent good
tools), but not wanting to improve the state-of-the-art over literal _tape
archives_ seems a bit too nostalgic to me. Especially when there are _clear_
problems with the current format, with obvious ways of improving them.

~~~
JdeBP
It sounds like you are progressing along the same road that led Rahul Dhesi to
invent the ZOO file format.

~~~
cyphar
As far as I can tell the only thing ZOO has over tar archives is having a
history of each file (using the VMS concepts of file versions) -- meaning that
it probably still has some of the problems I outlined above. While that is
useful, it is still not as good as it could be. Also, you don't really want
file versions with container images, you want to have conceptual "layers"
(which would be sort of like having versioned files but it's more like
snapshot IDs -- or like ZFS's birth-times).

~~~
JdeBP
One needs to give it more than a superficial glance. ZOO was designed to be
randomly accessible, with the directory headers forming a linked list. It
actually _has_ an uncompressed index and _can_ take advantage of seekable
files. It also supports both long and short filenames; CRCs of the metadata
structures (c.f. the recent kerfuffle about xz); and an extensible, versioned,
header mechanism that not only _could_ be extended but actually _already once
was_ extended to add the long filename support amongst other things.

~~~
cyphar
Is there an actual paper or some high-level summary of the format -- not to
mention a modern implementation? The only summary I could find was the one on
Wikipedia. I also found the source code of "unzoo" but it's a bit difficult to
understand the benefits of a file format if I first have to understand its
implementation.

I didn't take a superficial glance out of laziness, it's because I couldn't
find any more information about it. But I think you also missed that I
mentioned that the style of versioning implemented in ZOO (as far as I can
tell based on a Wikipedia page) is not the correct style for snapshot-like
versioning.

------
geofft
I realize the title is just a hook for the (very cool!) work in the article,
but a couple things that tarballs don't/can't specify that Docker containers
can:

\- environment variables like locales. If your software expects to run with
English sorting rules and UTF-8 character decoding, it shouldn't run with
ASCII-value sorting and reject input bytes over 127.

\- Entrypoints. If your application expects all commands to run within a
wrapper, you can't enforce that from a tarball.

You can make conventions for both of these like "if /etc/default/locales
exists, parse it for environment variables" and "if /entrypoint is executable,
prepend it to all command lines", but then you have a convention on top of
tarballs. (Which, to be fair, might be easier than OCI—I have no particular
love for the OCI format—but the problem is harder than just "here are a bunch
of files.")

~~~
catern
It's not necessarily a good thing for the container to be able to specify
locale. Locale should be picked up from the surrounding system; it's just that
unfortunately the surrounding system is usually not configured correctly.

And entrypoints/wrappers are definitely possible from a tarball. Just wrap the
executables in bin/, replacing them with shell script (or whatever) wrappers
pointing to the real executables. That's what Nix/Guix do for languages like
Python which require dependencies to be provided by environment variables (as
they don't have a way to "close over" the locations of their dependencies).

~~~
oconnore
> Locale should be picked up from the surrounding system; it's just that
> unfortunately the surrounding system is usually not configured correctly.

And around and around we go

------
matthewbauer
Nix has a very similar tool called nix-bundle[1].

[1]: [https://github.com/matthewbauer/nix-
bundle](https://github.com/matthewbauer/nix-bundle)

------
nerpderp83
Tarballs don't have a TOC and can't easily index into individual entities.

One _could_ create a utility to make tarballs with a TOC and the ability to
index while still remaining compatible with tar and gzip. Pigz is one step in
the direction.

~~~
tejtm
to list what is in a tar ball `tar -vtf tarball.tar`

to extract a particular entity 'tar -vxf tarball.tar path_in
tarball_to_entity`

edit: good points on it not being efficient for large archives, just
demonstrating it is possible.

~~~
nerpderp83
A tar is a linked list of file paths and contents, it cannot be indexed to a
particular file. A compressed tar has to first be decompressed and then the
chain of links traversed. Accessing a file in compressed tar is o(n) with
where the file is placed within the compressed tar stream.

It isn't that it is possible, it is that is horribly inefficient.

Zips on other hand unify storage and compression such that one has random
access to particular file, hence most modern file formats are zips with xml or
json inside.

------
digi_owl
A quick FYI, Gobolinux operates much the same way.

1\. Binary packages are simply compressed archives (tarballs) of the relevant
branch in the /Programs tree.

2\. branches do not have to actually live inside the /Programs tree. There are
tools available to move the branches in and out of /Programs.

All this because Gobolinux leverages symbolic links as much as possible.

~~~
matthewbauer
Gobolinux sort of does this. The main difference is GoboLinux uses “version
numbers” while Nix & Guix use hashes. It makes a lot of difference for more
complicated stuff.

~~~
digi_owl
True.

I suspect there are ways to introduce hashes to Gobo, if one were so inclined.
But so far nobody has.

------
kuwze
Does anyone know how this would apply, for example, to sharing a Guile 2.2
application with Debian/Red Hat based distributions? I want to use Guile 2.2
for development, but I am worried because it was only recently was released
for major distros (at least with Ubuntu I know it was released with 18.04) and
it doesn't seem to support the creation of executables.

~~~
sitkack
See this older discussion on statically linking guile [0], one should be able
to bake your source into a C program that statically links Guile 2.2 to create
a self contained executable. If that is too cumbersome, I would use a
container.

[0] [https://lists.gnu.org/archive/html/bug-
guile/2013-03/msg0000...](https://lists.gnu.org/archive/html/bug-
guile/2013-03/msg00002.html)

------
stuaxo
Please, can we move to an archive format that isn't so sprawlingly massive ?

~~~
cpburns2009
Are you complaining about the complexity of file format itself? My
understanding is it's pretty simple: a linked list of headers with the
contents of each file after each header. Or are you complaining that it
doesn't do compression itself like ZIPs do?

~~~
GrayShade
One think I dislike about tarballs is the lack of random access support.

~~~
cpuguy83
You can build random access around tarballs, just need to index the header
data.

Shameless plug, this is what
[https://github.com/cpuguy83/tarfs](https://github.com/cpuguy83/tarfs) does.

Granted, you do have to traverse the entire tarball.

~~~
masklinn
> You can build random access around tarballs, just need to index the header
> data.

> Granted, you do have to traverse the entire tarball.

So you can't randomly access a tarball, you can cache the linear access you've
already done.

~~~
Vendan
You can read all the headers without reading the whole file by just seeking
over the file data....

~~~
sitkack
But you would still have to decompress it. How much support for legacy systems
do we need vs just making a slightly better version of a jar?

~~~
cpuguy83
Decompression has nothing to do with tar, though. But I agree, it is painful
to deal with tar+gz.

~~~
sitkack
> Decompression has nothing to do with tar

Which is exactly the problem. Same issue occurred via volume and filesystem
management resulting in ZFS. We need systems that compose and also elide.

So something that both archives as set of files and compresses them w/o losing
affordances over the layer below it in the process.

------
justinsaccount
Articles like this are pointless. I get that guix and nix are neat, and I
think that every single time something about one of them is posted, but I
don't have the slightest clue how to use either one of them.

Do you want to convince people that something like guix is better than docker?
Then take something that is currently distributed using docker and actually
show how the guix approach is simpler.

i.e. I have a random app I recently worked on where the dockerfile was
something like

    
    
      FROM python:2.7
      WORKDIR /app
      ADD requirements.txt /app
      RUN pip install -r requirements.txt
      ADD . /app
    
      RUN groupadd -r notifier && useradd --no-log-init -r -g notifier notifier
      USER notifier
      EXPOSE 8080/tcp
      CMD ./notify.py
    

How do I actually take a random application like that and build a guix package
of it?

Another project I work on is built on top of zeromq, and it would be great to
use something like guix to define all the libsodium+zeromq+czmq+zyre
dependancies and be able to spit out an 'ultimate container image' of all of
that, but all this post shows me how to do is install an existing guile
package.

~~~
t0nt0n
With Guix you get full introspection of your entire package dependency graph,
you can check and manipulate every aspect - and it is still simple and easy to
work with. With GuixSD you get this same introspection and overview, but of
your entire system. creating a container, vm or even a docker image is a
simple '$ guix system <container|vm> config.scm' away. And your config.scm is
as complex as you like it to.

The simplest way would be to package the app for guix and you could just run
'$ guix environment <name-of-package>' and you would be dropped into an
environment with all your dependencies and whatever else the application
requires in your path ready for hacking, get your sources and editor and start
working.

If you need a vm or similar though I'd translate your example above into a
system config where:

\- packages include python-2.7 and whatever is in requirements.txt (this may
mean you have to package a few things, but again this is usually super easy)

\- users and groups are added to the config, as they always are, no extra step
necessary.

\- exposing ports and networking is available as options for qemu script guix
produces to launch the vm.

\- CMD ./notify.py: create a "simple" service that can be autostarted by the
system on boot.

\- filesystem access is also handled by arguments to the qemu script.

As always though there are several paths to Rome, and these are just two of
them.

Zeromq and libsodium are already packaged on guix, czmq and zyre looks like
they would be simple to package, guix is really quite simple to work with,
which I think is the reason so many of the users and devs are running it as
our daily drivers, even though it is strictly beta (0.14. I think is the last
release).

And pointless, come on - what does that even mean? Does it mean you don't
value them? I was quite happy to read about a neat new thing I can use my
favorite tool for.

~~~
justinsaccount
> With Guix you get full introspection of your entire package dependency graph

Yes, I know all that. It's neat. I would like to learn more about it.

> The simplest way would be to package the app for guix

I was asking how to package the app for guix, and your response is the
simplest way would be to package the app for guix...

> If you need a vm or soimilar though I'd translate your example above into a
> system config where: - packages include python-2.7 and whatever is in
> requirements.txt (this may mean you have to package a few things, but again
> this is usually super easy) - users and groups are added to the config, as
> they always are, no extra step necessary. - exposing ports and networking is
> available as options for qemu script guix produces to launch the vm. - CMD
> ./notify.py: create a "simple" service that can be autostarted by the system
> on boot. - filesystem access is also handled by arguments to the qemu
> script.

Yes, I'm sure it is super easy. How do I do it?

Do you know how to use the dockerfile I posted above? You run

    
    
      docker build -t myapp .
      docker run myapp
    

that's super easy. 9 lines and 2 commands. You can now add docker expert to
your resume.

> Zeromq and libsodium are already packaged on guix, czmq and zyre looks like
> they would be simple to package,

Well, I was working on a fork of things, so I would have needed to install my
forks.

> guix is really quite simple to work with

I'm sure it is!

> And pointless, come on - what does that even mean? Does it mean you don't
> value them? I was quite happy to read about a neat new thing I can use my
> favorite tool for.

You are correct, I don't really value posts saying how cool and easy something
is and how much better it is than other solutions, when they don't actually
present a complete solution someone can actually use.

I get that it is not other peoples job to teach me how to use something like
guix, but do people not understand why things like Docker won?

~~~
t0nt0n
Right, your dockerfile contains a requirements.txt with unknown complexity and
number of packages, your app is without a name and does not have any links to
code.

I'd be happy to provide some examples. Say you want your fork of libsodium:

    
    
      (define-public my-libsodium
        (package
          (inherit libsodium) ; now anything not defined in this package will be inherited from libsodium
          (source (origin (method url-fetch)
                    (uri "url-to-your-sources")
                    (sha256 (base32 "hash"))))
         ; Add whatever other fields your fork needs.
      ))
    

Sure it's slightly more verbose. That's a bit of the cost of having something
you can actually rely on, with that degree of hackability.

If you actually want help to package these things ask on our mailinglist or
IRC, we're happy to help with specifics. But you're basicly complaining that I
didn't give you a concrete solution to a problem with several missing details
that are important. Docker would not be able to instantiate your python
project if it did not know the contents of your requirements.txt.

The thing is docker is huge and bloated; is far from secure, and will probably
stay that way for the foreseeable future; has a more or less complete lack of
introspection; and is not strictly reproducible (sure, it gets quite far along
the way, but it really is not).

Guix on the other hand is rather lightweight, and you have a fair amount of
control over how lightweight it should be; builds from source, and has a sort
of hotpatching system for security fixes; has introspection and is quite close
to bitreproducible.

Sure, docker is _easy_, as long as it works. And I'd argue that because of its
complexity and obscurity it is not practically free software.

~~~
cyphar
Regarding your concerns about Docker, I agree with that (even though I've been
working on Docker and in the wider container community for almost 5 years
now). However, there are plenty of tools that are compatible with Docker but
provide similar benefits.

For instance, (from the openSUSE community which I'm a part of) we have KIWI
that provides builds with full introspection on a package level (similar to
what you're doing with Guix). If you build the image inside OBS (our build
system) then if a dependency of your image is updated then your image will be
rebuilt automatically and published in OBS (where it can be further pushed to
any Docker/OCI registry you like). The packages are signed, and the image is
also "signed" (though it currently signs the image artifact and doesn't use
image signing since that is still not standardised). And most packages in
openSUSE are bitreproducible (we build everything in OBS).

The above is far and above much better than the current standard in the
"official" world of Docker, but unfortunately because OBS has a UI from the
early 2000s (which is when it was written) it doesn't get enough attention
outside of the communities that use it (and enjoy using it a lot). Everyone
wants Dockerfiles even though they cannot provide these features (and you
cannot get package manifests of your images without running a package manager
in the image, which means you cannot get vulnerability information from the
manifest).

[ Though I'm mostly talking about openSUSE here, I also happen to work for
SUSE on the containers team. ]

~~~
pxc
> However, there are plenty of tools that are compatible with Docker but
> provide similar benefits.

And Guix is one of them, remember? From the article:

> Add -f docker [to your `guix pack` command] and, instead of a tarball, you
> get an image in the Docker format that you can pass to docker load on any
> machine where Docker is installed.

:-)

> The above is far and above much better than the current standard in the
> "official" world of Docker, but unfortunately because OBS has a UI from the
> early 2000s (which is when it was written) it doesn't get enough attention
> outside of the communities that use it (and enjoy using it a lot).

This is so true! I've mostly moved on from traditional, imperative package
managers and associated distros in favor of the functional package management
paradigm exemplified by Guix, but I still recommend openSUSE to my friends who
prefer a more traditional/mainstream distro because of the love I have for the
Open Build Service and Zypper.

The web interface for OBS does feel clunky these days, but it's a wonderful
tool not just for improving the reliability and quality of software packages,
but distributing them. Zypper is hands-down the most powerful and complete
high-level package management tool I've ever used as part of a binary-based
GNU+Linux distro. I love that openSUSE provides an instance of OBS that anyone
can use for free to build packages for not just openSUSE but a TON of
different distros.

I wish more people would explore, take advantage of, and celebrate OBS just
like I wish they'd do the same with Nix and Guix!

------
AdmiralAsshat
Do tarballs still have that unfixed/unfixable bug where the extracted files
will have the permissions of the person who untarr'd the file?

~~~
master-litty
That seems sensible to me, what else would you expect?

~~~
AdmiralAsshat
I expect that they should preserve the ownership and permissions of the
original file if I tell it to.

~~~
rakoo
How can it have the same owner if it's a different machine, and users aren't
the same ?

~~~
spookthesunset
Do tarballs store the user/group names as strings or do they store the uid/gid
instead?

It is one of the goofy things about Unix systems is most tools speak uid/gid
and woah is you if two machines on the network have “bob” only as different
uid’s.

Not entirely sure if windows has the same problem as to be honest if you use
active directory most of that stuff is auto-magic.

My hunch is that going with the ID vs. the “friendly name” has a bunch of
trade offs and whichever you pick will come with serious drawbacks.

~~~
justincormack
They can do either - traditional tar formats have uid as a number, but the
newer pax format has both numeric and named values.

~~~
Hello71
in fact, names are the default. "\--numeric-owner" must be passed to use
numeric values.

------
AnIdiotOnTheNet
Reinventing Application Bundles only 30 years after NeXTStep, poorly.

~~~
matthewbauer
Why poorly? I don’t see anything worse about this.

~~~
AnIdiotOnTheNet
Really? Seems like an awful lot of tooling for what is essentially "Put binary
and dependencies in folder. Move folder around at will" in sane environments.

~~~
pxc
The tooling already existed because it's part of a stack that goes from build
tool to package manager to operating system configuration manager, with all
kinds of features for developers floating around along the periphery. It
handles all of these things uniformly, reliably, reproducibly, and in a way
that deduplicates shared dependencies.

This article is just showcasing a relatively small bit of tooling on top all
that which makes it possible to reuse that work to produce containers out of
the very same stuff, in a whole range of formats.

`guix pack` and `nix-bundle` are illustrations of how a novel solution
(functional package management) to the very problem to which app bundling
constitutes utter capitulation (dependency management) can not only retain the
virtues the app bundle approach throws away in the hopes of making deployment
simple, but even match it in ease of deployment when _none_ of the
infrastructure of the package management system is expected to be present on
the deployment target.

From where I stand, that's damn impressive.

All of this was achieved without the kind of ‘standardization from above’ that
Apple gets to do on its platform. It's true that app bundling could have been
a lot simpler if the Linux community lived in a locked box at the mercy of a
Vampire King bearing the power to upgrade users' kernels in the dead of night
without bothering to ask them, who preempted any diversity or choice in
operating system components with a uniform common runtime, and gleefully
ripped unseemly APIs out from under developers with every OS release. But
instead— thank God!— we have such a wide range of environments under the name
‘Linux’ that I'm ready to agree with you and call it insane. Yet here we see
that hackers made it work anyway, without bossing anyone around or
compromising on the strengths of proper package management. And that's fucking
awesome.

~~~
AnIdiotOnTheNet
Boy, you sure make fragmentation, constant wheel reinventing, and the
necessity of complex tooling to perform simple tasks almost sound like a good
thing. I suppose it must be for the small percentage of people who value those
things over actually being able to do stuff.

Given the near-complete lack of non-oss software support Linux has, it seems
like both developers and users rather prefer uniform common runtimes and a
lack of diversity in their operating system components. It's almost like a
whole lot of things get much easier if there's some kind of standardization.

~~~
pxc
> Boy, you sure make fragmentation, constant wheel reinventing, and the
> necessity of complex tooling to perform simple tasks almost sound like a
> good thing.

Why, thank you!

Redundancy of efforts in F/OSS is of course a bad thing. It's perhaps even
more tragic in free software than in proprietary software, because in free
software, developers have fewer formal barriers to drawing upon the work of
others. But it's something free software projects can't simply disable by
exerting brute control over their users and contributors. The point is that
with tech like this, the hackers behind projects like Guix have triumphed in a
tougher struggle than NeXT or Apple ever picked. And they've built technology
that copes with a wider range of environments, not via ugly hacks on edge
cases, but through a thoughtfully designed build system which renders the
whole dependency tree of every program it builds transparent, reproducible,
and portable. That they had to build a vehicle for such wild and varied
terrain is not what I'm celebrating, the cool thing is that they _did_.

> Given the near-complete lack of non-oss software support Linux has, it seems
> like both developers and users rather prefer uniform common runtimes and a
> lack of diversity in their operating system components.

Alternatively, when you refuse to distribute source code, compatibility for
you involves greater demands on your platform, because you can't leave
downstream distributors to recompile and you refuse to allow your more capable
users to fix your software's incompatibilities. It's almost like a whole lot
of things get easier when you distribute source code with your application.

Regardless, I think there are a lot of factors that together explain the
predominance of free software on free operating systems. Proprietary software
companies aiming to hit as large a market as possible with a single codebase
turning away from perceived fragmentation in the ‘Linux market’ is certainly
one of those many factors.

~~~
AnIdiotOnTheNet
> Alternatively, when you refuse to distribute source code, compatibility for
> you involves greater demands on your platform, because you can't leave
> downstream distributors to recompile and you refuse to allow your more
> capable users to fix your software's incompatibilities.

And yet Windows still manages to run software written for a decade+ old
version of it, and users often make compatibility patches for now-unsupported
software, all without the source or recompilation. I think a big misstep by
the OSS community has been its reliance on the crutch of "you have the source,
do it yourself", and that includes making their software even work on a system
in the first place. It leads to thinking like "it's ok if we break backwards
and forwards compatibility, everyone can just recompile!".

