I don't get why we would want to allow packages to run any scripts before/after the installation.
I get why it's necessary at this point, but the true solution should get away without executing any code.
IMHO, a package should deliver a set of files to certain directories. That's it.
It should not overwrite existing files, that were installed by other packages. It should not change existing files in any way.
It might advise the system to trigger certain reindexing actions (systemd daemon reolad, update man-db, etc.) but doing this should be the duty of the package manager, not the package itself.
AFAIK, nix and Solaris' pkg are pretty close to this ideal.
A big advantage that this has, on top of security, is that:
- packages can be uninstalled safely and without side-effects
- package contents can be inspected (pkg contents)
- corrupted installations can be detected using checksums (pkg fix)
- package updates/installs can be rolled back using file system snapshots.
Because the scope of 'certain reindexing actions' is potentially unbounded. Suppose you have a large piece of software that uses plugins, which it finds by reading a 'plugins.xml' file. Now, when you install a plugin from the package manager, you need to modify 'plugins.xml' in some arbitrary fashion. You simply cannot constrain the entire software world to operate in a file-oriented, declarative manner.
> You simply cannot constrain the entire software world to operate in a file-oriented, declarative manner.
You can, if the community wants to go in that direction.
Ideally the software which loads "plugins.xml" is updated to support a conf.d-like directory full of "plugin-pkg1.xml" through "plugin-pkgN.xml", each owned by separate packages. Many OSS projects have done this.
If that's not possible (closed-source, too much work, whatever) then it seems simple enough to build a separate XML-merging tool that the package manager invokes which combines the separate XML files during package installation.
... and hopefully programmers realise that building all those things and shipping them is harder than just supporting a directory with the plugins in it. Or maybe even just punt with a wrapper that does it at startup.
Honestly: the defence of shit software and user-hostile developers never ceases to amaze me.
Running arbitrary code may be needed if you want to upgrade an existing installation in-place. Imagine that you are upgrading to a new major version and want to migrate the existing configuration to a new format.
This need not be an automatic action, though; you might choose to postpone the migration, and offer a separate tool for the user to run.
This prevents any packages from depending on such a package, though: they can no longer be automatically installed, because their dependency cannot, too.
This changes the concept of package management seriously enough.
Well, for me this is a question who owns the config file:
* If the package owns the file, and the user is not supposed to make changes to it, then package updates may do whatever to the file on update. It's good practice though, to leave a copy of the old file in /lost+found.
* If the user is allowed to make modification, then the user owns it. The configuration syntax becomes and API and should be treated as such. Breaking changes should be rare, preceded by a deprecation period and announced as major versions. As you say, it's questionable if such updates can/should be automated.
True.
And the only effective solutions I can think of are extremely ressource intensive (man hours) and therefore unrealistic. (Lile wrapping non-compliant apps and using mandatory access control for enforcement)
I like to see the dpkg/apt system enhance to add audit feature - able log all the files added/modified/deleted as part the installation process, ideally in html format.
Any admin can view the contents to debug or spot any potential security issues.
> I don't get why we would want to allow packages to run any scripts before/after the installation.
That’s because you probably never built a package management system. If you limit the package management system to the actions you propose it’s only usable for iOS type apps and that could be useful but then you still need a solution for managing the rest of the system.
You might have misunderstood "postInstall" -- rather understandably so since elsewhere (debian packages, ...) for precisely the issue discussed in the article: arbitrary scripts executed after package installation.
In Nix postInstall (and preInstall, as well as preBuild/postBuild, etc) specifying commands to execute before/after the corresponding "phase" -- so if a package is "almost" good to go with just "make install", you could use postInstall to do something like copy a file omitted by upstream's installation target.
The point is that postInstall in Nix is part of how the package itself is constructed-- in contrast to commands run after installing the package. There is no equivalent for this in Nix in a fundamental way (not by policy or for technical reasons).
Adding new systemd service, for example just requires me to put a file into `/etc/systemd/system`.
Many other applications work like this as well: Apache configs, /etc/sudoers.d on Debian, more examples in `/etc/?.d`
It's generally a good practice to make your system open for extension but closed to modifications [1].
There is no reason, the user database can not work the same way. Just have the system scan `/etc/passwd.d/*` on boot/reload and drop files there.
Then you need to add support for this features to the package manager, I imagine that the package manager will never cover all uses so you would need for rare cases the ability to run a script.
Your suggestion only seems to work if the software in the package is stateless. In the case of a DBMS the approach would just put data upgrade problems in another place and in some cases make them harder to solve as well.
I'm having slight trouble understanding the threat vector this is supposed to be protecting against. If you don't trust a package's install script, why would you trust any of the binaries installed by that package?
If you're unsure about bugs in a package's install script, why aren't you equally unsure about bugs in the binaries installed by the package.
In-fact, install scripts are auditable; third party compiled binaries aren't (at least not easily).
I see other advantages in declarative approaches - for example more freedom for debian to change the underlying file system layout, or to give the user some information about what the package is going to change for easier troubleshooting, but I do not see any advantage security-wise.
I mostly agree with you, but it is worth noting that most software is installwd by root but run by some "less privileged" user. That said, I still agree with you, as there is no data of value on any computer I own that is only accessible to root: all of the data that matters is in fact accessible to whatever the least privileged user that is using all of the software on that machine happens to be... on my laptop, that's me and on my database server that is my database. So really there is no difference between running software as root and running it as the "less privileged" user.
Like, the idea that I care that Chromium can run software as root is nonsensical: yes, root can modify all of the software not on my computer... to what end? The only thing of value on my computer is in my home directory, owned by me... hell, thanks to a bunch of people who (incorrectly) think they can make their computers safer by running fewer things as root, a ton of executable files are in my home directory thanks to userspace package managers provided by rust and node.js, so you can even modify other software without even having to be root anymore :/. There is simply no security advantage to any of this.
There are a still a handful of multi-user linux systems out there. Chromium as non-root cannot read the other users files when you log into some evil site that knows of a backdoor. (Assuming no other root backdoor)
There is one other point: if my OS is sound and my user files are corrupted - well at least I can restore my files from backup without first trying to reinstall my machine. It saves a little effort.
Docker is a nice idea. It's one tool, one system, for easily packaging software and running it, in an isolated environment. But Docker includes a lot of crap most people don't need. Do we need an isolated network for our apps? Do we need an isolated cgroup? Do we need to install a complete base image of an OS? Do we need root to run those apps? The answer, for most cases, is no.
Then there's things like Flatpak. They also want to make it easy to package and distribute software. And they see all the features of Docker and go, "Hey, a sandbox! That sounds cool! Let's make it mandatory!" In order to simply distribute software in a compatible way, they include a lot of restrictions they don't need to just distribute and run software.
All you need to distribute a software package is files, and a subsystem that maps files into the user's environment, and links together the files needed to run the software. We can accomplish this with a copy-on-write, overlay filesystem, and some software to download dependent files and lay them out in the right way. It should be incredibly simple, and it should work on any operating system that supports those filesystems. And those filesystems should be simple enough to implement on any operating system!
So what the hell is the deal here? Why has nobody come along and just provided the bare minimum needed to just distribute software (edit: in a way that also allows it to be run with all its dependencies in one overlay filesystem view)? Why is it always some ass-backwards incompatible crap that is "controversial" ? Why can't we just make something that works for any software?
It's literally what you described in the third paragraph:
> All you need to distribute a software package is files, and a subsystem that maps files into the user's environment, and links together the files needed to run the software. We can accomplish this with a copy-on-write, overlay filesystem, and some software to download dependent files and lay them out in the right way. It should be incredibly simple, and it should work on any operating system that supports those filesystems. And those filesystems should be simple enough to implement on any operating system!
It's literally, and exactly that. I'm not sure how it can be not what you're looking for.
Well, first of all, there's almost no documentation. I can see how to install it and run it, but I have no idea how to use it. How am I supposed to package software for it, or with it? How does it work internally? How is it supposed to provide an overlay for an individual application, much less multiple? How does it link dependencies? What's even the package manifest format? I skimmed through the C code and the Tcl code and none of it was readily apparent to me, and I could find no more documentation than a Getting Started guide. The manual is literally just the arguments to the app. I'm not sure how anyone other than the author could figure out how to do what I'm looking to do with it.
Also, you may be missing that I'm looking for Docker-like functionality. That is to say, have, say, 3 trees of files, one built on top of the other. When launching the application, lay the first tree down on an overlay, then the second, then the third, then run the application. By having the third tree link to the first two, I just pick the tree I want to run (the third tree, the one with the app), and it creates the correct overlays with the correct dependencies and runs it. No special paths needed by the application. What I'm asking for could be done with existing packaged software, with no need to change existing packages - the same way Docker does it now.
1. To use it to run application:
a. Get AppFSd running; then
b. Run the application you actually wanted to run
2. To package an application:
a. Create a package manifest
b. Run the build script to create a CPIO archive
c. Upload that CPIO archive to a webserver
d. Run the script to publish the archive
3. The package manifest format is described in the README ( http://appfs.rkeene.org/web/doc/trunk/README.md ) -- it's CSV with the format "type,time,extraData,name" where the extraData is depends on the value of "type" for type==file, it's "size,perms,sha1"
If you're looking for something Docker-like then it's different since this is a filesystem based approach which doesn't require any of the containerization techniques used by Docker... which is how this conversation got started. It would also not be available on every platform since not every platform supports the same containerization mechanisms, and for platforms that do they often require escalated privileges.
The Overlay2 filesystem driver is Linux-native, but you could implement it as a FUSE module. It would provide basically all the functionality I'm looking for, minus the network code. The idea would be to unpack software packages on the filesystem (e.g. chroot environment) and then overlay the directories of the packages needed before running a particular version of an app.
In order to make custom paths transparent to the application, you need some custom system calls that Linux provides, such as mount namespaces, which is essentially a containerization technology - but you don't need to use "containers" per se, just a particular system call. If you don't use mount namespaces, you have to use complicated hacks to make an application's view of the filesystem unique, such as chroot with bind mounts, or an LD_PRELOAD filter, but all that's too hacky for a general solution.
Plan9 had mount namespaces (among other things) decades ago, but good luck getting modern OSes to implement useful features in a standard way
I've used Debian since 1994 and can't recall a single package installation or removal which exhibited any problem this discussion claims to be solving.
I build many small pacakges and have made a few postinst files which occasionally break and leave apt in an upset state, typically when auto-installed by the preseed while trying to do horrible things to files (like editing snmpd.conf). These aren't packages that follow the guidelines though, they're just internal ones I build for a small number of installs, and only ends up with an unconfigured pacakge. I've never seen an external package that accidently causes problems.
Postinsts and other control files should on the whole be really simple.
This post seems to be more about protecting users from accidental problems with package installation scripts, but not about deliberate problems with packages, nor accidental problems with packages post installation script.
When I take an untrusted .deb, I look inside at the files it's dumping out, and the scripts it's going to run. Making apt more declarative wouldn't stop that.
Packages in the official Debian archives are generally well behaved. Unofficial packages, sometimes less so. There was an incident a couple of months ago where a package (from Microsoft, incidentally) replaced your /bin/sh with bash. Wouldn't it be neat if such an atrocity wasn't possible?
"Unofficial packages" aren't really in the set of supported dpkg use cases. They're a hack that fail in the general case because there are various things they can't express because the expression of them comes from the other side of the package relationship (eg. Breaks/Replaces), and official packages don't declare such things for packages outside the Debian archive. Therefore external packages are fundamentally broken, but third parties keep using them because better alternatives haven't existed in the past.
> Wouldn't it be neat if such an atrocity wasn't possible?
An easier way might be to make it difficult for users to install "unofficial packages", but that would be against the philosophy of users having ultimate control over their systems.
Your suggestion puts us in the rather interesting situation that you're requesting a new feature for something that primarily affects third party packages that aren't actually supported in the first place.
I don't have the link at hand but a few years ago (so I'm probably wrong about the details) I think a Debian developer was annoyed by the misuse of his personal repository that he modified his deb packages to change the user desktop background, mostly to demonstrate that installing random unknown packages from the Interweb could be dangerous.
I recall that episode made some noise at the time...
Yes, but we now live in a world of weaponized package takovers, such as the Kite incident with IDE plugins, or the security mishaps that have afflicted npm. The “I’m a good guy, you’re a good guy, we’re all good guys” approach doesn’t really cut it any longer.
I'm unfamiliar with 'the Kite incident' but the NPM incidents have been due to
a) the ass-backwards approach the company took to building a package manager.
b) the new development paradigm of "fuck it, find a library". Flying Spaghetti monster forbid these hipster developers actually have to write some fucking code themselves, its easier just to duct-tape together 200 different 'libraries', half of which do what you could do in 5 lines or less, and produces a dependency tree like fucking crab grass.
I'd love it if some more common things that are done in postint scripts could be done in a declarative way, like adding system users.
And then there are things that are declarative in the debian/ dir (like auto-(re)starting the installed services) that end up as generated, procedural code in the postinst script.
When such things can be done in a declarative manner, it's much easier to reason about them programmatically, and maybe you could completely disable postinst scripts for a whole category of packages.
When I first learned how Debian (and other) packages were put together I was somewhat surprised. Not just by the security implications but also by the inherent complexity. Even if you’re using templates and generators you can gain a lot by moving all of that out of the package and into the package manager/installer. Distribution-level changes, user OS customization and possibly even reduced privileges would become easier to handle. On the package manager side it seems like maintainability would (in the general case) vastly increase.
There is a downside though: if you forget something in the declarative tool it cannot be done. Using bash scripts lets a good programmer get what they need done. Using declarative makes it easier to do everything we thought of but some things are accidentally impossible.
I think an escape to bash might always be required, but I think it should be a restrictive thing. That is packages with that flag require a special flag passed to dpkg. Also any packages using bash should automatically get extra reviewers.
I agree. I would imagine that the vast majority of packages could easily be marked as "doesn't run anything as root, doesn't install any SUID executables", with enforcement from APT. This then reduces the attack/bug surface to those packages that do need to do dangerous things.
I addressed this in my packages with a roughly declarative language that turned into shell functions in a Debian, Arch, or FreeBSD package (by dint of prefixing with a shell function library) and into the packing list format for OpenBSD (by dint of an awk script). Examples:
There are different shell function mappings for pre/post-install/remove/update, making what (say) "login_service_with_dedicated_logger" does vary according to the action being taken and whether the package is a "tools" package or a "run" package. In the post-install of a "tools" package it creates dedicated user accounts and log file directories. In the post-install of a "run" package it conditionally enables and starts services.
It still needed the ability to run general-purpose commands here and there, especially to perform tidying up as things were renamed and errors were corrected over the years, and the number of declarative verbs that one turns out to need in practice is quite large.
IMHO, a package should deliver a set of files to certain directories. That's it.
It should not overwrite existing files, that were installed by other packages. It should not change existing files in any way.
It might advise the system to trigger certain reindexing actions (systemd daemon reolad, update man-db, etc.) but doing this should be the duty of the package manager, not the package itself.
AFAIK, nix and Solaris' pkg are pretty close to this ideal.
A big advantage that this has, on top of security, is that:
- packages can be uninstalled safely and without side-effects
- package contents can be inspected (pkg contents)
- corrupted installations can be detected using checksums (pkg fix)
- package updates/installs can be rolled back using file system snapshots.