Hacker News new | past | comments | ask | show | jobs | submit login
“Shared libraries are not a good thing in general” (kernel.org)
792 points by pengaru 6 days ago | hide | past | favorite | 500 comments





As Linus points out, the utility of shared libraries depends on the situation. Widely used libraries like libxcb, libGL, and zlib that tend to have reasonably stable ABIs are good candidates for sharing, while the "couple of libraries that nobody else used. Literally nobody" should obviously be statically linked. Software engineering is mostly about learning how to choose which method is appropriate for the current project.

However, there important advantage of using .so files instead of building into a big statically linked binary is NOT that a library can be shared between different programs! The important benefit is dynamic linking. The ability to change which library is being used without needing to rebuild the main program is really important. Maybe rebuilding the program isn't practical. Maybe rebuilding the program isn't possible because it's closed/proprietary or the source no longer exists. If the program is statically linked, then nothing can happen - either it works or it doesn't. With dynamic linking, changes are possible by changing which library is loaded. I needed to use this ability last week to work around a problem in a closed, proprietary program that is no longer supported. While the workaround was an ugly LD_LIBRARY_PATH hack, at least it got the program to run.


I don't think that being able to replace a library at runtime is a useful enough feature to justify the high maintenance cost of shared libraries. Like I complain about in a comment below, the cost of shared libraries is that an upgrade is all-or-nothing. If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked. You can't have both programs on your system, and Linux distributors will simply stop updating one of them until the author magically fixes it, which happens approximately never. And, the one they stop updating is the one that you want to update to get a new feature, 100% of the time.

It's just not worth it. A binary is a closure over the entire source text of the program and its libraries. That's what CI tested, that's what the authors developed against, and that's what you should run. Randomly changing stuff because you think it's cool is just going to introduce bugs that nobody else on Earth can reproduce. Nobody does it because it absolutely never works. You hear about LD_PRELOAD, write some different implementation of printf that tacks on "OH COOL WOW" to every statement, and then never touch it again. Finally, dynamic loading isn't even the right surface for messing with the behavior of existing binaries. It has the notable limitation of not being able to change the behavior of the program itself; you can only change the results of library calls.

I never want to see a dynamic library again. They have made people's lives miserable for decades.


>Nobody does it because it absolutely never works.

As an intern at a data science company, I noticed that a certain closed-source shared library we were using was calling the same trigonometric functions with the same arguments over and again. So, I tried to use LD_PRELOAD to "monkey-patch" libm with caching and that reduced the running time of the data pipeline by 90%.

>I never want to see a dynamic library again.

And replace it with what? Completely statically linked libraries? How much disk space would linux userland require if every binary was to be statically linked?


I like the anecdote, but I don't see what argument it offers. Usually, the core maintainers are the one with the strongest context on the project, and thus would have the best knowledge of optimization locations.

Telling maintainers to dynamically link just in case someone wants to change the libraries is like having replaceable CPUs on motherboards. Useful for 0.1%, but requires a lot of working around.


I was just giving a real-world example where dynamic patching via LD_PRELOAD was quite useful, not offering it as an argument for maintainers to use dynamically linked libraries.

Of course, there are many independent arguments for using dynamically linked libraries, which is why most linux distributions use them.


> replaceable CPUs on motherboards

What do you mean by this? You can generally replace the CPU on a desktop or laptop motherboard.


> As an intern at a data science company, I noticed that a certain closed-source shared library we were using was calling the same trigonometric functions with the same arguments over and again. So, I tried to use LD_PRELOAD to "monkey-patch" libm with caching and that reduced the running time of the data pipeline by 90%.

This is a very edgy edge case, though.

> How much disk space would linux userland require if every binary was to be statically linked?

Megabytes more?

I have never been on a project or a machine since the 70s where executables took a majority of the disk space. On this disk, about 15% of it is executables, and I have a lot of crap in my executable directory and just deleted a lot of junk data.

Also, no one's seriously proposing that core libraries that everyone uses be statically linked.


>Megabytes more?

Megabytes more in total, or megabytes more per binary?

> On this disk, about 15% of it is executables, and I have a lot of crap in my executable directory and just deleted a lot of junk data.

Is there any study on how this might change if we moved from dlls to statically linking everything?

> no one's seriously proposing that core libraries that everyone uses be statically linked

If we agree that shared libraries are fine for the common case, and there are n number of existing solutions for distributing applications with tricky dependencies, what are we even talking about in this thread? :\


> Is there any study on how this might change if we moved from dlls to statically linking everything?

Drew Devault did an informal one https://drewdevault.com/dynlib.html

TL;DR - not at all substantially.


The disk space issue could fixed other ways. Have the filesystem automatically find and shared common blocks.

into, say ..... common block libraries?

no, into common block maps. it's filesystem compression and deduplication, and it works great, on a lower level from libraries. I use it with ZFS and btrfs, works wonders.

That’s a dependency problem, which is orthogonal. A model like Nix can make this work splendidly. The programs that share an exact version of a shared lib do share it, but you are free to use a custom dynamic lib for only one package. In a way, Nix makes your notion of “clojure” the way of life everywhere (not just binaries, but scripts as well referring to interpreter, other programs by explicit hash of derivation)

> I don't think that being able to replace a library at runtime is a useful enough feature to justify the high maintenance cost of shared libraries.

We’re moving to a world where every program is containerized or sandboxed. There is no more real “sharing” of libraries, everything gets “statically linked” in a Docker image anyway.


I bet someone will invent shared content for Docker containers in the next few years as a disk-saving measure.

That's called volume mounts and/or base images ?

If I do an `apt-get install` of the same packages in different containers, with anything different at all before it in the Dockerfiles, there's no way to share a common layer that includes the package contents between those images.

You could volume mount /usr and chunks of /var to help, but I'm not sure that would be sane.


There are storage systems which do this, de-dupe on disk level.

In the cloud. Otherwise it is very seldom justified.

Not using Docker, but Snap/Flatpak use similar approaches.

The flip side is that it is easy to track security vulnerabilities at a level of a shared library - once the distribution pushes an update all the dependent applications are fine.

Imagine you have 10 programs statically linked against the same library. How many will be promptly updated? How do you, as a system admin, track which ones were not?


You solve two unrelated problems (tracking library dependencies and security patching) with one complex interdependent solution and then you pretend no other solution can possibly exist. This is a trap.

In production you very often end up maintaining stuff not from distribution, you have to track and rebuild that anyway.


The solution may be "complex and interdependent" but it already exists, and has been used for 35+ years.

Sure, conceptually there may be a better solution to both problems, but it has not been deployed by any GNU/Linux distribution I know.

And I am not sure if dockerizing or building everything statically and hoping a solution will eventually show up one day is the path forward.


Its not,

At least when the distro updates its shared libraries it fixes it for all users, imagine how many rebuilds + time + effort it would take for every docker container that had the same shitty small bug.


> You solve two unrelated problems (tracking library dependencies and security patching) with one complex interdependent solution (...)

Linking to a library is not what I would describe as "complex", or have any meaningful challenge regarding "interdependence".

This issue makes no sense in Windows, which you are free to drop any DLL in the project folder, and for different reasons also does not make sense in UNIX-like distribution which already provide official maintained packages.

I'm sure that there are plenty of problems with distributing packages reliably, but dynamic libs ain't one of it.


It seems to me like the better way to solve this would be for distributions to publish which versions of dependencies each package uses, and provide an audit tool that can analyze this list and notify of any vulnerabilities.

>If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked.

Why? You can have both e.g. /lib/libfoobar-1.0.4.so /lib/libfoobar-1.0.3.so

Usually you would have /lib/libfoobar-1.so -> /lib/libfoobar-1.0.4.so, but it doesn't prevent you from linking the problematic program directly with libfoobar-1.0.3


> but it doesn't prevent you from linking the problematic program directly with libfoobar-1.0.3

I feel that patchelf[0] is an ingenious little tool for exactly that purpose that isn't getting enough love. Not as useful for FOSS stuff, but it's been really useful for the times I had to relink proprietary things on our cluster in order to patch a security vulnerability or two.

[0] https://github.com/NixOS/patchelf


> You can't have both programs on your system

That's incidental and not by principle, isn't it? You should be able to have a multiple versions of a library installed, and the dynamic linker would pick the appropriate version.

Heck you could even have multiple versions in the same process (if you depend on A and B, and B uses an older version of A internally). Unfortunately I think the current dynamic linker on Linux has a shared namespace for symbols.

I also disagree that it is not useful to be able to update libraries separately. The example that I like to give is a PyGtk app written against an earlier Gtk2. Without modifications, it ran against the very last Gtk2, but now supporting: Newest themes, draggable menubars, better IME, window size grippers, and all kinds of other improvements to Gtk.

In the days of ActiveX (and I realize this is a bad example because in reality you often had breakage), you could update a control and suddenly your app would have new functionality (like being able to read newer file format versions or more options to sort your grids and so on).


> Like I complain about in a comment below, the cost of shared libraries is that an upgrade is all-or-nothing.

It really isn't. At all. For example, let's take a notoriously egregious example of C++, which doesn't really have a stable ABI, and a major framework like Qt, which provides pretty much the whole platform. Since the introduction of Qt5, you can pretty much upgrade even between LTS releases without risking any problem: replace libs built with the same compiler tool chain, and you're ddone

This is not an isolated case. Shared libraries and semantic versioning is not a major technical hurdle, and one which in practice only has upsides.


> you can pretty much upgrade even between LTS releases without risking any problem: replace libs built with the same compiler tool chain, and you're ddone

as someone who develops Qt apps for a living, this is not true at all. Yes, Qt promises no ABI / API break, sure. But it doesn't matter at all when a new release introduces a bug that breaks my app. It's super common to have to pin software to Qt versions if one wants to minimize bugs (and more importantly have a precise behaviour for all the users of the software - for instance, the way hidpi has been implemented has changed something like 3 times across the Qt5 lifetime, and there are still things to fix ; this means that on a hidpi screen the software won't look the same if linked against qt 5.6 / 5.10 / 5.15 which is pretty bad as for instance assets may have been made specifically to fit with the "Qt 5.6" look and may appear broken / incorrectly positioned / etc... on later versions (and conversely).


> But it doesn't matter at all when a new release introduces a bug that breaks my app.

I fail to see the relevance of your case. Semantic versioning and shared libraries is a solution to the problem of shipping bugfixes by updating a single component that's deployed system-wide. Semantic versioning and shared libraries do not address the problem of a developer shipping buggy code. Shared libraries ensure that you are able to fix the bug by updating a single library, even without having to wait for the package or distro maintainers to notice the bug exists.

> It's super common to have to pin software to Qt versions if one wants to minimize bugs (and more importantly have a precise behaviour for all the users of the software - for instance, the way hidpi has been implemented has changed something like 3 times across the Qt5 lifetime, and there are still things to fix ;

I've worked on a long-lived project based on a Qt application that supported on all platforms, and we've upgraded seamlessly from Qt 5.4 up to 5.12 without touching a single line of code.

In fact, the only discussions we had to have regarding rebuilding the app was due to a forced update to Visual Studio, which affected only Windows.

YMMV of course, specially if no attention is paid to the contract specified by the library/framework, but I disagree that your personal anecdote is representative of the reality of using shared libraries, even in languages which are notoriously challenging to work with such as C++.


> Shared libraries ensure that you are able to fix the bug by updating a single library,

but that assumes that the only thing that changes between library_v1.0.0.so and library_v1.0.1.so is the bug. There are always tons of regressions and unrelated changes between minor Qt versions - hell, I even had patch versions break, for instance I was hit by this which was introduced in 5.11.1: https://bugreports.qt.io/browse/QTBUG-70672 or this which worked before 5.9 and failed afterwards (for good reasons) : https://bugreports.qt.io/browse/QTBUG-60804 or this in 5.4: https://bugreports.qt.io/browse/QTBUG-44316

> and we've upgraded seamlessly from Qt 5.4 up to 5.12 without touching a single line of code.

and that worked on mac, windows, linux with X11 & wayland, in low-dpi and high-dpi alike ? I don't see how if you care about your app's looks (e.g. your UI designers send you precise positioning & scale of every widget, text, image, etc).

hell, if you have a QtQuick app today, resizing its window has been broken since Qt 5.2 (since then, layouting happens asynchronously which gives the impression the app wobbles - and that is the case for every Qt Quick app under Linux, even the simplest hello world examples ; this worked much better in 5.0 / 5.1, see e.g. https://bugreports.qt.io/browse/QTBUG-46074 ).


>If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked. You can't have both programs on your system, and Linux distributors will simply stop updating one of them until the author magically fixes it, which happens approximately never.

I've not experienced that. If the package is maintained at all, then it will get updated. People are still somehow maintaining Perl 5 libraries and putting out bugfix releases, which are functionally equivalent to dynamic linking.

If it's not maintained, then of course it will get removed from the distribution, for that and for a number of other reasons.


> People are still somehow maintaining Perl 5 libraries and putting out bugfix releases, which are functionally equivalent to dynamic linking.

Not to be pedantic, but having multiple versions of any Perl library, or multiple versions of Perl on a single server without things getting stepped on is trivially easy. There's utils to switch Perl versions as well. I mean, it's Perl.

Perl has a long, long, long history and culture of testing, backwards compatibility, Kwalitee Assurance, and porting to every system imaginable behind it as well, so if some Perl script from the 90's still runs with little to no modification, that should not exactly be seen as a rarity.


I don't mean modules installed from cpan, this is specifically for things like Perl and Python modules packaged in Debian. On purpose, there is usually no way to install more than one version of those with the system package manager. Maintainers seem to be doing a fine job of keeping them up-to-date. One of the major reasons to do this is so packages that depend on those libraries actually DO get tested with updates and don't get neglected and stuck on some old buggy version.

There are a lot of reasons to prefer static linking, but to me the argument of "it means you don't need to keep your package up to date with bug and security fixes" never held up, from a distro point of view anyway.


Nix OS/Packages allow several applications to use different versions of shared libraries, which IMHO is the way to go.

> Finally, dynamic loading isn't even the right surface for messing with the behavior of existing binaries.

Then what is? Dropping a custom library next to a binary tends to be way easier than modifying the binary.


If libfoobar-1.0.4 is not a drop-in upgrade for libfoobar-1.0.3 then it should have a different SONAME, which would allow distributions to package both of them.

But a breaking change probably doesn't impact all programs that depend on it, so that approach still ends up with many installed versions that could probably have been shared.

If you have gifted your code under a FOSS license, then people are going to run it in ways that you don't like.

The distributions will take all front-line bug reports and the maintainer will forward them to upstream only if applicable. Or at least, that's how Debian operates.


>If one program on your system depends on a quirk of libfoobar-1.0.3, and another program on your system depends on a quirk of libfoobar-1.0.4, you're fucked. You can't have both programs on your system, and Linux distributors will simply stop updating one of them until the author magically fixes it, which happens approximately never.

It's not true at all. Libfoobar-1.0.4 is backward compatible with libfoobar-1.0.3, so both programs will work. If one of the programs or the lib has a bug, then it's easier to just fix the bug than to add burden on maintainers.

Moreover, both libfoobar.so.1.0.3 and libfoobar.so.1.0.4 can be installed at the same time, e.g. libfoobar.so.1.0.4 can be packaged as libfoobar-1.0.4, while libfoobar.so.1.0.3 can be packaged as compat-libfoobar-1.0.3.

If you have more questions, try to find answers in packaging guidelines for your distribution, or ask maintainers for help with packaging of your app.


It seems to me that several circumstances have changed since the idea of dynamic linking first came about:

- A whole lot of software is now open and freely available

- The code-changes-to-binary-updates pipeline has been greased by a couple orders of magnitude, from internet availability to CI to user expectations around automatic updates (for both commercial and non-commercial software)

- Disk space, and arguably even RAM, have become very cheap by comparison

Given these, it seems worthwhile to re-evaluate the tradeoffs that come with dynamic linking


> Given these, it seems worthwhile to re-evaluate the tradeoffs that come with dynamic linking

Perhaps this is just a Linux distro thing, but as someone who closely monitors the Void packages repo, dynamic linking burdens distro maintainers with hundreds of additional packages which need to be individually tracked, tested and updated.

(Packages which could otherwise be vendored by upstream and statically linked during the build.)

Dynamic linking also adds complexity to distro upgrades, because dependant packages often need to be rebuilt when the libraries they dynamically link to are changed or upgraded. For example, Void’s recent switch from LibreSSL to OpenSSL borked my install script which wasn’t aware of the change. It resulted in a built system whose XBPS package manager couldn’t verify SSL certs. On Arch, AUR packages were notoriously prone to dynamic linking errors (back when I still used it).

Personally, I don’t find the bandwidth efficiency and CVE arguments for dynamic linking to be all that convincing [1]:

    Will security vulnerabilities in libraries that have been statically
    linked cause large or unmanagable updates?

    Findings: not really

    Not including libc, the only libraries which had "critical" or "high"
    severity vulnerabilities in 2019 which affected over 100 binaries on
    my system were dbus, gnutls, cairo, libssh2, and curl. 265 binaries
    were affected by the rest.

    The total download cost to upgrade all binaries on my system which
    were affected by CVEs in 2019 is 3.8 GiB. This is reduced to 1.0
    GiB if you eliminate glibc.

    It is also unknown if any of these vulnerabilities would have
    been introduced after the last build date for a given statically
    linked binary; if so that binary would not need to be updated. Many
    vulnerabilities are also limited to a specific code path or use-case,
    and binaries which do not invoke that code path in their dependencies
    will not be affected. A process to ascertain this information in
    the wake of a vulnerability could be automated.
[1]: https://drewdevault.com/dynlib

> Perhaps this is just a Linux distro thing, but as someone who closely monitors the Void packages repo, dynamic linking burdens distro maintainers with hundreds of additional packages which need to be individually tracked, tested and updated.

> (Packages which could otherwise be vendored by upstream and statically linked during the build.)

The effort is approximately the same for distro maintainers. Shared libraries might mean more individual packages but the same number of projects need to be actively maintained because if a vulnerability is patched in libssl (for example) you'd need to rebuild all projects that are statically linked rather than rebuilding libssl.

Upstream vendoring doesn't really help here. If anything, it add more issues than it solves with regards to managing a distro where you want to ensure every package has the latest security patches (vendoring is more useful for teams managing specific applications to protect them against breaking changes happening upstream than it is at a distro level where you can bundle different versions of the same .so if you absolutely need to and link against those)


Additionally I would say at least `dbus`, `gnutls`, `libssh2` and maybe?? `curl` are this kind of libraries which count as "widely used system libraries".

I.e. the few for which Linus thinks dynamic linking still can make sense ;=)


Shared library and .DLL usage is like freedom, and writing a constitution.

Where you can agree, you share. Where you cannot agree, you table and hold.

Perhaps we are trying to share too much.

I have removed shared libraries from internal code for similar reasons.


Some kind of Linux Standard Base, perhaps?

I think none of those (at least two, nearly surely) might qualify for that exemption. On the vast majority of systems I bet libcurl, for example, is probably linked against by 1 other piece of software - curl(1). “Really useful” isn’t the qualifier here. I bet ~same for libssh. dbus and gnutls I’d have to see more examples to understand.

I think you misunderstand how widely used libcurl is. On Arch Linux, it has 291 reverse dependencies: https://archlinux.org/packages/core/x86_64/curl/.

It is e.g. used by git, cargo (Rust's package manager) and cmake (C++ build system). It's literally installed on 99.99% of all Arch Linux systems: https://www.archlinux.de/packages/core/x86_64/curl.


I think that the kind of "knowledge" you are correcting is informing much of the discussion here. And it seems like the motivation is "I want to make MY thing as easy to ship as possible!".

There are hundreds of binaries in a Linux system and no one wants to rebuild all of them when an important library is updated.


> There are hundreds of binaries in a Linux system and no one wants to rebuild all of them when an important library is updated.

Oh, Idunno... How certain are you of this, and why?


VERY certain

    $ find /{,usr/}{,s}bin/ -maxdepth 1 -type f \
      | wc -l
    5616

    $ du -h --total --no-dereference \
      $(find /{,usr/}{,s}bin/ -maxdepth 1 -type f) \
      | tail -n 1
    1.3G    total
That's with shared libraries and debug symbols stripped. If I upgrade libc, why would I want to also waste a massive amount of time rebuilding all 5616 binaries?

Of course, even if we ignore the insane buld time, I don't have the source for some of those binaries, and in a few cases the source no longer exists. In those cases, should I be forced to only use those programs with the (buggy) versions of the libraries they were originally built with?


1.3 GB? It's 2021, I can read that from SSD into memory in less than the blink of an eye. Hard disk is fast and cheap enough that it's worth keeping builds around to incremental rebuild against new libraries. Also, typically tests require a build tree, and you are going to rerun tests on the relinked binaries, aren't you?

You’re absolutely right, I did misunderstand :/

Colour me a bit surprised, and corrected.


Libcurl is the only sane way I know to access the web from a C or C++ environment. I expect that most C programs needing to do web requests will link in libcurl.

If I am to install Pandoc (on Arch Linux) I have to install 95 Haskell dependencies besides the ones I already have installed.

That's the result of a perverse maintainer, there used to be a version that didn't depend on the development libraries.

Fortunately there is pandoc-bin at the AUR.

Thank you! I had noticed the same problem, but was not aware of this excellent solution. I just applied it, and it works perfectly... and I removed the dozens of Haskell libraries that were there just for Pandoc.

For anyone else interested, the same solution applies to shellcheck - just install shellcheck-bin from AUR.


I discovered that quite quickly after they switched.

More importantly (for any haskell developers using Arch), there are also static builds for stack, cabal and ghcup too although you then need to maintain your own set of GHC and libraries.

I can't imagine what it must be like to use the official packages since the haskell library ecosystem is quite fast moving.


The Haskell maintainers in Arch have an almost fantatic fascination with dynamic linking. Dynamic linking is otherwise not the norm in Haskell.

Haskell has poor support for dynamic linking. The Haskell (GHC) ABI is known to break on every compiler release. This means that every dynamically linked executable in the system needs to be recompiled. Any programs you compiled locally against dynamic system libs will break, of course.

This stance makes it much harder than necessarily for Arch users to get started with Haskell.


This is due to the maintainers on Arch and I've complained about it before too. They also update them almost daily so they just add noise to system updates. I just end up running the official Pandoc Docker container.

> If I am to install Pandoc (on Arch Linux) I have to install 95 Haskell dependencies

OMG, you really spent 95x more time, by installing of dependencies one by one? Why not just use a package manager?


I'm assuming they already are using the package manager. The issue is that there are so many dependencies for a single package, and almost every system update consist of updating a bunch of Haskell packages which just clutters maintenance.

Distribution does it job: it distributes upstream changes. It's like blaming a zip file, because it contains many small files instead of a few big, so it clutters maintenance. If you see this as a problem, then fix upstream. For me, it's important to receive all upstream changes in one update.

You still could. With proper build systems they could be pushing a new package whenever a dependency gets updated, or daily.

Daily would be a huge improvement but preferably less if it is non critical and I am not using testing repos.

No, I didn't. But, before uninstalling Pandoc, I was wasting time with frequent useless updates of minor Haskell dependencies that should be statically linked in the first place as no other package, at least in general, make use of them.

Why is this a problem?

As someone who runs a Linux distribution, please don't vendor your dependencies for the benefit of distribution maintainers, it makes it much more difficult to build a coherent system. Running different versions of common libraries causes real annoyances.

It's a double-edged sword, though. I find that with Debian I often can't have up-to-date versions of programs I care about, because some program that nobody cares about won't build against an upgraded library version that they mutually depend on.

The requirement that every binary on the system depend on the same shared library basically led to things like app stores and Snap, which I think we can all agree that nobody wants.


It looks like you care about these programs, but you want to have a free ride.

Yeah, and I get my free ride by just downloading the binary from the project's website. Linux distributions add a bunch of arbitrary rules that I don't care about and that don't help me.

a free ride that you indeed can get if you link statically

Yes this can't be emphasized enough. static linking is fine, I don't really care either way. But please, please don't vendor.

The point of software is to be composable, and vendoring kills that. Vendoring will cause free software to drown in its own complexity and is completely irresponsible.


Software is composable - at compile/package time. It doesn't have to be composable at use time.

I agree entirely, but vendoring breaks composition compile/package time by definition!

If you have compositional packages you bake into one giant executable or something, that's still not vendoring.


Could someone explain what is meant by “vendoring”?

Vendoring is when you just copy the dependencies into your own source tree, like in a third_party directory.

Personally, I'm a big fan of just taking what you need. A little copying is better than a little dependency[1].

[1] https://www.youtube.com/watch?v=PAAkCSZUG1c&t=9m28s


It's one a package contains it's dependencies, and often their dependencies too (transitive deps).'

packages are supposed to be nodes in a dependency graph (or better, partial order), and there are nice composition rules. But one the nodes themselves represent graph/order closures (or almost-closure) we use that nice property.

People that are very application oriented tend to like vendoring --- "my application is its own island, who gives a shit". But the distro's point isn't just gather the software --- that's an anachronism --- but plan a coherent whole. And applications who only care about themselves are terrible for that.

The larger point is free and open source software really should be library-first. Siloed Applicationns are a terrible UI and holdover from the economics of proprietary shrink-wrapped software which itself is a skeuomorphism from the physical world of nonreplicable goods.


> Siloed Applicationns are a terrible UI and holdover from the economics of proprietary shrink-wrapped software which itself is a skeuomorphism from the physical world of nonreplicable goods

Siloed applications are UI neutral and represent decoupling, which enables progress and resilience. Excessive interdependence is how you lose Google Reader because other services have changed requirements.


What? Getting rid of Google Reader was a business decision.

> Siloed applications are UI neutral and represent decoupling, which enables progress and resilience.

The medieval manorial economy is resilient. Is that progress?

What happened to unified look and feel? and rich interactions between widgets? The smalltalk and Oberon GUI visions? All that requires more integration.

I get that we need to make the here and no work, but that is done via the en-mass translation of language-specific package manager packages to distro packages (static or shared, don't care), not vendoring.


> What? Getting rid of Google Reader was a business decision

It has been widely reported that a significant factor in that business decisions was Google’s massively coupled codebase and the fact that Reader had dependencies on parts of it that were going to have breaking changes to support other services.

> The medieval manorial economy is resilient.

It’s not, compared to capitalist and modern mixed economies, and unnecessary and counterproductive coupling between unrelated functions is a problem there, too.

> Is that progress?

Compared to its immediate precedents, generally, yes.


Fewer and fewer people are interested in the benefits provided by "coherent whole" distributions. And more and more people are interested in the benefits provided by "it's own island" applications.

The future is statically linked and isolated applications. Distros need to adapt.


Nope.

1. Have the users been asking for everything to be a laggy electron app? I don't think so.

2. Within these apps are languages based package managers that don't encorage vendoring, it's just one people go to package for distros that they vendor away. Distros do need to make it easier to automatically convert language-specific package manager apps.

The future is making development builds and production builds not wildly different, and both super incremental, so one can easily edit any dependency not matter how deep and then put their system back to together.

Again, I am fine with static linking. My distro, NixOS is actually great at that and I've helped work on it. But vendoring ruins everything. I hate waiting for everyone to build the same shit over and over again with no instrumentality, and I don't have time to reverse-engineer whatever special snowflake incremental dev build setup each package uses.


The number of people who consider the system/distribution the atomic unit rather than the application, is probably about equal to the number of people who "edit dependencies and put their system back together" -- they are in total statistically zero. The overriding concern is the user need for a working application. Everything else is a distant secondary concern.

I'm not trying to convince you of anything, here, I'm just describing the facts on the ground. If you're not into them, that's fine!


The number of people who have a "theory of distribution" one way or the other is pretty low.

But

- many people seem to like unified look and feel

- many people complain about per-app auto-update

- many people love to complain software is getting worse

Are these people connecting these complaints to the debate we're having? Probably not. Can these views be connected to that? Absolutely.

---

I work on distros and due edit the root dependencies, I also contribute to many libraries I use at work during work, finally, I use the same distro at work on on my own and everything is packaged the same way. So yes, it's a "unified workflow for yak shaves" and I quite like it.

I hope there can be more people like me if this stuff weren't so obnoxiously difficult.


> many people seem to like unified look and feel

All else equal, sure. But they'll sacrifice that in a minute if it means elimination of other toil.

> many people complain about per-app auto-update

Facts not in evidence.

> many people love to complain software is getting worse

Okay.

> I work on distros...

Well, there you go.


(In response to: >> The future is statically linked and isolated applications. Distros need to adapt. )

> 1. Have the users been asking for everything to be a laggy electron app? I don't think so.

Humongous strawman. As if electon apps are the only ones that can be statically linked?!? For shame!


I have zero problem with static linking. I've said this multiple times in this thread. Strawman right back at you.

Yeah, sorry, I saw that later. (May have seen it earlier too, but not connected it to your name as I was replying.)

But that still doesn't make my comment a strawman (because I was talking about this one specific comment), or AFAICS yours less of one: Why would you jump to "bloated Electron apps"? Sure, they may suck, but the comment you replied to was about statically linked apps; no mention of Electron at all. Unless you're saying there was, originally, and had been edited out before I saw it? If not, your reply was... OK, more charitably, at least a non sequitur.

If not, please explain how, and I'll apologise.


The "coherent whole" is more in demand than ever before. Just look at the Android and iOS ecosystems with super hard rules how things have to behave and look in order to be admitted. They just put the burden on the app dev instead of a crew of distro maintainers.

If you define "demand" as hard orders from the warden of your walled garden, yes. But that's not how the concept is normally used.

Personally, for instance, I'd have been perfectly happy if at least a few apps had stayed with the Android UI of a few versions back[1], before they went all flat and gesture-based. There was no demand from me, as a consumer, to imitate Apple's UI.

___ [1]: And no, that's not outmoded "skeumorphism". That concept means "imitation of physical objects", like that shell on top of Windows that was a picture of a room, with a picture of a Rolodex on top of a dedktop etc etc. In the decades since ~1985 a separate visual grammar had developed, where a gray rounded rectangle with "highlights" on the upper and left edges, "shadows" on the lower and right edges, and text in the middle meant "button" in the sense of "click here to perform an action", not any more in the original skeumorphic sense of "this is a picture of a bit of a 1980s stereo receiver".


> packages are supposed to be nodes in a dependency graph (or better, partial order), and there are nice composition rules. But one the nodes themselves represent graph/order closures (or almost-closure) we use that nice property.

A) "Lose", not "use", right?

B) Sounds like a lot of gobbledy-gook... Who cares about this?

C) Why should I?

> People that are very application oriented tend to like vendoring --- "my application is its own island, who gives a shit".

No, people that are very application oriented tend to like applications that work without a lot of faffing around with "dependencies" and shit they may not even know what it means.

> But the distro's point isn't just gather the software --- that's an anachronism --- but plan a coherent whole.

A) Sez who?

B) Seems to me it would be both easier and more robust to build "a coherent whole" from bits that each work on their own than from ones that depend on each other and further other ones in some (huge bunch of) complex relationship(s).

> And applications who only care about themselves are terrible for that.

As per point B) above, seems to me it's exactly the other way around.

> The larger point is free and open source software really should be library-first.

Again, sez who?

> Siloed Applicationns are a terrible UI

WTF does the UI have to do with anything?

> and holdover from the economics of proprietary shrink-wrapped software

[Citation needed]

> which itself is a skeuomorphism from the physical world of nonreplicable goods.

Shrink-wrapped diskettes or CDs are physical goods, so they can't be skeumorphic.


That's easy to say, but sometimes I need to fix bugs in downstream packages, and I am not willing to wait for 6 months (or forever in some cases) for a fix to be released.

Then Linux distro a year out my patched censored version and link to the buggy upstream, and I have to keep telling bug reporters to not use the distro version.


A useful thing to do when forking software is to give it a new name, to make it clear that it's a separate thing. It sounds like you copied some specific version of software you depend on, then forked it, but left the name the same -- which caused confusion by package builders at the distribution since it's a bit of work to determine if you forked the dependency or are just including it for non-distribution user's convenience.

> (...) dynamic linking burdens distro maintainers with hundreds of additional packages which need to be individually tracked, tested and updated.

To me this assertion makes no sense at all, because the whole point of a distro is to bundle together more than hundreds of packages individually tracked, tested, and updated. By default. That's the whole point of a distro, and the reason developers target specific distros. A release is just a timestamp that specifies a set of packages bundled together, and when you target a distro you're targeting those hundred of packages that are shipped together.


Also compilers nowadays are smarter and can perform link time optimizations. Meaning that if of a library you only use a single function, in the final executable you would only get that single function. In reality code that use static linking could be more efficient than code that uses dynamic linking.

And you have to consider some performance penalty when using shared libraries. First you have a time penalty when loading the executable, since first you have to run the interpreter (ld-linux) and then your actual code. But also for each function call you have to make an indirect jump into it.


> But also for each function call you have to make an indirect jump into it.

Only the first call has a penalty. Then you call into the PLT and the PLT contains a direct jump to the function.


It still has a cost over potential inlining (that allows for more optimizations as well).

Of course if it is not a hotspot, it is meaningless.


The call into the PLT is a penalty.

Sure but it's not an indirect jump.

One tradeoff is security. If you're patching vulnerabilities, then just a single .so needs to be patched. With static linking every binary needs to be investigated and patched.

You can also argue that it is impossible to update dynamic libraries because they are used by multiple applications and you can't afford that any application breaks. So instead of being able to patch that one application where the security is needed, you now have to patch all of them.

> You can also argue that it is impossible to update dynamic libraries because they are used by multiple applications and you can't afford that any application breaks.

That's where maintenance branches comes in. You fix only the security issue, and push out a new version.


Isn’t this especially true in the world of containerization? We literally ship entire images or configurations of OS’s to run a single application or system function.

Although, I have mixed feeling about containers, because I fully appreciate that Unix is the application and the software that runs on it are just the calculations. In that world, sure, a shared library makes sense the same way a standard library makes sense for a programming language. Thus, a container is “just” a packaged application.

Regardless, this concept is so out of the realm of “performance” that it’s worth noting that the idea of trying to reduce memory use or hard disk space is not a valid argument for shared libraries.


Google's distroless containers attempt to fix this issue: https://github.com/GoogleContainerTools/distroless

> Disk space, and arguably even RAM, have become very cheap by comparison

CPU frequency and CPU cache are remaining small, so smaller binaries, which fit the cache, are running faster, and use less energy, and wastes less resources overall.

> Given these, it seems worthwhile to re-evaluate the tradeoffs that come with dynamic linking

Create your own distribution and show us benefits of static linking.


To underscore the gp's point, I have had to work with a 3rd party-provided .so before. No sources. A binary. And strict (if unenforceable) rules about what we could do with the binary.

I dealt with a situation like this for a ROS driver for a camera, where the proprietary SO was redistributable, but the headers and the rest of the vendor's "SDK" was not. The vendor clarified for me repeatedly that it would not be acceptable for us to re-host the SDK, that it was only accessible from behind a login portal on their own site.

The solution we eventually came to was downloading the SDK on each build, using credentials baked into the downloader script:

https://github.com/ros-drivers/pointgrey_camera_driver/blob/...

The vendor was fine with this approach, and it didn't violate their TOU, though it sure did cause a bunch of pain for the maintainers of the public ROS CI infrastructure, as there were various random failures which would pop up as a consequence of this approach.

I think eventually the vendor must have relented, as the current version of the package actually does still download at build time, but at least it downloads from a local location— though if that is permissable, I don't know why the headers themselves aren't just included in the git repo.


Have things changed at all since Pointgrey were acquired by FLIR?

To be honest, I haven't really tracked it— the product I work on dropped stereo vision in favour of RGBD, so I don't really know where it's landed. I suppose it's not a great sign that the current generation SDK still requires a login to access:

https://www.flir.ca/products/spinnaker-sdk/

And at least one spinnaker-based driver seems to have inherited the "download the SDK from elsewhere" approach, though who knows if that's due to genuine need or just cargo-culting forward what was implemented years ago in the flycapture driver:

https://github.com/neufieldrobotics/spinnaker_sdk_camera_dri...

The "proper" approach here would of course be for Open Robotics (the ROS maintainers) to pull the debs and host them on the official ROS repos, as they do for a number of other dependencies [1], but that clearly hasn't happened [2].

I think a lot of hardware vendors who cut their teeth in the totally locked down world of industrial controls/perception still think they're protecting some fantastic trade secret or whatever by behaving like this.

[1]: https://github.com/ros-infrastructure/reprepro-updater/tree/...

[2]: http://packages.ros.org/ros/ubuntu/pool/main/s/


I don't think that was their point at all. They were saying that dynamic linking creates flexibility and modularity that doesn't exist otherwise.

I think this was precisely the point. Shared libraries create seams you can exploit if you need to modify behavior of a program without rebuilding it - which is very useful if you don't want to or can't rebuild the program, for example because it's proprietary third-party software.

> The ability to change which library is being used without needing to rebuild the main program is really important.

Having spent many hours avoiding bugs caused by this anti-feature I have to disagree. The library linked almost always must be the one that the software was built against. Changing it out is not viable for the vast majority of programs and libraries.

Just as an example, there is no good technical reason why I shouldn't be able to distribute the same ELF binary to every distro all the time. Except the fact that distros routinely name their shared objects differently and I can't predict ahead of time what they will be, and I can't feasibly write a package for every package manager in existence. So I statically link the binary and provide a download, thereby eliminating this class of bug.

Despite the rants of free software advocates, this solution is the one preferred by my users.


Not defending it for general use, but dynamic linking can be very useful for test and instrumentation. Also sometimes for solvers and simulators, but that's even more niche.

Can you please elaborate? Why would the ability to change the library version at runtime be useful for testing? and what aspect of this is useful for simulators and solvers?

You compile a version of the dependency which intentionally behaves different (e.g. introduces random network errors, random file parsing errors, etc.) and "inject" it into a otherwise functional setup to check if all other parts including error reporting and similar work.

There are always other ways to archive this but using (abusing?) the dynamic linking is often the simpler way to set it up.

> and what aspect of this is useful for simulators and solvers?

Duno, but some programs allow plugin in different implementations of the same performance critical functionality so that you can distribute a binary version and then only re-compile that part with platform specific optimizations. If that is what the author is referring to I would argue today there are better ways to do it. But it's still working out either way and can be much easier to setup/maintain in some cases. (And probably falls into the "shared libraries as extension system" category Linus excluded from his commentary.)


That's part of it, which I agree falls under the excluded category. I was thinking more about code generation and program modification.

Curious what the better way to swap implementations would be?


I should clarify - it's at the start of runtime when the library is initially loaded. Not afterwards. You'd have to restart the program to swap a library.

For testing and simulation, it's a way to mock dependencies. You get to test the otherwise exact binary you'll deploy. And you can inject tracing and logging.

Solver installations tend to be long-lived, but require customizations, upgrades, and fixes over time. Dynamic linking lets you do that without going through the pain of reinstalling it every time. Also with solvers you're likely to see unconventional stuff for performance or even just due to age.


> ... when the library is initially loaded. Not afterwards. You'd have to restart the program to swap a library.

When you close the library with dlclose() you can swap it during runtime, too.


Dependency injection is useful for mock testing in C/C++ although there are more programmatic ways to do it for the latter

You're forgetting about one common case where libraries are replaced.

This is security vulnerabilities. If your application depends on a common library that had a vulnerability, I can fix it without you having to recompile your app.

With GLibc or X libraries a vulnerability there would result essentially requiring reinstallation of the entire OS.


You could but you would be doing yourself two disservices by trusting vendors that aren't providing security updates for dependencies in a timely manner and running applications on top of dependencies they haven't been tested with.

Vendors could ship applications with dependencies and package managers could tell which of those applications and dependencies have vulnerabilities. This would clarify the responsibility of vendors and pressure them to provide security updates in a timely manner.

One big obstacle is that it's fairly common for vendors to take a well known dependency and repackage it. It's difficult to keep track of repackaged dependencies in vulnerability databases.


What vendors? SCO UNIX? HP-UX? IBM AIX?

I'm not, actually. If the libraries need to be replaced the software should be rebuilt.

And yes, bandwidth and disk are cheap today. Reinstalling a large number of programs on disk is not that big of a problem today.


You seem to be assuming rebuilding is possible. What about the (still very useful) proprietary binaries from a company that hasn't existed in a decade? What about the binaries where the original source no longer exists?

I say that presents market opportunities. Every piece of technology faces a time of obsolescence.

If the original source no longer exists and rebuilding is no longer possible then replacing dependencies is no longer feasible without manual intervention to mitigate problems. ABIs and APIs are not as stable as you'd think.


> Every piece of technology faces a time of obsolescence.

That is true for most technologies that experience entropy and the problems of real world. Real physical devices of any kind will eventually fail.

However, anything build upon Claude Shannon digital circuits do not degrade. The digital CPU and the software that runs it are deterministic. Some people see a lack of updates in a project as "not maintained", but for some projects that lack of updates means the project is finished.

> obsolescence

What you label as obsolete I consider "the last working version". The modern attitude that old features need to be removed results in software that is missing basic features.

> replacing dependencies is no longer feasible without manual intervention to mitigate problems

This is simply not true. Have you even tried? I replaced a library to get proprietary software working last week. In the past, I've written my own replacement library to add a feature to a proprietary program. Of course this required manual intervention; writing that library took more than a week of dev time. However, at least it was possible to replace the library and get the program working. "Market opportunities" suggests you think I should have bought replacement software? I'm not sure that even exists?

> ABIs and APIs are not as stable as you'd think.

I'm very familiar with the stability of ABIs and APIs. I've been debugging and working around this type of problem for ~25 years. Experience suggests that interface stability correlates with the quality of the dev team. A lot of packages have been very stable; breaking changes are rare and usually well documented.


> there is no good technical reason why I shouldn't be able to distribute the same ELF binary to every distro

Oh, your app also works on every single kernel, every different version of external applications, and supports every file tree of every distro? Sounds like you added a crap-ton of portability to your app in order to support all those distros.

> I can't feasibly write a package for every package manager in existence.

But you could create 3 packages for the 3 most commonly used package managers of your users. Or 3 tarballs of your app compiled for 3 platforms using Docker for the build environment. Which would take about a day or two, and simultaneously provide testing of your app in your users' environments.


Yes this not that hard if you are statically linking as much as possible. The magic of stable syscalls. Variance between glibc is the biggest headache but musl libc solves a lot of the problems.

Have you ever actually tried that last step that you're suggesting? It's actually really time consuming and expensive to maintain that infrastructure due to oddities between distros, like glibc or unsupported compiler versions. Statically linking is easier than redoing all the work of setting up and tearing down developer environments just because one platform has a different #define in libc. It's also cheaper when your images are not small and you're paying for your bandwidth/storage.


> really important.

Except it isn't, at least not for open source.

Most libraries do not have stable ABI's, even for C there are may ways you can mess that up. Even "seemingly clear cut cases" like some libc implementations ran into accidental ABI breakage in the past.

And just because the ABI didn't change doesn't mean the code is compatible.

It's not seldom that open source libraries get bugs because dynamic linking is used to force them to be used with versions of dependencies which happen to be ABI compatible (enough) but don't actually work with it/have sub-tile bugs. It sometimes gets to a point where it's a major anoyence/problem for some open source projects.

Then there is the thing that LD_LIBRARY_PATH and similar are MAJOR security holes, and most systems would be best of to use hardening techniques to disable it (not to be confused with `/etc/ld.so.conf`).

Through yes without question for not properly maintained closed source programs it is helpful. But then for them things like container images being able to run older versions of linux (besides the kernel) in a sandbox can be an option, too. Through a less ergonomic one. And not applicable in all cases.


> Then there is the thing that LD_LIBRARY_PATH and similar are MAJOR security holes, and most systems would be best of to use hardening techniques to disable it (not to be confused with `/etc/ld.so.conf`).

I do not consider LD_LIBRARY_PATH or LD_PRELOAD more a security hole than PATH itself.

There is two scenarios:

- you control exactly how your program launcher (environment variables, absolute path) andit's a non issue

- you do not control the environment properly and the everything is a security hole.

That's said: DT_RUNPATH and RPATH are however beautiful security holes. They allow to hardcode loading path in the binary itself even with a controlled environment.

And many build tools let garbage inside these path unfortunately (e.g /tmp/my_build_dir )


I can only agree.

From a desktop point of view Linux needs some major improvement about how it handles applications.

It also has all tools to do so, but it would brake a lot of existing applications.

In the past I though Flatpack and Snap would steps in the right direction. But now I'm not so sure about that anymore (snap made some steps in the right direction but also many in wrong directions, flatpack seems to not care about anything but making things run easier, in both cases moving from a kinda curated repo to a not-really curated one turned out horrible).

For a server point of view things matter much less, especially wrt. modern setups (container, vm in cloud, cloud provider running customized and hardened Linux container/vm hosts, etc).

And in the end most companies paying people to work on Linux are primary interested in server-ish setups, and only secondarily in desktop setups (for the people developing the server software). Some exception would be Valve, I guess, for which Linux is an escape hatch in case bade lock-in patterns from phone app-stores take hold on windows.


> "Most libraries do not have stable ABI's, even for C"

I think the mess we created in ABI space is one of the failures of our indistry.


For comparison, AmigaOS was built on the assumption of binary compatibility and people still replace and update libraries today, 35 years later.

It's a cultural issue, not a technical one - in the Amiga world breaking ABI compatibility is seen as a bug.

If you need to, you add a new function. If you really can't maintain backwards compatibility, you write a new library rather than break an old one.

As a result 35 year old binaries still occasionally get updates because the libraries they use are updated.

And using libraries as an extension point is well established (e.g datatypes to allow you to load new file formats, or xpk which let's any application that supports it handle any compression algorithm there's a library for).

But it requires a discipline around it.


Oh man that brings memories. It's so sad that things like datatypes or xpk didn't made it to modern OSes (well, there's just fraction of it, I guess video codecs are closest thing to it, but that just targets one area).

I also wanted to point out that this standardization made it possible to "pimp" your AmigaOS to make individual desktops somewhat unique. There were basically libraries that substituted system libraries and changed how the UI looked or even how it worked. I kind of miss that. Now the only personalization I see is how the terminal prompt looks like :)


It's a side effect of abstraction. Even a language like C makes it extremely hard to figure out the binary interfaces of the compiled program. There's no way to know for sure the effects any given change will create in the output.

The best binary interface I know is the Linux kernel system call interface. It is stable and clearly documented. It's so good I think compilers should add support for the calling convention. I wish every single library was like this.

https://man7.org/linux/man-pages/man2/syscall.2.html


"It's a side effect of abstraction."

We have an entire language-on-top-of-a-language in C++ pre-processor, but we could not figure out something to specify to the compiler what we want in an ABI?

I think an abstraction is when a tool takes care of something for you, this situation is just neglect.


I have maintained some mini projects which try to have strong API stability.

And even through keeping API stability is much easier then ABI stability I already ran into gotchas.

And that was simple stuff compared to what some of the more complex libraries do.

So I don't think ABI FFI stability ever had a good chance (outside of some very well maintained big system libraries where a lot of work is put into making it work).

I think the failure was to not realize it earlier and instead move to a more "message passing" + "error kernel" based approach for libraries where this is possible (which are surprisingly many) and use API stability only for the rest (except system libraries).

EDIT: Like use pipes to interconnect libraries and use well defined (but potential binary) message passing between them. Being able to reload libraries resetting all global state, or run multiple versions of them at the same time etc. But without question this isn't nice for small utility libraries or language frameworks and similar.


Isn't that just ABI stability with extra steps?

it has the slight benefit of not corrupting your memory if your make an error

I think the failure was to not realize it earlier and instead move to a more "message passing" + "error kernel" based approach for libraries where this is possible (which are surprisingly many) and use API stability only for the rest (except system libraries).

Sounds pretty sweet as far as composability is concerned, but there is the overhead caused by serialization and the loss of passing parameters in registers.


Maybe this is in-line with what Linus said (very standard libraries), but I think when there's a vulnerability in libssl.so being able to push a fix instead of having to fix many things is a huge win.

The dependency hell thing seems to come up when libraries change their APIs and then you have to scramble. Last one I recall struggling with was libpng.so but there are plenty of others.


If you need to change a statically built executable, you can always patch it manually. This was how no-CD cracks worked back in the day.

Yes, dynamic linking makes this process easier, but it makes it so easy that both users and distro maintainers regularly break software without even realising it (hence why Red Hat use 5-year old versions of everything).


> If the program is statically linked, then nothing can happen - either it works or it doesn't. With dynamic linking, changes are possible by changing which library is loaded.

These problems are almost purely caused by dynamic linking though. The devs release something that was tested on some old version of ubuntu and now on your new fedora, things work different and the program is broken. While with static linking, it all just works.


You seem to be assuming that a statically linked library won't have a subtle, edge-case bug, when in fact, it doesn't just work.

Even that could be env-related, so maybe it surfaced once you moved to a new Fedora release.

Basically, the problems are pretty much the same with either approach, one has a more complex runtime environment, other has a more complex upgrade story.

While I'd agree with Linus' take on shared libraries, I'd still say that "statically linked libraries are not a good thing in general either" (the stress is on general).


It's not the program that is broken, it is the environment that is broken.

If by broken you mean having the latest features and security patches. Dynamically linked binaries are very fragile and not portable.

Latest features and security patches rarely go hand-in-hand: it's usually latest features and new security bugs instead.

Developers want to maintain a single branch because that's much cheaper, not because it's the ultimate solution to all the problems of maintaining software.


> The important benefit is dynamic linking. The ability to change which library is being used without needing to rebuild the main program is really important.

Or just having the option of loading the library at all. If you don’t need the functionality offered by libxyz, you’re not required to use it. One then has no end of language extensions that can be loaded into a generic interpreter to fit a script to whatever job they have at hand.

Edit: Linus touched on this as a last point:

> Or, for those very rare programs that end up dynamically loading rare modules at run-time - not at startup - because that's their extension model.

My question: is it actually rare?


Most applications don't have runtime extensions. The few that do, really need them in order to be useful. I'm convinced that there's a Pareto Distribution where 20% of libraries really benefit from being dynamically loaded, where the remaining 80% is better served by static linking. Given that, it seems to me that dynamic linking should be both opt-in and properly supported, but the one that decides that should be the library maintainer.

It is rare compared to the usual pulling in of required dependencies during build.

> If the program is statically linked, then nothing can happen

Static libs are embedded into most object-file formats (e.g. ELF) as isolated compilation units. Could you not just replace the static library within the object-file at the linkage level, in about the same way that old "resource editors" could replace individual resources within object-files, or the same way that programs like mkvmerge can replace individual streams within their target media-container format?

Static vs. dynamic linking is just about whether the top-level symbol table of the executable can be statically precomputed for that linkage. It doesn't impede you from re-computing the symbol table. That's what the linker already does, every time it links two compilation units together!


Does a tool that can do this exist? A “relinker”?

It’s a neat idea, but one obvious downside is the potential need to relink many executables when a shared component changes.


Ot seems you are getting downvoted, but for those of us who do not normally dive into the guts of executablea its a fair question

there is, but nearly all of the ones I've seen are for reverse-engineering purposes.

More and more often they're not just isolated units. Link time optimisation can mess up that assumption and for example inline things across library borders.

> that tend to have reasonably stable ABIs

Today I learned about Application Binary Interfaces.

(and thanks for the generally insightful comment)


> (...) while the "couple of libraries that nobody else used. Literally nobody" should obviously be statically linked.

I disagree. The main value proposition of shared libraries is being able to upgrade them without requiring to rebuild the application. Sure, libraries need to be competently maintained to ensure that patch releases are indeed compatible, but that still opens the door to fixing vulnerabilities on the fly by simply upgrading the lib that sneaks in the vulnerability, which shared libs allow even to end-users by simply dropping in a file.


Perhaps then the OS should figure out which parts of an executable are common, so we can still have shared libraries except they are smarter and not visible to the user.

Except "being visible to the user" is a useful thing to have, as GP explained :).

Or put in a different way: the problem is the "shared" part, not the "dynamic linking" part. Instead of static linking, you can avoid the issue of library versions by just shipping your shared libraries along with your program[0] - and this way, users still retain the ability to swap out components if the need arises.

--

[0] - Well, you could, if you could rely on LD_LIBRARY_PATH to be set to "." by default. Windows is doing it right, IMO, by prioritizing program location over system folders when searching for DLLs to load.


> Well, you could, if you could rely on LD_LIBRARY_PATH to be set to "." by default.

It's actually supported without LD_LIBRARY_PATH hacks, using DT_RPATH. You can do that by passing -rpath '$ORIGIN' to linker IIRC.


Thanks! I forgot all about it, in particular about '$ORIGIN' thing! Yes, I'm happy to see the person building the executable has at least some control over this.

It's actually relevant to a project I'm working on (proprietary, Windows/Linux, uses shared libraries for both mandatory components and optional plugins) - I'm gonna check if and how we're setting RPATH for the Linux builds, it might need some tweaking.


It's definitely technically doable: you'd have to checksum every read-only page of a program and see if you already have one in cache.

But is it worth it though? Even "big" statically linked Rust programs are a few dozen MBs of executable (and not all of that is read-only, and even less will be shared with any other executable at any given time). With things like LTO identical source files can result in different machine code as well.

In the end it would be a lot of trouble for potentially very little gains.

At first I was annoyed that Rust defaulted to dynamic linking, it felt inefficient and overkill. But the more I think about it the less I can really justify doing it any other way, there are very few benefits to dynamic linking these days. The only one I can think of is overriding an application's behaviour through LD_PRELOAD, but few people know how to do that these days.


> there are very few benefits to dynamic linking these days

I think you might underestimate it. E.g. what if every program uses Electron under the hood?


DLLs and shared libraries are different things though, no? Isn't it common in windows to use DLLs and simply pack in everything you need? Is that not the best of both worlds minus a bit of ram and disk?

> DLLs and shared libraries are different things though, no?

No; they're just different names for the same concept.

> Isn't it common in windows to use DLLs and simply pack in everything you need?

Yes.

> Is that not the best of both worlds minus a bit of ram and disk?

No; it's more like the worst of both worlds. When it is feasible, static linking is strictly superior to this approach. (You've pointed out that it may not be feasible in some cases in sibling comments, and I agree.)


DLLs are the same kind of file as ELF shared objects. Thing is that in Linux/BSD world they are indeed also commonly shared system-wide, on a package management level.

Bundling shared libraries is kinda the worst of both worlds: not getting any sharing between application (/system wide security patches) but not getting the performance benefits of static linking either. The only good thing is that you could manually replace e.g. a vulnerable version of a library inside this bundle.

And yeah… Flatpak/AppImage/snap distribution is kinda like that Windows-style one.


Bundling shared libraries is also used in big apps for faster build times and for more physical separation between modules.

Arguably the final release version could be statically linked, but that could be subtly different than what the devs are using.


For these cases, you can just use a modern build system (Bazel) and incremental linker (lld / gold).

Not everyone is in a position to vendor source for all their dependencies and use a new build system to solve a solved problem.

I do the same with good ol' cmake, I have a "developer" preset which will build my app with clang, lld, PCH, split into small shared libraries. And a release preset where everything is statically linked

DLL's are the windows version of shared libaries.

Shared Objects (.so) are the Linux version of shared libaries.

I.e. they are an implementation detail, and if you don't use them as "shared library" (in general) then it won't really fall under this category.


Sure but the point is the predominant style on windows is to copy an exclusive version of needed DLLs and not share them.

The now predominant style. Windows DLLs didn't always used to be named Whatever_5_0_1.DLL; used to be it was just Whatever.DLL.

Which lead to "DLL Hell" when the version of Whatever.DLL you had installed was not the same as the developer of your app had so the app wouldn't work, and if you switched to the one he had, some other app that used Whatever.DLL stopped working in stead. Which lead to the "exclusive version" / version-named DLL situation we have on Windows today... And, increasingly, on Linux.

I don't know exactly how shared libraries are loaded -- there are some tantalising mentions in sibling comments -- but it all seems to point towards two opposite solution paths:

1) Version-named libraries like on Windows; and/or, possibly, some file-link magic where /std_shared_libs/somelibrary/somelib_x points to /actual_libs/somelibrary/somelib_x_y_z, etc, etc... Then you could also have .../somelibrary/somelib_latest point to, well, the actual latest version installed, and somelibrary/somelib_default point to the one you want as default, etc, etc. (Hmm... Shouldn't stuff like this have been hammered out decades ago by, Idunno, some kind of Linux standardisation initiative... if this doesn't exist already, then WTF is the LSB for?!?)

2) Just fucking static-link everything already.

Friar William of Ockham seems to be pointing more towards one than the other of the above.

Anyway, what certainly doesn't look like a great solution to me is

3) Containerization. That just feels like "In stead of fixing DLL Hell on Linux, let's just ignore it, replicate ~half the OS -- but with our preferred version of every library -- inside a (pseudo-)VM, and pretend that running boxes within boxes within boxes is a solution". No, that's not a solution, that's a kludge.


When was the last time you were able to swap a dynamic library and have things just work?

Today, to get a game to work on an older system.

Several times last week when I needed to use an old proprietary program that used several outdated libraries.

Every single time I want to run a program (usually a game) that has a runtime dependency on pulseaudio[1]. (apulse[2] usually works to translate the libpulse ABI back to ALSA).

One time I had to write my own version of a library that specifically emulated the ABI used by an important program from an outside vendor. Obviously this didn't "just work"; it required a couple weeks of work to write the new version. The point isn't that future versions of a library will magically "just work". With a dynamically linked dependency replacement is at least possible. If the program in question was statically linked, nothing could be changed.

The question isn't about which method is less work or easier to maintain. The question that matters in the long run is if you want the basic ability to modify program's dependencies in the future to fix an important problem?

[1] As of a few years ago, the stupid design decisions in pulseaudio make it make it highly incompatible with my needs. Just having it installed makes runtime linking issues even worse. It also add insane amounts of latency (>5ms is bad. >50ms is insane) by design.

[2] https://github.com/i-rinat/apulse


I'm not in the pro-shlib camp (it's what brought us the whole Docker->k8s fiasco in the first place), but it might be worth noting that the LGPL requires shared libs ie requires you as an end user be able to swap LGPL libs by newer versions, alternate implementations, or your own.

Also, there are legitimate use cases for shlibs-as-plugins, such as ODBC.


I agree with the core of your comment, but I would note that containerization has other advantages over simple static binaries. In particular, sandboxing even at the network layer. Also, containers often bundle mutiple programs together to ensure compatibility, not just libraries, which is a step that is almost never discussed. Applications often depend on system utilities whose API is even more poorly defined/maintained than system libraries, so bundling your own ls or shell may be a safer option.

Even further, k8s really has nothing to do with this, it is a tool for submitting workloads to a group of a computers through a single API, and it critically depends on containerization for network and storage isolation. K8s would have looked more or less identical to how it does today even if shared libraries had never existed (though probably the exact format of OCI container images could have been much simpler).


> it might be worth noting that the LGPL requires shared libs

nope, not at all. https://www.gnu.org/licenses/gpl-faq.en.html#LGPLStaticVsDyn...


But it says

> If you dynamically link against an LGPLed library already present on the user's computer, you need not convey the library's source. On the other hand, if you yourself convey the executable LGPLed library along with your application, whether linked with statically or dynamically, you must also convey the library's sources, in one of the ways for which the LGPL provides.

Heads up to gnu.org: site won't load via https in FF due to HSTS policy (broken certificate chain?).


It happens almost every day when I update my rolling release distribution.

You'd not need to do that if all (or all besides libc maybe) were statically linked though.

So that's actually an argument for statically linking, as doing such every day is just a PITA and not really an option for those who need to actually get work done during spending paid time on such a system.


I successfully upgraded the OpenSSL-DLLs used by a no longer maintained Windows program when some servers started refusing connections (presumably due to outdated SSL/TLS versions and/or encryption schemes used by the old DLLs). After upgrading to a recent version of OpenSSL, everything worked fine again.

Last time I ran apt upgrade?

All the time.

You are right on track here. The complexity is real, but so are the benefits of a modular executable vs. a huge monolith.

I fully understand why languages like rust demand static linking (their compilation model doesn't allow dynamic linking). But once you encounter a C-API boundary, you can as well make that a dynamic C-ABI boundary.


It's not the default, but Rust absolutely allows dynamic linking [1]. For example, Bevy encourages building the engine as a dynamic library during development to improve iteration turnaround times for the app code [2].

[1] https://doc.rust-lang.org/reference/linkage.html

[2] https://github.com/bevyengine/bevy/issues/791


If I am not completely mistaken, if you produce a dynamic library with rust, you are limited to the C-ABI. For instance, you cannot import a polymorphic function from a dynamic library.

The rust abi isn't stable between compiler versions, but it does exist. Bevy can get away with using it because it's just to speed things up in development.

Polymorphism is a separate issue. One way to do polymorphism in rust is monomorphism, where your polymorphic function is compiled to a specific version with no generics per caller. If you don't know the callers ahead of time this can't work. Another way is dynamic dispatch, where you have one polymorphic function that chooses what code to run per type at runtime. This can work with dynamic linking


I haven't used it, but I don't believe that's the case. I think what you're describing is `#[crate_type = "cdylib"]` ("used when compiling a dynamic library to be loaded from another language"), whereas `#[crate_type = "dylib"]` produces a "dynamic Rust library".

> However, there important advantage of using .so files instead of building into a big statically linked binary is NOT that a library can be shared between different programs! The important benefit is dynamic linking.

One could reasonably view the kernel use of loadable modules as an example of this utility.


> The ability to change which library is being used without needing to rebuild the main program is really important.

In 40 years in the field, I've never needed that. Every single time, we just emitted an entirely new build - because it's _much easier._

> Maybe rebuilding the program isn't practical. Maybe rebuilding the program isn't possible because it's closed/proprietary or the source no longer exists

These are edge cases.

Nearly all the time when we develop, we are making changes in an existing codebase.


> Every single time, we just emitted an entirely new build - because it's _much easier._

For you, the application developer. For end users of old or proprietary software, replacing libraries is much more feasible.


> have reasonably stable ABIs

So what libraries have this property? I've read in old threads about even glibc making changes that break binary interface compatibility.

I get the feeling not much attention is paid to binary interfaces in the free software world.


> So what libraries have this property? I've read in old threads about even glibc making changes that break binary interface compatibility.

You can request different ABI versions from glibc. And glibc doesn’t introduce breaking changes every day but rather every few years.


Asides from rare bugs (which get fixed), glibc does not break ABI. However, glibc does extend the ABI with new versions so compatibility only goes one way - your runtime glibc (generally) needs to be at least as new as the one used for linking.

Another important consideration is security. If there is an exploit in the library code and the original program authors do not release updates regularly, then it is really important to be able to patch the security issue.

Note to 1-bit people: saying that "X is not a good thing in general", is not the same as saying that "X is a bad thing in general". All that's being said here is "tradeoffs exist" and highlighting some issues some people apparently like to ignore.

One of my favorite libraries is SDL. Of course, when SDL2 came out, they moved away from the LGPL license to a more liberal zlib license, which allows static linking without infecting the license of your project. Great, except now as years go by and games are abandoned, they stop working in new environments, or at least don't work as well, and you can't just force in a new dynamic version of SDL to save things like you used to be able to.

The solution they came up with: even if the program is static linked, the user can define an environmental variable to point to a dynamic lib and that will be used instead. Best of both worlds, at least going forward for programs that use a version of SDL2 that incorporates the feature. (Some more details: https://www.reddit.com/r/linux_gaming/comments/1upn39/sdl2_a...)


The "1-bit people" part made me laugh. Thanks ;)

I'm definitely stealing it

I think this is still the hallmark of API volatility.

I personally think with the stalling of serial speed in CPUs, we will eventually have to optimize and stabilize interfaces.

I think some of that will require stabilization of many APIs and libraries such that they don't change over the span of a decade.

This era is basically unthinkable in the current age, but I think that is because we have been living in the "free lunch" era of faster better CPUs every year for too long.

Much like all of the 20th century was an era of resource availability. The 21st will be an era of succeeding within resource constraints: with efficiency.

Those libraries will be excellent candidates for shared libraries.


As the reply to Linus's post points out, the increase in storage/network (~9MB vs 40+MB) can be multiples of the shared library cost, and memory cost can be multiples too if you're not calling the same binary over and over again like Clang.

A middle ground could be distros can keep control over the version that's being statically linked at build time. It would do away with the dependency conflicts that happen now and result in a better user experience. The app/libraries can be updated as needed together. This is what's happening in a venv or container too, once you've built the venv or container it's effectively static.

You do need to keep track of the library versions ultimately at build time. This effectively means handing things off to the build system, and distros like Nix have effectively wrapped build systems to some success I think.


Nix doesn't really "wrap" anything besides compilers, or when it's absolutely necessary for an application. And the wrapping is only done so that there's a way to shoehorn certain CFLAGS, some nix-specific behavior, and other configuration options which an upstream may not have a great way to configure.

But for most mature build systems we largely just use the toolchain as-is. CMake, pkg-config, autotools, and other toolchains or technologies generally allow enough flexibility to enable non-FHS builds to be done in a consistent and easy manner. Actually, creating nix packages for upstreams which have good release management discipline is trivial once you're familiar with nix.

What makes nix special is that you when you go to build the package, you reference the dependencies through their canonical name, but nix translates these dependencies to a unique paths which are hashed to capture all inputs (including configuration flags) used to create the dependency. So you can use multiple conflicting versions of a dependency in a package's dependency tree without issue. Dependencies can be shared if a dependency matches exactly, but can differ if needed. And that's something that FHS distros cannot do.

Actually, since nix never makes any assumptions about what's installed on the system, you're free to use it on any distro. And it's hermetic, so the host system wouldn't be changed outside of the `/nix` directory. You can even use is it even on macOS, although it's not as well supported, but should work fine for common packages.

> A middle ground could be distros can keep control over the version that's being statically linked at build time.

Nixpkgs already does this, even if the dependency is dynamically linked. So there's no additional overhead with switching it out to the static variants; other than upstreams may not support that scenario as well as "traditional" dynamic linking.


I was trying to get at cargo2nix, gradle2nix, etc..

Static linking along with non-FHS does enable multiple versions of apps to co-exist easier, but then this is what AppImage/Flatpak enable too.

I've played around with Nix and source-based atomic upgrades, and modules are more impressive to me than non-FHS, though I see how it enables the first. I think shared system libs with AppImage for apps would be fine for any distro, if only everyone could agree.


"Note to the 1-bit people" is an odd redirect given Linus is known for his 1-bit opinions.

I wouldn't say that, his opinions are more notorious for being... "strongly worded", not for being black or white.

The main supposed benefit is not disk or memory economy, but central patching.

The theory is as follows: If a security flaw (or other serious bug) is found, then you can fix it centrally if you use shared libraries, as opposed to finding every application that uses the library and updating each separately.

In practice this doesn't work because each application has to be tested against the new version of the library.


> then you can fix it centrally if you use shared libraries, as opposed to finding every application that uses the library and updating each separately.

This comes up a lot, but how often do you end up in a scenario where there's a critical security hole and you _can't_ patch it because one program somewhere is incompatible with the new version? Maybe even a program that isn't security critical.

Plus what you mentioned about testing. If you update each application individually, you can do it piecemeal. You can test your critical security software and roll out a new version immediately, and then you can test the less critical software second. In some systems you can also do partial testing, and then do full testing later because you're confident that you have the ability to roll back just one or two pieces of software if they break.

It's the same amount of work, but you don't have to wait for literally the entire system to be stable before you start rolling out patches.

I don't think it's 100% a bad idea to use shared libraries, I do think there are some instances where it does make sense, and centralized interfaces/dependencies have some advantages, particularly for very core, obviously shared, extremely stable interfaces like Linus is talking about here. But it's not like these are strict advantages and in many instances I suspect that central fixes can actually be downsides. You don't want to be waiting on your entire system to get tested before you roll out a security fix for a dependency in your web server.


> This comes up a lot, but how often do you end up in a scenario where there's a critical security hole and you _can't_ patch it because one program somewhere is incompatible with the new version? Maybe even a program that isn't security critical?

Talking about user usecases, every time I play a game on Steam. At the very least there are GNU TLS versioning problems. That's why steam packages it's own library system containing multiple versions of the same library -- thus rendering all of the benefits of shared libraries completely null.

One day, game developers will package static binaries, and the compiler will be able to rip out all of the functions that they don't use, and I won't have to have 5 - 20 copies of the same library on my system just sitting around -- worse if you have multiple steam installs in the same /home partition because you're a distro hopper.

One day...


There's a big difference between production installs and personal systems or hobbyist systems. Think, there are lots of businesses running Cobol on mainframes, there are machine shops running on Windows XP, there are big companies still running java 8 and python2. When you have a system that is basically frozen, you end up with catastrophic failure where to upgrade X you need to upgrade Y which requires upgrading Z, etc. You'd be surprised what even big named companies are running in their datacenters, stuff that has to work, is expensive to upgrade, and by virtue of being expensive to upgrade it ends up not being upgraded to the point where any upgrade becomes a breaking change. And at the rate technology changes, even a five year old working system quickly becomes hopelessly out of date. Now imagine a 30 year old system in some telco.

These are such different use cases that I think completely different standards and processes as well as build systems are going to become the norm for big critical infrastructure versus what is running on your favorite laptop.


Well, you will have multiple copies of the same library in your system, but they will be just exclusive for each application.

Well, not really. The compiler is able to optimize the contents of the library and integrate it with the program. i.e. some functions will just be inlined, and that means that those functions won't exist in the same form after other optimizations are applied (Like, maybe the square root function has specific object code, but after inlining the compiler is able to use the context to minify and transform it further).

So it's actually less wasteful.


> the compiler will be able to rip out all of the functions that they don't use

Link-time optimization is already a thing. It's sad it doesn't seem to be used more often.


Yes, but LTO doesn't apply across shared object libraries. Suppose I write a video game that uses DirectX for graphics, but I don't use the DirectX Raytracing feature at all. Because of DLL hell, I'm going to be shipping my own version of the DirectX libraries, ones that I know my video game is compatible with. Those are going to be complete DirectX libraries, including the Raytracing component, even though I don't use it at all in my game. No amount of LTO can remove it, because theoretically that same library could be used by other programs.

On the other hand, if I am static linking, then there are no other programs that could use the static library. (Or, rather, if they do, they have their own embedded copy.) The LTO is free to remove any functions that I don't need, reducing the total amount that I need to ship.


DirectX is a bad example, it's now a part of Windows and you don't ship it anymore.

Good point (and shows that I am not a video game developer). I had tried to pick DirectX as something that would follow fao_'s example of game developers. The point still holds in the general case, though as you pointed out, not in the case of DirectX in particular.

Even without LTO, linker will discard object files that aren't used (on Linux a static library is just an AR archive of object files). It's just a different level of granularity.

You can get better granularity – build with -ffunction-sections, link with --gc-sections.

LTO is really a much bigger deal, the main reason to use it is not throwing unused stuff away, but inlining across all the things.


> One day, game developers will package static binaries

Wouldn't you still need at least runtime linking (dlopen) to link to OpenGL/Vulkan/etc.?


I've seen independent OpenGL libraries packaged along with games, so, I wouldn't think so no.

I doubt that you did since OpenGL implementations are hardware-specific. Perhaps you mean utility libraries building on top of OpenGL such as GLEW or GLUT.

Some libraries (OpenGL, Vulkan, ALSA, ..) the shared library provides the lowest stable cross-hardware interface there is so linking the library makes no sense.


> This comes up a lot, but how often do you end up in a scenario where there's a critical security hole and you _can't_ patch it because one program somewhere is incompatible with the new version? Maybe even a program that isn't security critical.

That’s not the point. The point is having to find and patch multiple copies of a library in case of vulnerability instead of just one.

Giving up the policy to enforce shared libraries would just make the work of security teams much harder.


Then you'll be trading a security hole that might not even be exploitable with undefined behavior.

In practice this works really well as long as the change is to eliminate an unwanted side effect rather than to change a previously documented behavior.

But it doesn't really matter. What matters is that whatever system is in use needs to have a control system that can quickly and reliably tell you everything that uses a vulnerable library version, and can then apply the fixes where available and remind you of the deficiencies.

That metadata and dependency checking can be handled in any number of ways, but if it's not done, you are guaranteed not to know what's going on.

If a library is used inside a unikernel, inside a container, inside a virtual machine, inside venvs or stows or pex's or bundles, the person responsible for runtime operations needs to be able to ask what is using this library, and what version, and how can it be patched. Getting an incomplete answer is bad.


I strongly agree that the reporting and visibility you're talking about are important.

But there's one other advantage of the shared library thing, which is that when you need to react fast (to a critical security vulnerability, for example), it is possible to do it without coordinating with N number of project/package maintainers and getting them all to rebuild.

You still do want to coordinate (at least for testing purposes), but maybe in emergencies it's more important to get a fix in place ASAP.


>In practice this works really well as long as the change is to eliminate an unwanted side effect rather than to change a previously documented behavior

...and then you deploy and discover that somebody was depending on that "unwanted" side effect.


> In practice this works really well as long as the change is to eliminate an unwanted side effect rather than to change a previously documented behavior.

I fully agree with that, didn't understand the rest.


Let's say you are building a web-based file upload/download service. You're going to write some code yourself, but most components will come from open-source projects. You pick a base operating system, a web server, a user management system, a remote storage system, a database, a monitoring system and a logging system. Everything works!

Now it's a month later. What do you need in ongoing operations, assuming you want to keep providing reasonable security?

You need to know when any of your dependencies makes a security-related change, and then you need to evaluate whether it affects you.

You need to know which systems in your service are running which versions of that dependency.

You need to be able to test the new version.

You need to be able to deploy the new version.

It doesn't matter what your underlying paradigm is. Microservices, unikernels, "serverless", monoliths, packages, virtual machines, containers, Kubernetes, OpenStack, blah blah blah. Whatever you've got, it needs to fulfill those functions in a way which is effective and efficient.

The problem is that relatively few such systems do, and of those that do, some of them don't cooperate well with each other.

It's plausible that you have operating system packages with a great upstream security team, so you get new releases promptly... and at the same time, you use a bunch of Python Packages that are not packaged by your OS, so you need to subscribe individually to each of their changefeeds and pay attention.

Does that help?


Thank you

Absolutely correct.

The whole "shared libraries are better for security" idea is basically stuck in a world where we don't really have proper dependency management and everyone just depends on "version whatever".

This is interestingly also holding back a lot of ABI breaking changes in C++ because people are afraid it'll break the world... which, tbf, it will in a shared library world. If dependencies are managed properly... not so much.

I wish distros could just go back to managing applications rather than worrying so much about libraries.

EDIT: There are advantages with deduplicating the memory for identical libraries, etc., but I don't see that as a major concern in today's world unless you're working with seriously memory-constrained systems. Even just checksumming memory pages and deduplicating that way might be good enough.


So if there is 'proper dependency management' (what do you propose? are we too fixed in versioning, too loose?) how will you fix the next Heartbleed? Pushing updates to every single program that uses OpenSSL is a lot more cumbersome (and likely to go wrong because there is some program somewhere that did not get updated) than simply replacing the so/dll file and fixing the issue for every program on the system.

And in case your definition of proper dependency management is 'stricter', then you simply state that you depend on a vulnerable version, and fixing the issue will be far more cumbersome as it requires manual intervention as well, instead of an automated update and rebuild.

If it is looser, then it will also be far more cumbersome, as you have to watch out for breakage when trying to rebuild, and you need to update your program for the new API of the library before you can even fix the issue at all.


No, it is not cumbersome to reinstall every program that relies on OpenSSL. My /usr/bin directory is only 633 MB. I can download that in less than a minute. The build is handled by my distro's build farm and it would have no problem building and distributing statically linked binaries if they ever became the norm.

That is going back to the same issues with containers, where everything works just fine... as long as you build it from your own statically-configured repo and you rebuild the whole system every update. It's useless once you try to install any binaries from an external package source. And IMO, a world where nobody ever sends anyone else a binary is not a practical or useful one.

Yes? Rebuilding and (retesting!) the system on every major update is not a bad idea at all. I rarely install binaries from out-of-repo sources so that is not a great problem for me. And those I do install tend to be statically linked anyway.

> In practice this doesn't work because each application has to be tested against the new version of the library.

In practice it works: see autopkgtest and https://ci.debian.net, which reruns the tests for each reverse dependency of a package, every time a library gets updated.

I know for a fact that other commercial, corporate-backed distributions are far away from that sophistication, though.


> I know for a fact that other commercial, corporate-backed distributions are far away from that sophistication, though.

No, they're not. Both Fedora/RHEL and openSUSE/SLE do similar checks. Every update submission goes through these kinds of checks.

Heck, the guy who created autopkgtest works at Red Hat and helped design the testing system used in Fedora and RHEL now. openSUSE has been doing dependency checks and tests for almost a decade now, with the combination of the openSUSE Build Service and OpenQA.


Ex SUSE employee here. OBS and OpenQA are nowhere close to Debian CI.

Having worked with both systems, Debian CI does not focus on system integration. Fedora CI and Debian CI are more similar than different, but Fedora also has an OpenQA instance for doing the system integration testing as openSUSE does. openSUSE's main weakness is that they don't do deeper inspections of RPM artifacts, the dependency graph, etc. They don't feel they need it because OBS auto-rebuilds everything on every submission anyway. The Fedora CI tooling absolutely does this since auto-rebuilds on build aren't a thing there, and it's done on PRs as well as update submissions.

If you can retest everything, you can rebuild everything. But that doesn't help the network bandwidth issue. Debian doesn't have incremental binary diff packages anyway (delta debs) in the default repos anyway, so there's room for improvement there.

> If you can retest everything, you can rebuild everything

No, rebuilding is way heavier.

Even worse, some languages insist on building against fixed version of their dependencies.

It forces distributions to keep multiple version of the same library and this leads to a combinatorics explosion of packages to fix, backport, compile and test.

It's simply unsustainable and it's hurting distributions already.


Distros have been known to patch packages that wanted a fixed version of a dependency. In fact, most have it has their default policy.

Yes, that's my point.

To rephrase what I meant; those two aren't exclusive:

- statically link a library

- have all binaries in the system use the same version of the library, and updated when the library needs to be updated.


So if the CVE says that you need to update library L, and program A uses the thing that's broken and it's ok to update, but program B doesn't use the broken thing, but the L upgrade breaks it, CI would let you know that you're screwed before you upgrade... but you're still screwed.

It's worse if the program that needs the update also is broken by the update, of course.


So, what's the alternative?

Now it's your choice... you either lose B but protect the rest of the infrastructure from hackers... or you think the CVE doesn't apply to your usecase (internal thing on a testing server), and don't upgrade L to keep B working.

You can also install both versions of L. You can also patch the broken part of L out at the old version, if it's not mission critical. There's a lot of things you can do.

Having one giant binary file with everything statically compiled in is worse in every way, except for distribution-as-a-single-file (but you can already do this now, by putting the binary and the libraries in a single zip, dump everytinh in /opt/foo, and let user find the vulnerable library manually... which again, sucks.


If it were static libraries, you'd upgrade the package for A (which would need to be recompiled with updated L) and leave B alone. As a low priority followup, fix either B or L so they can work together again (or wait for someone else to fix and release).

Installing both versions of L is usually hard. It's one thing if it's OpenSSL[1] 1.1 vs 1.0, but if 1.0.0e is needed for security and 1.0.0d is needed for other applications, how do you make that work in Debian (or any other system that's at least somewhat mainstream)?

[1] Not to pick on OpenSSL, but it's kind of the poster child for important to pick up updates that also break things; but at least they provide security updates across branches that are trying to stay compatible.


For a rough measure of how many packages will break if you perform a minor update to a dependency, try searching “Dependabot compatibility score” on GitHub (or just author:dependabot I suppose), each PR has a badge with how many public repo’s CI flows were successful after attempting to apply the patch. By my eye it seems to be generally anywhere from 90-100%. So the question is would you be happy if every time a security vulnerability came out approximately 1 in 20 of your packages broke but you were “instantly” protected, or would you rather wait for the maintainers to get around to testing out and shipping the patch themselves. Given the vast majority of these vulnerabilities only apply in specific circumstances and the vast majority of my packages are only used in narrow circumstances with no external input, I’d take stability over security.

Security patches are typically much smaller scoped than other library updates. Also, Dependabot does not support C or C++ packages so its stats are not that useful for shared libraries, which are most commonly written in C.

> as opposed to finding every application that uses the library and updating each separately.

Now with Docker, it becomes, find all the Docker containers that have that shared library, which may even be a bigger pain.

Not all uses, but it seems as if a significant use of Docker is just a a very complicated replacement for a single statically linked binary.


Fully agree. I have a collection of tensorflow docker containers because installing multiple versions in parallel won't work.

Linux distribution fix vulnerabilities in shared libraries all the time.

Imagine statically linking openssl having to rebuild tenths of thousands of packages every time there's an update!


Rebuilding tens of thousands of packages on a dependency upgrade is not a big deal.

Rebuilding and running tests on that amount of packages every time there's a security update in a dependency is completely unsustainable for Linux distribution as well as the internal distributions in large companies.

I worked these systems in various large organization and distros. We did the math.

On top of that, delivering frequent very large updates to deployed systems is very difficult in a lot of environments.


> In practice this doesn't work because each application has to be tested against the new version of the library.

Debian security issues thousands patched shared libraries every year without testing them against every program, and without causing failures. They do that by back porting the security fix to the version of the library Debian uses.

I gather you are a developer (as am I), and I'm guessing that scenario didn't occur to you as no developer would do it. But without it Debian possibly wouldn't be sustainable. There are 21,000 packages linked against libc in Debian. Recompiling all of them when a security problem happens in libc may well be beyond the resources Debian has available.

In fact while it's true backward compatibility can't be trusted for some libraries, it can for many. That's easily verified - Debian ships about 70,000 packages, yet typically Debian ships just one version of each library. Again the stand out example is libc - which is why Debian can fearlessly link 21,000 packages again the same version.

I'm guess most of the people here criticising shared libraries are developers, and it's true shared libraries don't offer application developers much. But they weren't created by application developers or for application developers. They were created by the likes of SUN and Microsoft, would wanted to skip one WIN32.DLL so they could update just that when they found a defect in it. In Microsoft's case recompiling every program that depended on it was literally impossible.


Works the exact same way without shared libraries, just at the source level instead of the binary level. "Finding every application" is simple. The problem is the compile time and added network load. Both are solvable issues, via fast compiling and binary patching, but the motivation isn't there as shared libraries do an OK job of mitigating them.

Fast compiling isn't really compatible with reproducible builds, is it?

On the contrary. Reproducible builds is an absolutely prerequisite for for fast compiling. By enabling incremental compiles that can be trusted.

But is this still as important if the executable in question is part of the Linux distribution? In theory, Fedora knows know that Clang depends on LLVM and could automatically rebuild it if there was a change in LLVM.

To me that is an argument that doesn't make any sense, at least on Linux. It could make sense if we talk about Windows or macOS, where you typically install software by downloading it from a website and you have to update it manually.

On Linux it the only thing it should change is that if a vulnerability is discovered let's say in OpenSSL all the software that depends on OpenSSL must be updated, and that could be potentially half the binaries of your system. It's a matter of download size, that in reality doesn't matter that much (and in theory can be optimized with package managers that applies binary patches to packages).

It's the maintainers of the distribution that notes the vulnerability in OpenSSL and decides to rebuild all packages that are statically linked to the vulnerable library. But for the final user the only thing he has to do is still an `apt upgrade` or `pacman -Syu` or whatever, and he would still get all the software patched.


That's on the assumption that all software on Linux comes through the the official repos of the distribution. I would bet that there are almost no systems where this holds entirely true, as I've seen numerous software packages whose installation instructions for Linux are either `curl | sudo bash` or `add this repo, update, install`.

Can't we just link statically but still maintain the list of dependencies as if everything was linked dynamically?

Problem if you're not set up to build everything yourself.

Is it? Just update everything regularly.

Actually now that I think about it, building by yourself might put you at a disadvantage here, as you'd have to remember to rebuild everything. I'm kinda lazy when it comes to updates so not sure if I like the idea anymore with having to rebuild all the software I built myself lol, but it probably could be solved by also shipping .so versions of libraries.


Automatic updates are themselves a security risk, which is something that I rarely hear talked about. For example, the FBI's 2015/2016 dispute with Apple about unlocking phones. The FBI's position relied on the fact that Apple was technically capable of making a modified binary, then pushing it to the phones through automatic updates. If Apple were not capable of doing so (e.g. if updates needed to be approved by a logged-in user), then that vector of the FBI's attack wouldn't be possible.

I don't have the best solution for it, but the current trend I see on Hacker News of supporting automatic updates everywhere, sometimes without even giving users an opt-out let alone an opt-in, is rather alarming.


I don't argue for automatic updates. It's pretty much whatever we already have, but instead of updating a single library, you'd have to update every package that depends on that library.

I'm just throwing ideas around so you should definitely take what I'm saying with a grain of salt. It just would be interesting to see a distro like that and see what the downsides of this solution are. Chances are that there probably already is something like this and I'm just not aware of it and I'm reinventing the wheel.


Ah, got it. Sorry, I misinterpreted "update everything regularly" to imply developers forcing automatic updates on every user.

I'm in the same boat, as somebody who isn't in the security field. I try to keep up with it, but will occasionally comment on things that I don't understand.


> Is it? Just update everything regularly.

On stable production systems? Never.

The CIP project https://www.cip-project.org/ aims to backport security fixes to Linux for *25* years after each CIP release.

No sensible organization runs bleeding edge OSes on their airplanes, power plants, payment processors, industrial plants...


Nix will solve this for you in a breeze.

sure, some packages in some of the popular distros are indeed like that. If the package is important enough (like firefox) and the required dependencies are a bit out of step with what's currently used by the rest of the distribution you will sometimes see that for at least some of the dependencies.

Most distros dislike doing so, and scale it back as soon as it becomes technically possible.


But they dislike just packages requiring different versions of libraries, right? My point is to do literally everything as it is right now, but simply link it statically. You can still have the same version of a library across all packages, you just build a static library instead of a dynamic one.

This is odd to me, because surely they have to maintain that list anyway so they know which dependent packages need to be tested before releasing bugfixes?

Or is that step just not happening?

I just feel like, one of the big points of a package manager is that I can look up "what is program X's dependencies, and what packages rely on it?"


Another issue with security patches is that specific CVEs are not of equal severity for all applications. This likewise changes the cost/benefit ratio.

You also get bugs introduced into a program by changes to shared libraries. I've even seen a vulnerability introduced when glibc was upgraded (glibc changed the direction of memory, and the program was using memcpy on overlapping memory).

The memcpy API says that is Undefined Behavior, that program was never valid. Not much different from bitbanging specific virtual addresses and expecting they never change.

For overlapping memory, use memmove()


Yes, the program was invalid, but it was also accidentally bug free. The two are not mutually exclusive.

It was never valid taking in a generic shared interface to libc, but a statically linked version would have been valid.

C language makes a difference between Undefined Behavior and Implementation Defined Behavior. In this case it's the former (n1570 section 7.24.2.1).

Any code that invokes UB is not a valid C program, regardless of implemention.

More practically, ignoring the bug that did happen, libc also has multiple implementations of these functions and picks one based on HW it is running on. So even a statically linked glibc could behave differently on different HW. Always read the docs, this is well defined in the standard.


The source code may have had UB, but the compiled program could nevertheless have been bug-free.

> Any code that invokes UB is not a valid C program, regardless of implemention.

I disagree. UB is defined (3.4.3) merely as behavior the standard imposes no requirements upon. The definition does not preclude an implementation from having reasonable behavior for situations the standard considers undefined.

This nuance is very important for the topic at had because many programs are written for specific implementations, not specs, and doing this is completely reasonable.


> You also get bugs introduced into a program by changes to shared libraries. I've even seen a vulnerability introduced when glibc was upgraded (glibc changed the direction of memory, and the program was using memcpy on overlapping memory).

Did glibc change the behavior of memcpy (the ABI symbol) or memcpy (the C API symbol) which currently maps to the memcpy@GLIBC_2.14 ABI symbol?


Central patching had greater utility before operating systems had centralized build/packaging systems connected via the internet. When you had to manually get a patch and install it via a tape archive or floppy drive, or manually copy files on a lan, only having to update one package to patch a libc issue was a big help. Today it would require a single command either way, and so doesn't really have any significant benefit.

Doesn't work? This is what Debian and Fedora have been doing for decades!

An update is just as likely to break something as it is to fix something.

I don't know the term for this pattern.

- shared libs

- name indirection (dns domains, css classes)



Late binding?

overloading?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: