
An alternative to shared libraries - Fenume
http://www.kix.in/2008/06/19/an-alternative-to-shared-libraries/
======
stormbrew
This just illustrates that the real problem is versioning, not linking. It's
nice to hand wave off the versioning problems of this approach, but in
practice this isn't really different from the mechanisms used to version
shared libraries and suffers all the same pitfalls. You've just replaced ld.so
with a bunch of running daemons and turned it into a distributed computing
problem. Don't get me wrong, I love plan 9 and its namespaces and long for
them in the practical world, but this is 6 of one, half dozen of the other.

Also, filesystem namespaces in Linux are a privileged operation, so these
kinds of approaches don't work at all like they do in plan 9.

------
vrdabomb5717
What fascinates me is how close this synthetic filesystem is to making REST
calls with a small client library. All your application needs to know is the
path to call and what HTTP verb to use. Even the way the author introduces
versioning sounds familiar: using a versioning file is similar to having a
version at the beginning of your URL.

I appreciate how good architectures all resemble each other at some point,
with only the transport layers differing between applications.

~~~
robmccoll
REST is just CRUD over HTTP. The file interface and Unix everything-is-a-file
concept are one of the earliest well-defined CRUD interfaces in computing. The
author (and Plan 9) are taking these concepts to their logical conclusion.

------
rsp1984
An excellent read. Moreover, most linkers these days have the ability to strip
out unused code from your executable if you are using static linkage.

That means if your tiny executable uses just some methods of a gigantic
library, it can do so an it will still stay tiny. Contrast that with dynamic
linkage where in practice the whole gigantic library has to ship with your
executable because you can never be sure that the host will already have the
right version of libGigantic installed. What a mess.

Shared libraries are the reason a lot of app installations are several GB
these days. And I dare say that 99.x % of the code that's shipped with an
installation package _never_ gets used by the app installed and that most
compiled app code would comfortably fit on a couple of floppy disks if static
linkage was used rigorously.

~~~
jjoonathan
> if your tiny executable uses just some methods of a gigantic library, it can
> do so an it will still stay tiny

How many other functions do the ones you explicitly call drag into the
executable? How do you propose to deduplicate them and their resources between
processes? I'm specifically thinking about UI code. Having 30 different copies
of your UI toolkit and its resources would be silly, even if each was stripped
to 1/3 the size of the original. "Just use IPC and do UI in the window server"
sounds an awful lot like X11, whose architecture we were just starting to
migrate away from!

> Shared libraries are the reason a lot of app installations are several GB
> these days.

What are you referring to? vcredist is 7MB. Direct X runtimes are ~100MB. Qt
is 20MB. Not small, but not the majority of several GB. Sure it's silly to
ship an app with a shared library rather than statically link it, but I think
you're exaggerating the scale of the problem and understating the number of
profitable examples of library sharing, especially when it comes to UI
libraries.

~~~
hyc_symas
The X Window System architecture solves a specific problem - allowing clients
to use graphical applications no matter where the app actually runs. This was
important in the 1980s and is still worthwhile today - no matter how much
compute power you can stash under your desk or wear on your wrist, you can
cram orders of magnitudes more into a data center. Having a uniform interface
that's independent of where the heavy lifting actually happens is a crucial
accomplishment for X, and throwing that away with the focus on Direct X is
ultimately a mistake.

~~~
hbogert
Now, only if X actually worked when you want it to run in a data center and
see it on your laptop. X is horrible when the latency goes above 5ms.

~~~
hyc_symas
Not saying X is anywhere near perfect. But losing network transparency is a
mistake.

Personally I always preferred MGR. I suppose that evolved into Plan 9's window
system, but I've never tried the latter. My attempt to bring MGR into the
modern age is clunky at best.
[https://github.com/hyc/mgr](https://github.com/hyc/mgr) But there's a lot to
be said for a lightweight network-transparent protocol with a braindead-simple
runtime.

~~~
jjoonathan
> losing network transparency is a mistake.

Why? X11 competes with VNC to provide remote interaction. VNC wins, and not by
a small factor, on the apps I use day-to-day.

------
jeorgun
"If you think about it, if your code is small and clean, you wouldn’t feel the
need for shared libraries."

Yes you would, for any sane definition of "small and clean". Code can't be
made arbitrarily small; some problem spaces are fundamentally complex.

For what it's worth I have played around some with dynamic libraries and with
FUSE, and I found the former incomparably easier to work with. Maybe that
speaks more to FUSE in particular than to the idea in general (or maybe it's
just me being bad at FUSE), but that's been my experience.

------
ghshephard
OS X never[1] seems to require more than 8 GB, no matter how many applications
you have running - I've got 28 right now, including the Office Apps. Google
Earth, Various Browsers, etc... and system is ticking along nicely. I've got
to believe that is a testimony to the power of shared libraries.

[1] Yes, I'm aware there is an infinite range of work loads (Video, Audio,
PhotoShop, Virtualization, Oracle, etc.. that can use up as much memory as you
throw at it - I'm talking about the joe average user workloads here

------
101914
"In summary, the answer is to write lean, efficient and small pieces of
code..."

What if the user could avoid "non trivial" programs, i.e. the ones that
purportedly make it impossible to avoid shared libraries?

To put it another way, what if a user could have a system containing only
trivial programs that each do one thing and then use them in combination to do
"complex" tasks?

The term "non trivial software" is one I see continuously used as an
underlying assumption and hence a justification for maintaining the status quo
of all manner of existing software problems.

I do not want more "non trivial" software. I want simplicity and reliability.
Not to mention comprehensibility. I get those things from so-called "trivial"
software.

When some the "non trivial software" I am forced to use becomes too reliant on
too much resources or too many dependencies, I stop using it and find an
alternative.

This strategy has worked beautifully for me over the years.

Shared libraries was a useful concept in its day.

In my humble opinion, those days have passed. GB of memory is more than enough
for me personally.

I like to use crunched binaries in my systems. As such, I do not seek out
"non-trivial" software and am always looking to eliminate any existing
dependencies on it.

~~~
guard-of-terra
Web browser is an extremelly non-trivial program and still I am assured we
both use it.

~~~
falcolas
Not all of them. Some folks still browse the web via Emacs, Lynx, and Surf -
all rather simplistic browsers which work fine for a large number of sites,
including this one.

~~~
guard-of-terra
Simplistic, but not non-trivial. Neither of them use curl-the-binary for http
requests, I think.

------
guard-of-terra
I'm afraid that VFS-based APIs will be even more fragile as the life cycle
goes, and even harder to reason about.

------
geofft
Abstracting away the parts that are Plan-9-implementation-specific, this
article seems to be advocating replacing shared libraries with remote
procedure calls / a network API, or more fundamentally, calls that can cross
address spaces. It's worth nothing that this was an approach that, as I
understand it, _predated_ the advent of shared libraries on UNIX. Terminal
handling (termios) and the X Window System protocol both come to mind, and
we've been slowly moving away from that at least for X (libGL, Wayland/Mir,
etc.). It's also strongly reminiscent of Mach's approach of message-passing
between daemons, which was a decent idea, but ultimately failed because of
performance.

There are definitely advantages to address-space isolation: an unintentional
mistake in one component is much less likely to affect the other, the two
components can pull in conflicting versions of dependencies, etc. But
versioning and ABI compatibility remain issues. I think this post briefly
touches on the versioning problem and assumes that providing both the old and
new version of the library-daemon would solve it: that's probably technically
true, but you'd need to keep every version of the library around to avoid the
problem of libc introducing bugs in the process of fixing other bugs (the only
concrete problem mentioned here). So yes, there's definitely more flexibility
to solve problems than in the current implementations of dynamic linkers, but
the problems themselves remain hard.

Meanwhile, you've also introduced the difficult constraint that libraries have
to operate on copies of all your data. The hypothetical crypto library here is
copying every block of ciphertext over an inter-process call, decrypting it,
and copying it back to the original program. Apart from making security folks
generically twitchy at all the copies of secret data running around, this is
going to be awful for performance. And each side either has to trust the other
side not to be trying to exploit it (which reduces the benefits of address-
space isolation), or verify the data structures' integrity (which makes things
even slower). It's possible that with good implementations of cross-process
shared memory and low-overhead, secure message encodings (like Cap'n Proto),
you could make this better, but it'll be a bit of a project.

I'm happy to admit that the implementations of dynamic linking are all less
than awesome. Fundamentally, there's no reason that you can't design a shared-
library system with all of the properties in this design, including the
ability to load two copies of the same library that differ only by minor
version, to satisfy dependencies of two different components. Even the current
GNU linker (which is not my favorite dynamic linker) supports symbol
versioning, so it _could_ offer both the GLIBC_2.18 and GLIBC_2.19 versions of
a function in the same library, although this facility isn't used very much.

------
malkia
My experience from Windows is that DLL's are both blessing and pain. I'll give
a concrete example, a not so small Qt application.

If you use the DLL version of Qt you get the following benefits:

    
    
      - Faster link time (More on this later). This is the big winner.
    
      - Minor-versions can be updated apart from the executables or other DLL's using it. This requires the library itself to be well written and respect that (Qt follows good procedures, sqlite is awesome example, but there are some terrible ones - like the P4 C++ api which constantly adds/removes virtual members, enums, etc.)
    
      - Fully optimized (/LTCG) dlls, if Qt allows to mix all in one DLL then even better (calls between QtCore and QtGui could be further reduced, code inlined, etc.). Okay you don't get full whole app (only full static link would do), but you get still good link times with overall good optimization.
    
      - Exceptions kept there, not propagated (controversial whether this is a good idea, but I like it).
    
      - No clashes with other (usually) statically linked libraries - like png, zlib, etc. Unless you really want both QtCore and your app to use exactly the same versions (for one reason or another).
    
      - Smaller executable size (this does not matter lately, but may come)
    
    

Minuses of DLL, pluses of static linking:

    
    
      - Deployment madness no more. You push one executable, everyone is happy. You don't need to make sure that pushed DLLs (.so) won't break other executables. You might be able to rollback or work on specific version if things go badly (this could be also done with Dlls', .so, but they have to reside along with the executable).
    
      - Somewhat faster execution time (less time to resolve symbols, load DLLs, etc.)
    
      - You get RTTI, exceptions, and in older versions of certain systems __declspec(thread) and other things working correctly.
    
      - Real full whole code optimization. But you should have other release targets (for development).
    
    

My biggest pain with DLLs (and executables) was on Windows where people would
sync directly from P4 (perforce) and ran directly the executables from where
they were synced (this way, there is no extra step - "Press or Run this thing
after syncing"). But this comes with price - on Windows you can't replace
running executable with another (or DLL). I've tried the various hacks where
you set the executable/dll to be loaded from the NET or CD so that it's fully
put in the swap, but still does not work. A correct way seems to be a proper
deployment - you sync, you ran some kind of "Deployment" tool and then work.

From Windows point of view I really liked the idea of loading the DLL by first
looking at your executable path, then for other places. Seems like UNIX is not
this way, but then on UNIX people had stabilized places and locations for
things (/usr/lib, /usr/local/lib, etc.).

Another problem is if you want to have simulatenously 32-bit and 64-bit
dlls/.so. I like OSX's solution most of fat binaries, but seems like Linux
folks do not like that (there was a proposal time ago), and on Windows that's
out of the question.

It doesn't scale really when you get more architectures/models, but if you
have mainly 2 it might work pretty well (apart from being pain for the build
system).

~~~
david-given
Regarding deployment via a single executable: that's only true if your
application consists solely of code (or resources which can be deployed in
code, such as XBMs). As soon as you start having resources which aren't code,
you end up having to deploy them as well, which means you have to start
versioning them, etc, etc. The version of Chrome I'm typing this into is 120
files, one of which is the executable.

Regarding faster execution time: that's debatable. You certainly won't
necessarily get faster application startup, because you're going to have to
page in all that code, where with a DLL it's probably already in memory and
used by another process. You're also likely to use a lot more memory, because
each process will have its own Qt instance, and they won't be shareable.

Re RTTI and exceptions: works for me on Linux! Does this really not work on
Windows?

Re whole code optimisation: that I'll grant you. And DLL code is typically
terrible (because of hacks needed to allow text pages to be shared between
processes).

