Hacker News new | comments | show | ask | jobs | submit login

One of the main reasons for dynamic linking has become irrelevant, I believe: the availability of disk and memory space has grown faster than the size of the binary objects.



IMHO, static linking, be it using a mechanism alike what npm is doing or be it plain-old static linking of binaries also means "you link it, you own it".

For every package, you link statically, you as the parent package owner become responsible for all security flaws of all the packages you link statically.

And by "responsible" I mean: Every dependent security announcement of a dependent package also becomes your security announcement.

If you're AWESOMY 1.1 and you link statically against openssl and openssl announces a security flaw, then you better and quickly release AWESOMY 1.1.1 with an accompanying security announcement too.

Are you willing to do this? Do you trust the chain of dependencies all the way down to also be willing to do the same?

As a responsible developer, I'd much rather delegate that responsibility away to a packager or even the user, especially with well-known libraries like openssl.

If I'm owning AWESOMY 1.1 and link dynamically against openssl, then I don't have to do anything when openssl releases a security announcement. I can, if I want to, inform my users to maybe update openssl, but with some likelihood they are already doing this anyways for some other package.

For me as a developer, this is considerably more convenient.

Yes. Static linking has huge advantages for me as a developer too, but it also comes with a great many additional responsibilities I'm personally not willing to take on, also because I don't trust my dependencies to be as diligent about their dependencies.


At least on Linux, (commercial) people are packaging shared libs with their binary and using LD_PRELOAD wrappers anyway. "Own-it-by-proxy"-except-you're-shipping-it-so-you-kinda-own-it-anyway.

Dynamically linked, dlopen()-or-equivalent plugin designs are an exception. Otherwise, I'd like to see more statically-linked applications.


For every package, you link statically, you as the parent package owner become responsible for all security flaws of all the packages you link statically

Well, speaking as an administrator rather than a developer: a developer shouldn't be making that kind of policy decision (time of symbol resolution) to begin with barring a strong technical need (plug-in based architecture, etc.).

I mean, 90% of packages don't care whether their libraries are dynamic or static, but the ones that do can still be annoying. (Coreutils, of all things, requires dynamic linkage for stdbuf, though that seems to be considered a bug by the maintainer; on the other end of the spectrum, getting Perl to even build statically is like pulling teeth, and that was a design decision.)

Do you trust the chain of dependencies all the way down to also be willing to do the same?

The only time that is a problem is those cases where the package developer literally plunks down a bunch of code from his upstream into his own tree (think the embedded glib inside pkg-config, or the hideous monstrosity that is gnulib). I think all sane people agree this is bad -- if you aren't significantly altering the code (at which point it's "yours") there's no sense in doing that and risking a sync problem. But that's not even static linkage; that's just literal code-sharing.


The embedded glib inside pkg-config solves the chicken-egg problem: pkg-config requires glib and glib requires pkg-config. This way, you can build pkg-config first with its embedded glib, then glib itself and then go solve your own problems, instead of butchering the build system to make them build at all.


Oh, true, and I didn't mean that as a criticism of pkg-config; they only did that because it was literally the only option. In general you should only duplicate actual code if it's the only option, was my point; pkg-config is the pattern and gnulib is the anti-pattern in that.


I believe that what you describe is the remaining argument in favor of dynamic linking.

That said, I'd like to point out that it is convenient for the developer - since customers are ultimately interested to know if your software is vulnerable, and you'll have to explain how the vulnerability affects it -.

It's also a double edged sword: openssl has a good ascending compatibility record, but that can't be said about all user space libraries. IOW, it's tricky to guarantee that your software will work flawlessly across all the incarnations of CentOS 6.x, if you have a lot of external dependencies, for example.


IOW, it's tricky to guarantee that your software will work flawlessly across all the incarnations of CentOS 6.x,

It's not so tricky, since Red Hat specifies nowadays what guarantees can be expected for what packages:

https://access.redhat.com/articles/rhel-abi-compatibility#Ap...


Dynamic linking is required for anything resembling a plugin architecture (such as PAM).


Not really—one could use processes and some form of IPC (shared memory, pipes, whatever). The efficiency could be pretty bad, but the safety and reliability could be better.


In fact, OpenBSD delegates logins to a login_<foo> binary, e.g. http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man8/.... That works. OpenBSD's system is not as flexible as PAM, with the attendant upsides and downsides.


Yeah. There's a reason I stick with OpenBSD.

I have absolutely no need for my authentication system to be "flexible".


"And by "responsible" I mean: Every dependent security announcement of a dependent package also becomes your security announcement."

Yes, and no, because as the developer you can look at the security vulnerability and decide if its actually exploitable in your application. That is assuming you can determine it, but in a lot of cases its simple. Especially in a huge library like openSSL. Say for example the only thing i'm using openSSL for is some limited functionality, say SHA256, then I probably can ignore 99.99% of the security issues because they just won't apply.

I've been in this situation with an embedded platform that ships as part of the product I work on. Its pretty much got daily security updates, and yet we rarely get hit by any of them because our usage of the platform is like 1% of its functionality.


This is an oft-repeated mantra, but is it really true? I don't have a linux desktop running here, but try an experiment: start up a desktop environment such as KDE, and write "free -m" and see how much is there under "shared" heading. Without shared libraries, some extra memory corresponding to some multiple of that number would be used.

To calculate exactly how much memory is saved by shared libraries, you'd need to write a kernel module to walk the internal structures describing physical pages and summing reference counts of used pages. Maybe it's already been done?


some extra memory corresponding to some multiple of that number would be used

Depends. If I have 7 instances of my terminal emulator loaded (say I hadn't discovered tmux yet or something), the loader can share their .rodata and .text segments, and in many cases it does (YMMV; heuristics apply; void where prohibited; etc.). So a lot of things that are in shared libraries right now might still be only loaded into memory once if their binaries are segmented correctly.

The Plan9 people (Plan9 doesn't do dynamic linking) claim that the memory savings they get from skipping the relocation overhead are greater than the memory hit from the times that the same stuff does get loaded multiple times, though obviously always take self-promotion with a grain of salt.

Use case probably matters -- my servers run few processes to begin with, and it's often a lot of versions of the same process, whereas my laptop runs a ton of very different processes. Same sort of argument that makes me happy with udev on my laptop while also very happy with a static /dev tree on my servers.


I agree, sharing will (most probably) still work on segment-level [0]. But this means that if I have two different executables each linking an identical version of, say QT, in its .text segment, those copies will not be shared.

[0] In fact, certainly. Program loading works by mmaping the executable with MAP_SHARED flag into the process's address space, and the VFS takes care to keep all mappings coherent. The simplest way of keeping mappings coherent is to actually share the backing storage between all mappings. It's the foundation of CoW.


You could do that calculation in userspace with a stock kernel, by summing up the sizes of the non-writable mappings of the various library files in /proc/*/maps, then using mincore() to find out how much of the libraries are resident.


mincore is a system call and does not take a pid parameter. This means it has to be executed by each process individually, which would require injecting code into each running process and executing it in some way.

Unless there's a tool which makes this extremely simple (maybe Intel's pin?), I believe that writing a kernel module is simpler. The module's init function tallies up the pages and writes out the result into the kernel log. Then the module exits.


You don't need to run mincore() on target pids - you just need to write a tool that opens(O_RDONLY) and mmaps(PROT_READ) each library file, then calls mincore() on each page in the mapping to find out which pages of the library are loaded shared.

The results of mincore() in one process with a shared mapping of a file are enough to tell you how much of that file is loaded shared system-wide.


Just found out that physical page information is exposed through /proc: https://www.kernel.org/doc/Documentation/vm/pagemap.txt


Ah, good point! Though I think I'll do a dlopen on each file to mimic whatever the loader usually does. I'll do it as a weekend-project :)


In case it helps, here's my little utility (which just shows how much of a file is in core):

https://github.com/keaston/fincore


Are you suggesting that every single calculator, file manager, terminal, editor, package manager and music player ship with their own copy of libQt5* (around 78M) or gtk?


Static linking doesn't pull the entire library; it pulls the parts of the library it uses.

That said, QT and GTK probably should be broken into smaller libraries in a perfect world.


So I take it you're offering to buy me some more RAM, then? Unfortunately, my motherboard is already full at 4GB so this upgrade would be more expensive than you might expect at firs. What about my friends that have older 2GB laptgops? Do they get an upgrade too?

I'm joking, of course, but there IS a lot of variation in computer hardware, which makes this kind of broad generalization even more problematic.

Also, memory size is probably not that useful of a metric for modern hardware, where the penalty for overflowing the CPU caches can be huge.

edit:

Why statically link when you can prelink(8) instead?

( http://linux.die.net/man/8/prelink )




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: