Hacker News new | past | comments | ask | show | jobs | submit login

Linux and GCC today have the ability to compile and run fully static executables, I don't understand why this isn't done...





> I don't understand why this isn't done

Because when there's a security update to (say) OpenSSL, it's better for the maintainers of just that library to push an update, as opposed to forcing every single dependent to rebuild & push a new release.


My main issue with this rationale is that, in the vast majority of production environments (at least the ones I've seen in the wild, and indeed the ones I've built), updating dependencies for dynamically-linked dependents is part of the "release" process just like doing so for a statically-linked dependent, so this ends up being a distinction without a difference; in either circumstance, there's a "rebuild" as the development and/or operations teams test and deploy the new application and runtime environment.

This is only slightly more relevant for pure system administration scenarios where the machine is exclusively running software prebuilt by some third-party vendor (e.g. your average Linux distro package repo). Even then, unless you're doing blind automatic upgrades (which some shops do, but it carries its own set of risks), you're still hopefully at least testing new versions and employing some sort of well-defined deployment workflow.

Also, if that "security update" introduces a breaking change (which Shouldn't Happen™, but something something Murphy's Law something something), then - again - retesting and rebuilding a runtime environment for a dynamically-linked dependent v. rebuilding a statically-linked dependent is a distinction without a difference.


I would have agreed with this statement about five years ago. (Even though you would have had to restart all the dependent binaries after updating the shared libs.)

Today, with containers becoming increasingly the de facto means of deploying software, it's not so important anymore. The upgrade process is now: (1) build an updated image; (2) upgrade your deployment manifest; (3) upload your manifest to your control plane. The control plane manages the rest.

The other reason to use shared libs is for memory conservation, but except on the smallest devices, I'm not sure the average person cares about conserving a few MB of memory on 4GB+ machines anymore.


> Today, with containers becoming increasingly the de facto means of deploying software

I think that's something of an exaggeration.

Yes, containers are popular for server software, but even then it's a huge stretch to claim they are becoming de facto.


App bundles on MacOS and iOS are basically big containers, though there is some limited external linking through Apple's frameworks scheme.

And obviously video game distribution has looked like this since basically forever as well.


> App bundles on MacOS and iOS are basically big containers, though there is some limited external linking through Apple's frameworks scheme.

There's a file hundreds of megabytes large containing all the dynamically-linked system libraries on iOS to make your apps work.


Video games do not run on/as containers. Quite the opposite, in fact.

In addition to pjmlp's list, Steam is pushing toward this for Linux games (and one could argue that Steam has been this for as long as it's been available on Linux, given that it maintains its own runtime specifically so that games don't have to take distro-specific quirks into account).

Beyond containers / isolated runtime environments, the parent comment is correct about games (specifically of the console variety) being historically nearly-always statically-linked never-updated monoliths (which is how I interpreted that comment). "Patching" a game after-the-fact was effectively unheard of until around the time of the PS3 / Xbox 360 / Wii (when Internet connectivity became more of a norm for game consoles), with the sole exception of perhaps releasing a new edition of it entirely (which would have little to no impact on the copies already sold).


Kind of.

They do on XBox, Swift, iOS, Android sandboxes.


> Today, with containers becoming increasingly the de facto means

This assertion makes no sense at all and entirely misses the whole point of shared/dynamic libraries. It's like a buzzword is a magic spell that makes some people forget the entire history and design requirements up to that very moment.


Sometimes buzzwords make sense, in the right context. This was the right context.

Assuming you use containers, you're likely to not log into them and keep them up to date and secure by running apt-get upgrade.

The most common workflow is indeed: build your software in your CI system, in the last step create a container with your software and its dependencies. Then update your deployment with a new version of the whole image.

A container image is for all intents and purposes the new "static binary".

Yes, technically you can look inside it, yes technically you can (and you do) use dynamic linking inside the container itself.

But as long as the workflow is the one depicted above, the environment no longer has the requirements that led to the design of dynamic linking.

It's possible to have alternative workflows for building containers: you could fiddle with layers and swap an updated base OS under a layer containing your compiled application. I don't how common is that, but I'm sure somebody will want/have to do it.

It all boils down to whether developers still maintain control over the full deployment pipeline as containers penetrate the enterprises (i.e. whether re retain the "shift to the left", another buzzword for you).

Containers are not just a technical solution, they are the embodiment of the desire of developers to free themselves from the tyranny of filing tickets and waiting days to deploy their apps. But that leaves the security departments in enterprises understandably worried as most of those developers are focused on shipping features and often neglecting (or ignoring) security concerns around things that live one layer below the application they write.


Shared libraries have largely proven that they aren't a good idea, which is why containers are so popular. Between conflicts and broken compatibility between updates, shared libraries have become more trouble than they are worth.

I think they still make sense for base-system libraries, but unfortunately there is no agreed upon definition of 'base-system' in the wild west of Linux.


Some of us use linux as a desktop environment, and like having the security patches be applied as soon as the relevant package has updated.

As a user of the Linux desktop, I really love it when library updates break compatibility with the software I use too. Or can't be installed because of dependency conflicts.

Containers are popular because shared libraries cause more trouble than they are worth.


Containers most likely wouldn't have existed if we had a proper ecosystem around static linking and resolution of dependencies. Containers solve the problem of the lack of executable state control, mostly caused by dynamic linking.

More broadly, containers solve the problem of reproducibility. No longer does software get to vomit crap all over your file system in ways that make reproducing a functioning environment frustrating and troublesome. They have the effect of side-stepping the dependencies problem, but that isn’t the core benefit.

But the images themselves are not easily reproducible with standard build tooling.

True—but that's far less of a problem, because it rarely occurs unexpectedly and under a time crunch.

Diffing two docker images to determine the differences between builds would be far less onerous than attempting to diff a new deployment against a long-lived production server.


Dynamic linking isn't the issue. Shared libraries are the issue. You could bundle a bunch of .so files with your executable & stick it in a directory, and have the executable link using those. That's basically how Windows does it, and it's why there's no "dependency hell" there despite having .dlls (dynamically linked libraries) all over the place.

Shared libraries are shared (obviously) and get updated, so they're mutable. Linux systems depend on a substantial amount of shared mutable state being kept consistent. This causes lots of headaches, just as it does in concurrent programming.


And the reason we're using containers in the first place is precisely because we've messed up and traded shared libs for having a proven-interworking set of them, something that can trivially be achieved using static linking.

Actually the main selling point of containers has nothing to do with "proven interworking", but the ability to deploy and run entire applications in a fully controlled and fully configurable environment.

Static libraries do nothing of the sort. In fact, they make it practically impossible to pull it off.

There's far more to deploying software than mindlessly binding libraries.


On Windows, I don't need to use Docker in order to run a program in a reproducible way. I just download a program, and in 90% of cases it "just works" whether I'm running Windows 10, Windows 8, or the decade-old Windows 7.

Furthermore, installing that program will (again, in 90% of cases at least) not affect my overall system configuration in any way. I can be confident that all of my other programs will continue to work as they have.

Why? Because any libraries which aren't included in the least-common-denominator version of Windows are included with the download, and are used only for that download. The libraries may shipped as DLLs next to the executable, which are technically dynamic, but it's the same concept—those DLL's are program-specific.

This ability is what I really miss when I try to switch to desktop Linux. I don't want to set up Docker containers for random desktop apps, and I don't want a given app to affect the state of my overall system. I want to download and run stuff.

---

I realize there's a couple of big caveats here. Since Windows programs aren't sandboxed, misbehaving programs absolutely can hose a system—but at least that's not the intended way things are supposed to work. I'm also skipping over runtimes such as Visual C++, but as I see it, those can almost be considered part of the OS at this point. And I can a ridiculous number of versions of MSVC installed simultaneously without issue.


> On Windows, I don't need to use Docker in order to run a program in a reproducible way. I just download a program, and in 90% of cases it "just works" whether I'm running Windows 10, Windows 8, or the decade-old Windows 7.

One program? How nice. How about 10 or 20 programs running at the same time, and communicating between themselves over a network? And is your program configured? Can you roll back changes not only in which versions if the programs are currently running but also how they are configured?

> This ability is what I really miss when I try to switch to desktop Linux. I don't want to set up Docker containers for random desktop apps,

You're showing some ignorance and confusion. You're somehow confusing application packages and the natural consequence of backward compatibility with containers. In Linux, deploying an application is a solved problem, unlike windows. Moreover, docker is not used to run desktop applications at all. At most, tools like Canonical's Snappy are used, which enable you to run containerized applications in a completely transparent way, from installation to running.


> the ability to deploy and run entire applications in a fully controlled and fully configurable environment

But isn't the reason to have this fully controlled and fully configurable environment to have a proof of interworking? Because when environment is in any form different you can, and people already do, say that it's not supported.


> But isn't the reason to have this fully controlled and fully configurable environment to have a proof of interworking?

No, because there's far more to deploying apps than copying libraries somewhere.


> Actually the main selling point of containers has nothing to do with "proven interworking", but the ability to deploy and run entire applications in a fully controlled and fully configurable environment.

Which is exactly the same selling point as for static linking.


Based on my experience this is very rarely the case unless you have an extremely disciplined SecOps team.

> Based on my experience this is very rarely the case

You must have close to zero experience them because that's the norm on any software that depends on, say, third-party libraries that ship with a OS/distro.

Recommended reading: Debian's openssl package.

https://tracker.debian.org/pkg/openssl


You are talking about a FOSS project, I am talking about a company that has a service that uses OpenSSL in production.

These are not diametrically opposed. Your company can have a service that uses OpenSSL in production that runs on Debian to automatically take advantage of Debian patches and updates if it's linked dynamically to the system provided OpenSSL.

You can either employ an extremely disciplined SecOps team to carefully track updates and CVEs (you'd need this whether you're linking statically or dynamically) or you can use e.g. Debian to take advantage of their work to that end.


Every single company that I used to work for had an internal version of Linux that they approved for production. Internal release cycles are disconnected from external release cycles. On the top of that, some of these companies were not using system-wide packages at all, you had to reference a version of packages (like OpenSSL) during your build process. We had to do emergency patching for CVEs and bump the versions in every service. This way you can have 100% confidence what a particular service is running with a particular version of OpenSSL. This process do not depend on Debian's (or other FOSS vendor's) release cycles and the dependencies are explicit, therefore the vulnerability assessment is simpler (as opposed to go to every server and check which version is installed). Don't you think?

If you need that level of confidence - sure. But it's going to cost a lot more resources and when you're out of business your customers are fully out of updates. I wouldn't want to depend on that (then again a business customer will want to maintain a support contract anyway).

Isn't a containerized solution a good compromise here? You could use Debian on a fixed major release, be pretty sure what runs and still profit from their maintenance.


What I'm saying is that the only way you can get away with not having an "extremely disciplined SecOps team" is to depend on someone else's extremely disciplined SecOps team. Whether you link statically or dynamically is orthogonal.

> Every single company that I used to work for had an internal version of Linux that they approved for production.

I can't deny your experience, but meanwhile I've been seeing plenty of production systems running Debian and RHEL, and admins asking us to please use the system libraries for the software we deployed there.

> Internal release cycles are disconnected from external release cycles.

That seems to me like the opposite of what you'd want if you want to keep up with CVEs. If you dynamically link system libraries you can however split the process into two: the process of installing system security updates doesn't affect your software development process for as long as they don't introduce breaking changes. Linking statically, your release cycles are instead inherently tied to security updates.

> We had to do emergency patching for CVEs and bump the versions in every service.

What is that if not tying your internal release cycles to external release cycles? The only way it isn't is if you skip updates.

> This process do not depend on Debian's (or other FOSS vendor's) release cycles and the dependencies are explicit, therefore the vulnerability assessment is simpler (as opposed to go to every server and check which version is installed). Don't you think?

I don't know, going to every server to query which versions of all your software they are running seems similarly cumbersome. Of course, if you aren't entirely cowboying it you'll have automated the deployment process whether you're updating Debian packages or using some other means of deploying your service. Using Debian also doesn't make you dependent on their release cycles. If you feel like Debian isn't responding to a vulnerability in a timely manner, you can package your own version and install that.


> You are talking about a FOSS project

I'm talking about the operating system that's pretty much a major component of the backbone of the world's entire IT infrastructure, whether its directly or indirectly through downstream distros that extend Debian, such as Ubuntu. Collectively they are reported to serve over 20% of the world's websites,and consequently they are the providers and maintainers of OpenSSL that's used by them.

If we look at containers, docker hub lists that Debian container images have been downloaded over 100M times, and ubuntu container images have been downloaded over 1B times. These statistics don't track how many times derived images are downloaded.


I know the ability exists, but I'm pretty sure that it's not exactly easy to get it working. Last time I tried, it immediately failed because my distribution wasn't shipping .a files (IIRC) for my installed libraries. There's a lot of little things that don't quite work because nobody's using them so they're harder to use so nobody uses them...

It's easy to get working provided that you compile _everything_ from source. You can either omit glibc from this, or accept that it will still dynamically load some stuff at runtime even when "statically" linked, or switch to musl. A nice benefit is that LTO can then be applied to the entire program as a whole.

Yep, exactly this. And quirks of glibc and friends make fully static compilation likely to produce odd failures, unfortunately.

Glibc does not really support static linking.

Kind of my point — so much software depends on it, it’s difficult to statically link more things.

Musl is a pretty good replacement, I have been using it for years without any troubles.

I like Musl, but it is a source of pain at times too: https://github.com/kubernetes/kubernetes/issues/64924 https://github.com/kubernetes/kubernetes/issues/33554

Admittedly you could put that on the Kubernetes folks, but the same problem doesn't exist with glibc.


I don't see how zig cc would help with that. Your distribution probably also doesn't ship all the source files for your packages either, and there's no other way to statically link.

Gentoo does.

I've used Gentoo for almost a decade now, and no, that's not true. emerge doesn't just randomly keep source files on disk, and certainly not in a form easy to link to. In fact, Gentoo is worse than Debian for static linking, because not all packages have IUSE=static-libs. If it doesn't, you need to patch that ebuild and potentially many dependencies to support it. On the other hand, on Debian, the standard is for -dev packages to come with both headers and static libraries.

N°1 most harmful post on the internet : https://akkadia.org/drepper/no_static_linking.html

I am convinced that Drepper's insistence on dynamic linking has set the linux desktop useability and developer friendliness back literal decades.


I work on an embedded linux system that has 256 MB of RAM. That can get eaten up really fast if every process has its own copy of everything.

~15 years ago my "daily driver" had 256MB of RAM and it was perfectly usable for development (native, none of this new bloated web stuff) as well as lots of multitasking. There was rarely a time when I ran out of RAM or had the CPU at full usage for extended periods.

Now it seems even the most trivial of apps needs more than that just to start running, and on a workstation, less than a year old with 4 cores of i7 and 32GB of RAM, I still experience lots of lag and swapping (fast SSD helps, althougn not much) doing simple things like reading an email.


I'm on a Mac with 32 GB of Ram.

According to Activity Monitor, right now:

• 4.26 GB are being used by apps

• 19.52 GB are cached files

• 8.22 GB are just sitting idle (!)

Now, I'm not running anything particularly intensive at the moment, and I make a point of avoiding Electron apps. I also rebooted just a few hours ago for an unrelated reason.

But the fact is that I've monitored this before—I very rarely manage to use all my RAM. The OS mostly just uses it to cache files, which I suppose is as good a use as any.


and I make a point of avoiding Electron apps

I do that personally too, but in a work environment that is unfortunately not always possible --- and also responsible for much of the RAM usage too.


Slack is the biggest culprit IME. If there was a native client, I'd take it like a shot


> ~15 years ago my "daily driver" had 256MB of RAM and it was perfectly usable for development

What I failed to mention was that the rootfs is also eating into that (ramdisk). In your case I'm guessing your rootfs was on disk.


Oh my god, just try running a modern OS on a spinning rust drive. It's ridiculous how slow it is. It's obvious that modern developers assume everything is running on SSD.

Are you sure? I've been running Linux for a long time with no page file. From 4gb to 32gb (The amount of RAM I have now) and have literally only ran out of RAM once (and that was because of a bug in a ML program I was developing). I find it very hard to believe that you experience any swapping at all with 32gb, much less "lots".

You've likely not experienced the amazing monstrosity that is Microsoft Teams:

https://answers.microsoft.com/en-us/msoffice/forum/all/teams...

There's a screenshot in there showing it taking 22GB of RAM. I've personally never seen it go that high, but the 10-12GB of RAM that I have seen is absolutely ludicrous for a chat app. Even when it's initially started it takes over 600MB. Combine that with a few VMs that also need a few GB of RAM each, as well as another equally-bloated Electron app or two, and you can quickly get into the swapping zone.


How is that possible, 22GB? Fucking electron. You would think, at least, that Microsoft would code a fucking real desktop app.. I hate web browsers.

I also experience the same thing with Mattermost (the client also being an Electron app). The memory bloat usually comes from switching back and forth from so many channels, scrolling up to load more chat history, and lots and lots of image attachments (and of course, the emoticons).

scrolling up to load more chat history, and lots and lots of image attachments (and of course, the emoticons).

I remember comfortably browsing webpages with lots of large images and animated GIFs in the early 2000s, with a fraction of the computing power I have today. Something has become seriously inefficient with browser-based apps.


You said yourself you managed to find a case where you ran out of memory. Why do you find it "very hard to believe", knowing nothing about his use cases, that his job doesn't involve exactly the sort of situations that consume vast amounts of RAM. Why do people insist with such conviction that "it doesn't happen to me, therefore it's inconceivable that it happens to someone else, doing something totally different than what I'm doing, into which I have no insight". Baffling.

> Why do you find it "very hard to believe", knowing nothing about his use cases, that his job doesn't involve exactly the sort of situations that consume vast amounts of RAM.

Probably the GGP said they experience lag while "doing simple things like reading an email." Now, maybe GGP meant to add "while I'm sequencing genes in the background", but since that was left out I can see how it would be confusing! :)


That's fair. Good point.

Then don't statically link. "Emebedded systems" requirements shouldn't dictate "Desktop" or "Server" requirements.

You should look into fdpic as format to store your binaries in. It think i might lessen your concerns.

So dynamic linking makes sure they each dependency is loaded into memory just once?

Could someone estimate how much software nowadays is bloated by duplicated modules?


Note that when using static linking, you don't get a copy of everything, just everything you actually use.

It doesn't alter the fundamental point: shared libraries save both persistent storage and runtime memory.


> Note that when using static linking, you don't get a copy of everything, just everything you actually use.

Which is a significant fraction of everything even if you call simple like printf.

> It doesn't alter the fundamental point: shared libraries save both persistent storage and runtime memory.

I fail to see the argument for this. Dynamic linking deduplicates dependencies and allows code to be mapped into multiple processes "for free".


Have you measured it? How much is dynamic linking saving you? How many processes are you running on embedded systems with of 256MB or RAM?

Ok, so I just measured with a C "hello world".

My dynamically-linked executable is 8296 bytes on disc. My statically-linked executable is 844,704 bytes on disc.

So if I had a "goodbye world" program as well, that's a saving of about 800KB on disc.

Now one can argue the economics of saving a bit under a megabyte in a time where an 8GB microSD card costs under USD5 in single quantities, but you can't argue that it's a relatively big saving.

At runtime, the dynamic version uses (according to top) 10540 KB virtual, 540 KB resident, and 436 KB shared. The static version uses 9092 KB virtual, 256 KB resident, and 188 KB shared.

I haven't investigated those numbers.


256MB of RAM is a fairly large amount–this is how much iPhone 3GS had, for instance. It relied on dynamic linking to system libraries heavily and ran multiple processes.

That proves the point: with multi-GiB memory nowadays you can fit many times over all the space that the iPhone saved using dynamic linking.

I would rather not be limited on my laptop to iPhone apps from ten years ago.

My second sentence was arguing for dynamic linkage (I called them "shared libraries", but I think that's a fairly common nomenclature).

Fear not, static linking is back on the rise!

https://stackoverflow.com/questions/3430400/linux-static-lin...

I've also linked this Zig post into that list (and happy to add further languages if you can provide a link that shows that they have good out-of-the-box static linking support).


Rust has some really good static linking support. If you compile with the `musl` targets (e.g. x86_64-unknown-linux-musl), it will give you a fully statically linked binary.

LD_PRELOAD is really useful, especially if you want to change memory allocators without having to recompile.

I never realised people were moaning about shared libraries.


Also if you want to hook all the `open` style calls to make them work in a very tight sandbox :)

(e.g. I have Firefox running under Capsicum: https://bugzilla.mozilla.org/show_bug.cgi?id=1607980)


Dynamic linking is what has allowed avoiding even longer backwards compatibility and even more cruft like Windows has.

Just for readers, do you mean avoiding incompatibility?

Whether they did or not, they spoke truth. Linux's (userland, not kernel) backward compatibility is ridiculously bad unless you're compiling from source. This is not the case on Windows.

No, I mean Linux has been able to avoid long-term backwards compatibility.

From the late 90s to the late aughts it’s unlikely anyone could have used a desktop for any work. 256MB of memory was a lot of memory not so long ago.

Looking at how a browser, an IDE, and a few compilation processes will gladly chew through 8GB of memory... it’s not necessarily horrible, but this is a modern contrivance.


One I think people forget about is ASLR. What symbols are you going to shuffle? At least with dynamic linked dependencies the linker can just shove different shared objects into different regions without much hassle.

Other have mentioned the other points: runtime loading (plugins), CoW deduplication and thus less memory and storage.


In addition to the other excellent reasons mentioned here, there's also the fact that some libraries deliberately choose to use runtime dynamic linkage (dlopen) to load optional/runtime-dependent functionality.

If you want make a program that supports plugins, you have only two real options: non-native runtimes or dynamic linking. And the later gets you into a lot of trouble quickly. The former trades performance and memory usage for ease of use and a zoo of dependencies.

> some libraries deliberately choose to use runtime dynamic linkage (dlopen) to load optional/runtime-dependent functionality.

Also known as plugins.

It's not a design flaw, it's a feature.


Ironically that is how they have been implemented since the dawn of time.

Dynamic linking was added around Slackware 2.0 timeframe.


You cant statically compile in glibc right?

sure you can:

    $ cat hello.c
    #include <stdio.h>

    int main() {
        printf("hello world!\n");
    }
    $ gcc -o hello hello.c
    $ ldd hello
            linux-vdso.so.1 (0x00007ffff9da0000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fed449d0000)
            /lib64/ld-linux-x86-64.so.2 (0x00007fed45000000)
    $ gcc -o hello hello.c -static
    $ ldd hello
            not a dynamic executable

You can, but it is not supported. Certain features (NSS, iconv) will not work.

> Certain features (NSS, iconv) will not work.

If you're the kind of person who wants static linking then you really don't want these features.

The real problem is that statically linked programs under Linux don't (didn't?) support VDSO, which means that syscalls like gettimeofday() are suddenly orders of magnitude slower.

In the end, we had to do a kind of pseudo-static linking - link everything static except glibc.


I think the vDSO page is mapped into every process regardless of how the program is linked, although you may have difficulty using it.

Vdso is supported with static linking in Musl libc at least.

I don't think the parent comment's point was if this was technically possible.

glibc is GPL licensed, and the GPL explicitly forbids statically linking to it unless your code is GPL too.

Thus any non-GPL project has it's license tainted by the GPL if you statically link it.

It's not a technical limitation, it's a legal one.


This is false. The license for glibc is the LGPL, not the GPL, and the LGPL has an exception to allow static linking without the whole code having to be under the LGPL, as long as the .o files are also distributed to allow linking with a modified glibc ("As an exception to the Sections above, you may also combine or link a "work that uses the Library" with the Library to produce a work containing portions of the Library, and distribute that work under terms of your choice [...] and, if the work is an executable linked with the Library, with the complete machine-readable "work that uses the Library", as object code and/or source code, so that the user can modify the Library and then relink to produce a modified executable containing the modified Library.")

Sounds like I got conned by my (poor) memory. I should have re-googled this before posting, thanks for the correction.

Yes, but for many projects it means no static linking in practice.

This isn't a legal limitation.

It's FUD.

See here -

https://www.gnu.org/licenses/gpl-faq.html#LGPLStaticVsDynami...


Thanks for the link, that's very useful. I should have re-googled this stuff instead of commenting "from memory", sorry for spreading the FUD :/

Also, right above that: https://www.gnu.org/licenses/gpl-faq.html#GPLStaticVsDynamic

Dynamically linking a GPL library is the same as statically linking a GPL library; the resulting executable must be GPL-licensed.


That's also a good reason to avoid glibc and switch to a musl-based system.

I might be uneducated but wasn't there a "system library exception" or something like that in GPL to prevent these problems?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: