
On building portable Linux binaries - sagargv
http://sagargv.blogspot.com/2014/09/on-building-portable-linux-binaries.html
======
hamburglar
I've always thought it would be interesting to have a statically linked
executable that was actually statically linked to externally-visible, signed
copies of shared libs that are embedded in themselves, so that the dynamic
linker could recognize when two processes, both of which are statically
linked, could still share the same pages in memory because they have both
statically linked the same lib+version. In other words, you get the memory and
io savings of dynamic libs, but you get none of the dependency resolution
problems (and none of the disk space savings).

~~~
ahomescu1
I think the same could be done much easier using KSM [1] (Kernel SamePage
Merging), which Linux now has.

1\.
[https://en.wikipedia.org/wiki/Kernel_SamePage_Merging_%28KSM...](https://en.wikipedia.org/wiki/Kernel_SamePage_Merging_%28KSM%29)

~~~
hamburglar
Except KSM is not as well-suited to this problem as a dynamic linker would be,
since KSM doesn't have as much insight into the semantics of the pages it's
analyzing. I don't know all the details of how KSM works, but it must apply
heuristics to _predict_ whether a page is going to get written to, whereas the
dynamic linker knows exactly which sections of dynamic libs can be shared. Not
to mention that KSM works less efficiently by recognizing duplicate pages
after the fact and merging them, rather than skipping the duplication in the
first place.

That's not to say that the KSM approach wouldn't be a big benefit (and you're
right, it's much easier given that it's completely transparent), but it's
definitely a less precise way to approach this specific problem.

~~~
ahomescu1
> Except KSM is not as well-suited to this problem as a dynamic linker would
> be, since KSM doesn't have as much insight into the semantics of the pages
> it's analyzing.

Why would it need those insights? If two pages are identical, KSM de-
duplicates them. The linker-assisted method would do the same, only earlier.

< [...] it must apply heuristics to predict whether a page is going to get
written to, whereas the dynamic linker knows exactly which sections of dynamic
libs can be shared.

I'm not sure which heuristics you're referring to. I think KSM de-duplicates
the pages and sets the single remaining copy as copy-on-write, to detect
writes (I wouldn't call this a heuristic). Both KSM and the dynamic linker
would only be able to de-duplicate entire pages, since that's what the
granularity the OS provides for sharing memory between processes.

> Not to mention that KSM works less efficiently by recognizing duplicate
> pages after the fact and merging them, rather than skipping the duplication
> in the first place.

OK, fair enough.

~~~
hamburglar
> I'm not sure which heuristics you're referring to.

As I said, the details are somewhat hazy for me, but my understanding is that
KSM maintains a notion of how recently and how frequently a page changes, in
order to determine whether it's a good candidate for deduplication or whether
that would just be a waste of time that causes unnecessary page faults. I
could be completely wrong on this, but I don't think it's quite as simple as
"if two pages are identical, KSM de-duplicates them."

------
ryao
You can use Gentoo Prefix for this:

[https://www.gentoo.org/proj/en/gentoo-
alt/prefix/](https://www.gentoo.org/proj/en/gentoo-alt/prefix/)

It builds everyting except libc. Support for building libc is under
development by the project. I used an early version of it to provide a build
environment for the ZFSOnLinux kernel modules on CoreOS:

[https://github.com/ClusterHQ/flocker/blob/zfs-on-coreos-
tuto...](https://github.com/ClusterHQ/flocker/blob/zfs-on-coreos-
tutorial-667/docs/experimental/zfs-on-coreos.rst)

------
kpc
Another option:

Most/all modern distributions can run programs built with lsbcc[1] from the
Linux Standard Base workgroup. The lsbdev environment links your program
against "stub" .so files that advertise appropriate symbol versions, and
prevent referencing library symbols that are not defined in the standard.

It can be somewhat of a hassle to get programs to build in this environment,
but it greatly reduces the chances of finding an incompatibility when the
program is later run under an unexpected Linux distribution. LSB includes an
app checker[2] which can generate reports[3] showing compatibility with
various distributions.

[1]
[https://wiki.linuxfoundation.org/en/Book/HowToDevel](https://wiki.linuxfoundation.org/en/Book/HowToDevel)

[2]
[http://ispras.linuxbase.org/index.php/About_Linux_Applicatio...](http://ispras.linuxbase.org/index.php/About_Linux_Application_Checker)

[3]
[http://ispras.linuxbase.org/index.php/File:LAC_TestReportDis...](http://ispras.linuxbase.org/index.php/File:LAC_TestReportDist.png)

------
davidgerard
By far the most reliable way I've found to run old binaries on Linux is ... to
run a Windows binary under Wine.

We have the portable all-free-software binary handler, yay! It just has Win32
in the middle ...

~~~
userbinator
Perhaps it's a side-effect of the open-source (or "source available but under
a restrictive license") nature of *nix, and the larger number of systems it
runs on that has lead to the "just compile the source on your system" way of
doing things that makes the source more portable, but customises the binaries
to the specific system(s) they're compiled for. Hardcoding the path to the
dynamic linker in the executable certainly doesn't help...

It's definitely far easier to make small, self-contained binaries for e.g.
x86-Windows versions that'll work from Win95 onwards than for, e.g. all x86-PC
Linux distros in the last 20 years. Those compiled on one version of Windows,
if linked to only the relatively stable set of system libraries and APIs, and
not needing anything else, will simply "just work" with no installation, which
I think is one of the nicer aspects of the Windows platform; the idea of
"portable apps" seems far less developed on Linux.

(I suppose you could also use Java...)

~~~
xorcist
I keep hearing that, but Linux has quite good backwards compatibility. I run a
couple of 15 year old statically built executables on Linux, not because I
have to but simply because I've kept them around. I believe Linux also keeps
optional support for pre-ELF binaries, and that's even older. So I think Linux
fares quite well in this regard.

Most things that create problems are auxilliary programs that people no longer
keep around, such as mail transport software that has changed behaviour, or
protocols that have changed. Those are things that would give you trouble on
any platform.

~~~
dottrap
Linus Torvalds was just at Debconf14 and specifically addressed how broken
shipping application binaries for Linux desktop is.

[http://www.youtube.com/watch?v=5PmHRSeA2c8](http://www.youtube.com/watch?v=5PmHRSeA2c8)

The most relevant parts start at about 6min and 41min.

Highlights:

\- .deb vs .rpm misses the whole point. The problem is application writers
just want to ship an application binary for "Linux", but it is a nightmare.

\- Except for the kernel which strives for ABI stability, everything else in
the Linux distros constantly break binary compatibility, including the most
important library, glibc.

\- Package maintainers are forced to use shared libraries for everything, even
packages that are unstable and not well used which means apps will break

\- You can't install packages under these systems as non-root

\- He ships binaries for his SCUBA diving app for Windows and OS X. He only
ships source for Linux. That's sad.

\- Static linking is a possible solution, but that is sad too.

(For the record, he points to Valve as a potential savior but says they will
probably do the sad thing of static linking. They aren't doing quite this.
They ship a set of dynamic libraries called Steam Runtime that all Steam games
can draw from. Better than static linking, but still kind of sad.)

At the 41 minute segment, a user celebrates the "good backwards compatibility"
for a 19 year binary he still has, and Linus points out that is "patting
ourselves on the back", but doesn't address the real compatibility issues
which are modern binaries.

I can attest to the glibc hell. Another simple example is clock_gettime().
Just trying to ship a binary between Ubuntu 14.04 and 12.04 is enough to bang
your head. (I think this problem appears between 12 and 13 too.) On one
version, it is found in glibc, but in the other it is in librt. This prevents
the binary from working on both.

Linus also criticizes Java (mentioned in the parent). That didn't work for him
either.

As for Windows, Windows still has DLL-hell. It's sad that Microsoft won't ship
their Visual Studio common C and C++ runtimes with the OS (MSVC*dll), even
though they finally put version numbers in the file name now. And they make it
harder than it should be to deploy your app with them. However, this is still
a vast improvement over what Linux distros are doing (as Linus points out).

(edit: formatting)

------
fineIllregister
I know it's not a panacea, but the Open Build Service is helpful in building
packages for multiple distributions.

------
giancarlostoro
I recently started using Fedora, love it, hate the lack of software. I could
always contribute to packaging software not yet available, like e.g. Atom, but
in the time I could waste learning to do these things I could of installed
Ubuntu. I'll keep it for now, though. Maybe I'll figure out how to package and
somehow release some lovely RPM's. I love Fedora, hate the lack of packages.

~~~
loudmax
I wonder if containers on the desktop would be useful for this kind of
situation. You could have containers for different distributions, or even
different versions of a distribution. For example, one for Ubuntu 14.04 and a
different one for 13.10. The containers wouldn't do much, other than take up
filesystem space, until they're needed.

This wouldn't work if the software requires a kernel module that isn't
packaged for your distribution. It also wouldn't be particularly useful for a
file system navigator or something that needs to see the rest of the system.
That should still leave in quite a bit of useful software.

~~~
pyre
> I wonder if containers on the desktop would be useful for this kind of
> situation.

Wouldn't that negate the benefits of things like shared libraries?

~~~
sarnowski
[http://harmful.cat-v.org/software/dynamic-
linking/](http://harmful.cat-v.org/software/dynamic-linking/)

..comes to my mind. Shared libraries and their advantages are at least..
controversal.

~~~
pyre
What about the benefit of being able to (e.g.) update OpenSSL across the
board? Imagine if each application needed a HeartBleed update because every
application had their own copy of OpenSSL. How would that be preferable?

------
mwilliamson
A made a tool called Whack [1] as an attempt to solve a small part of the
problem. The idea was that you could write scripts that would build
statically-linked versions of binaries, such as Apache HTTPD [2]. Then, you
could run a command like:

    
    
        whack install git+https://github.com/mwilliamson/whack-package-apache2-mod-php5 apps/apache2
    

to install to any location you wanted, and the installation would still work
if you mv'd the entire directory somewhere else, or even onto another
distribution (provided it didn't have an older libc).

I never really the pushed the idea further since I think doing a decent job
requires a lot of time and effort, and it solves something which is a minor
annoyance for me, rather than causing significant pain. I do still use it for
when I need to quickly get a WordPress server for development up and running
since it's one command to deploy from scratch.

It feels to me like there's lots of scope for something similar to apt-get
that builds and installs static binaries (or, at the very least, lets you
install to different directories without root, similar to, say, virtualenv for
Python or RVM for Ruby).

EDIT: actually, from a quick glance, it seems like Gentoo Prefix [3], as
mentioned by ryao, allows you to install things as you normally would on a
Gentoo machine, but without being root and with a prefix of your choosing.

[1]
[https://github.com/mwilliamson/whack](https://github.com/mwilliamson/whack)
[2] [https://github.com/mwilliamson/whack-package-apache2-mod-
php...](https://github.com/mwilliamson/whack-package-apache2-mod-php5) [3]
[https://www.gentoo.org/proj/en/gentoo-
alt/prefix/](https://www.gentoo.org/proj/en/gentoo-alt/prefix/)

------
IshKebab
It's sad that this is still a problem. Every time it comes up the old "just
open source it", "just dynamically link everything", "just make a package for
every distribution and version of libc" people come out and stop and sort of
progress.

Autopackage gave up because of this lack of progress.

So we are left with two equally awful "solutions":

1\. Do all your development on years-old linux distros. 2\. Use docker and
essentially package an entire distro with your app.

This is never a problem on Windows.

~~~
e12e
> ... never a problem on windows

Hello? Dll-hell? Outdated dlls bundeled with an app? This essentially _the_
problem distros solve; fixed target against which to compile..

~~~
angersock
Erm, no?

Try installing a copy of Half-Life 1 from '98 on a modern machine (go on, I'll
wait).

Note that it runs just fine, other than being confused if you've got more than
4GB of ram (which you can fake and fix in other ways).

Windows _had_ a lot of DLL issues, but the VC redstributables and vendors
packaging their own blessed libraries solves almost all of those issues. I'll
take that over the crackheaded Linux story any day.

