As input to a linker, Mach-O and ELF files look pretty similar: they both split ...

johncolanduoni · on July 13, 2018

> Heck, I don't understand why -dead_strip isn't enabled by default.

My guess would be to facilitate debugging, where you might want to invoke a symbol that wouldn't otherwise be used at runtime (e.g. some sort of debug print).

jacobush · on July 13, 2018

Can it be also for something dynamically load a symbol and call it? Like for .so making

userbinator · on July 13, 2018

Yes, as someone with a Windows background I think this is one of the most unusual aspects of the dynamic linking system on Unix-likes --- everything is "exported" by default and the basic system has no concept of linking to symbols from a specific module, it only looks at the symbol name itself.

In Windows you have to explicitly specify which symbols to export, and only those exported ones can be imported.

klodolph · on July 13, 2018

On Windows this has the consequence that if you have a DLL, it might use a different malloc than your application. As a result, you can’t safely free objects that were created in a DLL unless you go to all sorts of trouble. This can mean passing objects back into a DLL to free them or using some other mechanism to ensure that you use the same allocator everywhere. This is especially problematic with C++ because of inlining.

The “everything is exported” default can be fixed with a flag, you then set symbol visibility with attributes like __declspec or use a linker script. This doesn’t change the calling convention, unlike with DLLs.

comex · on July 13, 2018

What you said about malloc on Windows is true, but I wouldn’t blame it on symbols not being exported by default – but rather on Windows’ choice to ship N different C runtime libraries, one for each MSVC version, as separate DLLs. Most applications and libraries link to one of those DLLs, and if two images link to the same DLL, they do share C runtime state, including the malloc heap [1]. But that’s only possible if they were compiled with the same MSVC version. (There is also an option, -MT, to fully statically link the C runtime, but it’s less commonly used.)

In contrast, both Linux and Darwin have only one C runtime library for the system, in libc.so.6 (typically) or libSystem.dylib respectively, which maintains backwards compatibility over a longer time period. Sometimes there’s a need to make changes to libc that would normally be ABI-breaking, e.g. when off_t was changed to be 64 bits to support >4GB file sizes on 32-bit platforms. But to avoid breakage, special mechanisms are used to provide two different versions of the same symbol within the same library, for each affected symbol; existing binaries will use the old version, while programs compiled and linked on a newer system will automatically use the new version. (On Linux, a complicated mechanism called symbol versioning is used for this, while Darwin just renames symbols with the asm() directive in header files.)

On Linux, it is possible to statically link any of various libcs, but programs that do that typically can’t use shared libraries * at all*, so the issue of malloc across library boundaries doesn’t come up.

Oh, and for the record, apparently Windows 10 has a new “universal” CRT DLL for new code that will be maintained in place going forward, but it’s still separate from all the pre-existing versioned DLLs, and there’s a debug variant that’s a separate DLL from the normal one and probably doesn’t share state with it.

[1] https://msdn.microsoft.com/en-us/library/ms235460.aspx

userbinator · on July 14, 2018

There is one CRT that has existed since Win95 and is still there in Win10, it's called MSVCRT.DLL and getting versions of MSVC other than 6 to link to it is possible and has been done (although not trivial). MS rather strongly discourages this, but as evidenced by all the apps out there that do it and continue to work, it's pretty much the only way to get a single small dynamically-linked binary that'll run on every 32-bit version of Windows ever released.

In contrast the "universal" CRT is not very universal at all, and a horrible bloated mess. But certainly not unusual of MS...

userbinator · on July 14, 2018

This can mean passing objects back into a DLL to free them or using some other mechanism to ensure that you use the same allocator everywhere.

That's what the LocalAlloc/LocalFree functions are for.

comex · on July 13, 2018

Only if the symbol is exported, but in that case it won’t be removed even if -dead_strip is used.

As userbinator noted, Unix has an awful tradition of exporting most symbols by default, which is also in effect on macOS. The modern approach is to pass -fvisibility=hidden to the compiler, which changes the default to unexported, and then mark symbols that are supposed to be exported with __attribute__((visibility(“default”))). Alternately, there’s a Darwin-specific ld option (-exported_symbols_list) where you write a list of names to export in a separate text file, and everything else is forced to be unexported.

bear_child · on July 13, 2018

Can you say more about the last paragraph? I have often wondered if you could avoid identifier mangling with a better designed object file format

comex · on July 13, 2018

Hmm… well, when it comes to identifier mangling, object formats aren't really the problem. Both ELF and Mach-O use nul termination for symbol (and section) names, so they can't contain 00 bytes, but there's nothing in the binary format preventing them from containing any other bytes. So you could make a symbol named

    foo::bar(int,int)

…and most likely, everything that deals with binaries would have no problem with it.

A bigger obstacle might be the assembler, whose input is text. Assembly files usually write symbol names without any escaping or quoting, so non-alphanumeric characters could be misinterpreted. But in fact, it seems that both GNU as and LLVM's assembler (currently used on macOS) allow optionally surrounding symbol names in quotes, allowing those characters to be used:

    "foo::bar(int,int)":
        jmp "foo::bar(int,int)"

Also, it seems that Clang will use this syntax where necessary when generating assembly files. This compiles:

    int bar(int a, int b) asm("foo::bar(int,int)");
    int bar(int a, int b) {
        return a + b;
    }

…but GCC apparently doesn't use it; I just tried it on the latest version of GCC (8.1.0) and it produces assembly that uses the name unquoted, which then makes the assembler spit out errors.

However, I've left one thing out. I think C++ symbol mangling mainly originated as a hack to support existing assemblers, but it also achieves a basic form of compression. For example, keywords are represented by a single character, and there's a "substitution" syntax for reusing the same token sequence more than once. C++ symbol names already tend to be crazy long, and having a less succinct mangling would make them even longer, which would make binaries larger and might make dynamic linking slower – though to be honest, I have no idea how much (if at all) this would be noticeable in the relative scheme of things.

Also, there has to be a single canonical mangling of any given declaration, so even if a platform decided to use C++ syntax directly in symbol names, it would probably omit spaces, unnecessary parentheses, etc., and the result might be harder to read than what you get after demangling. So a demangler might still be desirable. Still, it would certainly be more readable than the current mangling!

But that's all assuming that the overall compilation scheme would still look like today, with a 'dumb' linker that only knows about symbols and assembly code, not types or anything about C++ semantics.

You could go a step further and design an object format with native support for C++, even things like templates. Imagine being able to define a template in one .cpp file, link it into a library, and then instantiate that template from another executable! That would be enormously cool. In fact, the C++ spec used to define an 'export template' syntax that was supposed to do this, but essentially no compilers implemented it, and it was removed in C++11. (C++ modules are also kind of a form of this, but they're meant to be compiler-specific, private build artifacts rather than something defined at the system level.)

I can think of three distinct drawbacks, though:

1. C++ template semantics are very tightly bound to its syntax; there's little you can say about a template definition without knowing what it's instantiated with. Indeed, if you're going to encode template definitions in object files, the format would probably be nothing more complicated than pre-tokenized source code. More modern languages do this somewhat better – in fact, Swift actually plans to have a stable ABI for generics.

2. Similarly, C++ template semantics are very C++-specific; other languages would probably require separate support in the format rather than being able to reuse the C++ functionality. In comparison, existing 'dumb' object formats are basically language-agnostic.

3. The biggest problem: If you allowed templates to be exported from dynamic libraries, the dynamic linker's functionality would have to be transformed from a series of quick name lookups and fixups that can be done every time a binary is launched, to a full-fledged C++ compiler with an expensive code generation step. Even if you cached the output it would still be slow on first launch, especially on low-powered platforms like mobile devices…

And yet despite all those drawbacks, I still dream of a system that has… at least some form of this. (I've thought a bit about possible designs: perhaps it could be designed as a component of the package manager rather than of the linker directly.) Why?

Well, Debian just started packaging Rust code, and look at how that's going. Each library package ("crate") gets a libfoo-dev that just contains a copy of the package source code, with no libfoo binary package; each executable is statically linked, and a new package version will be released whenever any of its dependencies update. Which is going to mean a lot of redundant upgrades. That's Rust, not C++, but to the extent C++ libraries avoid this problem, it's usually by eschewing templates altogether for anything that's meant to have a stable ABI. If the API does include templates (at least ones that clients instantiate with their own parameters), then clients have to be rebuilt whenever the library changes, same as Rust. I find this quite annoying, since I think the future should be full of ergonomic, strongly-typed APIs taking full advantage of the features of modern languages… yet I don't want to burden sysadmins with pointless upgrades. :) And it just feels wrong that linkers have basically never progressed beyond C.

jstimpfle · on July 13, 2018

Why would you want run-time code generation for things that you can also generate at compile time? A linker is there to link, not to compile.

One of the big problems with templates is that you're not required to instantiate them explicitly. So you end up with lots of duplicate instantiations. Maybe try doing it explicitly, always - have a well defined place where the template is instantiated. You can also put the compiled code in a library, I think. (There seems to be an extern template feature since C++11).

Anyways, templates are a mess...