As input to a linker, Mach-O and ELF files look pretty similar: they both split their contents into named "sections", and then there's a symbol table, which is basically a list of (name, address) pairs.
In C code, each symbol generally represents (the start of) a separate function or variable. All references between them are explicitly marked as relocations; thus, the linker should be free to reorder them, and any function/variable that isn't explicitly referenced is unused and can be removed (unless you're linking a shared library and it's meant to be publicly exported from that).
On the other hand, in assembly code, symbols are not necessarily independent. You can have things like a function that "falls through" into into another function. For example, here's a hypothetical assembly file that implements both bzero() and memset(), and has the former fall through into the latter:
// void bzero(void *s, size_t n)
// (zeroes memory)
_bzero:
// Set up arguments for memset
mov r2, r1 // 3rd argument to memset = 2nd argument to bzero
mov r1, #0 // 2nd argument to memset is 0
// (1st argument to memset = 1st argument to bzero, no move needed)
// Fall through into memset
// void *memset(void *b, int c, size_t len);
_memset:
...memset implementation...
As for sections: traditionally, all code gets put into the same section (named ".text" for ELF, "__text" for Mach-O); all data gets put into the same section (".data" / "__data"); etc.
The Darwin (Mach-O) linker is optimized for the properties of C code, and implicitly treats the data between each symbol and the symbol following it as a separate unit ("atom" or "subsection"). Or, more specifically, it does this if the SUBSECTIONS_VIA_SYMBOLS flag is set in the Mach-O header, which is always the case for object files compiled from C (as of 2005 or so). Thus, it always has the ability to remove unused functions/variables, though it doesn't actually bother to do so unless you pass -dead_strip.
ELF linkers are more traditional and treat each section as an indivisible unit; symbols aren't taken into consideration at all. So, by default, it's not possible to strip unused functions and variables, nor to reorder multiple symbols that came from a single object file. However, you can pass "-ffunction-sections -fdata-sections" to GCC (the compiler, not the linker) to make it put every single function and variable, respectively, in its own section in the .o file. For example, a function named "foo" would appear in a section called ".text.foo". Then the linker will coalesce all the ".text.*" sections back into a single ".text" output, and similarly for other types of sections. But first it can strip unused sections (if you pass --gc-sections) – which is equivalent to stripping unused functions/variables, since each section contains only a single function/variable.
These are basically two different ways to accomplish the same thing, which probably explains what userbinator said. Both approaches feel kind of hacky to me. On the ELF side, object files with a bazillion sections are annoying to look at (if you examine them with readelf or other tools), and putting everything in its own section is not really how sections were originally intended to work. On the Mach-O side, well, the symbol table wasn't originally meant to to be used to split up the input data; in particular, unlike with ELF, Mach-O symbols don't have a size field (which is why the atom implicitly lasts until the next symbol). And it feels wrong that object files compiled from assembly have to be treated differently from everything else (they don't have SUBSECTIONS_VIA_SYMBOLS, unless you explicitly ask for it in the assembly file).
Personally I prefer the Mach-O approach just because it requires fewer flags to enable stripping of unused functions/data. Heck, I don't understand why -dead_strip isn't enabled by default.
But if you were to design a new object file format from scratch, it could probably handle this much more elegantly than either ELF or Mach-O.
> Heck, I don't understand why -dead_strip isn't enabled by default.
My guess would be to facilitate debugging, where you might want to invoke a symbol that wouldn't otherwise be used at runtime (e.g. some sort of debug print).
Yes, as someone with a Windows background I think this is one of the most unusual aspects of the dynamic linking system on Unix-likes --- everything is "exported" by default and the basic system has no concept of linking to symbols from a specific module, it only looks at the symbol name itself.
In Windows you have to explicitly specify which symbols to export, and only those exported ones can be imported.
On Windows this has the consequence that if you have a DLL, it might use a different malloc than your application. As a result, you can’t safely free objects that were created in a DLL unless you go to all sorts of trouble. This can mean passing objects back into a DLL to free them or using some other mechanism to ensure that you use the same allocator everywhere. This is especially problematic with C++ because of inlining.
The “everything is exported” default can be fixed with a flag, you then set symbol visibility with attributes like __declspec or use a linker script. This doesn’t change the calling convention, unlike with DLLs.
What you said about malloc on Windows is true, but I wouldn’t blame it on symbols not being exported by default – but rather on Windows’ choice to ship N different C runtime libraries, one for each MSVC version, as separate DLLs. Most applications and libraries link to one of those DLLs, and if two images link to the same DLL, they do share C runtime state, including the malloc heap [1]. But that’s only possible if they were compiled with the same MSVC version. (There is also an option, -MT, to fully statically link the C runtime, but it’s less commonly used.)
In contrast, both Linux and Darwin have only one C runtime library for the system, in libc.so.6 (typically) or libSystem.dylib respectively, which maintains backwards compatibility over a longer time period. Sometimes there’s a need to make changes to libc that would normally be ABI-breaking, e.g. when off_t was changed to be 64 bits to support >4GB file sizes on 32-bit platforms. But to avoid breakage, special mechanisms are used to provide two different versions of the same symbol within the same library, for each affected symbol; existing binaries will use the old version, while programs compiled and linked on a newer system will automatically use the new version. (On Linux, a complicated mechanism called symbol versioning is used for this, while Darwin just renames symbols with the asm() directive in header files.)
On Linux, it is possible to statically link any of various libcs, but programs that do that typically can’t use shared libraries * at all*, so the issue of malloc across library boundaries doesn’t come up.
Oh, and for the record, apparently Windows 10 has a new “universal” CRT DLL for new code that will be maintained in place going forward, but it’s still separate from all the pre-existing versioned DLLs, and there’s a debug variant that’s a separate DLL from the normal one and probably doesn’t share state with it.
There is one CRT that has existed since Win95 and is still there in Win10, it's called MSVCRT.DLL and getting versions of MSVC other than 6 to link to it is possible and has been done (although not trivial). MS rather strongly discourages this, but as evidenced by all the apps out there that do it and continue to work, it's pretty much the only way to get a single small dynamically-linked binary that'll run on every 32-bit version of Windows ever released.
In contrast the "universal" CRT is not very universal at all, and a horrible bloated mess. But certainly not unusual of MS...
Only if the symbol is exported, but in that case
it won’t be removed even if -dead_strip is used.
As userbinator noted, Unix has an awful tradition of exporting most symbols by default, which is also in effect on macOS. The modern approach is to pass -fvisibility=hidden to the compiler, which changes the default to unexported, and then mark symbols that are supposed to be exported with __attribute__((visibility(“default”))). Alternately, there’s a Darwin-specific ld option (-exported_symbols_list) where you write a list of names to export in a separate text file, and everything else is forced to be unexported.
Hmm… well, when it comes to identifier mangling, object formats aren't really the problem. Both ELF and Mach-O use nul termination for symbol (and section) names, so they can't contain 00 bytes, but there's nothing in the binary format preventing them from containing any other bytes. So you could make a symbol named
foo::bar(int,int)
…and most likely, everything that deals with binaries would have no problem with it.
A bigger obstacle might be the assembler, whose input is text. Assembly files usually write symbol names without any escaping or quoting, so non-alphanumeric characters could be misinterpreted. But in fact, it seems that both GNU as and LLVM's assembler (currently used on macOS) allow optionally surrounding symbol names in quotes, allowing those characters to be used:
"foo::bar(int,int)":
jmp "foo::bar(int,int)"
Also, it seems that Clang will use this syntax where necessary when generating assembly files. This compiles:
int bar(int a, int b) asm("foo::bar(int,int)");
int bar(int a, int b) {
return a + b;
}
…but GCC apparently doesn't use it; I just tried it on the latest version of GCC (8.1.0) and it produces assembly that uses the name unquoted, which then makes the assembler spit out errors.
However, I've left one thing out. I think C++ symbol mangling mainly originated as a hack to support existing assemblers, but it also achieves a basic form of compression. For example, keywords are represented by a single character, and there's a "substitution" syntax for reusing the same token sequence more than once. C++ symbol names already tend to be crazy long, and having a less succinct mangling would make them even longer, which would make binaries larger and might make dynamic linking slower – though to be honest, I have no idea how much (if at all) this would be noticeable in the relative scheme of things.
Also, there has to be a single canonical mangling of any given declaration, so even if a platform decided to use C++ syntax directly in symbol names, it would probably omit spaces, unnecessary parentheses, etc., and the result might be harder to read than what you get after demangling. So a demangler might still be desirable. Still, it would certainly be more readable than the current mangling!
But that's all assuming that the overall compilation scheme would still look like today, with a 'dumb' linker that only knows about symbols and assembly code, not types or anything about C++ semantics.
You could go a step further and design an object format with native support for C++, even things like templates. Imagine being able to define a template in one .cpp file, link it into a library, and then instantiate that template from another executable! That would be enormously cool. In fact, the C++ spec used to define an 'export template' syntax that was supposed to do this, but essentially no compilers implemented it, and it was removed in C++11. (C++ modules are also kind of a form of this, but they're meant to be compiler-specific, private build artifacts rather than something defined at the system level.)
I can think of three distinct drawbacks, though:
1. C++ template semantics are very tightly bound to its syntax; there's little you can say about a template definition without knowing what it's instantiated with. Indeed, if you're going to encode template definitions in object files, the format would probably be nothing more complicated than pre-tokenized source code. More modern languages do this somewhat better – in fact, Swift actually plans to have a stable ABI for generics.
2. Similarly, C++ template semantics are very C++-specific; other languages would probably require separate support in the format rather than being able to reuse the C++ functionality. In comparison, existing 'dumb' object formats are basically language-agnostic.
3. The biggest problem: If you allowed templates to be exported from dynamic libraries, the dynamic linker's functionality would have to be transformed from a series of quick name lookups and fixups that can be done every time a binary is launched, to a full-fledged C++ compiler with an expensive code generation step. Even if you cached the output it would still be slow on first launch, especially on low-powered platforms like mobile devices…
And yet despite all those drawbacks, I still dream of a system that has… at least some form of this. (I've thought a bit about possible designs: perhaps it could be designed as a component of the package manager rather than of the linker directly.) Why?
Well, Debian just started packaging Rust code, and look at how that's going. Each library package ("crate") gets a libfoo-dev that just contains a copy of the package source code, with no libfoo binary package; each executable is statically linked, and a new package version will be released whenever any of its dependencies update. Which is going to mean a lot of redundant upgrades. That's Rust, not C++, but to the extent C++ libraries avoid this problem, it's usually by eschewing templates altogether for anything that's meant to have a stable ABI. If the API does include templates (at least ones that clients instantiate with their own parameters), then clients have to be rebuilt whenever the library changes, same as Rust. I find this quite annoying, since I think the future should be full of ergonomic, strongly-typed APIs taking full advantage of the features of modern languages… yet I don't want to burden sysadmins with pointless upgrades. :) And it just feels wrong that linkers have basically never progressed beyond C.
Why would you want run-time code generation for things that you can also generate at compile time? A linker is there to link, not to compile.
One of the big problems with templates is that you're not required to instantiate them explicitly. So you end up with lots of duplicate instantiations. Maybe try doing it explicitly, always - have a well defined place where the template is instantiated. You can also put the compiled code in a library, I think. (There seems to be an extern template feature since C++11).
In C code, each symbol generally represents (the start of) a separate function or variable. All references between them are explicitly marked as relocations; thus, the linker should be free to reorder them, and any function/variable that isn't explicitly referenced is unused and can be removed (unless you're linking a shared library and it's meant to be publicly exported from that).
On the other hand, in assembly code, symbols are not necessarily independent. You can have things like a function that "falls through" into into another function. For example, here's a hypothetical assembly file that implements both bzero() and memset(), and has the former fall through into the latter:
As for sections: traditionally, all code gets put into the same section (named ".text" for ELF, "__text" for Mach-O); all data gets put into the same section (".data" / "__data"); etc.The Darwin (Mach-O) linker is optimized for the properties of C code, and implicitly treats the data between each symbol and the symbol following it as a separate unit ("atom" or "subsection"). Or, more specifically, it does this if the SUBSECTIONS_VIA_SYMBOLS flag is set in the Mach-O header, which is always the case for object files compiled from C (as of 2005 or so). Thus, it always has the ability to remove unused functions/variables, though it doesn't actually bother to do so unless you pass -dead_strip.
ELF linkers are more traditional and treat each section as an indivisible unit; symbols aren't taken into consideration at all. So, by default, it's not possible to strip unused functions and variables, nor to reorder multiple symbols that came from a single object file. However, you can pass "-ffunction-sections -fdata-sections" to GCC (the compiler, not the linker) to make it put every single function and variable, respectively, in its own section in the .o file. For example, a function named "foo" would appear in a section called ".text.foo". Then the linker will coalesce all the ".text.*" sections back into a single ".text" output, and similarly for other types of sections. But first it can strip unused sections (if you pass --gc-sections) – which is equivalent to stripping unused functions/variables, since each section contains only a single function/variable.
These are basically two different ways to accomplish the same thing, which probably explains what userbinator said. Both approaches feel kind of hacky to me. On the ELF side, object files with a bazillion sections are annoying to look at (if you examine them with readelf or other tools), and putting everything in its own section is not really how sections were originally intended to work. On the Mach-O side, well, the symbol table wasn't originally meant to to be used to split up the input data; in particular, unlike with ELF, Mach-O symbols don't have a size field (which is why the atom implicitly lasts until the next symbol). And it feels wrong that object files compiled from assembly have to be treated differently from everything else (they don't have SUBSECTIONS_VIA_SYMBOLS, unless you explicitly ask for it in the assembly file).
Personally I prefer the Mach-O approach just because it requires fewer flags to enable stripping of unused functions/data. Heck, I don't understand why -dead_strip isn't enabled by default.
But if you were to design a new object file format from scratch, it could probably handle this much more elegantly than either ELF or Mach-O.