I've been wondering about qualitative differences between Mach-O and ELF, after ...

haberman · on May 16, 2019

That is a deep and interesting question. I'm not sure I can give a great answer, but here are a few thoughts.

If we look at the file format itself (separate from the features/semantics of the linker and loader), I think ELF is simpler and more orthogonal. You can iterate over the section/segment tables of an ELF file without knowing anything about what each section/segment means. ELF nicely decouples the high-level "container" aspect of the file format from the lower-level semantics of how you interpret each section/segment in the linker and loader.

Mach on the other hand couples these two concepts together. The top-level table is an array of "load commands", each with its own type, but you can't even parse a load command until you know what type it is. Unlike ELF, the entries of this table do not have a generic format or even a consistent size. If you haven't written code to specifically recognize a given command type, all you can do as fallback behavior is skip it. To me ELF feels like a refactoring of Mach to make it a little more general and layered.

If we consider the actual semantics and features of the file formats, there are pros and cons to both. Mach-O has built-in support for fat (multi-architecture) binaries, which is kind of nifty, though I've never actually used it myself. Mach-O distinguishes between "dylib" and "bundle" for shared libraries -- for the life of me I can never remember the difference between these two -- whereas ELF just has one type of shared library. (https://docstore.mik.ua/orelly/unix3/mac/ch05_03.htm). The distinction seems to add complexity and I'm not sure I understand the benefit. Mach-O has two-level namespaces (dynamic symbols are resolved by both name and the library they come from) -- colliding symbols aren't generally a problem I've seen with ELF, but maybe it's useful in some cases. ELF makes symbol interpositioning easy with LD_PRELOAD, though Mach-O seems to have its own version of this that I've never tried: https://stackoverflow.com/questions/12609728/changing-functi.... Overall I prefer ELF.

AceJohnny2 · on May 16, 2019

Thanks!

Regarding multi-arch support, Ryan C. Gordon (Linux game porter extraordinaire, icculus.org) had proposed FatELF [1] back in 2009 (LWN coverage [2]). It seemed simple enough to implement, but never really picked up steam (IMHO for reasons that speak of the culture of the Linux ecosystem).

[1] http://icculus.org/fatelf/

[2] https://lwn.net/Articles/359070/

yjftsjthsd-h · on May 16, 2019

> reasons that speak of the culture of the Linux ecosystem

"Everyone ships source; just recompile"? It would be convenient, but with source and a compiler you can hit everything anyways.

AceJohnny2 · on May 16, 2019

Yep, that's indeed my perspective, and I think that mindset dismisses the effort required to deliver closed-source binaries with long-term support.

glandium · on May 16, 2019

Mach-O also has a more compact bytecode-like representation for relocations, while ELF just wastes tons of space. See https://glandium.org/blog/?p=1177

monocasa · on May 16, 2019

Not haberman, but I've written loaders for both Mach-O and ELF, and really prefer ELF.

ELF is mainly structured like descriptive tables of how the relevant pieces look in memory; Mach-O is more structured like a script of commands that you run to load the binary. There's a couple places where the model breaks down for ELF, DWARF and GNU_STACK both feel more Mach-O, but if you're playing with binaries for non standard uses, ELF just feels a lot cleaner IMO.

I'd love to hear the Mach developer's arguments though.

haberman · on May 16, 2019

Interesting, what was your job that required writing both a Mach-O and ELF loader?

monocasa · on May 16, 2019

Binary analysis and introspection tools.

jcranmer · on May 16, 2019

I only know the ELF format in detail, but I think the major complaint is that ELF uses a flat namespace for symbols whereas Mach-O has a two-level namespace. Furthermore, ELF lets you preload dynamic libraries such that you can override calls even to symbols provided in the same shared object.

saagarjha · on May 16, 2019

Note that it's possible to force a flat namespace for Mach-O through a variety of linker flags and DYLD environment variables. And I'm not sure if it does everything you'd want it to, but you can use DYLD_INSERT_LIBRARIES to preload Mach-O dynamic libraries as well.