Hacker News new | past | comments | ask | show | jobs | submit login
How Fat Does a Fat Binary Need to Be? (justine.lol)
171 points by jart on Feb 11, 2021 | hide | past | favorite | 67 comments



> That's how big a statically-linked executable needs to be to run natively without dependencies on Linux, Mac, Windows, FreeBSD, OpenBSD, NetBSD, and BIOS.

* on x86.

There was a time period when it was reasonable to just assume everything that mattered was x86. It started sometime in the early 2000s, as non-x86 servers died out and Macs switched to x86. And it ended last November, when Apple announced Macs would switch to ARM.

The author has suggested sticking with x86 and handling other architectures using emulation. But x86 is the worst possible architecture to emulate, because its strong memory ordering model means that emulators on other architectures essentially must limit the emulated process to a single core. (Unless you have special hardware support like M1 Macs do… for now.) Also, any hardware CPU architecture is going to be slower to emulate than a purpose-built bytecode – and we already have a good enough bytecode, WebAssembly.

Emulating x86 might make sense anyway if you assume that x86 is and will be the norm, and non-x86 an edge case that can afford to take a performance hit. But that sure doesn't seem to be the way the wind is blowing.

[1] https://justine.lol/ape.html


Author here. These binaries will re-exec themselves under Qemu if you run them on different architectures. So they truly are build-once run-anywhere.

I think improving performance on other architectures is great. There's nothing that inherently limits the Actually Portable Executable format from baking in native support for other architectures. It's simply a question of focus.

What Cosmopolitan does, it does really well. No other C library is able to target multiple operating systems and all the x86 microarchitectures. You can now easily distribute software that can be easily run on the vast majority of PCs and servers. Want it to do more? Consider becoming a sponsor. The impact the project is already having should be all the proof we need that it could be doing much more. In order for that to happen it needs your support.


Fair enough. I'm not against your project. I realize that open-source projects have time/funding constraints and that you don't owe me anything.

By "baking in native support for other architectures", I guess you're thinking of an approach similar to Apple's "fat binaries", which are just multiple independent binaries built for different architectures bundled together? That's a good pragmatic option, and definitely the best choice size-wise for small programs. On the other hand, it lacks forwards-compatibility: you're limited to the set of architectures the original developer decided on. And for large programs, the cost of N copies of the program can add up.

Personally, I'm a fan of bytecode VMs such as WebAssembly. That way the binary can be truly architecture-independent, capable of supporting many different architectures or special architecture variants without bloating the size of the executable. On the other hand, if you wanted the binary to run out of the box with no dependencies, you would have to bundle an entire VM implementation into the binary for at least some architectures, and then you'd have your own share of "hello world" bloat. (I'm assuming there would be an option for the user to supply their own VM for less common architectures.)

I do wonder how small one can make a WebAssembly VM, if its design prioritizes minimal size over maximal speed, while still aiming to get reasonable speed (i.e. not an interpreter)…


In order to solve the “forward” compatibility option you could have both Specialized binaries and a generic version in (wasm) bytecode. Then you can have all the performance optimisations you care about and retain compatibility with all platforms.

Optionally you could consider a binary that on first run (&prompt?) removes the binaries irrelevant to the current host.


Semi off-topic, but having just come across this project, I just started looking back at some of your stuff and saw a comment about an ape chibicc fork. What's the state of that? Is it usable yet?


It's usable but not polished yet. Lots of GNU extensions and an integrated assembler were added. You can read about them here: https://github.com/jart/cosmopolitan/blob/master/third_party... I have automated tests which do a two-stage build and test on every platform each time I run make, so it's going to stay stable. The two other things I want are an integrated linker and VLA parameters. Rui is helping me implement those. So I anticipate we can start distributing a APE build of complete chibicc toolchain very soon that you can use to build APE binaries on any platform.

Keep in mind this compiler doesn't optimize. So if you're OK taking a 2x performance hit, you're going to find that your code compiles significantly faster than GCC or Clang and you're going to have the convenience of the fact that the whole toolchain will be a single .com file that you can just scp on any server for a fully deterministic reproducible build experience. It's also the most readable and hackable compiler codebase I've ever seen, so if you've ever wanted to pull out the dragon book at try your hand at optimizing, you'll find that chibicc is fertile ground to do it.


Very exciting stuff! I'll try to take it for a spin tonight :)


Here's something you can paste into your Linux terminal to get started:

    git clone https://github.com/jart/cosmopolitan
    cd cosmopolitan
    make -j8
    o/third_party/chibicc/chibicc.com -include libc/integral/normalize.inc -o hello.com.dbg examples/hello.c o/cosmopolitan.a
    o/third_party/gcc/bin/x86_64-linux-musl-objcopy -SO binary hello.com.dbg hello.com
    ./hello.com


> Want it to do more? Consider becoming a sponsor.

How?


https://github.com/sponsors/jart

Thank you Hacker News for your support!



> And it ended last November, when Apple announced Macs would switch to ARM.

This is revisionism. It ended with the smartphone and netbook era, both of which put more ARM devices in homes than Macs ever will.


Well. I haven't heard of ARM netbooks being a big thing. But yes, smartphones and other non-desktop-operating-system devices have been ARM for a long time. And other architectures still show up on other embedded devices, like MIPS on routers.

By "everything that mattered" I guess I should have said "everything that mattered that you have a shell on".

For some of those non-desktop-operating-system devices, getting a shell is impossible without bypassing lockdown measures. For others, it's possible, but not particularly common as far as I know, for a variety of reasons. They tend to run stripped-down OSes; they don't have great keyboards; they were pretty slow compared to desktops until recently.

Yes, this is a huge overgeneralization. Plenty of people do run shells on their phones or routers. But I do believe that a major desktop platform switching to ARM is qualitatively different from what we'd seen before.


Many chromebooks are on ARM, and chromebooks all have the native ability to run a true linux shell at this point, and have for several years now.


I know a guy who long ago gave up MacBooks for Chromebooks for development. He programs the systems that keep HVAC/large refrigeration systems running. He buys a new machine every year and his company pays for it. Cheap. His last Chromebook had 8 GB RAM and his current one 16. For all intents and purposes, it's faster than any Mac or PC that you would use for programming and the horsepower needed is on a remote server. He can never lose any code or other project notes, photos, since they, too, live in the cloud. Not for everyone, obviously.


Oh, yeah, I have a chromebook with 8 GB of ram. It's a good dev machine.


I thought most "netbooks" had Intel Atom processors?


I agree. It might depend on where you start the 'netbook' definition and how you count 'most'.

The EeePC was x86, the Chromebook Pixel was x86 along with many that followed with balance being ARM variants.

More recently you have convertibles/ultrabooks and their stripped down cousins (e.g. Microsoft Surface Pro vs. Surface 3/Go/Go 2 and Surface 2, Surface RT, and Surface Pro X the latter 3 being ARM).

Ultimately the netbook niche has been squeeze by ultrabooks and tablets to few offerings if anyone is even bothering to market computers that way anymore.


I'll admit to being biased up front, as I run a lot of ARM and POWER stuff, so I have a vested interest, but I have to agree that this feels short sighted. ARM at the very least has a big potential to take a much bigger slice of the market, with ARM servers and ARM Macs "finally" coming out in force, while ARM has been dominating smartphone and mobile devices for at least the last, say, 15 years, and has gotten a lot of mindshare through SBCs like the Raspberry Pi.

It's a fun project, but I don't think the assumption that everything is x86 is as safe as the author proposes.


Has anyone actually benchmarked top-tier WebAssembly engines against good emulators? I'd be very curious to hear what the results are like. Although, Rosetta might have to be excluded because it cheats in fidelity by using the host MMU…


The time period still continues until something other than x86 is consistently runnable (emulation or not) or most apps have done significant rewrites, not when different hardware shows up.


I'm in awe at the existence of this thing. Cannot say anything.

EDIT: Alright, I found something to say. Everything is stunningly perfect about this project except for one thing: the disturbing "mock greek" (using d as o and m as u, etc). Is it a sort of modern utf8 version of 1337-speek? Maybe it is too ironical to understand?

EDIT2: OK, now I get it. The "actually portable executables" use an amazing hack where the PE header itself gets executed as machine code, it turns out to be mostly harmless, and execution is captured from that forward, using other data parts of the header as code. This is the same thing your brain does when reading letters in mixed alphabets. It is confusing when you know both alfabets and try to read on each, but if you squint your eyes, then only the actual shape of the characters matters, and you can read it for its intended meaning. Thus it is a sort of self-referential title which is perfectly appropriate here.


Seems impossible on the face of it. Linux binaries have to start with 0x7F ELF and Windows binaries have to start with MZ or ZM so I don't get how it could work with both.


As explained by the author https://justine.lol/ape.html


In short: The start of the file is MZ but also a shell script. If you run it on linux it replaces its header with the ELF header, then execs itself.


And the reason Windows requires MZ is so that the program is also a valid dos program (that prints an error and exits). According to wikipedia MZ were picked to be the initials of one of the leading dos developers. Turtles all the way down.


Based on the generated code, you may need to broaden the definition of "binary" to include anything that is executable (shell script.)


It's not really a shell script though, it just looks like one enough to the point that a bourne shell can modify it to be directly executable.

I'd probably be most worried about the bourne shell part. It seems like it'll be easy for this to break on any "non-standard" shell that Linux might execute.


The behavior that Cosmopolitan needs is required by POSIX because UNIX Sixth Edition didn't have shebang lines. It's normally an implementation detail of execve() in C libraries and the behavior is implemented by nearly all shells too. So far the only issue has been that some shells, like zsh, have a binary safety check where it refuses to run scripts that have embedded NUL characters. I worked with the FreeBSD team to get POSIX revised to clarify the correct behavior w.r.t. these restrictions. FreeBSD's /bin/sh has already been patched and zsh is pending upstream. https://github.com/jart/zsh/commit/94a4bc14bb2e415ec3d10cf71...


I appreciate the thoroughness with which you are pursuing this. The amount of work you're doing outside of the technical aspect is definitely something I can admit that I would've never pursued. It's encouraging to see.


This is true, but the shell is executing it "as if" it were a script, right? Until the binary part takes over...


/bin/sh execution is supported at the kernel level iirc? Like, the minimum requirement for a linux distribution is /bin/sh, the rest of the system doesn't have to be there necessarily.


No, the Linux kernel only requires an init executable to run. It otherwise makes no assumptions about the programs available in userland. Pretty much any sane/usable Linux distribution will package a shell, though.


It does, however, hardcode /bin/sh as a init of last resort: https://github.com/torvalds/linux/blob/dcc0b49040c70ad827a7f...


Exactly what I was talking about, thank you


The behavior your are likely thinking of (execvp on a file without a shebang) is implemented in libc.


No. As someone else mentioned, the kernel hardcodes /bin/sh as a last resort: https://github.com/torvalds/linux/blob/dcc0b49040c70ad827a7f...


Is it possible to build real applications with Cosmopolitan?

I'm thinking stuff that requires other dependencies, e .g. build a universal Python interpreter by using Cosmopolitan. Would that be possible?


It's been formally verified that Cosmopolitan lets you compute anything that's computable. It's able to build real apps too! For example, JavaScript is the most popular language and Cosmopolitan has an embedded JavaScript interpreter. See https://github.com/jart/cosmopolitan/blob/master/examples/he... where we built a 300kb Actually Portable Executable + ZIP file polyglot that reflectively loads interprted sources from within its own ZIP file structure. You can download the compiled executable here: https://justine.lol/hellojs.com It would be trivial to support Python too, along with any other language.


First of all, cool project! I've done some less ambitious stuff in regards to mashing together scripts and zip archives to create what I'd describe as "dynamically linked python executables", and what you've managed to achieve with compiled code is awe inspiring.

Pure computation and file/socket based IO is enough to solve many problems, but I'm curious what the plan is down the road to support more exotic forms of IO like high performance graphics, audio, human input devices, etc. in a portable way?

It seems like the fully-statically-linked approach can only get as far as the common subset of all supported platforms, which is probably a tautology, but means that the cosmopolitan library will have to not only provide APIs for any possible type of IO someone might want to do, but also bake in all the userspace code to implement it on every platform. For a lot of things that seems currently very difficult, as many hardware vendors only provide dynamic libraries as their interface. I suppose statically linking isn't mutually exclusive with dynamically loading, though?

The other route seems to be targeting a virtual machine of some sort, whether it's running the code and providing a consistent ABI like qemu, or communicating over files/sockets like rendering graphics and sound in a web browser. I get the impression that's the less interesting route to go down though.

I appreciate the holistic nature of the project, and it seems like you've approached it with a pragmatic mindset. Myself thinking pragmatically, it just seems like you'll eventually want to at least support opengl or a GUI library or something, to appeal to the types of programs that really benefit most from easy portability.


One of Cosmopolitan's demo programs is an NES emulator that runs in the terminal: https://justine.lol/nesemu1.html It plays audio by piping raw audio samples into an ffplay or sox subprocess. Another demo program is PRINTVIDEO.COM which can do things like play 4K MPEG videos in the terminal. https://justine.lol/printvideo.html Cosmopolitan is also able to dynamically link things like KERNEL32.DLL so you can create GUI applications. See https://justine.lol/apelife/index.html See also this issue where people have requested hardware accelerated graphics: https://github.com/jart/cosmopolitan/issues/35

Cosmopolitan can be doing more if game studios are willing to back the project. Remember how John Carmack dipped his toes in the water with Quake on Linux decades ago and was promptly scared off by the chaos and complexity of the open source development process? Cosmopolitan brings the kind of order to Unix interfaces that game developers need. Games can't rely on people like distro maintainers to patch and recompile their codebases every time there's a change to some dynamic library. They need to be able to compile their programs once and have game enthusiasts be able to enjoy those programs for years to come. Right now anything you build with Cosmo will run on every Linux distro since 2007. That means there's a good chance your programs will continue to work perfectly and natively on every Linux distro in the future. You might not be able to run shaders on Nvidia chips yet. But what we've accomplished so far is a big leap forward compared to the status quo.


Thank you very much for your work. It is Bellard-level magnificent!


I'm deeply honored you should say that. I have a lot of respect for Bellard.


This looks great! Combining this with Nim would be extremely powerful. Is it a matter of changing compilers like musl or would some tweaks need to be done for it to work with Nim?


On the subject of bloated executables, git on windows is epic - it's broken into something like 17 separate executables, 3.1mb each, over 500mb in total.


That's why more and more things are builtins now.


Why?


In a world of accidental complexity explosion when you think all hope is lost, a gem like this appears to demonstrate what technological progress looks like.


Shout out to the cosmopolitan project. It looks super cool. Though I can imagine there might be more ABI issues when you move to other languages like C++


What is the history of the term "fat binary"? I first heard of it as a term for Mac applications that combined 68k and PowerPC code, and wasn't really aware of any wider use.

Can you make a program these days that runs natively on M1, x86, PowerPC, and 68k Macs?


Not sure about the origin, but before Mac fat binaries (which were formally called “universal binaries”), NeXTstep had quad-fat binaries with m68k, i386, SPARC, and PA-RISC.

Edit: Just realized this all might have been happening at the same time. The ‘90s are a blur!


I believe so, yes. I've gotten three fourths of the way there, if anyone has the last one (though, it would be a stretch to call them "Macs" at that point) feel free to post it here: https://github.com/saagarjha/dummy_thicc


Reminds me of .kkrieger, a full 3D shooter with sounds, textures, enemies, levels, etc. packed in ~96kb.

https://en.wikipedia.org/wiki/.kkrieger


Sizecoding is a major part of the demoscene.

.kkrieger may be the most notorious in the game category but 96k is actually on the heavy side for such productions. One other example of a really small game is BootChess, a chess engine in less than 512 bytes to fit in a BIOS boot sector.

And you should look what people are fitting into 4k intros nowadays.


It's amazing what compression and Perlin's noise generator can do.


How is it possible to build once and run on both Windows and Linux if Windows uses the PE executable format and Linux uses ELF?

Is there some kind of trickery to make a file both PE and ELF valid? Doesn't magic numbers conflict?


APE executables are PE executables that embed a shell script in the MS-DOS stub. The GNU or LLVM linker is configured to generate a printf '\177ELF...' >$0 statement which inserts the ELF or Mach-O header into the first 64-bytes depending on where the executable is run. See https://justine.lol/ape.html and https://raw.githubusercontent.com/jart/cosmopolitan/master/a...


.. actually, wait, are you the author? Thanks for open sourcing this!

Can I use it to make a Rust executable? Maybe it can post-process the executable built by Rust somehow? (I guess Rust needs to link against the cosmopolitan libc and otherwise not make any syscalls)

Maybe it's possible to make a cosmopolitan Rust target, like, x86-unknown-cosmopolitan?

Also, does it support x86_64? (and maybe a fat x86/x86_64 binary as well?)


This is so freaking cool!


IL code. .net compiles to intermediate language which runs on a local interpeter.

https://en.wikipedia.org/wiki/Common_Intermediate_Language

https://www.i-programmer.info/programming/other-languages/93...


Now that there is a possibility that arm machines may become more common, it would be interesting to be able to build multi-arch AppImages. Maybe Ryan Gordon FatELF could be part of it.


Using Docker and Python I can only say: very, very fat. The Tensorflow image is something like 5 GB.

On the other hand, Docker images is literally the only thing taking up space on my hard drive by now.


There will be other architectures after x86 has died. Mark my words.


Probably developed and run by intelligent rabbits to whom humans are fairy tales.


It wasn't that long ago that programmers had to be reminded that there were non-VAX processors.


Or you can ship Java or .NET binary (obviously)


I wanted to see if I could use this with Python, so I Googled "python cosmopolitan", and this was the first result: Passion Python Sex Position - Cosmopolitan [0]. I had completely forgotten about that magazine.

[0](https://www.cosmopolitan.com/sex-love/positions/a26874/passi...)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: