Hacker News new | past | comments | ask | show | jobs | submit login
Tiny ELF Files: Revisited in 2021 (nathanotterness.com)
105 points by jandeboevrie on July 10, 2023 | hide | past | favorite | 12 comments



Last November, I looked into this problem a bit more thoroughly, studying all the conditions that the Linux kernel actually checks. My conclusion was that the absolute smallest possible 64-bit ELF file in Linux was 80 bytes. (The smallest 32-bit ELF file was indeed 45 bytes, as in the original article.) This version was written to simply call _exit(0), but I recall that there are a few more opcode bytes to spare:

  00000000: 7f45 4c46 b0e7 0f05 0000 0000 0000 0000  .ELF............
  00000010: 0200 3e00 0000 0000 0100 0000 0100 0000  ..>.............
  00000020: 1800 0000 0000 0000 1800 0000 0100 0000  ................
  00000030: 0000 0000 0000 3800 0100 0000 0000 0000  ......8.........
  00000040: 0100 0000 0000 0000 0000 0000 0000 0000  ................
I wonder if this version could be modified to create a Hello World smaller than 105 bytes? Unfortunately, I can't find where I put all my notes for this, though I did keep the snippet I used to generate it:

  alignas(Elf64_Ehdr) char data[80] = {0};
  Elf64_Ehdr *elf_ex = (Elf64_Ehdr *)data;
  memcpy(elf_ex->e_ident, ELFMAG, SELFMAG);
  elf_ex->e_type = ET_EXEC;
  elf_ex->e_machine = EM_X86_64;
  elf_ex->e_entry = 0x0000000100000001;
  elf_ex->e_phoff = 24;
  elf_ex->e_phentsize = sizeof(Elf64_Phdr);
  elf_ex->e_phnum = 1;
  Elf64_Phdr *elf_ppnt = (Elf64_Phdr *)(data + 24);
  // elf_ppnt->p_type = PT_LOAD;
  // elf_ppnt->p_flags = PF_X;
  // elf_ppnt->p_offset = 24;
  elf_ppnt->p_vaddr = 0x0000000100000018;
  // elf_ppnt->p_paddr = 0x0038000000000000;
  // elf_ppnt->p_filesz = 1;
  elf_ppnt->p_memsz = 1;
  // elf_ppnt->p_align = 0;
  memcpy(data + 4, "\260\347\17\5", 4);


Alright, through some trial and error, I've found all the bytes that are fully modifiable without the program crashing:

  00000000: 7f45 4c46 cccc cccc cccc cccc cccc cccc  .ELF............
  00000010: 0200 3e00 cccc cccc 0100 0000 0100 0000  ..>.............
  00000020: 1800 0000 0000 0000 1800 0000 0100 0000  ................
  00000030: cccc cccc cccc 3800 0100 0000 0000 0000  ......8.........
  00000040: 0100 0000 0000 0000 cccc cccc cccc cccc  ................
(The words at 0x1c-0x1d and 0x2c-0x2d can also be modified, but they have to be equal to each other, and there are other limitations on their values.)

From there, some careful assembly golfing gives me a 91-byte Hello World that exits successfully:

  00000000: 7f45 4c46 b001 488d 3540 0000 0050 eb38  .ELF..H.5@...P.8
  00000010: 0200 3e00 0f05 eb18 0100 0000 0100 0000  ..>.............
  00000020: 1800 0000 0000 0000 1800 0000 0100 0000  ................
  00000030: b0e7 31ff 0f05 3800 0100 0000 0000 0000  ..1...8.........
  00000040: 0100 0000 0000 0000 b20e 5feb c748 656c  .........._..Hel
  00000050: 6c6f 2c20 776f 726c 6421 0a              lo, world!.
The main annoyance is that I have to use a 7-byte instruction to load the RIP-relative address of the string into RSI, since the program can't be placed any lower than 0x000100000000 in virtual memory with this offset. There may yet be some way to shorten it that I haven't thought of, though. The four blocks of instructions are

  0x01:
    .ascii "ELF"
    mov al, 1
    lea rsi, [rip+0x40]
    push rax
    jmp 0x48
  0x14:
    syscall
    jmp 0x30
  0x30:
    mov al, 231
    xor edi, edi
    syscall
  0x48:
    mov dl, 14
    pop rdi
    jmp 0x14
  0x4d:
    .ascii "Hello, world!\n"
Perhaps I should make a write-up about the exact constraints imposed by the kernel, and how they limit the possible offsets between the ELF header and program header.


A variation, with a CALL/POP in place of the LEA/JMP, also clocks in at 91 bytes:

  00000000: 7f45 4c46 50b0 01b2 0e89 c7e8 2000 0000  .ELFP....... ...
  00000010: 0200 3e00 0f05 eb30 0100 0000 0100 0000  ..>....0........
  00000020: 1800 0000 0000 0000 1800 0000 0100 0000  ................
  00000030: 5e40 b64d ebde 3800 0100 0000 0000 0000  ^@.M..8.........
  00000040: 0100 0000 0000 0000 b0e7 5f0f 0548 656c  .........._..Hel
  00000050: 6c6f 2c20 776f 726c 6421 0a              lo, world!.
  
  0x01:
    .ascii "ELF"
    push rax
    mov al, 1
    mov dl, 14
    mov edi, eax
    call 0x30
  0x14:
    syscall
    jmp 0x48
  0x30:
    pop rsi
    mov sil, 0x4d
    jmp 0x14
  0x48:
    mov al, 231
    pop rdi
    syscall
  0x4d:
    .ascii "Hello, world!\n"
I no longer think it's within my ability to get this any shorter, at least not without wackier tricks, like using the fixed header bytes within instructions, or interleaving instructions within each other, or pushing some wild code snippet onto the stack. (Any variation of pushing the string onto the stack takes too many bytes to pad the immediate operands and to put RSP into RSI; running any trivial snippets on the stack simply doesn't make the code any shorter; and in both the last variation and this one, there's no way I can find to put the string address 0x10000004d into RSI and jump to the next block in fewer than 9 instruction bytes. Note that 1 instruction byte in the first block can be saved in both of these variations by popping argc into RAX rather than moving a literal 1 into it, but that extra byte can't be used to save any instructions in later blocks.)


As it turns out, the idea of using the fixed header bytes in the instructions isn't so wacky after all! We can move the 0x00000001 at 0x18 into EDI, then use the next two bytes as a JMP to the next block. (Also, I'd totally forgotten that the SYSCALL instruction saves the return address in RCX, which lets us use a no-op read(2) to shave a byte off the address calculation and avoid the assumption that argc == 1.) This technique takes the program down to 86 bytes, the smallest it can become without breaking up the "Hello, world!\n" string:

  00000000: 7f45 4c46 500f 05b0 01b1 4851 b20e eb04  .ELFP.....HQ....
  00000010: 0200 3e00 5eb3 e7bf 0100 0000 eb12 0000  ..>.^...........
  00000020: 1800 0000 0000 0000 1800 0000 eb12 0000  ................
  00000030: 0f05 935f 0f05 3800 0100 0000 0000 0000  ..._..8.........
  00000040: 0100 0000 0000 0000 4865 6c6c 6f2c 2077  ........Hello, w
  00000050: 6f72 6c64 210a                           orld!.
  
  0x01:
    .ascii "ELF"
    push rax
    syscall
    mov al, 1
    mov cl, 0x48
    push rcx
    mov dl, 14
    jmp 0x14
  0x14:
    pop rsi
    mov bl, 231
    mov edi, 1
    jmp 0x30
  0x30:
    syscall
    xchg ebx, eax
    pop rdi
    syscall
  0x48:
    .ascii "Hello, world!\n"
I can't think of any possible trick that would allow the string to be broken up while also saving bytes overall. Perhaps there exists some wild self-modifying code that somehow produces the correct behavior, or perhaps this program truly is the shortest.


The guys at Be Inc created a compressed ELF format for BeIA (the internet appliance OS they created just before the company folded.) I believe this is the patent (there may be others) that covers the process: https://patents.google.com/patent/US6883087

I think the Patent is just a general method for creating "crushed" executable code... The BeIA ELF implementation used the magic number "CEL" rather than "ELF" and is used a common table to look up certain symbols - so the exe only had the non common parts in it (or something like that, I haven't got a good handle on how it functioned, just a developers view of what it was like to use.) The process was called "crushing" and a "crushed" exe was massively smaller than the "uncrushed" version. I went over a load of the info I have in this thread over on the Haiku OS discussion forums: https://discuss.haiku-os.org/t/imagining-beia-like-concept-i...


> I haven't been able to track down the original date, but it was already around in the early 2000's.

I remember reading the original article when working on our 4k intro Sesamstr [1] for a demoparty in September 1999. (The ELF Kickers [2] changelog lists July 1999 for the initial release.) We wanted to dynamically link libvga.so, which requires some more features of the ELF header to play with. It turns out that the linker hash table can overlap with other parts of the header, to shave off some bytes.

As an additional shot of nostalgia, I remember some other sceners (Wedge, Viznut, XeF4?) trying to avoid SVGAlib entirely, so that one could use the smaller "a.out" format. Because one could not simply use movl $0x13, %eax / int $0x10 in Linux, there were some crude attempts to set all the required VGA registers "by hand". I think that required a couple of hundred bytes to pull off. Back then, you could probably get away with not restoring text mode at the end :)

[1] https://www.pouet.net/prod.php?which=1318

[2] http://www.muppetlabs.com/~breadbox/software/elfkickers.html


Nice. One thing I liked about the original MuppetLabs article is that it included a dl example with a call to libc.so (which adds a whole additional layer of complexity.) I'd be very interested in seeing a 'modern', 64 bit nasm example doing this in "raw" binary mode (nasm -f bin), tiny or not so tiny.


its not too bad, but you do need much more bytes for some additional elf structures like for example dynamic table entry for the library, dyn strings to go in there for the lib name and perhaps some relocation and things like GOT/PLT depending on if you want PIC code or not etc. fun excersize you can do yourself with this kind of thing as a base / template. i'd recommend to start with the non-overlapped-headers version :') saves some headaches. readelf and objdump can help a lot. (compare with normal compiled binary which runs 'printf' or some libc function...). once it works properly u can see what stuff u can overlap into places with bunches of 0s. (not a lot :P). everything u need is in elf specifications and x86_64 abi specification.


> excersize

If we'd transfer this discussion to DOS .exe files, there's bound to be an even better pun in there ;)


I agree with the pun potential, but surely on DOS the smallest binaries would be COM executables? So perhaps for exe we should restrict ourselves to modern Windows?


Ah interesting, thank you. Agree with sibling "excersize" is a really good one. I remember having a lot of fun excersizing with PE files under Windows95. More recent OS's, well, they are different.


And here I was thinking we were talking about those files included on old PS2 discs...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: