Hacker News new | comments | ask | show | jobs | submit login
The missing link: explaining ELF static linking, semantically (2016) [pdf] (researchgate.net)
31 points by signa11 29 days ago | hide | past | web | favorite | 8 comments

Despite all the complexity that goes into _creating_ a statically linked ELF executable as the PDF describes, I was reminded exactly how easy it is to unwrap and load a static ELF binary into memory when I was writing a static ELF loader the other week.

You literally end up with a table ("section header table") in your executable with sections of type "SHT_PROGBITS" and "SHT_NOBITS". You copy <n> bytes from the offset in the ELF image for the former to the address specified, and zero <n> bytes at the address specified for the latter.

Obviously in a system with virtual memory, you need to account for page table mappings, and for a multiprocessing system you need to setup a process vector with initial instruction pointer, address of the page tables etc. so that you can task switch.

But for the most basic case (ie. a DOS or CP/M workalike) you can pretty much wallop the program into memory and jump or call to the entry point.

Even creating ELF executables is not particularly hard. As the article says, the complexity is in the linker. If you keep track of depedencies yourself, creating an ELF binary boils down to writing a header and aligning the text and data segments properly.

Here is a full compiler that emits ELF in about 1600 lines of code (the ELF emitter is about 50 lines in size, most which deals with generating the header):


That is a fascinating project, thanks for sharing.

An ELF loader doesn't need to look at sections, that's too fine grained anyways. An ELF loader should instead use the program headers' PT_LOAD "instructions".

Ha good to know! My ELF loader is 228 source lines, I would love to make it smaller. I will look into that.

Perhaps for CP/M. Loading an MS-DOS .EXE file (not a .COM file) you'll need to do some fixups to adjust for the actual address the program is loaded into [1][2].

[1] https://github.com/spc476/NaNoGenMo-2015/blob/master/C/msdos...

[2] So I could run an old MS-DOS executable and redirect its I/O to the Linux host running the program. Dosbox did not support that feature.

Sorry I should have clarified: "for a DOS or CP/M workalike which uses ELF binaries" which is what I'm currently working on for fun.

That is interesting note about DOS though. CP/M binaries from memory are raw binary images hard coded to be loaded into address 0x100, at least for CPM-80 anyway.

this (https://www.youtube.com/watch?v=dOfucXtyEsU) is also pretty good.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact