a.out was the name of the output file from unix linker. Thus the format of that file was "a.out" format. all the other unix toolchains did the same thing for compatibility.
a.out is pretty limited, though fine for small PDP-11 binaries: some text (executable), initialized data, and an amount of RAM to preallocate, plus some symbols.
Even when I was designing bfd back in 1989 it was clear that a.out, while pervasive, just wasn't powerful enough, especially when you added C++ constructors, destructors, atexit stuff etc. SVR4 by then had COFF to be more flexible (Microsoft's PE is a COFF derivative). But COFF was kinda crufty (not bad for a first effort!) and was clumsy for things like shared libraries.
ELF is a good example of a standards committee designing something to handle known problems. It's not perfect, but pretty good.
One can have e_phoff or e_shoff or both... or I suppose neither. Because you can have both, they can conflict. If you have e_phoff, then within that there is the issue of p_vaddr and p_paddr. You could have one or the other or both, or I suppose neither. They can thus conflict. The various flags associated with these things can also conflict.
This comes up often with boot loaders and kernels. There are versions of Linux that are described by p_vaddr, p_paddr, and sh_addr. Sometimes one or another has an offset added/subtracted, or is zero, or is simply missing or worse.
None of the above even makes sense once you add ASLR. I have no idea how that is expressed, but I'm pretty certain it is a gross hack. There just isn't a proper way to express such a thing.
Needlessly, there are two types of symbols. They could conflict. Better formats just use a flag bit.
BTW, when I hand-generate ELF files, I hit bugs in BFD. I recall hitting one that relates to string tables, with BFD refusing to accept an index of 0. I had to add a useless NUL byte.
ASLR is, relatively speaking, a recent invention (when compared to ELF). ELF describes memory layout, so if you want to randomize it you pretty much have to do your relocation at runtime...which you can do except for the entry point, but I suppose you could make an extension to relocate even that. For that one small thing I think you could propose a revision to the ELF spec.
And I'm sure there are still bugs in bfd! But with ELF being pretty pervasive these days that 30 year-old (just noticed that: 1989-2019) software is less useful than it used to be. But feel free to send in a bug fix if you can't have a string table index of 0 (it would surprise me if that bug really exists, but stranger things have happened...)
The same mostly goes for having both physical and virtual addresses. Has there ever even existed a loader that would generate page tables according to that? There should have been one kind of address, and possibly header flags to indicate allowable MMU states.
If there is no such bug, then you can stop wasting that byte at the beginning of the string tables.
All high end language features drop down to binary formats -- that's what ABIs are.
In the case of things like constructors and destructors you can of course just stick them all in the text section and have the linker construct call to them in some fashion. You'd have to use some reserved name mangling to distinguish them so that the linker can even find them.
If instead you have separate sections the linker can just run through the symbol table; the dynamic loader can even do so so the compile-time linker can ignore those sections (just copy them). You can even play tricks like unloading them after running (or only load exit code upon exit) if memory games like that matter. etc etc.
It's all about how you communicate semantic issues from compile time to run time, as with all ABI issues.
This needs to happen regardless how the binary, or dynamic library gets loaded.
It is not directly related to the filename used by gcc. It is related to the a.out binary executable file format (https://en.wikipedia.org/wiki/A.out). It is confusing due to the file format, and the choice of default name by gcc, having the same name.
> Also, I thought bitrot was bits in files changing over time due to the storage medium failing;
That is the usual meaning for "bitrot".
> is this a different use of the term?
To some extent yes. In this context the use of "bitrot" is related to parts of the source code drifting apart from changes made in other areas of the source. So in a very broad sense, a similar meaning, but also a very contextual based meaning.
Yes, it is historically related. The file format is the descendant of binary format of the first C compiler, that already produced a.out files by default. Today in linux you typically get an ELF, but is still called like that.
My understanding is that it's more to do with things moving on around a project that's not actively maintained - That piece of software you wrote 15 years ago might not run any more because the linux kernel has changed, or the libc api. It may not compile straight off because a dependency is gone, or a header file has moved, or all sorts of little niggly things.
It's not literal rot, but a sort of metaphor for what happens when you leave stuff for ages.
(answer: historically related, but the gcc output isn't in this format today)
No, that's not what "bit rot" means.
"Bit rot" refers to software that stops working because of non-backward-compatible changes in the hardware/software environment.
For example, if you try to play an old Windows 95 game on Windows 10, you will probably have a bad time.
I have encountered COFF executables, and been part of a COFF->ELF transition on a embedded system, but Linux never supported COFF afaik.
But if it's broken for 3 years and you didn't say anything, nobody will make a fix for it. Or it's unlikely.
A.out isn't used anymore. Like, not even the most ancient linux machines are likely to use a.out format. It's atm going to be hidden behind a flag, default off, if nobody complains then it'll be completely removed in a few releases.
I still have the file system from a linux host I installed in 1999 or 2000. Debian; I made a VM copy when I retired the iron. It doesn't have a.out shared libraries, so even if the kernel's support worked in 2000, I could't run that binary for lack of shared libraries.
If the kernel's support hasn't really been tested for 19-20 years and the shared libraries haven't even been built, then I think running that binary has very poor chances of working.
Now, if you agree that shared libraries haven't been available for a long time and that coredumps haven't worked for a long time, that doesn't mean that nothing works. That doesn't mean that every case has been broken. But I think it means that the support has been very lightly tested for a long time now, or not tested at all. And assuming that something works when it hasn't been tested for a long time is ugh.
I think the reason I care about this is dogmatic. I believe, dogmatically, that untested code is broken. If you accept that dogma and do not test a.out support now, then the coming change does not break a.out. If you do test, fine. If you disagree that untested code is broken, fine. If you say it "might imaginably work" or somesuch, fine.
a.out is a very old format superseded by ELF.
I would like to ask who's idea it was to name GCCs default output after a completely different file format ?!
So the command-line arguments and defaults are older than the ELF format. When ELF support was added, the question was: Should the default value for -o depend on other arguments? There didn't seem to be good reason to have a complicated default, and keeping the simple default provided compatibility with old scripts and makefiles.
cd tests ; for a in *.c ; do gcc $a && ./a.out ; done
That's a great blog post; it's not just about the policy, but talks about some people who disagree, provides a lot of links, and is just generally better than anything I could hope to write here. I'll only observe that I'm not endorsing anything except that that's a pretty good post. I'm just pointing out the context, not arguing for or against "don't break userspace" personally.
A.out is so positively ancient that it's likely been broken for years, nobody is testing or using it in live systems anymore.
He's also pragmatic. He will break userspace in 100 ways if it's needed.
When Linus say never break userspace, I understand it as : never break userspace without warning and consensus over an alternative. Don't break the userspace to force it to evolve or fix things. Fix it there first and then deprecate.
It is not the same thing.
The following disk sets are available:
A The base system. Enough to get up and running and have elvis and
comm programs available. Based around the 1.0.9 Linux kernel,
and the new filesystem standard (FSSTND).
These disks are known to fit on 1.2M disks, although the rest of
Slackware won't. If you have only a 1.2M floppy, you can still
install the base system, download other disks you want and
install them from your hard drive.
AP Various applications and add ons, such as the manual pages,
groff, ispell (GNU and international versions), term, joe, jove,
ghostscript, sc, bc, and the quota patches.
D Program development. GCC/G++/Objective C 2.5.8, make (GNU and
BSD), byacc and GNU bison, flex, the 4.5.26 C libraries, gdb,
kernel source for 1.0.9, SVGAlib, ncurses, clisp, f2c, p2c, m4,
E GNU Emacs 19.25.
F A collection of FAQs and other documentation.
I Info pages for GNU software. Documentation for various programs
readable by info or Emacs.
N Networking. TCP/IP, UUCP, mailx, dip, deliver, elm, pine, smail,
cnews, nn, tin, trn.
Object Oriented Programming. GNU Smalltalk 1.1.1, and the
Smalltalk Interface to X (STIX).
Q Alpha kernel source and images (currently contains Linux
Tcl, Tk, TclX, blt, itcl.
Y Games. The BSD games collection, and Tetris for terminals.
X The base XFree86 2.1.1 system, with libXpm, fvwm 1.20, and xlock
X applications: X11 ghostscript, libgr13, seyon, workman,
xfilemanager, xv 3.01, GNU chess and xboard, xfm 1.2, ghostview,
and various X games.
XD X11 program development. X11 libraries, server linkkit, PEX
XV Xview 3.2 release 5. XView libraries, and the Open Look virtual
and non-virtual window managers.
IV Interviews libraries, include files, and the doc and idraw apps.
These run unreasonably slow on my machine, but they might still
be worth looking at.
OI ParcPlace's Object Builder 2.0 and Object Interface Library 4.0,
generously made available for Linux developers according to the
terms in the "copying" notice found in these directories. Note
that these only work with libc-4.4.4, but a new version may be
released once gcc 2.5.9 is available.
T The TeX and LaTeX2e text formatting systems.
> Because I think the likeliihood that anybody cares about a.out core dumps is basically zero. While the likelihood that we have some odd old binary that is still a.out is slightly above zero.
You can do it the other way around.
At the time I could run an entire firewall on a 1.44 MB floppy, kernel, libs and enough userland for ipfw scripting. With FreeBSD 2.x. FreeBSD 3.x with ELF put that right out of reach.
Of course now that I spend my days parking silly stuff in random ELF sections I am pretty happy.
“Linux's transition to ELF was more or less forced due to the complex nature of building a.out shared libraries on that platform, which included the need to register the virtual address space at which the library was located with a central authority, as the a.out ld.so in Linux was unable to relocate shared libraries.”