
ABI vs. API (2004) - hiq
https://lists.debian.org/debian-user/2004/02/msg00648.html
======
nachtigall
Best and short explanation I've ever read.

It's a pity that Colin Watson left Debian's Technical Committee
([https://lists.debian.org/debian-
ctte/2014/11/msg00052.html](https://lists.debian.org/debian-
ctte/2014/11/msg00052.html)), but his blog is still worth reading:
[http://www.chiark.greenend.org.uk/~cjwatson/blog/](http://www.chiark.greenend.org.uk/~cjwatson/blog/)

------
ajuc
Great resource about C++ ABI compatibility (what changes ABI, what doesn't,
and how to change code without changing ABI):

[https://community.kde.org/Policies/Binary_Compatibility_Issu...](https://community.kde.org/Policies/Binary_Compatibility_Issues_With_C%2B%2B)

------
jwatte
Having maintained a C++ API and ABI based on implementation inheritance with
virtual functions, I have deep respect for both kinds of compatibility.
Someone even used placement new to put objects with vtable pointers in shared
memory, expecting destructors to work in a different process!

Check out the "fragile base class problem" some time for more things than can
go wrong.

Something that want made explicit: If you support closed source, shipping
binaries, then the ABI can't change, or you break all compiled binaries users
have installed. If rebuilding the world from source is an option, changing the
ABI is an option.

(Introducing a new ABI for new subsystems is still OK - hence renaming libc
when it changes)

(Say what you want about Microsoft, but COM solved that problem pretty well in
the '90s...)

~~~
teacup50
> _If rebuilding the world from source is an option, changing the ABI is an
> option._

It's an option ... but not a very good one.

It tends to create an unstable environment that requires a centralized
organization just to keep it working, precluding a healthy diverse software
ecosystem.

This is a major reason why it's very difficult to distribute binary packages
when operating externally to the distribution packagers, and why commercial
software distribution on Linux is a massive pain in the ass.

------
hendzen
The ultimate guide on writing shared libraries, ABI compatibility, etc, is
this document written by Ulrich Drepper:
[https://www.akkadia.org/drepper/dsohowto.pdf](https://www.akkadia.org/drepper/dsohowto.pdf)

------
twic
Hmm. I've always understood 'ABI' to mean something like 'the conventions for
turning an API into machine code'. So the fact that floating-point arguments
go on the stack is part of the ABI, whereas the fact that some function takes
two floating-point arguments is part of the API. This seems to be the meaning
used in the System V Application Binary Interface [0].

[0]
[https://www.x86-64.org/documentation/abi.pdf](https://www.x86-64.org/documentation/abi.pdf)

------
Someone
30 years ago, this article could have mentioned the OSI model
([https://en.m.wikipedia.org/wiki/OSI_model](https://en.m.wikipedia.org/wiki/OSI_model))

Although desribed in terms of networks, it is flexible/abstract/content-free
(1) enough to describe communication between software components on a single
machine.

An API describes what programs/libraries say to each other, an ABI is a lower
layer that describes how they should do that. I would place them on levels 4
and 3. Level 2 is the hardware in the CPU, level 1 are electrons moving
through tiny wires, but one can change that to something different (light
flowing through fibers, a human keeping track of state on paper, etc) without
affecting ABI nor API.

Looking at this this way, it is clear that there are levels above an API. For
example, one could place the semantics of the API (you can only close a file
that is open, free memory you allocated, etc) at level 5.

(1) I think it is a mix of these, but feel free to pick an adjective you like.

------
dunkelheit
That's why developing with shared libraries is such a PITA. Got to preserve
those precious ABIs.

~~~
viraptor
What exactly do you mean by PITA? In most cases in my experience, you have two
very simple choices - 1. don't modify the existing API / public structures as
you change things and the ABI will stay safe; 2. just change things and bump
up major number when necessary.

The third way of carefully doing only additive modifications to the library to
preserve ABI even over upgrades is hard - that's true. But I don't think it's
needed that often once most of the features are in. And again - it applies
only if you've got some really popular software that gains anything by doing
it this way, rather than just bumping up the version.

~~~
dunkelheit
What I mean is that in addition to rules of the language I program in (say C
or C++) there is another bunch of seemingly arbitrary rules that I must
constantly be aware of to ensure nothing breaks. Rules like "don't reorder
fields of structs" or "you can add new methods to a class as long as they are
not virtual. and sometimes you can add virtual methods too as long as they end
up at the end of vtable".

How do I ensure that I conform to these rules apart from being disciplined
about them? If everything compiles, am I good? No! If everything links, am I
good? No! Depending on linker options and the nature of incompatibility the
program using incompatible ABIs can blow up at runtime or just silently
corrupt data.

And don't get me started on the venerable "pimpl idiom". A page of boilerplate
just to ensure the most basic thing.

Hope that clarifies my short sentiment a bit :) I agree that once you grasp
the rules following them is not that hard but it is just another bit of
incidental complexity that we agreed to maintain.

~~~
rectang
C++ ABI compat rules _really_ hurt when they prevent you from evolving a
library in a sensible way. Say that you really need to add a method to a
virtual class. Unless you planned ahead by adding placeholder methods, you're
SOL. What a lousy design constraint!

This problem is solvable at the language/runtime level by resolving vtable
offsets at runtime, but that doesn't help the existing C++ userbase.

~~~
toolslive
It hurts so much that the boost people decided they wouldn't even bother
trying.

------
amelius
In my opinion, with every change in ABI, the build system should transparently
update all binaries. The only thing the developer should notice is that the
build takes longer.

~~~
tremon
That means the build system has access to all source code of all binaries
compiled against it. That is doable within projects (make dependencies), and
within centralized open-source distributions such as Debian (through archive
rebuild triggers).

But it is impossible in general.

~~~
jschwartzi
Wouldn't there be idioms in the assembly which represent compatibility with
the old ABI? Couldn't you translate those directly in the assembly without
having to re-run the compilation?

ABI describes things like how the stack works, but at the assembly level the
only stack is the one you implement, which is generally compatible with an ABI
version.

~~~
tremon
Not easily. What you are describing is annotating every entry point in a
library with symbol tags, and annotating every function call with the same
symbols. It basically requires you to have another JIT compilation phase or
runtime library to resolve function calls. We do have examples of that (e.g.
.net assemblies), but it requires a lot of infrastructure to support it.

And there's still more than one stack implementation choice at the assembly
level. For example, what is the correct order for passing parameters on the
stack? What is the memory alignment for off-size stack parameters? Is the
caller or the callee responsible for reclaiming the stack space at function
completion? How are function return values passed back?

------
alistproducer2
Can someone explain how the changing of a dependency abi breaks things both at
compile and run time?

~~~
filereaper
A good recent real life example of this is when the POWER architecture moved
to Little Endian. The API's stayed the same (think GLib C API), but LE
introduced a new ABI where registers (ie r12) which had a specific purpose in
BE changed in LE. This all has to do with linking/compiling.

At runtime if you're reading/writing to memory you had to ensure that you read
and write using the same "stride" ie reading and writing back in 32bits is
fine, but if you read in 32bits do some transforms in a register and write it
out in 16bits you'd have to make sure the internal register BE representation
and the written out to memory LE representation made sense.

The GCC and other compiler guys can give you lots of war stories about SIMD
(VMX/VSX on POWER)

Anyways, just something I thought off the top of my head :)

~~~
alistproducer2
Thanks. I've been reading up on OS Dev and static and dynamic libraries and
have been trying to wrap my head around why an ABI change could blow things
up.

------
georgiecasey
very tenuously related, but I forgot that Ian Murdock died over Xmas. crazy.
any updates on what happened there?

------
shiggerino
One of the worst design decisions of Unix to fundamentally separate the source
from the binaries. And Debian and those of their ilk had to make it worse with
that -dev package nonsense.

~~~
csl
Unless I misunderstand you, I thought the exact _opposite_ was true: The
separation of source from binaries is what made UNIX a success. When C was
designed, it allowed one to take the source from one machine to another,
disregarding the underlying machine architecture.

~~~
mitchty
I assume (presume perhaps) they're a windows user and the -dev mention is more
for the case where debug symbols and packages are separated from non debug
binaries.

I only know that windows does something different more akin to debug symbols
in each binary by default with chicanery to not have it impact things in the
general case. But I'm not a windows programmer this is about as much as I know
about windows in that regard.

~~~
shiggerino
Never been a Windows user or programmer, but like you I've heard they do quite
a lot of clever stuff under the hood. Probably a lot of VMS heritage, another
system I've never had a chance to use but heard good things about. But every
version I've tried since Windows 3.1 has had an abysmal user experience, so
I've never once considered switching.

And that leaves no current viable alternatives to the Unix-like systems. Sad,
really.

------
hitlin37
how different is ABI from XPCOM? Wasn't xpcom had the same goals to provide
binary compatibility?

~~~
treve
ABI is not a specific technology in the same way 'API' is not. So the ABI
doesn't have any specific goals, it's just something that emerges when working
with dynamic linking.

------
tener
A decent explanation, but one part strikes me as odd:

> Changing a library's API in a backwards-incompatible way is in general bad
> form because it means that developers of programs using the library have to
> change their source code to port the programs to the new library. However,
> it does happen sometimes in the case of major libraries, such as some of
> those in GNOME and KDE.

So only the big guys are allowed to break the APIs? Really?

~~~
awalton
Nothing in that comment said you can't do whatever you want - it said, quite
specifically, it's bad form to do it, and then explained exactly why that was
the case.

Plenty of libraries break API/ABI all of the time. Notoriously FFMPEG was/is
terrible about it. But depending on libraries that are hugely unstable in this
way is terrible, so in most cases, people go out of their ways to avoid doing
so (most frequently using another library, or in harder-to-avoid cases
creating an interposer library to smooth out the instabilities of the
underlying library).

OSS developers probably know more than anyone in the world that the most
expensive thing to do in software engineering is maintaining a piece of
software as time moves forward. Anything that makes that job more difficult
needs to have an enormous upside to make it worth while, and in almost every
case, using an unstable library is just not worth the extra maintenance costs
attached.

~~~
tener
All true, but for some reason he sees the major libraries as a different beast
and I just don't see why, which is why I wrote it strikes me as odd.

~~~
curryhoward
I think you are interpreting it as "Major libraries are allowed to do it
sometimes."

I think what he meant to say is "Even major libraries do it sometimes." It is
already understood that minor libraries do it.

------
invernomut0
I think you should indicate that this is a off-topic discussion from 2004
debian user mailing list somewhere in the link title.

~~~
Noseshine
"Off topic" to what? The title chosen here matches the contents it links to.

~~~
Rexxar
Probably [OT] in the original title means "off topic" but I don't know why we
should care about this here.

