
Tips for C libraries on GNU/Linux - qznc
https://git.kernel.org/?p=linux/kernel/git/kay/libabc.git;a=blob_plain;f=README
======
bjourne
Most of those tips are great, but _dont_ use autotools! The points they give
for choosing autotools are exactly the same as one would have read maybe a
decade ago when people said that CVS was "good enough" and that it "just
works." Subversion, git and several others are much better alternatives and
where developed because people weren't happy with CVS quirks and crappiness.
One example is waf (<http://docs.waf.googlecode.com/git/book_16/single.html>),
arguably just as well documented as autotools but without all the cruft and
generated files madness autotools forces you to go through. CMake
(<http://www.cmake.org/>) is another, used by KDE 4. SCons
(<http://www.scons.org/>) another nice build tool. These also have the
advantage of working much better on Windows, a platform autotools is
completely alien to.

~~~
yorhel
Please _do_ use autotools! You only need two files: configure.ac and
Makefile.am, both at the top-level of your project. The autogenerated stuff
can be ignored, you don't have to learn M4 to use autoconf. And (if you are a
bit careful, but that's not too hard) you'll have many nice features such as
out-of-source builds, proper feature checking, amazing portability, 'make
distcheck', and acceptable cross-compiling (still hard to get right, but the
alternatives tend to be even harder). Don't switch to another build system
purely based on the idea that it's "more elegant".

~~~
ambrop7

      $ tar xf autotools-using-package.tar.bz2
      $ cd autotools-using-package
    

Configure, compile, install. Oh, I found this little bug. Let's try to fix it
... edits configure.ac ....

    
    
      $ ./autogen.sh
      Error: possibly undefined macro AC_BLABLABA
    

Spend some hours figuring this out... Oh, I need to install an old version of
auto*! How do I get the old one but keep the new one around? Spend another 30
minutes to figure that out.

    
    
      $ ./autogen.sh
      checking for build system type...
      ^C
    

No damn, I wanted to generate configure, not run it! How do I clean up the
mess it made just now?

    
    
      $ make clean
      $ make distclean
      $ ./configure --prefix=$HOME/my_app ...
      $ make -j9 install
      ...
      install: no such file or directory blabla.la
    

WTF!?!?! Spend an hour or so googling this mess. Ah, it's a parallel-make bug.

    
    
      $ make install
    

HOLY SHIT, IT INSTALLED!!!

Let's submit this fix upstream. No problem, use diff.

    
    
      $ mkdir temp
      $ tar xf autotools-using-package.tar.bz2 -C temp
      $ mv temp/autotools-using-package autotools-using-package.orig
      $ diff -urN autotools-using-package.orig autotools-using-package
    

WTF IS ALL THIS MESS IN THE DIFF I NEVER TOUCHED?!!??!

I know you're going to say I should be using the VCS checkout in the first
place, which would hopefully be configured to ignore the autogenerated files.
But as a user, or distribution maintainer, most of the time the bug you find
is with a specific, packaged version of the software, and it may be quite an
effort to figure out how to get the exact same version from the VCS server.

~~~
JoshTriplett
> $ ./autogen.sh > checking for build system type... > ^C > > No damn, I
> wanted to generate configure, not run it! How do I clean up the mess it made
> just now?

I always run "./autogen.sh --help" for exactly that reason; then if I see
--help output from configure, I know that autogen.sh "helpfully" ran configure
for me.

You can also usually just run "autoreconf -v -f -i" directly, unless the
package has done something unusual that it needs to take extra steps in
autogen for.

------
radarsat1
> _Avoid callbacks in your API_

This is good advice especially if you want to write bindings for other
languages, including callbacks (as I've found) adds a lot of difficulty to
that stage.

However, two main projects I work on use callbacks extensively. The reason is
that they are essentially event-based, and it would seem much less user-
friendly to force the user to implement a huge switch statement, particularly
when user-defined events are involved.

How else could callbacks be avoided? In some cases, they just seem like the
most user-friendly option.

Although for bindings they are more difficult, using callbacks can be great
when binding to higher-level languages, where the user can specify what should
happen using a short lambda function (e.g. Python, etc.) A switch statement or
whatever is much more obnoxious in those cases, so what other solutions are
there?

~~~
justincormack
Interfaces like epoll let you pass and return you a 64 bit value which can be
an address of a callback or a value to look up in a dispatch table. The user
can choose. This is very flexible.

~~~
ambrop7
A very nice feature of epoll is that an epoll file descriptor (which monitors
a set of file descriptors) can itself be monitored via epoll/select/poll. This
means that if epoll is available, a library can abstract all its I/O via a
single file descriptor. Think of a complex operation that involves multiple
sockets and timer events. All the user needs to do is monitor this one file
descriptor in his event loop, and call a function in the library which
determines which of its own file descriptors have become ready, and reacts
appropriately, possibly by calling callbacks if the user needs to be notified.

Unfortunately if you're trying to be portable, you can't do this, and instead
have to implement a complete event notification abstraction (unless you know
you will only ever be dealing with this one file descriptor). E.g. user of the
library needs to implement functions like addWait(fd, io_type, callback),
delWait(wait_id), addTimeout(milliseconds, callback), delTimeout(timeout_id).

------
qznc
For a counter-point you could read <http://sta.li/faq>

Most prominently Plan9 and friends are opposed to dynamically linked
libraries.

~~~
emillon
It's also a security and maintenance nightmare. I totally respect the suckless
people for trying this experiment, though.

------
emillon
Nice set of tips. I'd like to put an emphasis on symbol versioning, SONAMEs
and symbol visibility. That's what allows distribution to upgrade your library
without recompiling everything, and too few developers are aware of this
(simple) mechanism.

------
jfaucett
"Don't write your own LISP interpreter and do not include it in your library.
:)" - lol

~~~
chj
I don't get it..

~~~
qznc
As we are in GNU-land here, you should use the official GNU extension
language: Guile (Scheme)

<http://www.gnu.org/software/guile/>

~~~
justincormack
No one seems to live in that part of GNUland any more.

~~~
noahl
A few of us do, but not many.

I work on Guile because I enjoy messing with compilers. I absolutely agree
that very few projects use it (exceptions include Lilypond and GNUCash).
However, that might change - Guile 2.0 switched from a simple interpreter to a
virtual machine and a compiler to that VM. That means that Guile is
competitive in speed with other scripting languages (and actually faster than
some of them, I believe). It also means that it now supports multiple high-
level languages. There is currently a good Emacs Lisp implementation and about
half of an ECMAScript implementation.

I don't know if it'll become more widely used, but on the mailing lists you
certainly see people writing libraries and contributing code, so I think there
is a real chance of it. My sense is that Guile is now coming out of a period
of stagnation. I don't know where it's going.

~~~
fafner
Isn't GNU Make adding support for Guile scripting?

edit: [http://lists.gnu.org/archive/html/guile-
user/2012-01/msg0006...](http://lists.gnu.org/archive/html/guile-
user/2012-01/msg00060.html)

~~~
noahl
You appear to be right, but I don't know anything about it.

That's exciting, though - it would be really cool if Makefiles could do more
computation. Maybe then we'd get an easier-to-use autoconfiguration system!

~~~
fafner
On one hand I'm a bit afraid that this might add more complexity to an already
complex tool. But on the other hand the current tools in Make are very ugly to
use. So Guile could certainly be an improvement.

And an easier-to-use autotools? Bring it on!

------
andrewcooke
<http://udrepper.livejournal.com/20407.html> explains the O_CLOEXEC issue.

~~~
ambrop7
The solution I have done in my software is to close all unneeded file
descriptors right after fork() in the child [1].

It can be argued that this is not the best solution because every place where
fork() is called needs to be patched, and this could be in libraries. But the
same applies to O_CLOEXEC flags; every place where file descriptors are
created needs to be patched. Further, there are probably many more places
where fd's are created than where fork() is called.

So if you want to be super careful library, you should do both. Yes, I know
the article advises against fork() from libraries. But sometimes you really
need it. It's not bad per-se, just bad when done in *nix because of the broken
design of OS interfaces.

[1]
[http://code.google.com/p/badvpn/source/browse/trunk/system/B...](http://code.google.com/p/badvpn/source/browse/trunk/system/BProcess.c#241)

~~~
alexlarsson
Yeah, and then some other thread opens a file descriptor while your loop is
busy closing the last few file descriptors...

CLOEXEC approaches are the only race free solutions.

~~~
premchai21
fork() only leaves the one thread running in the child, and at that point the
fd tables are no longer shared, so trying to detect and close unwanted
descriptors in the child after fork is not racy by itself as a way of
mitigating the possibility of uncontrollable non-CLOEXEC opens elsewhere in
the process (though this doesn't preclude it being a bad idea for other
reasons).

~~~
alexlarsson
true, sorry.

------
meaty
I came here to have a rant about autotools being mentioned and found a lot of
people doing the same.

Thats a sign something needs to be taken out in the yard and shot if there
ever was one.

~~~
dlitz
No, seriously, please just use autotools. It's way better for users and
packagers of your library.

If you're not going to do that, then use an alternative that provides the same
command-line interface as autotools (so that things like "./configure
--prefix=FOO && make -j4 && make install DESTDIR=BAR" still work). As far as I
know, there's no such thing as of yet.

------
sigjuice
<https://twitter.com/timmartin2/status/23365017839599616>

------
olalonde
> \- Make your library threads-aware, but _not_ thread-safe!

Can anyone explain this piece of advice?

~~~
nn2
Don't do your own locking. Let the caller pass in state. Push locking to the
caller. Don't have your own global state that would need hidden locks, but
instead let the caller handle it with arguments.

That is similar how the STL does it, but not like stdio

I'm not sure I fully agree. \- For simple libraries it's likely good advice.
\- But it encourages big locks and poor scaling. It may be right for desktop
apps, but not necessarily for server code that needs to scale. For some things
that's fine, but you don't want that for the big tree or hash table that your
multi threaded server is built around on. \- It avoids the problems of locks
being non composable, that is the caller may need to know which order the
locks need to be called, to avoid deadlock. Actually it doesn't avoid it, just
pushes it to someone else. However if you make sure the library is always the
leaf and never calls back the library locks will be generally at the bottom of
the lock hierarchy.

~~~
KMag
If this advice is taken the wrong way, then it "just pushes [the locking
problem] to someone else", but often locking is a crutch. Sure, there are some
programs that have a natural need for a lot of globally mutable state, but not
many.

Let's be honest. Most multithreaded programs evolve from programs that are
more-or-less single threaded. Then, threads are added in an attempt to improve
performance, and high-contention locks are broken into finer grained locks
when profiling shows lock contention in the critical path. I would argue it's
better to either design for minimal mutable global state from the start.
Failing that, it's often better to re-factor the code when you start scaling
up the number of threads, before you start investing a lot of time into
locking and breaking down your big locks into finer and finer grained locks.

I'm sure you're not one of those programmers who often leans on
mutexes/semaphores/etc. as a crutch to prop up poor design, but there are a
lot of programmers who do.

