

Dependency Inversion in C Using Function Pointers - ernstsson
http://ernstsson.net/post/26821666317/dependency-inversion-in-c-using-function-pointers

======
kabdib
... annnnd, we've reinvented callbacks. They've been around a long time,
they're practically primordial. Cavemen were knapping callbacks out of flint
in anticipation of the first computers.

I have some rules for these that have served me well:

1\. Never hold locks when you're calling someone back. You have NO idea how
the caller is going to abuse you.

2\. Be prepared to handle recursion (usually with deferral of some kind, or
possibly an error), because at some point the callee is going to call you
back.

3\. Always provide a 'void*' or some other context to be passed along with the
function pointer (or the callee is doomed to use a global).

4\. Document what the callee can do. For instance, if you're a timer object
making "alarm clock" callbacks, forbid callees from taking too much time
before they return; assert if they blow it.

~~~
ernstsson
Well, not reinventing to be honest, rather redocumenting. There's a lot of
cavemen behavior that still needs to be taught to "cave-kids". You're correct,
an experienced C programmer should know this, but I do expect more to join the
ranks. Good list of rules! Well, actually good to keep in mind even with the
original tangled design but when resolving the tangle it actually clarifies
the responsibility of this even further (the documentation, making sure
there's a void* etc etc).

~~~
jerryr
Agreed. I find articles like this great for passing along to junior engineers
to help extend their vocabulary, making our pairing more productive. So, thank
you for this nice little example of using dependency injection to remove
static dependencies. Which, besides reducing "smell", makes unit testing more
tractable.

I do agree with the parent's list--especially the "void*" pointer for passing
around context. Unless the injected routine is doing something very simple,
some context is almost always required. Providing that along with the function
pointer helps avoid globals--and thus avoid unnecessary singletons.

I could see how providing a complete example that illustrates the use of this
context might muddy the core focus of your article. Maybe a follow-up article?
:) Thanks again for creating teaching material for me.

~~~
kabdib
I'll add: The times that I've left off a void* context, I've always wanted one
later. Just put one there. Honest. Don't think about functions without also
thinking about their environments.

(In languages with closures, you'd just use a closure. Passing a void* around
is C's meatball way of expressing an execution context).

------
gioele
Two problems: perfomance and premature abstraction.

Using function pointers instead of direct calls one drastically reduces the
ability of the compiler to optimise function calls. Second, in most
architectures calling a function through a function pointer will incur in a
noticeable overhead when compared to a simple function call.

But the other problem is premature abstraction. The code is this example is
tangled because the "server" has only one "client". In case there were at
least one more client to be supported at runtime, it would surely be turned
into something similar to what has been suggested. But removing the coupling
before you know the similarities and dissimilarities between "clients" can
only bring problems.

Obviously all this applies only to internal calls, not stable APIs.

~~~
ernstsson
Normally compilers does not do optimization of function calls between files
anyway, so in this specific case that wouldn't matter. Within the same file it
would've been different of course, bringing us to the second point; Yes, in
this simplified example it's really very silly to have two files at all. The
code itself isn't bad, just badly abstracted. The best refactoring for this
limited problem would of course be to combine the files, but then the
description of how to invert dependencies would've been lost. The cost of
having examples easy to grasp in a short blog post I guess.

~~~
exDM69
> Normally compilers does not do optimization of function calls between files
> anyway, so in this specific case that wouldn't matter.

This is why function pointers are poison to the optimizer. There can be a
10-100x perf difference between C qsort and C++ std::sort because the use of
that function pointer kills performance but in the C++ case the sort function
is a template in a header file and can be inlined and then further optimized.

Thankfully, all major C compilers have at least some link time optimization
efforts going on. When link time optimization becomes more widely available,
we can finally stop thinking about translation units and use function pointers
as much as we like.

~~~
ernstsson
So as a summary to the comments above; Function pointers between files isn't
bad for performance if we can't do link time optimization. When link time
optimization really becomes widely available, it's still not bad? Just the
step in the middle that could be affected then? Thinking out loud here, any
thoughts on this? I usually opt for structure over performance before
profiling proves otherwise, but I still find this to be an interesting topic.

~~~
exDM69
> So as a summary to the comments above; Function pointers between files isn't
> bad for performance if we can't do link time optimization.

No. Function pointers between files are horrible for performance. Do not use
them if you're on the fast path. A function pointer (across translation units)
not only destroys the compiler's ability to optimize, it also kills your cpu's
instruction cache. You can solve the problem by using inline functions and
apply __attribute__((always_inline)) or similar if required. If in doubt,
check the assembly output of your compiler to verify that all calls have been
inlined.

Link time optimization is the cure. It's not widely available at the moment,
but maybe in a few years it will be more important. If you work in a
relatively isolated piece of software, it's possible that you can use a
compiler with link time optimization and get the perf gains today.

Meanwhile, don't use function pointers across translation units if you know
you're on the fast path. If you're optimizing something that is profiled to be
slow, look for function pointers because getting rid of them can give a big
boost.

> I usually opt for structure over performance before profiling proves
> otherwise, but I still find this to be an interesting topic.

You should always go for structure over performance except when you should
not.

------
irahul
Injection by parameters(setters, constructor and simple params) has been in
use much before DI became a fad.

    
    
        void qsort(void *base, size_t nmemb, size_t size,
                   int(*compar)(const void *, const void *))
    

Here, `compar` is being injected. When it is needed, it's sweet. But when you
start going down Java's way(i.e the way Java frameworks (over)do it), it's
irritating. The point of DI is to inject dependencies rather than hard coding
them. That doesn't mean you have to use IoC containers, or you have to inject
everything.

~~~
hendzen
Curious to hear which Java DI frameworks you dislike. I've used Google Guice
before on a large project and I found it to be great: it doesn't get in your
way and it makes your code a lot more modular and testable.

------
shadowmint
You're almost always better off using an ABI and shared libraries than doing
this.

That said, there are cases where it's useful; but remember: dependency
injection is an _anti-pattern_ that _increases code complexity_ to facilitate
run-time configuration and testing.

Without a helper IOC framework (that does the hard work of maintaining
singleton instances and injecting the right ones in the right places at the
right time) doing this is a lot of work in a large project.

...not saying it's a bad thing. It's totally a good thing, especially making
your C code testable.

Just be aware that there are limitations and downsides to this _beyond_ merely
the potential speed cost in using pointers and optimization issues mentioned
above.

~~~
ernstsson
Yes, I agree, it's about facilitating run-time configuration and testing. Not
sure I agree with the increasing of code complexity though. Maybe there's an
aspect of code complexity that increases with dependency injection. I
personally feel that code complexity more correlates with explicit branches,
something that I usually eliminate using DI or polymorphism. And yeah, any way
of writing code has it's downsides and upsides, just a matter of finding
balance.

~~~
shadowmint
Yeah absolutely; re: code complexity, if its only one level deep its not
really an issue, but you get into nasty situations when you have recursive
calls taking place.

A <\--- B, C

C <\--- D

D <\--- E, F

Now if you want to have function G that calls A, it needs to have a call like
this:

void G (void _E, void_ F, void _D, void_ B, void _C) { A(B, C(D(E, F))) }

(where void _ \--> what ever function pointer)

Not pretty. There are totally ways around this to do with grouping injected
values or separating apis so the chain is never deeper than one or two, but it
really does lead to some hideous code if you're not careful. :)

As a caveat to what I said as well though, I will admit: what I said only
applies if you're on a sane system that lets you use dynamically linked
libraries. If you're on a stupid OS that has artifical restrictions (I'm
looking at you iOS) this is probably actually not the worst way to go for
cleaner code, if you're writing plain C.

~~~
ernstsson
Well written! I think that potentially hideous code, putting together groups
of injected code is one of the hardest type of component get right. My
background is in embedded and mobile so I usually expect a stupid OS or
perhaps not even an OS at all (or perhaps writing part of the OS). Artificial
restrictions is just silly tough.

------
TorKlingberg
The second solution looks like a good old callback to me. The other two are
just complicating things with additional state.

In general, I try to avoid function pointers i C unless I am sure it makes the
code clearer. It is often difficult to debug code that tries to be clever with
them. Like preprocessor macros they are powerful, but tempting to misuse.

~~~
ernstsson
Good old callback, exactly what it is! In a very simple case like this perhaps
the additional state is complicating things, it does have it's uses is many
cases though, as described in the post. I personally think function pointers
doesn't make code more difficult to debug than any dependency injection used
in many of the higher level programming languages as Java. It is usually the
main argument against dependency injection as well. Usually it goes; Since it
makes it harder to debug I'd rather make it more coupled. When it's more
coupled it's harder to unit test. Since it's harder to unit test, I need to
debug more, etc etc. Personally I'd rather break that chain early. In the end,
debugging code with function pointers (or dependency injection) gets easier
with time.

------
Jacobi
The article describes an implementation of the observer pattern in C ... It's
well known and used in many open source C projects. But I'm not sure that
replacing simple tangled dependencies in the Linux kernel by using function
pointers is a good idea. Because you may introduce an overhead in some
performance critical code. And you may lose the locality of reference when
abusing function pointers.

~~~
ernstsson
Yes the example used to explain dependency inversion indeed happens to be an
observer pattern. Well known yes, at least amongst experienced C programmers,
but as mentioned in a comment on the page; "all the more reason that it should
be explained in blog posts.".

------
qznc
This article makes me doubt this Arqua tool. Using a callback does not remove
coupling. It just makes it dynamic, so static analysis cannot find a direct
link.

Arqua would have to find out that what value the global gNotifier variable
could contain. An exact points-to-analysis should find out that it is
clientNotify an insert the dependency. A less-powerful analysis might
approximate that it could be anything. In this case there should be a
dependency to _every other compilation unit_. Since Arqua reports "no
dependency" it seems to do a non-conservative approximation, which means it
does not report safe results.

That is not really a good quality metric. So why should one optimize against
it?

~~~
ernstsson
As previously mentioned in one of the other comments; to "facilitate run-time
configuration and testing.". Removing the static coupling has a value in
itself, making the component isolated to facilitate unit-testing.

