Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One (non-generalizable) solution is to avoid using writable function pointers, to help maintain control flow integrity.

For example, in libopus we have platform-specific SIMD optimizations that are chosen at runtime.

The common solution is just to have a function pointer that points to the accelerated routine. But there are only a limited number of choices for each routine. So we put pointers to them in a static const array, which gets stored on a read-only page. Then instead of storing a pointer, we just store an index into this array. This index is chosen based on the available instruction sets detected at runtime. This lets us use the same index for every accelerated function.

Then, we pad the array size out to a power of two. To call an accelerated function, we mask off the upper bits of the index. Even if the index gets corrupted, we still call one of the functions in the array. So there's only so much damage it can do.

Obviously this doesn't apply if the set of functions you need to call is open-ended (e.g., a callback, like nginx was using). But it seems like a good pattern to follow when it applies.



Is there a reason why you did that that way other than to avoid writeable function pointers?

Also, that just sounds like how switch/case is sometimes implemented. Have you considered just using a switch/case statement instead of manually managing function pointers and the like?


That was the main reason.

It also means that you don't need to pass around (a pointer to) a big table of function pointers throughout all of your functions that need to use accelerated routines. Instead you just pass a single integer index. But that is pretty minor.

It can be pretty similar to a switch in practice, but I think there are a few differences. One is function argument handling. Each version of an accelerated routine has to be an independent function, compiled in a separate compilation unit, because you have to limit the instruction sets available. So you will duplicate all of the function call setup overhead in each branch of the switch. This probably gets optimized away by a decent compiler. Even if it does, you still wind up essentially duplicating the function table at every call site. That does not seem likely to get optimized away. You could dispatch to a common thunk which selects the accelerated routine to call, but now you have two function calls instead of one. I am also not sure that switch statements handle the default case as cheaply as doing a single bitwise AND with a small, compile-time constant.

But if you have a use of function pointers that can be replaced by a simple switch statement, that is probably the better approach.


Your approach is kind of how indirect function calls are implemented in WebAssembly, if I am not mistaken.


Using a switch requires that you statically determine and list every target. Implementing your own jump table let's you build it at runtime.


Even in the dynamic case, you can still use an possibly malloced array and verify the index is valid, which is a nice improvement over leaving function pointers lying around in the vicinity of buffers.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: