Does anyone have a simple example of how this would work? I can't wrap my head a...

DSMan195276 · on Feb 11, 2015

I took a quick look at the accepted patch. while I can't guarantee I know what's actually going on, my understanding is that patching individual functions works by sticking the replacement functions code somewhere new in memory, getting a pointer to it, and then over-writing the code in the old function to jump to the new one. (Kinda like short-circuiting the old function - all the old code still calls the old function location, but that location simply says 'jump to this new location over here').

It looks like, however, because kernel modules seem to be in elf format (Don't quote me on that, just going from the code), elf format includes a 'relocation table', which is basically a table that says "this function is located here, and this next function is located here, and ..." for every function in the module. Ignoring why that is actually there, they can take advantage of the relocation table and replace a functions location with the location of the replacement function, effectively overwriting the old one. Even if it's still in memory (I can't tell if it gets removed or not) the code will never be called again.

From there, the discussion mostly seems to be around how to 'stop' the kernel enough to be able to replace the function without resulting in a mess because something was trying to use that function at the same time that you replaced it.

donavanm · on Feb 11, 2015

Basically yes. Check out the kprobes docs for a nice description of how these frameworks work ,https://www.kernel.org/doc/Documentation/kprobes.txt. Being able to intercept (and mangle) kernel function calls is awesome. With uprobes the same techniques work in userland as well.

detaro · on Feb 11, 2015

(just what I gathered from a bit of mailing-list reading, if any of this is wrong please correct me)

I don't think it can do arbitrary changes, it "only" applies specially prepared changes to the running system by replacing function call targets while tasks are sleeping. The biggest limitation to changes probably is that old and new code runs in parallel, so you can only do changes to data structures that won't confuse the old code. "Simplest" use case might be adding guards against exploits to syscalls.

I can't tell how viable it would be for the new code to build an entire parallel structure and to only switch to this after everything is migrated, or how deep into the kernel these changes could go. Could one fix a file system driver while using the file system?

AlyssaRowan · on Feb 11, 2015

Yes, it's just that the fix won't happen for you when you're halfway through a patched function.

lemonade · on Feb 11, 2015

The original academic paper from MIT describing how it works:

http://www.ksplice.com/doc/ksplice.pdf

bchallenor · on Feb 12, 2015

This paper is very readable - thanks for the link.