

Get to know Ksplice: kernel patching was never this easy. - tabbott
http://www.ibm.com/developerworks/aix/library/au-spunix_ksplice/index.html

======
tzs
Cool technology--which I'm not letting anywhere near any production server I
have control over.

One of the important things rebooting does when updating a kernel is verify
that the updated system still boots. If a kernel change turns out to be
incompatible with something in your configuration, you really really really
want to find that out right away, under controlled circumstances, in a
scheduled maintenance window.

You do not want to find it out six months down the road, when you've had an
unexpected outage from a power failure or other hardware problem, and can't
get your system back up, and have no idea how long its actually been broken
(and so no hint as to what broke).

~~~
danudey
My company has just purchased 150-200 ksplice licenses, and it's really a
fantastic technology. It loads up a kernel module that does all the runtime
patching for you, but if you don't run it, it doesn't update.

The benefit I see here is that it provides you with a kernel that has all the
latest security updates without having to recompile, upgrade, etc. When you
reboot next, you have the same kernel you had last time you rebooted (unless
you've done an upgrade in the meantime), so you know it's going to boot.

ksplice doesn't modify the kernel on-disk, only in memory. In the unlikely (so
far for us) situation that one of those patches is incompatible, causes
problems, crashes your system, etc. and you want to reboot to a known-good
configuration, just log into their system and deactivate the server, and it
won't be able to do any updates.

Your post seems to assume that ksplice changes your booting kernel, and thus
who knows if it will boot next time. This doesn't happen. Unless you upgrade
your kernel yourself (e.g. through yum or apt), you're always booting from the
same (insecure) kernel, and the ksplice daemon reapplies the patches again
after you boot to get you up-to-date.

It's really a marvellous technology, and it's worked very well for us so far.

------
chwahoo
As a researcher in the area of runtime software upgrades, I'm excited to see
this post here and am curious what the HN community thinks of Ksplice. Does
your product demand extremely high availability with the further constraint
that you can't just shift the workload to another machine to perform an OS (or
application) upgrade? If so what are the circumstances? (e.g., Does your
program maintain long-running connections that can't be broken?) Are there
times when you have been unable to upgrade your OS or server program because
you couldn't sacrifice availability?

I'd love to get in touch with people who are working under these constraints.

------
malkia
Cool. Correct me if I'm wrong, this is how I think the article describes it:
\- Stop the machine (one cpu running, the other ones are stopped). \- Verify
in all thread stacks whether the address range of the "C" function to be
patched (begining to the end) to be patched is not referenced in the stack -
e.g. if some function have to return to the said one (there might be false
positives). \- Put the new code in a free memory for execution. Change the
first 5 bytes (x86) of the old function with JMP to the new function (granted
the function should be at least 5 bytes).

Function changes that won't work: \- Change to the function return type. \-
Change to the function argument list.

Things to be aware of: \- If the same function is to be patched again, then
you should check the stack for all pervious versions of the function (the
stack might be in the old function, doing a JMP to a newer version that's
doing another JMP to a newer function, etc.).

What else I might be missing?

Still don't know how they deal with data structures...

In Common Lisp, one can update CLOS objects, but these beasts are full of
metadata, and only the compiler knows the inner parts.

Still quite interresting.

~~~
chwahoo
I believe changes to a function's return type and argument list should be
supported. The caller of such a function would need to have been modified, so
if the caller is also not "active" (on the callstack), then the update should
be typesafe.

They do have limited support for type changes to data - e.g., adding a field
to a struct. These must be performed using a "shadow data structure" which
holds fields that were added to the original structure. This would cause
patches to diverge a bit from the original kernel versions.

However, since Ksplice aims primarily to support security patches, I suspect
that both function signature and type changes are less common than they would
be for general OS (or application) evolution.

