
Dynamic Software Updating: Linux 4.0 and Beyond - awruef
http://www.pl-enthusiast.net/2015/04/14/dynamic-software-updating/
======
contingencies
While interesting for a subset of use cases, the reality is that testing
software in any new environment must occur before trusting it in production.

It would be foolish at this point to trust such upgrades to be '100% clean
slate' (ie. equivalent to rebooting to the new kernel). Case in point: if
boot-time initialization code has changed, eventual reboot may fail.

An interesting question may be, then, which operations workflow allows for the
proactive testing of all three cases: (1) live upgrade kernel/restart services
(2) live upgrade/reboot/restart services (3) traditional
upgrade/reboot/restart services ?

For the increase in complexity this creates combined with the sole benefit
apparently being nominal speed of upgrade/reduced downtime on a per-node basis
(in an era where most people running serious services have either outsourced
kernel management or are running multi-node HA clustering, anyway), this seems
a fairly edge-case feature.

------
vezzy-fnord
DSU has always been an interesting field to me, and I've wondered why the idea
of hotpatching systems while they're running has been so relatively
underrated.

Related is application checkpointing, where you can freeze the state of a
process (fds, sockets, pipes, sigmasks, open ttys, etc.) into binary images
and overlay them back. Useful for all sorts of nifty cases. See CRIU for a
Linux implementation: [http://criu.org/Main_Page](http://criu.org/Main_Page)

------
frozenport
I wonder if we have a new problem: silently breaking stuff?

~~~
aceperry
This is cutting edge software, so of course things will break. But once it is
been tested out, it'll be fine.

~~~
yeukhon
I think what parent comment is referring to the process patched or process
that depends on the patched process.

------
choffee
I think I would prefer to see the applications be more fault tolerant so that
rebooting for a kernel upgrade was no problem as outages are expected. If your
software demands 100% kernel uptime then you are just waiting for a failure.

------
sengork
How long before apps run off the trunk itself?

