Not necessarily... I maintain servers which offer ssh access to researcher to run long running computation. They perform their computation anyway they want (using interactive tools under screen, by running their own code, etc.), and there is no way I can checkpoint their computation, so I cannot reboot the system without killing their computation, which is precisely what I am supposed to prevent. Planned power outage and the like can be handled through suspend-to-disk, but updating the kernel require a reboot.
Since we provide ssh access to a number of users, local privilege escalations are a problem, but I cannot just reboot the system whenever I want.
Of course I would need a high level of trust in ksplice before using it.
Posted Dec 20, 2008 1:35 UTC (Sat) by giraffedata (subscriber, #1954)
[Link]
What you have isn't actually an aversion to downtime, it's an aversion to reboots. Which another reason to like patching a kernel on the fly.
People who are averse to down time (i.e. they can't afford to be offline for three minutes) do usually have some kind of redundant system thing so they can reboot one system, then the other, and thereby install a kernel patch without ever being offline.
People who are averse to a reboot (i.e. they don't want to lose the state gathered by the past four days of calculation) might use checkpointing in order to tolerate reboots, but you're at least one example that they don't. Because your users use generic tools, the only way I know that checkpointing could eliminate the pain of reboot is those new virtual machine-based things.
Patching runtime kernel
Posted Jun 25, 2009 12:08 UTC (Thu) by epa (subscriber, #39769)
[Link]
I think in ten years' time we will look back and see how silly it is to require a reboot every time the kernel is patched. Nowadays it's obvious that waiting for fsck after an unclean shutdown is unacceptable, even though that's the way it was for many years. Anything which can cut the number of reboots is a step forward for desktop usability. We don't want Linux to be that annoying system that wants to restart itself all the time, a title currently held by Windows, but by a thin margin given the frequency of kernel updates by many distros.
So yes, ksplice is wanted; for the remaining 30% of kernel updates that can't be spliced into a running system, working suspend/resume should help to keep downtime to a minimum.