Why does checkpointing need kernel support at all? A process is able to dump its core to a file, along with details of file descriptors it has open. In general, any action a process took to get into a particular state, it did by calling normal kernel APIs - so those same APIs should be usable to restore the saved state later. There might be some missing kernel interface to query the current state ('what file descriptors do I have?') but adding those as needed seems fairly straightforward and not intrusive.
Posted Feb 26, 2009 16:16 UTC (Thu) by lwithers (subscriber, #23379)
[Link]
Perhaps of further interest is the description of Crash-only software by Valerie
Henson (now Aurora). Software written with this paradigm in mind, combined
with something like daemonitor or
OpenRC tricks, can be used to build a system with a certain amount of
resilience.
Can't checkpointing be done in user space?
Posted Feb 26, 2009 20:13 UTC (Thu) by nix (subscriber, #2304)
[Link]
Well, yes, but one application I'd like to see (when they get
suspension/resumption of network connections working) is the ability to
suspend/resume a system which is displaying X apps some of which are
running on another machine, without using some sort of proxying layer like
xpra.
It's likely to be tricky...
Can't checkpointing be done in user space?
Posted Feb 27, 2009 3:32 UTC (Fri) by spotter (subscriber, #12199)
[Link]
Posted Feb 27, 2009 18:26 UTC (Fri) by giraffedata (subscriber, #1954)
[Link]
A user space program can checkpoint itself. Many do. This project is about checkpointing an application that wasn't designed for checkpointing, which I suppose saves the enormous engineering effort of building application-specific checkpointing into all the applications.
Can't checkpointing be done in user space?
Posted Mar 6, 2009 8:14 UTC (Fri) by TRS-80 (subscriber, #1804)
[Link]
CyroPID is a user-space application that can checkpoint other processes without any special support. It doesn't work for all cases, although it's good enough for the "D'oh! I forgot to start this application inside screen(1)" use-case.