LWN.net Logo

Can't checkpointing be done in user space?

Can't checkpointing be done in user space?

Posted Feb 26, 2009 11:19 UTC (Thu) by epa (subscriber, #39769)
Parent article: Checkpoint/restart tries to head towards the mainline

Why does checkpointing need kernel support at all? A process is able to dump its core to a file, along with details of file descriptors it has open. In general, any action a process took to get into a particular state, it did by calling normal kernel APIs - so those same APIs should be usable to restore the saved state later. There might be some missing kernel interface to query the current state ('what file descriptors do I have?') but adding those as needed seems fairly straightforward and not intrusive.

Why exactly is kernel support needed?


(Log in to post comments)

Can't checkpointing be done in user space?

Posted Feb 26, 2009 16:16 UTC (Thu) by lwithers (subscriber, #23379) [Link]

Perhaps of further interest is the description of Crash-only software by Valerie Henson (now Aurora). Software written with this paradigm in mind, combined with something like daemonitor or OpenRC tricks, can be used to build a system with a certain amount of resilience.

Can't checkpointing be done in user space?

Posted Feb 26, 2009 20:13 UTC (Thu) by nix (subscriber, #2304) [Link]

Well, yes, but one application I'd like to see (when they get
suspension/resumption of network connections working) is the ability to
suspend/resume a system which is displaying X apps some of which are
running on another machine, without using some sort of proxying layer like
xpra.

It's likely to be tricky...

Can't checkpointing be done in user space?

Posted Feb 27, 2009 3:32 UTC (Fri) by spotter (subscriber, #12199) [Link]

see the original zap paper

http://www.ncl.cs.columbia.edu/publications/osdi2002_zap.pdf

section 4.5

Can't checkpointing be done in user space?

Posted Feb 26, 2009 16:46 UTC (Thu) by spotter (subscriber, #12199) [Link]

Can't checkpointing be done in user space?

Posted Feb 27, 2009 18:26 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

A user space program can checkpoint itself. Many do. This project is about checkpointing an application that wasn't designed for checkpointing, which I suppose saves the enormous engineering effort of building application-specific checkpointing into all the applications.

Can't checkpointing be done in user space?

Posted Mar 6, 2009 8:14 UTC (Fri) by TRS-80 (subscriber, #1804) [Link]

CyroPID is a user-space application that can checkpoint other processes without any special support. It doesn't work for all cases, although it's good enough for the "D'oh! I forgot to start this application inside screen(1)" use-case.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds