LWN.net Logo

Checkpoint/restart: it's complicated

Checkpoint/restart: it's complicated

Posted Nov 15, 2010 20:28 UTC (Mon) by mhelsley (subscriber, #11324)
In reply to: Checkpoint/restart: it's complicated by daglwn
Parent article: Checkpoint/restart: it's complicated

Which is why you need a c/r implementation that aggressively avoids checkpointing shared data multiple times and avoids unnecessary IO as much as possible.

linux-cr does the former using its objhash so that we don't checkpoint shared state more than once. It avoids doing disk IO by, whenever possible. not bundling file/directory contents into the checkpoint image (generic_file_checkpoint()). Instead it relies on userspace to do IO-bandwidth friendly optimizations like using filesystem snapshots.

That said, file "contents" are necessary for anon_inode-based interfaces such as eventfd, epoll, signalfd, and timerfd because those can't be "backed up" and restored like normal files.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds