One billion files on Linux

Posted Aug 19, 2010 20:32 UTC (Thu) by mhelsley (guest, #11324)
Parent article: One billion files on Linux

"Finally, application developers must bear in mind that processes which run this long will invariably experience failures, sooner or later. So they will need to be designed with some sort of checkpoint and restart capability."

Was that exactly Ric's point -- that the applications had to checkpoint themselves? Or did he just say that being able to checkpoint applications was necessary? I ask because there's a big difference. Expecting all applications that might be run in these environments to explicitly checkpoint themselves just isn't practical. Look at how many non-HPC applications use BLCR for example.

The alternative is to enable "external" checkpointing. Checkpoints that don't require rewriting the application, or ld preloads, etc. There is already an effort underway to push this to mainline:

https://ckpt.wiki.kernel.org/index.php/Main_Page

One billion files on Linux

Posted Aug 20, 2010 18:12 UTC (Fri) by ricwheeler (subscriber, #4980) [Link] (1 responses)

My general point was that anything that takes days or weeks to complete, will break eventually. Think of using rsync to mirror a billion files over a wide area network for example. After a network issue or a power outage, you do not want to have to start from the first file.

How you checkpoint/restart is less critical to me. I would see that some applications (like rsync itself) should be aware and restartable in their design. Others would certainly benefit from external checkpointing.

One billion files on Linux

Posted Aug 20, 2010 21:54 UTC (Fri) by mhelsley (guest, #11324) [Link]

Thanks for the clarification.

This use of rsync presents an interesting case for the userspace portion of checkpoint/restart.

During checkpoint we often need to checkpoint the contents of the filesystems. One way to do that is with a frozen filesystem and rsync. Obviously if we're rsync'ing to mirror the filesystem in the first place then we shouldn't attempt to checkpoint the rsync task's filesystem(s) with rsync -- we'd want to do a "local" snapshot if possible.

Since the kernel does not force userspace to save the filesystem contents userspace can choose if and how it will do so. In other words this case requires no special changes to the checkpoint syscall.