My general point was that anything that takes days or weeks to complete, will break eventually. Think of using rsync to mirror a billion files over a wide area network for example. After a network issue or a power outage, you do not want to have to start from the first file.
How you checkpoint/restart is less critical to me. I would see that some applications (like rsync itself) should be aware and restartable in their design. Others would certainly benefit from external checkpointing.