|
|
Log in / Subscribe / Register

The copy problem is really the backup problem

The copy problem is really the backup problem

Posted May 30, 2019 15:01 UTC (Thu) by mcr (subscriber, #99374)
Parent article: The Linux "copy problem"

In the old says of Unix, with a single file system, we used "dump" to get a good copy. It had all sorts ridiculous issues of having a userspace program trying to decipher file system contents from raw reads of the disk. On the other hand, when it worked, it got all the metadata, did it without destroying the buffer cache, and often was able to backup disks which were in the process of dying. When it failed, it failed, and the backups were sometimes garbage. And it didn't work for many things. So people mostly use tar for backup. And that's should be the most common copy problem, which is not just about data centers or cluster environments. And tar fails for any file system that does something innovative.
My claim is that our VFS layer is incomplete: it should include an atomic backup and an atomic restore operation, at least on a file level, but optionally on a directory basis. If we had that, then cp would always usefully be backup file | restore file2. This means that file systems have to serialize file contents and meta data, and have to deserialize it too. We Linux a microkernel architecture, then probably much of this deserialization could be done in some system-provided, non-ring0 context. Should we pick tar for serialization, or something more modern like CBOR, that's a bike shed for a design team.
I would just be happy if we could agree that we need this functionality.


to post comments

The copy problem is really the backup problem

Posted Jun 4, 2019 9:18 UTC (Tue) by jezuch (subscriber, #52988) [Link] (2 responses)

At least on btrfs that's:

btrfs subvolume snapshot -r
btrfs send
btrfs receive

But it does not work on per-file basis, unfortunately. And yes, btrfs defines its own serialization format.

The copy problem is really the backup problem

Posted Jun 4, 2019 14:06 UTC (Tue) by mcr (subscriber, #99374) [Link] (1 responses)

What happens if a file is open (TXTBUSY)? or open O_DIRECT, or any of these other things that might be mutually exclusive with regular I/O?

The copy problem is really the backup problem

Posted Jun 19, 2019 21:32 UTC (Wed) by nix (subscriber, #2304) [Link]

btrfs send and receive are not regular I/O, so they work fine. (Though I'm not sure what happens in conjunction with O_DIRECT, which is a bit... hard to grasp the semantics of on a CoW filesystem in any case.)

(You don't get -ETXTBSY if you read a file in any case, only if you try to modify it.)


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds