|
|
Subscribe / Log in / New account

The Btrfs inode-number epic (part 2: solutions)

The Btrfs inode-number epic (part 2: solutions)

Posted Aug 24, 2021 16:17 UTC (Tue) by zblaxell (subscriber, #26385)
In reply to: The Btrfs inode-number epic (part 2: solutions) by mtu
Parent article: The Btrfs inode-number epic (part 2: solutions)

This description of ZFS snapshots sounds so much less flexible than the btrfs version that I wonder if it's even an accurate description of ZFS.

btrfs snapshots are a lazy version of 'cp -a --reflink'. Users can drop a subvol anywhere in the filesystem and snapshot it anywhere else. This is part of the current problem--there isn't a single administrator-managed tree of subvols or snapshots, because ordinary applications can create and use subvols the same way they make directories (*). An existing NFS export can wake up one morning after an application software upgrade and suddenly find itself hosting a lot of subvols it didn't plan for. This proliferation of subvols is why the obvious solution (create distinct mount points for each and every subvol) isn't very popular (nor is the other obvious solution, lock down subvols so they aren't as trivial to use).

Unlike other popular snapshot systems, btrfs has no distinction between "base" and "clone" subvols. There is a notion of an "original" subvol and a "snapshot" subvol, but it's not part of the implementation, it's only a hint for administrators to label before-and-after snapshots for incremental send/receive. After a snapshot, both subvols are fully writable equal peers sharing ownership of their POSIX tree and data blocks, the same as if you had done cp -a --reflink atomically. Snapshots have a read-only bit that can be turned on or off (but turning it off means the subvol is no longer synchronized with copies on other filesystems, so it can't be used as a basis for incremental send/receive any more). You can chain snapshots (snap A to B, snap B to C, snap C to D...), with equal cost to write any subvol in the chain, and you can delete any of the snapshots in the chain with equal cost and without disrupting any other snapshot (other snapshot systems will have up to O(n) extra cost if there are n snapshots, or may not be able to delete the original subvol before deleting all snapshots). These properties greatly improve the usability of snapshots for applications since they can freely switch between treating them as subvol units or as individual files.

(*) If that seems weird, observe that a long time ago 'mkdir' required root privileges (**).

(**) OK there were different reasons for that. Still, ideas about what is "normal" for a filesystem and what is "privileged" do change over time.


to post comments


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds