The Btrfs inode-number epic (part 2: solutions)
The Btrfs inode-number epic (part 2: solutions)
Posted Aug 29, 2021 9:22 UTC (Sun) by NYKevin (subscriber, #129325)In reply to: The Btrfs inode-number epic (part 2: solutions) by neilbrown
Parent article: The Btrfs inode-number epic (part 2: solutions)
Why is it necessary to use something that has the potential for collisions at all? Why not just hand out arbitrary or sequential numbers in a centralized fashion (like every other filesystem that isn't FAT)? Is there some rule that says you're not allowed to look at subvolume X when you make a new file in subvolume Y? Why would such a rule be necessary?
Posted Aug 29, 2021 12:31 UTC (Sun)
by foom (subscriber, #14868)
[Link]
Posted Aug 30, 2021 15:05 UTC (Mon)
by zblaxell (subscriber, #26385)
[Link]
To get globally unique and stable inode numbers without a separate subvol ID, the filesystem would have to dynamically remap duplicate inode numbers from subvol-local values to globally-unique values every time a readdir() or stat() happened. This adds some overhead to all read operations that filesystem maintainers are reluctant to implement. They would prefer some more efficient way to tell an application "this is a distinct inode number namespace but not a distinct filesystem" so that applications that rely on the uniqueness feature can bear some of the costs (including opportunity costs) of implementing it, while not imposing new costs (such as new O(log(N)) search costs on every stat(), or exploding /proc/mounts and `df` output size) on applications that don't care about inode uniqueness.
The NFS server could maintain its own persistent unique inode numbers in a mapping table outside of the filesystem, and not send the filesystem's inode numbers to clients at all, but that has obvious and onerous runtime costs (the NFS server would have to maintain persistent state proportional to filesystem size).
The Btrfs inode-number epic (part 2: solutions)
The Btrfs inode-number epic (part 2: solutions)