The Btrfs inode-number epic (part 2: solutions)
The Btrfs inode-number epic (part 2: solutions)
Posted Aug 23, 2021 18:38 UTC (Mon) by martin.langhoff (guest, #61417)Parent article: The Btrfs inode-number epic (part 2: solutions)
A tough tradeoff it seems. Questions...
What's the fallout if the inodes are not unique? Given that large modern systems can be really large, inode collisions might be just a fact of life.
and... the solution is an intermediate "let's limit the repercussions on other software" solution. Sure. So then... _is there a clear correct way to check for unique inode that is sane, clear of collisions and portable (across filesystems)?
In other words, if I was a maintainer of a deduplicator utility, or developing the next version of NFS, and I'm alert enough to be reading this article, is there a clear way to DTRT?
While today we want to not break the world, we're also building tomorrow...
Posted Aug 23, 2021 22:04 UTC (Mon)
by neilbrown (subscriber, #359)
[Link] (2 responses)
This is unknowable in general - it depends on exactly what assumptions various applications make.
We know some specific problems.
There are probably others. However most code would never notice.
> So then... _is there a clear correct way to check for unique inode that is sane, clear of collisions and portable
Probably not. Even the current best-case behaviour of file-systems like ext4 does not provide the guarantees that I have described tar as requiring (it is possible I've misrepresented 'tar' - I haven't checked the code).
Tracking the identity of filesystems (to detect these mounts) is not well supported. st_dev is, as I say, transient for some filesystems. The statfs() systemcall reports an "fsid", but this is poorly specified. The man page for statfs() says "Nobody knows what f_fsid is supposed to contain". Some filesystems (btrfs, xfs, ext4 and others) provide good values. Other filesystems do less useful things. Some just provide st_dev in a different encoding.
Posted Aug 24, 2021 5:40 UTC (Tue)
by ibukanov (subscriber, #3942)
[Link] (1 responses)
Posted Aug 24, 2021 16:33 UTC (Tue)
by jonesmz (subscriber, #130234)
[Link]
The Btrfs inode-number epic (part 2: solutions)
- if a directory has the same inode number as an ancestor, find/du etc will refuse to enter that directory.
- if a 'tar' archive is being created of a tree, and two *different* files both have multiple links and both have the same inode number, then the second one found will not be included in the archive (I *think* tar doesn't track inode numbers for dirs or for objects with only one link).
- Other tools that collect files, like rsync and cpio, will have similar problems.
- various tools probably cache a dev/ino against a name, and if a subsequent stat shows that same dev/ino, they assume it is the same object. So if a given name referred to two different inodes over time, which happen to have the same inode number, such tools would behave incorrectly. (all these are unlikely with my overlay scheme - this one more so than most).
The "compare st_dev and st_ino" approach is only completely reliable when you have both files open. If you don't, it is possible for the first file to be deleted after you 'stat' it, and then for the second file to be created with the same inode number.
Use of "ctime" or even "btime" where supported, would help here.
So comparting dev, ino, and btime should be sufficient providing btime is supported. Almost.
Another possible (though unlikely) problem is that these objects might be on auto-mounted filesystems.
If you stat a file, get busy with something else and the filesystem gets unmounted, then some other filesystem gets mounted, the second filesystem *might* get the same st_dev as the first filesystem. So if you then stat a file on the new filesystem, it could be a completely different file on a different filesystem, but might have the same st_dev_ st_ino, and st_btime.
The Btrfs inode-number epic (part 2: solutions)
The Btrfs inode-number epic (part 2: solutions)