|
|
Log in / Subscribe / Register

XFS parent pointers

By Jake Edge
May 7, 2018

LSFMM

At the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Allison Henderson led a session to discuss an XFS feature she has been working on: parent pointers. These would be pointers stored in extended attributes (xattrs) that would allow various tools to reconstruct the path for a file from its inode. In XFS repair scenarios, that path will help with reconstruction as well as provide users with better information about where the problems lie.

[Allison Henderson]

The patch set has had a "bumpy history", she said. Lots of issues were identified with earlier versions of the patch set, which have now been addressed. Historically there were problems with locking order, but now the goal is to not have to lock the parent inode when creating the parent pointer. The xattr name will be the parent inode number and generation, along with the directory offset of the file. The xattr value will be the file name.

Jeff Layton said he sees how it would be useful to be able to walk the tree back to the root to recreate the path, but wondered about hard links. Dave Chinner said that each link would create its own parent pointer attribute. Al Viro asked about rename operations during the tree walk, but Chinner said there is no real problem there. The walk is done in user space (using ioctl() calls); the idea is that if there is problem in inode X, sector Y, a reverse lookup can be done to provide the user with the path. If the path changes during the walk, the user-space program should redo it.

Henderson said that one use case is for online scrub and repair. It will allow inodes that have been orphaned to be reconnected correctly. The error reporting will also be better because there will be a path associated with the inode where problems were found. She is trying to gather information on other use cases so that she can ensure that the feature supports them. Chinner said that filesystem repair is an important use; simply dumping a million files into the lost+found directory is useless.

Ted Ts'o asked about the performance of the feature. Chinner said it simply added an xattr operation to each file create, rename, link, and unlink operation. That should be fine if the xattr fits in the inode, Ts'o said, but Chinner noted that xattrs are being used everywhere these days, so xattr operations are generally expected.


Index entries for this article
KernelFilesystems/XFS
ConferenceStorage, Filesystem, and Memory-Management Summit/2018


to post comments

XFS parent pointers

Posted May 7, 2018 18:08 UTC (Mon) by k8to (guest, #15413) [Link] (10 responses)

Could you label your file with a particular xattr to cause online scrub to move your file somewhere you normally wouldn't be able to place it?

XFS parent pointers

Posted May 7, 2018 20:09 UTC (Mon) by bfields (subscriber, #19510) [Link]

No XFS expert, but my understanding is that this parent pointer is represented as an xattr on disk, but is not something you could modify through the normal xattr interface.

XFS parent pointers

Posted May 8, 2018 1:39 UTC (Tue) by dgc (subscriber, #6611) [Link] (8 responses)

> Could you label your file with a particular xattr to cause online scrub to move your
> file somewhere you normally wouldn't be able to place it?

No. These parent pointer xattrs will be in a private xattr namespace that only the kernel code can access. Essentially they are part of the metadata of the filesystem, and users cannot access filesystem metadata directly.

-Dave.

XFS parent pointers

Posted May 8, 2018 6:52 UTC (Tue) by dgm (subscriber, #49227) [Link] (2 responses)

Unless they can modify the filesystem on disk, that is.

XFS parent pointers

Posted May 8, 2018 14:52 UTC (Tue) by k8to (guest, #15413) [Link]

At which point, presumably they could put the file anywhere anyway.

XFS parent pointers

Posted May 8, 2018 22:32 UTC (Tue) by dgc (subscriber, #6611) [Link]

> Unless they can modify the filesystem on disk, that is.

And that, folks, is why we don't let anyone other than root access block devices directly or mount filesystem images.

Because if anyone can modify the filesystem on disk then we're completely and utterly screwed, parent pointers or not. And that goes for any filesystem that doesn't have a cryptographically secure on-disk format (e.g. XFS, ext4, btrfs, f2fs, etc). i.e. no filesystem except maybe bcachefs is robust against such tampering.

-Dave.

XFS parent pointers

Posted May 8, 2018 19:07 UTC (Tue) by nix (subscriber, #2304) [Link] (3 responses)

This doesn't mean an extra block allocated per file, right? For files without (many) xattrs, presumably this is small enough to get packed in with existing metadata? (The space usage for an extra block for filesystems containing many very small files might be quite painful, not to mention the disk seeks -- though XFS is damn good at keeping those down these days. :) )

XFS parent pointers

Posted May 8, 2018 22:27 UTC (Tue) by dgc (subscriber, #6611) [Link] (2 responses)

> This doesn't mean an extra block allocated per file, right?

In most cases there will be no extra allocation - the xattr will easily fit inside the inode for typical single parent, short name files on a default 512 byte inode filesystem. It's not until you have multiple hard links or filenames > 100 bytes that the xattrs will tend to go out of line. Or you have lots of other xattrs, in which case they're at risk of being moved out of line, anyway. If you're really worried about xattrs being kept in line, then you can always format the filesystem with 1kB or 2kB inodes....

-Dave.

XFS parent pointers

Posted May 9, 2018 15:51 UTC (Wed) by nix (subscriber, #2304) [Link] (1 responses)

Oh right. I'm only worrying because I'm hacking up a replacement for GNU Stow that uses hardlinks where possible, unstowing by comparing the st_dev/st_ino of appropriately-located files in the target and stow tree, to reduce the visible behaviour change GNU stow and graft can cause with their huge symlink farms (since lots of programs care whether things are symlinks or not but almost none check their own read-only files to see what their link count is). If this works, systems that use it would end up with hundreds of thousands of files with link count 2.

However, I suspect that for your average file, two hardlinks is fine for inline storage of the parent pointers as well (inodes aren't *that* small). If a file has ten twenty or fifty hardlinks, I'd frankly *expect* parent pointers to be moved out of line. :)

XFS parent pointers

Posted May 21, 2018 1:54 UTC (Mon) by njs (subscriber, #40338) [Link]

FYI in case you're looking for other projects that make similar demands, conda is also a heavy user of hardlinks, for a somewhat similar use case.

XFS parent pointers

Posted May 9, 2018 3:54 UTC (Wed) by k8to (guest, #15413) [Link]

Thank you for the clarification!

XFS parent pointers

Posted May 7, 2018 18:50 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (10 responses)

Look on the bright side of this change - it'll allow directory hardlinks!

/me runs and hides.

XFS parent pointers

Posted May 8, 2018 22:00 UTC (Tue) by Paf (subscriber, #91811) [Link] (9 responses)

Wait, how? And come to think of it, why aren’t those allowed today?

XFS parent pointers

Posted May 8, 2018 22:05 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

I refer you to a previous discussion: https://lwn.net/Articles/681685/

XFS parent pointers

Posted May 14, 2018 21:10 UTC (Mon) by viro (subscriber, #7872) [Link] (2 responses)

Still no go - you'd need to scan all "ancestor" chains (with all the IO it would imply) and do that while the graph structure is guaranteed not to change under you. Good luck with the locking, or with DoS potential in that...

XFS parent pointers

Posted May 14, 2018 23:16 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Shouldn't all of the parents' dentries already be in RAM, though? Locking will be more complicated, but a parallel "rename stack locking" structure specifically for renames should suffice.

XFS parent pointers

Posted May 14, 2018 23:54 UTC (Mon) by viro (subscriber, #7872) [Link]

Why would they? You've looked up foo/bar/baz and found that oh, BTW, it also has links from a/b/c/d/e/never/been/there/whatever/the/hell/it/might/be and a dozen other directories. Each of those having a bunch of ancestors of its own, etc.

Ancestors are guaranteed to be in dcache when there's only one path from root to it... Even that takes some work to maintain in the face of open_by_handle() - fs/exportfs/expfs.c is where it's dealt with. With multiple paths, that becomes a non-starter; do you really want stat(2) capable of sucking in thousands of directories from disk, all of that - with rename/link/unlink blocked on the entire filesystem? Or doing that already joyful work on a graph that keeps changing under you...

Sure, you could pull *everything* into dcache on mount and keep it there all along. Then everything will be in dcache at all times, but that means memory footprint from hell and hash chains' lengths from the same place. And you still have a potentially enormous graph (remember, you've got everything in dcache) and need to answer questions like "will it remain a connected directed graph if we remove this edge?" and "will adding such an edge create a loop in it?", atomically wrt graph modifications...

XFS parent pointers

Posted May 8, 2018 22:29 UTC (Tue) by nevyn (guest, #33129) [Link] (4 responses)

Answering backwards:

> And come to think of it, why aren’t those allowed today?

Because it's impossible to stop loops. Eg. /tmp/p/c and then you hardlink p into c. gg.
Lots of bad things happen if you do that.

> Wait, how?

Because, in theory, if you have a single "parent" for each directory entry you don't have loops. I'm less sure this solves all the problems though. At best it seems like another way to do mount --bind, at worst you'd have all the same problems.

XFS parent pointers

Posted May 10, 2018 0:32 UTC (Thu) by Paf (subscriber, #91811) [Link]

Thanks nevyn, and cyberax as well.

XFS parent pointers

Posted May 14, 2018 12:55 UTC (Mon) by cortana (subscriber, #24596) [Link] (2 responses)

I believe Mac OS X supports directory hardlinks on HFS+... does anyone know how software that isn't prepared for this behaves if you set up an infinite loop in this way?

XFS parent pointers

Posted May 16, 2018 1:15 UTC (Wed) by foom (subscriber, #14868) [Link]

HFS+'s directory hardlink support doesn't let you make loops. It has a bunch of restrictions which make that impossible, although they also prohibit making some directory hardlinks which _don't_ create a loop. (That's okay, it wasn't intended to be a generally-used feature, only really for Time Machine backups' use.)

See the comment starting "Source parent and" here, for what it actually checks: https://opensource.apple.com/source/hfs/hfs-407.30.1/core...

XFS parent pointers

Posted Jun 8, 2018 23:58 UTC (Fri) by JanC_ (guest, #34940) [Link]

You can make directory hardlinks on NTFS too, by doing direct NT kernel API calls, but it's not a good idea to actually do that (I don't think anything in Windows userspace can actually handle that properly).


Copyright © 2018, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds