User: Password:
|
|
Subscribe / Log in / New account

XFS: the filesystem of the future?

XFS: the filesystem of the future?

Posted Jan 24, 2012 0:03 UTC (Tue) by dgc (subscriber, #6611)
In reply to: XFS: the filesystem of the future? by dlang
Parent article: XFS: the filesystem of the future?

> block numbers will need to change, but I don't see why inode numbers
> would have to change (and if you don't change those, then lots of other
> problems vanish), they are already independent of where on the disk the
> data lives.

Inode numbers in XFS are an encoding of their location on disk. To shrink, you have to physically move inodes and so their number changes.

> this seems fairly obvious to me, what am I missing that makes the
> simple approach of

[snip description of what xfs_fsr does for files]

> not work? (at least for file data)

Moving data and inodes is trivial - most of that is already there with the [almost finished] xfs_reno tool (moves inodes) and the xfs_fsr (moves data) tools. It's all the other corner cases that are complex and very hard to get right.

The "identify something to move" operation is not trivial in the case of random metadata blocks in the regions that will be shrunk. A file may have all it's data in a safe location, but it may have metadata in some place that needs to be moved (e.g. an extent tree block). Same for directories, symlinks, attributes, etc. That currently requires a complete metadata tree walk which is rather expensive. It will be easier and much faster when the reverse mapping tree goes in, though.

The biggest piece of work is metadata relocation. For each different type of metadata that needs to be relocated, the action is different - reallocation of the metadata block and then updating all the sibling, parent and multiple index blocks that point to it is not a simple thing to do. It's easy to get wrong and hard to validate. And there are a lot of different types. e.g. there are 6 different types of metadata blocks with multiply interconnected indexes in the directory structure alone.

> if you try to do this on a live filesystem

If we want it to be a fail-safe operation then it can only be done online. xfs_fsr and xfs_reno already work online and are fail-safe. Essentially, every metadata change must be atomic and recoverable and that means it has to be done through the transaction subsystem. We don't have a transaction subsystem implemented for offline userspace utilities, so a failure during an offline shrink would almost certainly result in a corrupted filesystem or data loss. :(

In case you hadn't guessed by now, one of the reasons we haven't implemented shrinking is that we know *exactly* how complex it actually is to get it right. We're not going to support a half-baked implementation that screws up, so either we do it right the first time or we don't do it at all. But if someone wants to step up to do it right then they'll get all the help they need from me. ;)

Dave.


(Log in to post comments)

XFS: the filesystem of the future?

Posted Jan 24, 2012 0:41 UTC (Tue) by dlang (subscriber, #313) [Link]

> Inode numbers in XFS are an encoding of their location on disk. To shrink, you have to physically move inodes and so their number changes.

If I understand this correctly, this means that a defrag operation would have the same problems. Does this mean that there is no way (other than backup/restore) to defrag XFS?

as for the rest of the problems (involving moving metadata), would a data-only shrink that couldn't move metadata make any sense at all?

XFS: the filesystem of the future?

Posted Jan 24, 2012 2:04 UTC (Tue) by dgc (subscriber, #6611) [Link]

xfs_fsr doesn't change the inode number. It copies the data to another temporary file and if the source file hasn't changed once the copy is complete, it atomically swaps the extents between the two inodes via a special transaction. It uses invisible IO, so not even the timestamps on the inode being defragged get changed.

As to data only shrink, that makes no sense because metadata like directories will pin the blocks high up in the filesystem. and so you won't be able to shrink it anyway....

XFS: the filesystem of the future?

Posted Jan 24, 2012 8:13 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

OK, so XFS doesn't support full defrag, it can't move metadata to improve performance - but it does have a data-only defrag which will be enough for some people.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds