XFS online filesystem scrubbing and repair

By Jake Edge
May 16, 2018

In a filesystem track session at the 2018 Linux Storage, Filesystem, and Memory-Management Summit (LSFMM), Darrick Wong talked about the online scrubbing and repair features he has been working on. His target has mostly been XFS, but he has concurrently been working on scrubbing for ext4. Part of what he wanted to discuss was the possibility of standardizing some of these interfaces across different filesystem types.

Filesystem scrubbing is typically an ongoing activity to try to find corrupted data by periodically reading the data on the disk. Online repair attempts to fix the problems found by using redundant information (or metadata that can be calculated from other information) stored elsewhere in the filesystem. As described in Wong's patch series, both scrubbing and repair are largely concerned with filesystem metadata, though scrubbing data extents (and repairing them if possible) is also supported. Wong said that XFS now has online scrubbing support, but does not quite have the online repair piece yet.

Btrfs has support for online scrubbing and ext4 will eventually as well. Wong wondered if there was an opportunity to create a common wrapper for user space. Ted Ts'o said that it would help if there was some clarity about the goals and requirements of a scrubber tool. He asked, is it a cron job that scrubs all the filesystems or might there be individual crontab entries for ext4 and XFS? Clearly the goal should be to make the system administrator's life better.

Chris Mason brought up the CRC checks that the filesystems currently do. When those CRC checks fail, each filesystem logs its own message to dmesg. There is no consistency between the filesystems for that message. Wong recommended that Btrfs return a "filesystem corrupt" error status to user space as ext4 and XFS do, but Mason pointed out that CRC errors are not only found during a filesystem scrubbing.

Kent Overstreet said that he had a framework that could be used for long-running jobs in the kernel. It returns a file descriptor that can be used to monitor the job. Wong said that the XFS scrubbing consists of many ioctl() commands that are called from user space. Overstreet said that sounded harder to deal with. Josef Bacik said that Btrfs is similar to XFS, but that having a single file descriptor might be better.

Dave Chinner wondered if there was a way to have a single scrubbing command that handled any kind of filesystem, so that users do not have to remember how to do it for each type. No one seemed opposed to the idea but getting there may take some time.

When data errors are found, some users may not really want to have the filesystem try to repair things, Ric Wheeler said. Instead they will just want the name of the file containing the error so that they can simply get a copy from another server. That requires mapping the blocks back to a path. He also said that a recent paper showed that, while SSDs will last a lot longer than rotating storage, they will generate many more errors (on the order of 10-15 times more) than rotating storage over that time. So these kinds of problems will become more prevalent.

Another thing that needs to be standardized is the I/O priority that these scanners will run with, Mason said.

Wong suggested starting with a simple common scrubbing wrapper that would do the right thing for each filesystem type. It would just report whether the metadata had errors and whether the data had errors. From that, administrators could then decide how to fix the errors. Chinner said that there needs to be some standard on what errors get returned, but Wong suggested starting with something simple: 0 for OK, 1 to indicate a problem and that the administrator should check the logs for more information. It was generally agreed that would be a reasonable place to start, though Ts'o cautioned there would be a need to eventually standardize more pieces at multiple levels.

Index entries for this article
Kernel	Filesystems/XFS
Conference	Storage, Filesystem, and Memory-Management Summit/2018

XFS online filesystem scrubbing and repair

Posted May 16, 2018 17:11 UTC (Wed) by pj (subscriber, #4506) [Link] (5 responses)

Shouldn't this functionality be rolled into fsck, and not another wrapper program? What am I missing?

XFS online filesystem scrubbing and repair

Posted May 16, 2018 19:28 UTC (Wed) by rahulsundaram (subscriber, #21946) [Link] (4 responses)

fsck is just a frontend. Each filesystem has it's own fsck utility

Examples:
fsck.cramfs fsck.ext3 fsck.fat fsck.msdos fsck.xfs
fsck.btrfs fsck.ext2 fsck.ext4 fsck.minix fsck.vfat

XFS online filesystem scrubbing and repair

Posted May 16, 2018 21:58 UTC (Wed) by nix (subscriber, #2304) [Link] (3 responses)

Also, of course, scrubbing is *online*, unlike fsck. It's not really quite the same thing. (But dispatching from a fscrub utility to individual fscrub.* tools seems like a good idea to me too.)

XFS online filesystem scrubbing and repair

Posted May 17, 2018 4:20 UTC (Thu) by unixbhaskar (guest, #44758) [Link] (2 responses)

Yup. fscrub.* would be a good piece to have. :)

XFS online filesystem scrubbing and repair

Posted May 17, 2018 13:32 UTC (Thu) by ehiggs (subscriber, #90713) [Link] (1 responses)

fsscrub, surely.

XFS online filesystem scrubbing and repair

Posted May 17, 2018 15:15 UTC (Thu) by ju3Ceemi (subscriber, #102464) [Link]

A command which matches the "discard" feature (via fstrim) would be nice

Shades of mdraid ...

Posted May 24, 2018 17:13 UTC (Thu) by Wol (subscriber, #4433) [Link]

It seems most systems have difficulty passing an error back up the chain when there's a problem short of an out-and-out failure.

Raid, for example, could check majority voting if there's a mirror, or parity for raids 5 and 6. But at present, on a read, it just assumes everything is hunky-dory if the data blocks read okay. There is no mechanism for checking and indicating a problem. Okay, with a two-disk mirror or raid 5, there's nothing that can be done to fix an integrity error, but it would be nice to know it's happened. And if you know a disk is flaky it would be nice to be able to tell any fix utility to assume that's the problem. At present, the fix utilities assume by default it's the parity that's dud, so even with raid-6 an ill-informed "fix" will simply trash your data if that assumption isn't true.

Cheers,
Wol

maybe just maybe look at other systems ?

Posted May 25, 2018 12:02 UTC (Fri) by johnjones (guest, #5462) [Link]

seems strange that this is much of a debate

how does AIX/OS2 with JFS2 online scrub do it ?
how does Solaris/FreeBSD with ZFS do it ?
etc