User: Password:
|
|
Subscribe / Log in / New account

Barriers and journaling filesystems

Barriers and journaling filesystems

Posted May 23, 2008 16:55 UTC (Fri) by jlokier (guest, #52227)
Parent article: Barriers and journaling filesystems

A couple of clarifications.

...contiguous space is easy to come by. Keeping the journal together will be good for performance, but it also helps to prevent reordering. In normal usage, the commit record will land on the block just after the rest of the journal data, so there is no reason for the drive to reorder things. The commit record will naturally be written just after all of the other journal log data has made it to the media.
That helps only with the first barrier, before the commit block. There's a better way to eliminate that barrier, which is a checksum in the commit block and ext4 does do that. You still need the second barrier, somewhere after the commit block, because it orders the journal write against writes elsewhere on the disk - those are never contiguous.
Disabling the write cache avoids this whole barrier problem, because writes can't be reordered then
It's not clear if disabling write cache is enough, when ext3 is mounted with barrier=0 (the current default). That stops the disk from reordering writes, but the kernel elevator is still able to reorder writes, when barrier=0, before sending them to the disk. Setting barrier=1 has the dual effect of telling the kernel not to reorder requests around barrier writes, and ideally passing that constraint to the disk as well.
Disabling barriers on xfs increases performance very very much in some situations, especially when deleting a directory tree with many small files, eg the linux-2.6 tree. With barriers it takes something like 2-3 minutes and without barriers around 20 seconds. (These numbers are from my memory).
That suggests a flaw in the way XFS implements deletions. There is no reason to require so many barriers. The only thing which should be able to cause a high rate of barriers is a high rate of fsync() calls (which aren't done in this case) or the journal being too small.


(Log in to post comments)

Barriers and journaling filesystems

Posted May 24, 2008 20:31 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

It's not clear if disabling write cache is enough, when ext3 is mounted with barrier=0 (the current default). That stops the disk from reordering writes, but the kernel elevator is still able to reorder writes, when barrier=0, before sending them to the disk. Setting barrier=1 has the dual effect of telling the kernel not to reorder requests around barrier writes, and ideally passing that constraint to the disk as well.

But isn't it impossibly naive for ext3 to assume writes it submits to the block layer get realized on disk in the order submitted? I assume designers weren't that naive and, when working without barriers, ext3 withholds writes from the block layer until every prerequisite write has completed.

The value of barriers is supposed to be that ext3 doesn't have to let the queue run dry, with its attendant throughput slowdown. Ext3 can submit writes for before the commit record, the commit record, and after the commit record, with barriers placed appropriately in the stream, and the block layer will take care of enforcing the required ordering.

The fact that "write completed" doesn't imply the data is persistent across a disk drive power failure (and furthermore the gaining of that persistence isn't in any particular order) is an orthogonal issue. Which code deals with it depends upon whether ext3 uses barriers or not. (And ISTR the block layer doesn't provide any mechanism separate from barriers to deal with it, so if you don't use barriers=1, it doesn't get dealt with).


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds