User: Password:
|
|
Subscribe / Log in / New account

An alternative to the application barrier() call

An alternative to the application barrier() call

Posted Sep 13, 2009 20:23 UTC (Sun) by dlang (subscriber, #313)
In reply to: An alternative to the application barrier() call by anton
Parent article: POSIX v. reality: A position on O_PONIES

my point is that enforcing a barrier through all these layers can be expensive (on a multi-disk array you would need to make sure that one disk has completed it's work before submitting the write to the next disk)

this isn't always needed, so don't try to do it for every write (and I've straced a lot of code that does lots of wuite() calls)

do it when the programmer says that it's important. 99+% of the time it won't be (the result is not significantly more usable after a crash with part of the file if it's not all there, or this really is performance sensitive enought to risk it)

you would be amazed at the amount of risk that people are willing to take to get performance. talk to the database gurus at MySQL or postgres about the number of people they see disabling f*sync on production databases in the name of speed.


(Log in to post comments)

An alternative to the application barrier() call

Posted Sep 14, 2009 22:16 UTC (Mon) by anton (subscriber, #25547) [Link]

Fortunately writes on the file system level can be merged across file system barriers, resulting in few barriers that have to be passed to the block device level. So there is no need to pass a block device barrier down for every file system barrier.

And since it is possible to implement these implicit barriers between each write efficiently (by merging writes), why burden programmers with inserting explicit file system barriers? Look at how long the Linux kernel hackers needed to use block device barriers in the file system code. Do you really expect application developers to do it at all? And if they did, how would they test it? This has the same untestability properties as asking application programmers to use fsync.

Concerning the risk-loving performance freaks, they will use the latest and greatest file system by Ted T'so instead of one that offers either implicit or explicit barriers, but of course they will not use fsync() on that file system:-).

BTW, if you also implement block device writes by avoiding overwriting live sectors and by using commit sectors, then you can implement mergeable writes at the block device level, too (e.g., for making them cheaper in an array). However, the file system will not request a block device barrier often, so there is no need to go to such complexity (unless you need it for other purposes, such as when your block device is a flash device).

An alternative to the application barrier() call

Posted Sep 20, 2009 5:22 UTC (Sun) by runekock (subscriber, #50229) [Link]

> Fortunately writes on the file system level can be merged across file system barriers, resulting in few barriers that have to be passed to the block device level.

But what about eliminating repeated writes to the same place? Take this contrived example:

repeat 1000 times:
write first byte of file A
write first byte of file B

A COW file system may well be able to merge the writes, but it would require a lot of intelligence for it to see that most of the writes could actually be skipped. And a traditional file system would be even worse off.

An alternative to the application barrier() call

Posted Sep 20, 2009 18:38 UTC (Sun) by anton (subscriber, #25547) [Link]

For a copy-on-write file system that example would be easy: Do all the writes in memory (in proper order), and when the system decides that it's time to commit the stuff to disk, just do a commit of the new logical state to disk (e.g., by writing the first block each of file A and file B and the respective metadata to new locations, and finally a commit sector that makes the new on-disk state visible.

An update-in-place file system (without journal) would indeed have to perform all the writes in order to have the on-disk state reflect one of the logical POSIX states at all times (assuming that there are no repeating patterns in the two values that are written; if there are, it is theoretically possible to skip the writes between two equal states).


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds