Goodness gracious, are we fixing the right problem?

Posted Mar 17, 2012 21:32 UTC (Sat) by giraffedata (guest, #1954)
In reply to: Goodness gracious, are we fixing the right problem? by davecb
Parent article: The trouble with stable pages

I think there are any number of other ways a writer of file pages might want only a consistent set of data to get hardened to disk, so a checksum computing callback isn't a very general solution.

Apparently, the way it works with stable pages is that something locks out the page from getting scheduled for writeback while the page is being updated and having its checksum calculated. So it sounds like a better solution is to have that thing lock out the page not from being scheduled, but from having I/O actually started. The page could move through the I/O queue while being locked/updated, but when it reaches the head of the queue if it is locked (in the middle of an update) at that moment, the scheduler starts something else instead, while the locked one otherwise retains its position at head of the queue.

You don't want to waste your time writing out a page that's just going to get dirty again immediately anyway.

Goodness gracious, are we fixing the right problem?

Posted Mar 17, 2012 22:53 UTC (Sat) by davecb (subscriber, #1574) [Link] (2 responses)

I quite agree: there should be several other ways to meet our needs than the one we first tried. My father used to say "if you can't think of at least three ways to do something, your not thinking hard enough". We've suggested two, perhaps others can suggest some more.

A minor niggle about deferring writes of locked pages: you need to delay not just the locked page but also any the depend upon it. When updating files, for example, you need to write the file data, the inode data and then the directory (if the file is new). Delaying the file write until after the inode write breaks the critical ordering we depend upon for consistency.

Of course, one might also change the logic to achieve consistency during writes by something other than critical orderings at this low a level: a good commit log of both metadata and data would allow us to enthusiastically reorder writes so much we could start risking starvation (;-))

--dave

Goodness gracious, are we fixing the right problem?

Posted Mar 17, 2012 23:49 UTC (Sat) by giraffedata (guest, #1954) [Link] (1 responses)

you need to delay not just the locked page but also any the depend upon it.

Where such ordering is required, it must be implemented today with write barriers, because otherwise the device driver, not to mention the device, is free to do I/Os from the queue in any order it pleases. But I don't think anyone would be updating a page that is scheduled for I/O and is in front of a write barrier - it would defeat the purpose.

Goodness gracious, are we fixing the right problem?

Posted Mar 18, 2012 0:03 UTC (Sun) by davecb (subscriber, #1574) [Link]

I fear one might do so unintentionally (;-))

--dave (exceedingly fallible) c-b