|
|
Subscribe / Log in / New account

Multiple drives

Multiple drives

Posted Jun 3, 2017 9:53 UTC (Sat) by jlayton (subscriber, #31672)
In reply to: Multiple drives by pbonzini
Parent article: Improved block-layer error handling

That's really not related to the changes we're making here, but it is possible to do so.

Ultimately, an fsync syscall returns whatever the filesystem's fsync operation returns, so if the filesystem wants to check for O_DIRECT and always return 0 without flushing, then it can do so today.

Now, that said...one wonders why an application would call fsync on an O_DIRECT fd?


to post comments

Multiple drives

Posted Jun 4, 2017 4:02 UTC (Sun) by neilbrown (subscriber, #359) [Link] (4 responses)

> Now, that said...one wonders why an application would call fsync on an O_DIRECT fd?

To ensure that the metadata is safe? I think you need O_SYNC|O_DIRECT if you want to not use fsync at all.
See "man 2 open"

Multiple drives

Posted Jun 4, 2017 19:32 UTC (Sun) by pbonzini (subscriber, #60935) [Link] (3 responses)

Also to ensure that the data is safe, because writes can stop at the disk cache and an fsync is needed to ensure it reaches the platters or the flash. This is represented as a REQ_FLUSH request (while metadata often are REQ_FUA, i.e. force unit access). REQ_FLUSH applies to all completed writes *before* the flush, while FUA applies to the write that had the flag only.

Multiple drives

Posted Jun 5, 2017 11:44 UTC (Mon) by jlayton (subscriber, #31672) [Link] (2 responses)

Thanks, that makes sense.

I don't quite see why you'd want to avoid reporting errors on a O_DIRECT fd in either case though. In both cases, it's possible that data previously written via that O_DIRECT file descriptor didn't make it to disk, so wouldn't you want to inform the application?

The big change here is that reporting those errors on the O_DIRECT fd won't prevent someone else from seeing those errors on via another fd. So, I don't quite see why it'd be desirable to avoid reporting it on the O_DIRECT one.

Multiple drives

Posted Jun 5, 2017 11:55 UTC (Mon) by pbonzini (subscriber, #60935) [Link] (1 responses)

> I don't quite see why you'd want to avoid reporting errors on a O_DIRECT fd in either case though. In both cases, it's possible that data previously written via that O_DIRECT file descriptor didn't make it to disk, so wouldn't you want to inform the application?

I certainly would. :) However, I'm worried about the application using O_DIRECT seeing errors that happened while accessing the file via another fd.

In fact, if I understand correctly, those errors could even have happened before the O_DIRECT file descriptor had even been opened, if they have never been reported to userspace.

Multiple drives

Posted Jun 5, 2017 16:15 UTC (Mon) by jlayton (subscriber, #31672) [Link]

The patchset actually initializes the errseq_t in struct file to the value of the mapping's errseq_t at open time. So, in principle you shouldn't see errors that occurred prior to your open.

How mixed buffered and direct I/O are handled is not really addressed (or changed for that matter) in this set. Yes, you will quite likely see an error on an O_DIRECT fsync, but it's quite likely that you'll see that today anyway. Most filesystems make no distinction about whether you opened the fd with O_DIRECT or not. They flush the pagecache and inode anyway just like they would with a buffered fd.

The flip side of this (and the scarier problem) is that with the current code, it's likely that that fsync on the O_DIRECT fd would end up clearing the error such that a later fsync on the buffered fd wouldn't ever see it. That problem at least should be addressed with these changes.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds