LWN: Comments on "Fixing error reporting—again" https://lwn.net/Articles/752613/ This is a special feed containing comments posted to the individual LWN article titled "Fixing error reporting—again". en-us Wed, 12 Nov 2025 07:20:53 +0000 Wed, 12 Nov 2025 07:20:53 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Fixing error reporting—again https://lwn.net/Articles/791216/ https://lwn.net/Articles/791216/ jlayton <div class="FormattedComment"> sync() is void return. syncfs() returns an int, and so could (in principle) return an error if there is a problem with writeback. syncfs() is not defined by POSIX, so it's not "broken" per-se, but I think it'd probably be more helpful to have it return an error if there was an issue with writeback.<br> </div> Sun, 16 Jun 2019 01:56:11 +0000 Fixing error reporting—again https://lwn.net/Articles/785579/ https://lwn.net/Articles/785579/ quocanh1897 <div class="FormattedComment"> <font class="QuotedText">&gt; syncfs() is "really broken" in its error reporting. He plans to fix that, probably by using another errseq_t in the superblock, since reporting from syncfs() requires a separate cursor on the error state.</font><br> I thought sync() always returns success, how does it "really broken"? And what is "separate cursor on the error state"?<br> Thanks.<br> </div> Thu, 11 Apr 2019 11:39:29 +0000 Fixing error reporting—again https://lwn.net/Articles/764074/ https://lwn.net/Articles/764074/ Trol1024 <div class="FormattedComment"> The crucial thing may be that a read() after a successful open()-write()-close() may return old data. <br> <p> That may happen where an async writeback error occurs after close() and the inode/mapping get evicted before read().<br> <p> That violate POSIX as POSIX requires that a read() that can be proved to occur after a write() has returned will return the new data. <br> </div> Tue, 04 Sep 2018 05:15:58 +0000 Fixing error reporting—again https://lwn.net/Articles/752992/ https://lwn.net/Articles/752992/ bfields <div class="FormattedComment"> "I wonder how many people don't fsync on NFS because they know close() is enough and are about to find out that it isn't."<br> <p> Do you think that's really likely?<br> <p> Linux knfsd doesn't support write delegations, but I believe that both the client and some popular servers have supported them for a while, and I don't recall seeing such a bug report.<br> <p> So, I'm optimistic, but I suppose it's something to keep an eye on. (Possibly also worth checking that the man pages don't provide any false guarantees here.)<br> </div> Fri, 27 Apr 2018 21:29:49 +0000 Fixing error reporting—again https://lwn.net/Articles/752917/ https://lwn.net/Articles/752917/ mjg59 <div class="FormattedComment"> <font class="QuotedText">&gt; Remember when ext3 had that wonderful "rename causes fsync" semantic, so no body bothered to fsync</font><br> <p> No? The behaviour people were expecting was that doing a write and then a rename would result in those operations happening in order and that you'd either end up with the old file or the new file. People weren't fsyncing because they didn't care *which* file ended up on disk, not because they were expecting rename to cause an implicit fsync.<br> </div> Fri, 27 Apr 2018 06:24:42 +0000 Fixing error reporting—again https://lwn.net/Articles/752916/ https://lwn.net/Articles/752916/ donald.buczek <div class="FormattedComment"> There is a difference: The Cought Fire error class would be reported by a read error.<br> </div> Fri, 27 Apr 2018 05:34:26 +0000 Fixing error reporting—again https://lwn.net/Articles/752908/ https://lwn.net/Articles/752908/ neilbrown <div class="FormattedComment"> <font class="QuotedText">&gt; Unless you have a write delegation, I believe....</font><br> <p> uh-oh.<br> Remember when ext3 had that wonderful "rename causes fsync" semantic, so no body bothered to fsync and when ext4 had more sane semantics people complained?<br> I wonder how many people don't fsync on NFS because they know close() is enough and are about to find out that it isn't.<br> <p> </div> Fri, 27 Apr 2018 02:37:42 +0000 Fixing error reporting—again https://lwn.net/Articles/752906/ https://lwn.net/Articles/752906/ bfields <div class="FormattedComment"> "NFS (and possibly other similar filesystems) is a bit different as close() always does an internal fsync() first - so a lack of an error there means that all the data is safe."<br> <p> Unless you have a write delegation, I believe....<br> </div> Fri, 27 Apr 2018 00:56:26 +0000 Fixing error reporting—again https://lwn.net/Articles/752837/ https://lwn.net/Articles/752837/ MarcB <div class="FormattedComment"> <font class="QuotedText">&gt; The thing about close() is that a lack of an error doesn't tell you anything about the data. It just tells you that writeback hasn't hit an error *yet*. I don't see how you can depend on something that is already unreliable.</font><br> <p> That is exactly the question I ask as an application developer: What does an error on close() mean, and why should I check it?<br> As I see it, *not* doing fsync(), but then checking close(), only catches errors in an unreliable way.<br> <p> As an example:<br> <p> Let's assume I do a doomed write(), that will hit a bad block, followed directly by a close().<br> <p> Now, after the write(), I get preempted for some time, and when my process runs again, and can submit the close(), it will get the error that occurred while other processes where running. Fine.<br> <p> But now, I am on an idle system and will be scheduled immediately once my write() returns. The error has not occurred yet, and I won't see it. Not so fine.<br> <p> (Alternatively: The first write() happens shortly before the automatic ext4 filesystem sync, the second shortly after).<br> <p> So, if I get an error from close(), something is wrong. But if I don't get the error, exactly the same thing might be wrong, it's just that no one has noticed yet.<br> <p> I find it hard, to come up with a scenario where that would be truly useful, but perhaps I am missing something. (Quotas and NFS are obvious candidates; they might add failure classes that close() catches reliably).<br> <p> </div> Thu, 26 Apr 2018 13:54:13 +0000 Fixing error reporting—again https://lwn.net/Articles/752822/ https://lwn.net/Articles/752822/ epa <div class="FormattedComment"> Well yes, and even if writeback has succeeded that doesn't promise you that the hard disk won't spontaneously catch fire and destroy your data tomorrow. The idea is to report, reliably, any errors that have occurred so far.<br> </div> Thu, 26 Apr 2018 08:38:26 +0000 Fixing error reporting—again https://lwn.net/Articles/752796/ https://lwn.net/Articles/752796/ neilbrown <div class="FormattedComment"> <font class="QuotedText">&gt; But it is documented that the close() call can return errors, so some users will be dependent on that behavior, Chinner said. </font><br> <p> The thing about close() is that a lack of an error doesn't tell you anything about the data. It just tells you that writeback hasn't hit an error *yet*. I don't see how you can depend on something that is already unreliable.<br> <p> NFS (and possibly other similar filesystems) is a bit different as close() always does an internal fsync() first - so a lack of an error there means that all the data is safe. For other filesystems, we don't need to go out of our way to report an error that cannot be relied upon anyway.<br> <p> </div> Wed, 25 Apr 2018 20:51:16 +0000