|
|
Subscribe / Log in / New account

Errors on close

Errors on close

Posted Feb 13, 2025 17:54 UTC (Thu) by NYKevin (subscriber, #129325)
In reply to: Errors on close by farnz
Parent article: Maintainer opinions on Rust-for-Linux

> but it still closes the FD, so there is no way to recover.

Of course error-on-close is recoverable, you just delete the file and start over. Or if that doesn't work, report the error to the user so that they know their data has not been saved (and can take whatever action they deem appropriate, such as saving the data to a different filesystem, copying the data into the system clipboard and pasting it somewhere to be preserved by other means, etc.).


to post comments

Errors on close

Posted Feb 13, 2025 18:12 UTC (Thu) by Wol (subscriber, #4433) [Link]

> Of course error-on-close is recoverable, you just delete the file and start over.

Until you can't start over ... which is probably par for the course in most data entry applications ...

Cheers,
Wol

Errors on close

Posted Feb 14, 2025 10:49 UTC (Fri) by farnz (subscriber, #17727) [Link]

But, by the nature of Linux's close syscall, error on close means one of three things:
  1. You supplied a bad file descriptor to close. No data loss, nothing to do.
  2. A signal came in mid-close. No data loss, nothing to do.
  3. You're on NFS, and a previous operation failed - but you don't know which one, or whether the data is safe.

Deleting the file is the worst possible thing to do with an error on close - two of the three are cases where the data has been saved, and it's an oddity of your code that resulted in the error being reported on close. The third is one where the file is on an NFS mount, the NFS server is set up to write data immediately upon receiving a command (rather than on fsync, since you won't get a delayed error for a write if the NFS server is itself caching) and you didn't fsync before close (required on NFS to guarantee that you get errors).

But even in the latter case, close is not enough to guarantee that you get a meaningful error that tells you that the data has not been saved - you need fsync, since the NFS server is permitted to return success to all writes and closes, and only error on fsync.

And just to be completely clear, I think this makes error on close useless, because all it means in most cases is either "your program has a bug" or "a signal happened at a funny moment". There's a rare edge case if you have a weird NFS setup where an error on close can mean "data lost", but if you're not in that edge case (which cannot be detected programmatically, since it depends on the NFS server's configuration), the two worst possible things you can do if there's an error on close are "delete the file (containing safe data) and start over" and "report to the user that you've saved their data, probably, so that they can take action just in case this is an edge case system.

On the other hand, fsync deterministically tells you either that the data is as safe as can reasonably be promised, or thatit's lost, and you should take action.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds