How to cope with hardware-poisoned page-cache pages

Posted May 5, 2022 17:55 UTC (Thu) by jlayton (subscriber, #31672)
In reply to: How to cope with hardware-poisoned page-cache pages by willy
Parent article: How to cope with hardware-poisoned page-cache pages

Thanks. It looks like most of the callers that end up in that function do call mapping_set_error() first if the page was still dirty. I'm not sure of the exact scenario that would lead to silent data corruption, so I'd be interested to understand how that can occur.

ISTM that while we would lose the data on the page in these situations, it wouldn't be silent. You'd get an error back on the next fsync/msync. If there are any gaps in that coverage though, we should fix them.

How to cope with hardware-poisoned page-cache pages

Posted May 5, 2022 19:37 UTC (Thu) by yang.shi (subscriber, #133088) [Link] (2 responses)

It does set AS_EIO, the first fsync does return error, but the read will return old data from disk since the page is truncated. No error is returned on the read path. Write syscall also succeeds.

For example, a simple test is we create a file and write to the file, then inject memory error to one page which is dirty, then reread the range, all the written data is lost, you will get old data (0 in this simple test).

How to cope with hardware-poisoned page-cache pages

Posted May 5, 2022 19:54 UTC (Thu) by jlayton (subscriber, #31672) [Link] (1 responses)

> It does set AS_EIO, the first fsync does return error, but the read will return old data from disk since the page is truncated. No error is returned on the read path. Write syscall also succeeds.

Which is expected behavior. Once you call fsync and get back an error, any data written since the last fsync is now suspect -- some writes may have succeeded and some may not.

It's up to the application to make sense of the state (unfortunately). That's not a trivial task, but that's really all we can guarantee in the face of this sort of problem.

How to cope with hardware-poisoned page-cache pages

Posted May 6, 2022 23:19 UTC (Fri) by yang.shi (subscriber, #133088) [Link]

> Which is expected behavior. Once you call fsync and get back an error, any data written since the last fsync is now suspect -- some writes may have succeeded and some may not.

IIUC it means the data on disk is suspect instead of the data in page cache, right? This doesn't bother the readers. The readers still consume the consistent data. The memory error is different, it means the data in page cache is even suspect. So waiting for fsync() may be already late.

> It's up to the application to make sense of the state (unfortunately). That's not a trivial task, but that's really all we can guarantee in the face of this sort of problem.

I agree it is not a trivial task, particularly if we want to handle this in page granularity.