|| ||"Darrick J. Wong" <email@example.com> |
|| ||"Theodore Ts'o" <firstname.lastname@example.org> |
|| ||[PATCH v3 0/3] data integrity: Stabilize pages during writeback
for ext4 |
|| ||Wed, 4 May 2011 10:37:04 -0700|
|| ||Christoph Hellwig <email@example.com>,
Chris Mason <firstname.lastname@example.org>, Jeff Layton <email@example.com>,
Jan Kara <firstname.lastname@example.org>, Dave Chinner <email@example.com>,
Joel Becker <firstname.lastname@example.org>,
"Martin K. Petersen" <email@example.com>,
Jens Axboe <firstname.lastname@example.org>,
Mingming Cao <email@example.com>,
Dave Hansen <firstname.lastname@example.org>, email@example.com|
|| ||Article, Thread
This is v3 of the stable-page-writes patchset for ext4. A lot of code has been
cut out since v2 of this patch set. For v3, the large hairy function to walk
the page tables of every process is gone since Chris Mason pointed out that
page_mkclean does what I need. The set_memory_* hack is also gone, since (I
think) the only time the kernel maps a file data blocks for writing is in the
buffered IO case. That leaves us with some surgery to ext4_page_mkwrite to
return locked pages and to be careful about re-checking the writeback status
after dropping and re-grabbing the page lock; and a slight modification to the
mm code to wait for page writeback when grabbing pages for buffered writes.
There are also some cleanups for wait_on_page_writeback use in ext4.
I ran my write-after-checksum ("wac") reproducer program to try to create the
DIF checksum errors by madly rewriting the same memory pages. In fact, I tried
the following combinations:
a. 64 write() threads + sync_file_range
b. 64 mmap write threads + msync
c. 32 write() threads + sync_file_range + 32 mmap write threads + msync
d. Same as C, but with all threads in directio mode
e. Same as A, but with all threads in directio mode
f. Same as B, but with all threads in directio mode
After some 44 hours of safety testing across 4 machines, I saw zero errors.
Before the patchset, I could run any of A-F for 10 seconds or less and have a
screen full of errors.
To assess the performance impact of stable page writes, I moved to a disk that
doesn't have DIF support so that I could measure just the impact of waiting for
writeback. I first ran wac with 64 threads madly scribbling on a 64k file and
saw about a 12% performance decrease. I then reran the wac program with 64
threads and a 64MB file and saw about the same performance numbers. I will of
course be testing a wider range of hardware now that I have a functioning patch
set, though as I suspected the patchset only seems to impact workloads that
rewrite the same memory page frequently.
As always, questions and comments are welcome; and thank you to all the
previous reviewers of this patchset!
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to firstname.lastname@example.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"email@example.com"> firstname.lastname@example.org </a>