allow GFP_NOFS allocators to perform swapcache writeout

(Requires the direct-to-BIO-for-swap patch)

One weakness which was introduced when the buffer LRU went away was
that GFP_NOFS allocations became equivalent to GFP_NOIO.  Because all
writeback goes via writepage/writepages, which requires entry into the
filesystem.

However now that swapout no longer calls bmap(), we can honour
GFP_NOFS's intent for swapcache pages.  So if the allocation request
specifies __GFP_IO and !__GFP_FS, we can wait on swapcache pages and we
can perform swapcache writeout.

This should strengthen the VM somewhat.




--- 2.5.22/mm/vmscan.c~GFP_IO-swap	Sun Jun 16 23:12:53 2002
+++ 2.5.22-akpm/mm/vmscan.c	Sun Jun 16 23:12:53 2002
@@ -391,7 +391,8 @@ shrink_cache(int nr_pages, zone_t *class
 	spin_lock(&pagemap_lru_lock);
 	while (--max_scan >= 0 &&
 			(entry = inactive_list.prev) != &inactive_list) {
-		struct page * page;
+		struct page *page;
+		int may_enter_fs;
 
 		if (need_resched()) {
 			spin_unlock(&pagemap_lru_lock);
@@ -426,10 +427,17 @@ shrink_cache(int nr_pages, zone_t *class
 			goto page_mapped;
 
 		/*
+		 * swap activity never enters the filesystem and is safe
+		 * for GFP_NOFS allocations.
+		 */
+		may_enter_fs = (gfp_mask & __GFP_FS) ||
+				(PageSwapCache(page) && (gfp_mask & __GFP_IO));
+
+		/*
 		 * IO in progress? Leave it at the back of the list.
 		 */
 		if (unlikely(PageWriteback(page))) {
-			if (gfp_mask & __GFP_FS) {
+			if (may_enter_fs) {
 				page_cache_get(page);
 				spin_unlock(&pagemap_lru_lock);
 				wait_on_page_writeback(page);
@@ -450,7 +458,7 @@ shrink_cache(int nr_pages, zone_t *class
 		mapping = page->mapping;
 
 		if (PageDirty(page) && is_page_cache_freeable(page) &&
-				page->mapping && (gfp_mask & __GFP_FS)) {
+				page->mapping && may_enter_fs) {
 			/*
 			 * It is not critical here to write it only if
 			 * the page is unmapped beause any direct writer
@@ -479,6 +487,15 @@ shrink_cache(int nr_pages, zone_t *class
 		 * If the page has buffers, try to free the buffer mappings
 		 * associated with this page. If we succeed we try to free
 		 * the page as well.
+		 *
+		 * We do this even if the page is PageDirty().
+		 * try_to_release_page() does not perform I/O, but it is
+		 * possible for a page to have PageDirty set, but it is actually
+		 * clean (all its buffers are clean).  This happens if the
+		 * buffers were written out directly, with submit_bh(). ext3
+		 * will do this, as well as the blockdev mapping. 
+		 * try_to_release_page() will discover that cleanness and will
+		 * drop the buffers and mark the page clean - it can be freed.
 		 */
 		if (PagePrivate(page)) {
 			spin_unlock(&pagemap_lru_lock);

-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/