LWN.net Logo

[PATCH] slab: use order 0 for vfs caches

From:  Linux Kernel Mailing List <linux-kernel-AT-vger.kernel.org>
To:  bk-commits-head-AT-vger.kernel.org
Subject:  [PATCH] slab: use order 0 for vfs caches
Date:  Tue, 27 Apr 2004 06:01:44 +0000

ChangeSet 1.1567, 2004/04/26 23:01:44-07:00, akpm@osdl.org

	[PATCH] slab: use order 0 for vfs caches
	
	We have interesting deadlocks when slab decides to use order-1 allocations for
	ext3_inode_cache.  This is because ext3_alloc_inode() needs to perform a
	GFP_NOFS 1-order allocation.
	
	Sometimes the 1-order allocation needs to free a huge number of pages (tens of
	megabytes) before a 1-order grouping becomes available.  But the GFP_NOFS
	allocator cannot free dcache (and hence icache) due to the deadlock problems
	identified in shrink_dcache_memory().
	
	So change slab so that it will force 0-order allocations for shrinkable VFS
	objects.  We can handle those OK.


# This patch includes the following deltas:
#	           ChangeSet	1.1566  -> 1.1567 
#	           mm/slab.c	1.130   -> 1.131  
#

 slab.c |   74 ++++++++++++++++++++++++++++++++++++++---------------------------
 1 files changed, 44 insertions(+), 30 deletions(-)


diff -Nru a/mm/slab.c b/mm/slab.c
--- a/mm/slab.c	Tue Apr 27 00:23:45 2004
+++ b/mm/slab.c	Tue Apr 27 00:23:45 2004
@@ -1220,41 +1220,55 @@
 
 	size = ALIGN(size, align);
 
-	/* Cal size (in pages) of slabs, and the num of objs per slab.
-	 * This could be made much more intelligent.  For now, try to avoid
-	 * using high page-orders for slabs.  When the gfp() funcs are more
-	 * friendly towards high-order requests, this should be changed.
-	 */
-	do {
-		unsigned int break_flag = 0;
-cal_wastage:
+	if ((flags & SLAB_RECLAIM_ACCOUNT) && size <= PAGE_SIZE) {
+		/*
+		 * A VFS-reclaimable slab tends to have most allocations
+		 * as GFP_NOFS and we really don't want to have to be allocating
+		 * higher-order pages when we are unable to shrink dcache.
+		 */
+		cachep->gfporder = 0;
 		cache_estimate(cachep->gfporder, size, align, flags,
-						&left_over, &cachep->num);
-		if (break_flag)
-			break;
-		if (cachep->gfporder >= MAX_GFP_ORDER)
-			break;
-		if (!cachep->num)
-			goto next;
-		if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit) {
-			/* Oops, this num of objs will cause problems. */
-			cachep->gfporder--;
-			break_flag++;
-			goto cal_wastage;
-		}
-
+					&left_over, &cachep->num);
+	} else {
 		/*
-		 * Large num of objs is good, but v. large slabs are currently
-		 * bad for the gfp()s.
+		 * Calculate size (in pages) of slabs, and the num of objs per
+		 * slab.  This could be made much more intelligent.  For now,
+		 * try to avoid using high page-orders for slabs.  When the
+		 * gfp() funcs are more friendly towards high-order requests,
+		 * this should be changed.
 		 */
-		if (cachep->gfporder >= slab_break_gfp_order)
-			break;
+		do {
+			unsigned int break_flag = 0;
+cal_wastage:
+			cache_estimate(cachep->gfporder, size, align, flags,
+						&left_over, &cachep->num);
+			if (break_flag)
+				break;
+			if (cachep->gfporder >= MAX_GFP_ORDER)
+				break;
+			if (!cachep->num)
+				goto next;
+			if (flags & CFLGS_OFF_SLAB &&
+					cachep->num > offslab_limit) {
+				/* This num of objs will cause problems. */
+				cachep->gfporder--;
+				break_flag++;
+				goto cal_wastage;
+			}
+
+			/*
+			 * Large num of objs is good, but v. large slabs are
+			 * currently bad for the gfp()s.
+			 */
+			if (cachep->gfporder >= slab_break_gfp_order)
+				break;
 
-		if ((left_over*8) <= (PAGE_SIZE<<cachep->gfporder))
-			break;	/* Acceptable internal fragmentation. */
+			if ((left_over*8) <= (PAGE_SIZE<<cachep->gfporder))
+				break;	/* Acceptable internal fragmentation. */
 next:
-		cachep->gfporder++;
-	} while (1);
+			cachep->gfporder++;
+		} while (1);
+	}
 
 	if (!cachep->num) {
 		printk("kmem_cache_create: couldn't create cache %s.\n", name);
-
To unsubscribe from this list: send the line "unsubscribe bk-commits-head" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


(Log in to post comments)

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds