| From: |
| Lee Schermerhorn <lee.schermerhorn@hp.com> |
| To: |
| linux-mm@kvack.org, linux-numa@vger.kernel.org |
| Subject: |
| [PATCH 0/4] hugetlb: V3 constrain allocation/free based on task mempolicy |
| Date: |
| Wed, 29 Jul 2009 13:54:50 -0400 |
| Cc: |
| <akpm@linux-foundation.org>, Mel Gorman <mel@csn.ul.ie>,
Nishanth Aravamudan <nacc@us.ibm.com>, andi@firstfloor.org,
David Rientjes <rientjes@google.com>,
Adam Litke <agl@us.ibm.com>,
Andy Whitcroft <apw@canonical.com>, <eric.whitney@hp.com> |
PATCH 0/4 hugetlb: constrain allocation/free based on task mempolicy
I'm sending these out again, slightly revised, for comparison
with a 3rd alternative for controlling where persistent huge
pages are allocated which I'll send out as a separate series.
Against: 2.6.31-rc3-mmotm-090716-1432
atop previously submitted "alloc_bootmem_huge_pages() fix"
[http://marc.info/?l=linux-mm&m=124775468226290&w=4]
This is V3 of a series of patches to constrain the allocation and
freeing of persistent huge pages using the task NUMA mempolicy of
the task modifying "nr_hugepages". This series is based on Mel
Gorman's suggestion to use task mempolicy. One of the benefits
of this method is that it does not *require* modification to
hugeadm(8) to use this feature.
V3 factors the "rework" of the hstate_next_node_to_{alloc|free}
functions out of the patch to derive huge pages nodes_allowed
from mempolicy, and moves it before the patch to add nodemasks
to the alloc/free functions. See patch patch 1/4.
A couple of limitations [still] in this version:
1) I haven't implemented a boot time parameter to constrain the
boot time allocation of huge pages. This can be added if
anyone feels strongly that it is required.
2) I have not implemented a per node nr_overcommit_hugepages as
David Rientjes and I discussed earlier. Again, this can be
added and specific nodes can be addressed using the mempolicy
as this series does for allocation and free. However, after
some experience with the libhugetlbfs test suite, specifically
attempting to run the test suite constrained by mempolicy and
a cpuset, I'm thinking that per node overcommit limits might
not be such a good idea. This would require an application
[or the library] to sum the per node limits over the allowed
nodes and possibly compare to global limits to determine the
available resources. Per cpuset limits might work better.
This are requires more investigation, but this patch series
doesn't seem to make things worse than they already are in
this regard.
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html