|| ||Christoph Lameter <email@example.com>|
|| ||[PATCH 00/14] Zoned VM counters V2|
|| ||Thu, 8 Jun 2006 16:02:39 -0700 (PDT)|
|| ||firstname.lastname@example.org, Hugh Dickins <email@example.com>,
Nick Piggin <firstname.lastname@example.org>, email@example.com,
Andi Kleen <firstname.lastname@example.org>,
Marcelo Tosatti <email@example.com>,
Christoph Lameter <firstname.lastname@example.org>|
Zone based VM statistics are necessary to be able to determine what the state
of memory in one zone is. In a NUMA system this can be helpful for local
reclaim and other memory optimizations that may be able to shift VM load
in order to get more balanced memory use.
It is also helpful to know how the computing load affects the memory
allocations on various zones.
The patchset introduces a framework for counters that is a cross between the
existing page_stats --which are simply global counters split per cpu-- and
the approach of deferred incremental updates implemented for nr_pagecache.
Small per cpu 8 bit counters are added to struct zone. If the counter
exceeds certain thresholds then the counters are accumulated in an array of
atomic_long in the zone and in a global array that sums up all
Access to VM counter information for a zone and for the whole machine
is then possible by simply indexing an array (Thanks to Nick Piggin for
pointing out that approach). The access to the total number of pages of
various types does no longer require the summing up of all per cpu counters.
Benefits of this patchset right now:
- zone_reclaim_interval vanishes since VM stats can now determine
when it is worth to scan for reclaimable pages.
- loops over all processors are avoided in writeback and
- Get rid of the nr_pagecache atomic for the single processor case
- Accurate counters in /sys/devices/system/node/node*/meminfo. Current
counters are based on where the pages were allocated so the counters
were not useful to show the actual use of pages on a node.
- Detailed VM counters available in more /proc and /sys status files.
References to earlier discussions:
Earlier approaches: http://marc.theaimsgroup.com/?l=linux-kernel&m=113460...
Performance test with AIM7 did not show any regressions. Seems to be a tad
faster even. Tested on ia64/ NUMA. Builds fine on i386, SMP / UP.
- Cleanup code, resequence and base patches on 2.6.17-rc6-mm1
- Reduce interrupt holdoffs
- Add zone reclaim interval removal patch
- Rename EVENT_COUNTER to VM_EVENT_COUNTERS (also all variables and functions)
The patchset consists of 14 patches. These are:
01/14 Per zone counter infrastructure
Sets up the functionality to handle per zone counters but does not
02/14 Add zoned counters to /proc/vmstat
Adds the display of zoned counters
03/14 Conversion of nr_mapped to a per zone counter
Converts nr_mapped and sets up the first per zone counters. This allows
optimizations in various places that avoid looping over counters from all
04/14 Conversion of nr_pagecache to a per zone counter
Replace the single atomic variable with a per cpu counter. For UP this means
that no atomic operations have to be used for nr_pagecache anymore. Remove
special nr_pagecache code.
05/14 Use zoned counters instead of zone_reclaim_interval
Replace the zone_reclaim_interval logic with a check for
06/14 Extend proc per node, per zone stats by adding per zone counters
Adds new counters to various places where we display counters.
07/14 Conversion of nr_slab to a per zone counter
This avoids looping over processors in the reclaim code and allows accurate
accounting of slab use per zone.
08/14 Conversion of nr_pagetable to a per zone counter
Allows accurate accounting of pagetable pages per zone.
09/14 Conversion of nr_dirty to a per zone counter
Avoids loop over processors in writeback state determination
10/14 Conversion of nr_writeback to a per zone counter
Avoids loop over processors in writeback state determination.
11/14 Conversio of nr_unstable to a per zone counter
Avoids loop over proessors in writeback state determination.
12/14 Remove get_page_state functions
There is no need anymoore for the get_page_state function. So remove it.
13/14 Convert nr_bounce to a per zone counter
nr_bounce also counts a type of page.
14/14 Remove writeback structures
There is really no need anymore to cache writeback information since
the counters are readily available. Remove the writeback information
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/