|
|
Subscribe / Log in / New account

sched/numa: introduce numa locality

From:  王贇 <yun.wang-AT-linux.alibaba.com>
To:  Ingo Molnar <mingo-AT-redhat.com>, Peter Zijlstra <peterz-AT-infradead.org>, Juri Lelli <juri.lelli-AT-redhat.com>, Vincent Guittot <vincent.guittot-AT-linaro.org>, Dietmar Eggemann <dietmar.eggemann-AT-arm.com>, Steven Rostedt <rostedt-AT-goodmis.org>, Ben Segall <bsegall-AT-google.com>, Mel Gorman <mgorman-AT-suse.de>, Luis Chamberlain <mcgrof-AT-kernel.org>, Kees Cook <keescook-AT-chromium.org>, Iurii Zaikin <yzaikin-AT-google.com>, Michal Koutný <mkoutny-AT-suse.com>, linux-fsdevel-AT-vger.kernel.org, linux-kernel-AT-vger.kernel.org, linux-doc-AT-vger.kernel.org, "Paul E. McKenney" <paulmck-AT-linux.ibm.com>, Randy Dunlap <rdunlap-AT-infradead.org>, Jonathan Corbet <corbet-AT-lwn.net>
Subject:  [PATCH v3 0/2] sched/numa: introduce numa locality
Date:  Tue, 3 Dec 2019 13:59:14 +0800
Message-ID:  <040def80-9c38-4bcc-e4a8-8a0d10f131ed@linux.alibaba.com>
Archive-link:  Article

Since last patch set:
  Because the locality region concept is too complicated, we tried to
  decouple the time factor from it now as Michal commented.

  Now the numa_stat just expose the local/remote page accessing counter
  , gathering from all the tasks in hierarchy. This should be much more
  easier to understand, also the meaning of counter is straightforward.

  Now we have just one locality percentage for each cgroup, to represent
  how NUMA Balancing is working and imply NUMA efficiency.

Modern production environment could use hundreds of cgroup to control
the resources for different workloads, along with the complicated
resource binding.

On NUMA platforms where we have multiple nodes, things become even more
complicated, we hope there are more local memory access to improve the
performance, and NUMA Balancing keep working hard to achieve that,
however, wrong memory policy or node binding could easily waste the
effort, result a lot of remote page accessing.

We need to notice such problems, then we got chance to fix it before
there are too much damages, however, there are no good monitoring
approach yet to help catch the mouse who introduced the remote access.

This patch set is trying to fill in the missing pieces, by introduce
the per-cgroup NUMA locality info, with this new statistics, we could
achieve the daily monitoring on NUMA efficiency, to give warning when
things going too wrong.

Please check the second patch for more details.

Michael Wang (2):
  sched/numa: introduce per-cgroup NUMA locality info
  sched/numa: documentation for per-cgroup numa statistics

 Documentation/admin-guide/cg-numa-stat.rst      | 176 ++++++++++++++++++++++++
 Documentation/admin-guide/index.rst             |   1 +
 Documentation/admin-guide/kernel-parameters.txt |   4 +
 Documentation/admin-guide/sysctl/kernel.rst     |   9 ++
 include/linux/sched.h                           |  15 ++
 include/linux/sched/sysctl.h                    |   6 +
 init/Kconfig                                    |  11 ++
 kernel/sched/core.c                             |  75 ++++++++++
 kernel/sched/fair.c                             |  62 +++++++++
 kernel/sched/sched.h                            |  12 ++
 kernel/sysctl.c                                 |  11 ++
 11 files changed, 382 insertions(+)
 create mode 100644 Documentation/admin-guide/cg-numa-stat.rst

-- 
2.14.4.44.g2045bb6



Copyright © 2019, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds