|
|
Log in / Subscribe / Register

cgroup-aware OOM killer

From:  Roman Gushchin <guro-AT-fb.com>
To:  <linux-mm-AT-kvack.org>
Subject:  [v7 0/5] cgroup-aware OOM killer
Date:  Mon, 4 Sep 2017 15:21:03 +0100
Message-ID:  <20170904142108.7165-1-guro@fb.com>
Cc:  Roman Gushchin <guro-AT-fb.com>, Michal Hocko <mhocko-AT-kernel.org>, Vladimir Davydov <vdavydov.dev-AT-gmail.com>, Johannes Weiner <hannes-AT-cmpxchg.org>, Tetsuo Handa <penguin-kernel-AT-I-love.SAKURA.ne.jp>, David Rientjes <rientjes-AT-google.com>, Andrew Morton <akpm-AT-linux-foundation.org>, Tejun Heo <tj-AT-kernel.org>, <kernel-team-AT-fb.com>, <cgroups-AT-vger.kernel.org>, <linux-doc-AT-vger.kernel.org>, <linux-kernel-AT-vger.kernel.org>

This patchset makes the OOM killer cgroup-aware.

v7:
  - __oom_kill_process() drops reference to the victim task
  - oom_score_adj -1000 is always respected
  - Renamed oom_kill_all to oom_group
  - Dropped oom_prio range, converted from short to int
  - Added a cgroup v2 mount option to disable cgroup-aware OOM killer
  - Docs updated
  - Rebased on top of mmotm

v6:
  - Renamed oom_control.chosen to oom_control.chosen_task
  - Renamed oom_kill_all_tasks to oom_kill_all
  - Per-node NR_SLAB_UNRECLAIMABLE accounting
  - Several minor fixes and cleanups
  - Docs updated

v5:
  - Rebased on top of Michal Hocko's patches, which have changed the
    way how OOM victims becoming an access to the memory
    reserves. Dropped corresponding part of this patchset
  - Separated the oom_kill_process() splitting into a standalone commit
  - Added debug output (suggested by David Rientjes)
  - Some minor fixes

v4:
  - Reworked per-cgroup oom_score_adj into oom_priority
    (based on ideas by David Rientjes)
  - Tasks with oom_score_adj -1000 are never selected if
    oom_kill_all_tasks is not set
  - Memcg victim selection code is reworked, and
    synchronization is based on finding tasks with OOM victim marker,
    rather then on global counter
  - Debug output is dropped
  - Refactored TIF_MEMDIE usage

v3:
  - Merged commits 1-4 into 6
  - Separated oom_score_adj logic and debug output into separate commits
  - Fixed swap accounting

v2:
  - Reworked victim selection based on feedback
    from Michal Hocko, Vladimir Davydov and Johannes Weiner
  - "Kill all tasks" is now an opt-in option, by default
    only one process will be killed
  - Added per-cgroup oom_score_adj
  - Refined oom score calculations, suggested by Vladimir Davydov
  - Converted to a patchset

v1:
  https://lkml.org/lkml/2017/5/18/969


Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
Cc: cgroups@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org

Roman Gushchin (5):
  mm, oom: refactor the oom_kill_process() function
  mm, oom: cgroup-aware OOM killer
  mm, oom: introduce oom_priority for memory cgroups
  mm, oom, docs: describe the cgroup-aware OOM killer
  mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer

 Documentation/admin-guide/kernel-parameters.txt |   1 +
 Documentation/cgroup-v2.txt                     |  56 +++++
 include/linux/memcontrol.h                      |  36 +++
 include/linux/oom.h                             |  12 +-
 mm/memcontrol.c                                 | 293 ++++++++++++++++++++++++
 mm/oom_kill.c                                   | 209 +++++++++++------
 6 files changed, 537 insertions(+), 70 deletions(-)

-- 
2.13.5



Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds