| From: |
| Johannes Weiner <hannes-AT-cmpxchg.org> |
| To: |
| Andrew Morton <akpm-AT-linux-foundation.org> |
| Subject: |
| [PATCH v4 0/8] mm: switch THP shrinker to list_lru |
| Date: |
| Thu, 21 May 2026 11:02:06 -0400 |
| Message-ID: |
| <20260521150330.1955924-1-hannes@cmpxchg.org> |
| Cc: |
| David Hildenbrand <david-AT-kernel.org>, Lorenzo Stoakes <ljs-AT-kernel.org>, Shakeel Butt <shakeel.butt-AT-linux.dev>, Michal Hocko <mhocko-AT-kernel.org>, Dave Chinner <david-AT-fromorbit.com>, Roman Gushchin <roman.gushchin-AT-linux.dev>, Muchun Song <muchun.song-AT-linux.dev>, Qi Zheng <qi.zheng-AT-linux.dev>, Yosry Ahmed <yosry.ahmed-AT-linux.dev>, Zi Yan <ziy-AT-nvidia.com>, "Liam R . Howlett" <liam-AT-infradead.org>, Usama Arif <usama.arif-AT-linux.dev>, Kiryl Shutsemau <kas-AT-kernel.org>, Vlastimil Babka <vbabka-AT-kernel.org>, Kairui Song <ryncsn-AT-gmail.com>, Mikhail Zaslonko <zaslonko-AT-linux.ibm.com>, Vasily Gorbik <gor-AT-linux.ibm.com>, Baolin Wang <baolin.wang-AT-linux.alibaba.com>, Barry Song <baohua-AT-kernel.org>, Dev Jain <dev.jain-AT-arm.com>, Lance Yang <lance.yang-AT-linux.dev>, Nico Pache <npache-AT-redhat.com>, Ryan Roberts <ryan.roberts-AT-arm.com>, cgroups-AT-vger.kernel.org, linux-mm-AT-kvack.org, linux-kernel-AT-vger.kernel.org |
| Archive-link: |
| Article |
This is version 4 of switching the THP shrinker to list_lru.
Changes in v4:
- guard folio_memcg_alloc_deferred() with mem_cgroup_disabled() to fix
NULL deref in __memcg_list_lru_alloc() when booting with
cgroup_disable=memory (e.g., kdump capture kernel) -- reported and
tested by Mikhail Zaslonko on s390 and x86
- flatten if (folio) branches in alloc_swap_folio() and alloc_anon_folio()
in a prep patch so the list_lru allocation additions are a clean minimal
diff (Lorenzo)
- folio_memcg_alloc_deferred() moved out of alloc_charge_folio() into the
anon-only collapse_huge_page() path; collapse_file() shares that helper
but its pages don't go on the THP shrinker queue (David)
- guard folio_memcg_alloc_deferred() with order > 1; mTHPs below order-2
can't be queued on the deferred split list (David)
- make deferred_split_lru static, hide behind folio_memcg_alloc_deferred()
wrapper with GFP_KERNEL (Lorenzo)
- rename l -> lru throughout huge_memory.c (Lorenzo)
- kdoc for folio_memcg_list_lru_alloc() (Lorenzo)
- list_lru_lock_irq()/unlock_irq()/add_irq() irq-disabling variants;
use list_lru_add_irq() in deferred_split_scan() (Lorenzo)
- reorder shrinker_free() before list_lru_destroy() (Lorenzo)
Changes in v3:
- dedicated lockdep_key for irqsafe deferred_split_lru.lock (syzbot)
- conditional list_lru ops in __folio_freeze_and_split_unmapped() (syzbot)
- annotate runs of inscrutable false, NULL, false function arguments (David)
- rename to folio_memcg_list_lru_alloc() (David)
Changes in v2:
- explicit rcu_read_lock() in __folio_freeze_and_split_unmapped() (Usama)
- split out list_lru prep bits (Dave)
The open-coded deferred split queue has issues. It's not NUMA-aware
(when cgroup is enabled), and it's more complicated in the callsites
interacting with it. Switching to list_lru fixes the NUMA problem and
streamlines things. It also simplifies planned shrinker work.
Patches 1-4 are cleanups and small refactors in list_lru code. They're
basically independent, but make the THP shrinker conversion easier.
Patch 5 extends the list_lru API to allow the caller to control the
locking scope. The THP shrinker has private state it needs to keep
synchronized with the LRU state.
Patch 6 extends the list_lru API with a convenience helper to do
list_lru head allocation (memcg_list_lru_alloc) when coming from a
folio. Anon THPs are instantiated in several places, and with the
folio reparenting patches pending, folio_memcg() access is now a more
delicate dance. This avoids having to replicate that dance everywhere.
Patch 7 flattens the folio allocation retry loops in alloc_swap_folio()
and alloc_anon_folio() without functional change, in preparation for
patch 8.
Patch 8 finally switches the deferred_split_queue to list_lru.
Based on mm-unstable.
include/linux/huge_mm.h | 7 +-
include/linux/list_lru.h | 68 +++++++++
include/linux/memcontrol.h | 4 -
include/linux/mmzone.h | 12 --
mm/huge_memory.c | 355 ++++++++++++++-----------------------------
mm/internal.h | 2 +-
mm/khugepaged.c | 3 +
mm/list_lru.c | 220 ++++++++++++++++++---------
mm/memcontrol.c | 12 +-
mm/memory.c | 52 ++++---
mm/mm_init.c | 15 --
11 files changed, 374 insertions(+), 376 deletions(-)