|
|
Subscribe / Log in / New account

bpf, mm: Introduce __GFP_TRYLOCK

From:  Alexei Starovoitov <alexei.starovoitov-AT-gmail.com>
To:  bpf-AT-vger.kernel.org
Subject:  [PATCH bpf-next v2 0/6] bpf, mm: Introduce __GFP_TRYLOCK
Date:  Mon, 09 Dec 2024 18:39:30 -0800
Message-ID:  <20241210023936.46871-1-alexei.starovoitov@gmail.com>
Cc:  andrii-AT-kernel.org, memxor-AT-gmail.com, akpm-AT-linux-foundation.org, peterz-AT-infradead.org, vbabka-AT-suse.cz, bigeasy-AT-linutronix.de, rostedt-AT-goodmis.org, houtao1-AT-huawei.com, hannes-AT-cmpxchg.org, shakeel.butt-AT-linux.dev, mhocko-AT-suse.com, willy-AT-infradead.org, tglx-AT-linutronix.de, tj-AT-kernel.org, linux-mm-AT-kvack.org, kernel-team-AT-fb.com
Archive-link:  Article

From: Alexei Starovoitov <ast@kernel.org>

Hi All,

This is a more complete patch set that introduces __GFP_TRYLOCK
for opportunistic page allocation and lockless page freeing.
It's usable for bpf as-is.
The main motivation is to remove bpf_mem_alloc and make
alloc page and slab reentrant.
These patch set is a first step. Once try_alloc_pages() is available
new_slab() can be converted to it and the rest of kmalloc/slab_alloc.

I started hacking kmalloc() to replace bpf_mem_alloc() completely,
but ___slab_alloc() is quite complex to convert to trylock.
Mainly deactivate_slab part. It cannot fail, but when only trylock
is available I'm running out of ideas.
So far I'm thinking to limit it to:
- USE_LOCKLESS_FAST_PATH
  Which would mean that we would need to keep bpf_mem_alloc only for RT :(
- slab->flags & __CMPXCHG_DOUBLE, because various debugs cannot work in
  trylock mode. bit slab_lock() cannot be made to work with trylock either.
- simple kasan poison/unposion, since kasan_kmalloc and kasan_slab_free are
  too fancy with their own locks.

v1->v2:
- fixed buggy try_alloc_pages_noprof() in PREEMPT_RT. Thanks Peter.
- optimize all paths by doing spin_trylock_irqsave() first
  and only then check for gfp_flags & __GFP_TRYLOCK.
  Then spin_lock_irqsave() if it's a regular mode.
  So new gfp flag will not add performance overhead.
- patches 2-5 are new. They introduce lockless and/or trylock free_pages_nolock()
  and memcg support. So it's in usable shape for bpf in patch 6.

v1:
https://lore.kernel.org/bpf/20241116014854.55141-1-alexei...

Alexei Starovoitov (6):
  mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation
  mm, bpf: Introduce free_pages_nolock()
  locking/local_lock: Introduce local_trylock_irqsave()
  memcg: Add __GFP_TRYLOCK support.
  mm, bpf: Use __GFP_ACCOUNT in try_alloc_pages().
  bpf: Use try_alloc_pages() to allocate pages for bpf needs.

 include/linux/gfp.h                 | 25 ++++++++
 include/linux/gfp_types.h           |  3 +
 include/linux/local_lock.h          |  9 +++
 include/linux/local_lock_internal.h | 23 +++++++
 include/linux/mm_types.h            |  4 ++
 include/linux/mmzone.h              |  3 +
 include/trace/events/mmflags.h      |  1 +
 kernel/bpf/syscall.c                |  4 +-
 mm/fail_page_alloc.c                |  6 ++
 mm/internal.h                       |  1 +
 mm/memcontrol.c                     | 21 +++++--
 mm/page_alloc.c                     | 94 +++++++++++++++++++++++++----
 tools/perf/builtin-kmem.c           |  1 +
 13 files changed, 177 insertions(+), 18 deletions(-)

-- 
2.43.5




Copyright © 2024, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds