| From: |
| Alexei Starovoitov <alexei.starovoitov-AT-gmail.com> |
| To: |
| bpf-AT-vger.kernel.org |
| Subject: |
| [PATCH bpf-next v2 0/6] bpf, mm: Introduce __GFP_TRYLOCK |
| Date: |
| Mon, 09 Dec 2024 18:39:30 -0800 |
| Message-ID: |
| <20241210023936.46871-1-alexei.starovoitov@gmail.com> |
| Cc: |
| andrii-AT-kernel.org, memxor-AT-gmail.com, akpm-AT-linux-foundation.org, peterz-AT-infradead.org, vbabka-AT-suse.cz, bigeasy-AT-linutronix.de, rostedt-AT-goodmis.org, houtao1-AT-huawei.com, hannes-AT-cmpxchg.org, shakeel.butt-AT-linux.dev, mhocko-AT-suse.com, willy-AT-infradead.org, tglx-AT-linutronix.de, tj-AT-kernel.org, linux-mm-AT-kvack.org, kernel-team-AT-fb.com |
| Archive-link: |
| Article |
From: Alexei Starovoitov <ast@kernel.org>
Hi All,
This is a more complete patch set that introduces __GFP_TRYLOCK
for opportunistic page allocation and lockless page freeing.
It's usable for bpf as-is.
The main motivation is to remove bpf_mem_alloc and make
alloc page and slab reentrant.
These patch set is a first step. Once try_alloc_pages() is available
new_slab() can be converted to it and the rest of kmalloc/slab_alloc.
I started hacking kmalloc() to replace bpf_mem_alloc() completely,
but ___slab_alloc() is quite complex to convert to trylock.
Mainly deactivate_slab part. It cannot fail, but when only trylock
is available I'm running out of ideas.
So far I'm thinking to limit it to:
- USE_LOCKLESS_FAST_PATH
Which would mean that we would need to keep bpf_mem_alloc only for RT :(
- slab->flags & __CMPXCHG_DOUBLE, because various debugs cannot work in
trylock mode. bit slab_lock() cannot be made to work with trylock either.
- simple kasan poison/unposion, since kasan_kmalloc and kasan_slab_free are
too fancy with their own locks.
v1->v2:
- fixed buggy try_alloc_pages_noprof() in PREEMPT_RT. Thanks Peter.
- optimize all paths by doing spin_trylock_irqsave() first
and only then check for gfp_flags & __GFP_TRYLOCK.
Then spin_lock_irqsave() if it's a regular mode.
So new gfp flag will not add performance overhead.
- patches 2-5 are new. They introduce lockless and/or trylock free_pages_nolock()
and memcg support. So it's in usable shape for bpf in patch 6.
v1:
https://lore.kernel.org/bpf/20241116014854.55141-1-alexei...
Alexei Starovoitov (6):
mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation
mm, bpf: Introduce free_pages_nolock()
locking/local_lock: Introduce local_trylock_irqsave()
memcg: Add __GFP_TRYLOCK support.
mm, bpf: Use __GFP_ACCOUNT in try_alloc_pages().
bpf: Use try_alloc_pages() to allocate pages for bpf needs.
include/linux/gfp.h | 25 ++++++++
include/linux/gfp_types.h | 3 +
include/linux/local_lock.h | 9 +++
include/linux/local_lock_internal.h | 23 +++++++
include/linux/mm_types.h | 4 ++
include/linux/mmzone.h | 3 +
include/trace/events/mmflags.h | 1 +
kernel/bpf/syscall.c | 4 +-
mm/fail_page_alloc.c | 6 ++
mm/internal.h | 1 +
mm/memcontrol.c | 21 +++++--
mm/page_alloc.c | 94 +++++++++++++++++++++++++----
tools/perf/builtin-kmem.c | 1 +
13 files changed, 177 insertions(+), 18 deletions(-)
--
2.43.5