locking: Introduce nested-BH locking.
From: | Sebastian Andrzej Siewior <bigeasy-AT-linutronix.de> | |
To: | linux-kernel-AT-vger.kernel.org, netdev-AT-vger.kernel.org | |
Subject: | [PATCH v2 net-next 00/15] locking: Introduce nested-BH locking. | |
Date: | Fri, 03 May 2024 20:25:04 +0200 | |
Message-ID: | <20240503182957.1042122-1-bigeasy@linutronix.de> | |
Cc: | "David S. Miller" <davem-AT-davemloft.net>, Boqun Feng <boqun.feng-AT-gmail.com>, Daniel Borkmann <daniel-AT-iogearbox.net>, Eric Dumazet <edumazet-AT-google.com>, Frederic Weisbecker <frederic-AT-kernel.org>, Ingo Molnar <mingo-AT-redhat.com>, Jakub Kicinski <kuba-AT-kernel.org>, Paolo Abeni <pabeni-AT-redhat.com>, Peter Zijlstra <peterz-AT-infradead.org>, Thomas Gleixner <tglx-AT-linutronix.de>, Waiman Long <longman-AT-redhat.com>, Will Deacon <will-AT-kernel.org> | |
Archive-link: | Article |
Disabling bottoms halves acts as per-CPU BKL. On PREEMPT_RT code within local_bh_disable() section remains preemtible. As a result high prior tasks (or threaded interrupts) will be blocked by lower-prio task (or threaded interrupts) which are long running which includes softirq sections. The proposed way out is to introduce explicit per-CPU locks for resources which are protected by local_bh_disable() and use those only on PREEMPT_RT so there is no additional overhead for !PREEMPT_RT builds. The series introduces the infrastructure and converts large parts of networking which is largest stake holder here. Once this done the per-CPU lock from local_bh_disable() on PREEMPT_RT can be lifted. v1…v2 https://lore.kernel.org/all/20231215171020.687342-1-bigea... - Jakub complained about touching networking drivers to make the additional locking work. Alexei complained about the additional locking within the XDP/eBFP case. This led to a change in how the per-CPU variables are accessed for the XDP/eBPF case. On PREEMPT_RT the variables are now stored on stack and the task pointer to the structure is saved in the task_struct while keeping every for !RT unchanged. This was proposed as a RFC in v1: https://lore.kernel.org/all/20240213145923.2552753-1-bige... and then updated v2: https://lore.kernel.org/all/20240229183109.646865-1-bigea... - Renamed the container struct from xdp_storage to bpf_net_context. Suggested by Toke Høiland-Jørgensen. - Use the container struct also on !PREEMPT_RT builds. Store the pointer to the on-stack struct in a per-CPU variable. Suggested by Toke Høiland-Jørgensen. This reduces the initial queue from 24 to 15 patches. - There were complains about the scoped_guard() which shifts the whole block and makes it harder to review because the whole gets removed and added again. The usage has been replaced with local_lock_nested_bh()+ its unlock counterpart. Sebastian