| From: |
| Eduard Zingerman <eddyz87-AT-gmail.com> |
| To: |
| bpf-AT-vger.kernel.org, ast-AT-kernel.org |
| Subject: |
| [PATCH RFC bpf-next 0/6] bpf: better error reporting when verifier hits 1M instructions limit |
| Date: |
| Tue, 26 May 2026 02:30:01 -0700 |
| Message-ID: |
| <20260526-better-1m-reporting-v1-0-51e4f2c59780@gmail.com> |
| Cc: |
| andrii-AT-kernel.org, daniel-AT-iogearbox.net, martin.lau-AT-linux.dev, kernel-team-AT-fb.com, yonghong.song-AT-linux.dev, Eduard Zingerman <eddyz87-AT-gmail.com> |
| Archive-link: |
| Article |
When the BPF verifier exceeds the 1M instruction budget, the current
error output shows a random execution trace that happens to be active
at the moment, which is not very helpful for debugging.
This series improves the error report using a profiler-inspired
approach: collect and count "callchain" stack traces that the verifier
visits during program validation, and report the top 3 hottest traces
when the budget is exhausted. To minimize performance an memory impact
of such profiling, only collect samples when verifier visits loop
headers, iterator next, may_goto and callback-calling instructions.
For callchains ending at iterator next, may_goto, or callback-calling
instructions, identify which registers or stack slots most frequently
differ between cached and current states.
Here is an example of the report for scx lavd_dispatch, with verifier
limited to 200K instructions to trigger the error:
lavd_dispatch():
; void BPF_STRUCT_OPS(lavd_dispatch, s32 cpu, struct task_struct *prev) @ main.bpf.c:889
... disassembly ...
consume_task():
; bool consume_task(u64 cpu_dsq_id, u64 cpdom_dsq_id) @ balance.bpf.c:410
... disassembly ...
#1 most visited simulated stacktrace (visited 1807 times):
lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
consume_task/2715 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:316)
#2 most visited simulated stacktrace (visited 1682 times):
lavd_dispatch/124 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1107)
consume_task/2994 (.../scx/scheds/rust/scx_lavd/src/bpf/balance.bpf.c:386)
#3 most visited simulated stacktrace (visited 8 times):
lavd_dispatch/255 (.../scx/scheds/rust/scx_lavd/src/bpf/main.bpf.c:1022)
Most varying: R7 (frame 0)
BPF program is too large. Processed 200001 insn
---
Eduard Zingerman (6):
bpf: move live registers and scc printout to a standalone function
bpf: compute loops hierarchy
selftests/bpf: test cases for loop hierarchy computation
bpf: report hot simulated callchains when 1M instructions limit is met
bpf: report register diff summary for hot callchains
selftests/bpf: test budget exhaustion profiling report
include/linux/bpf_verifier.h | 38 ++++
kernel/bpf/Makefile | 2 +-
kernel/bpf/fixups.c | 5 +
kernel/bpf/liveness.c | 22 +-
kernel/bpf/loops.c | 184 +++++++++++++++++
kernel/bpf/states.c | 174 ++++++++++++++--
kernel/bpf/verifier.c | 230 +++++++++++++++++++++
tools/testing/selftests/bpf/prog_tests/verifier.c | 4 +
.../selftests/bpf/progs/verifier_budget_report.c | 175 ++++++++++++++++
.../selftests/bpf/progs/verifier_live_stack.c | 2 +-
.../selftests/bpf/progs/verifier_loop_hierarchy.c | 223 ++++++++++++++++++++
11 files changed, 1014 insertions(+), 45 deletions(-)
---
base-commit: 8496d9020ff37a33c2a7b2fc84350fd03ffbde78
change-id: 20260525-better-1m-reporting-1d795a21cf72