Some thoughts
Some thoughts
Posted Jul 17, 2025 5:19 UTC (Thu) by irogers (subscriber, #121692)Parent article: SFrame-based stack unwinding for the kernel
Some other thoughts:
- Why a new encoding format, why not have a subset of dwarf that matches the limited stack frame encoding supported by sframes? Dwarf is compressed and there may be potential for reuse of the ELF section in areas like C++ exception delivery.
- When perf is invoked to record system wide for a period of time like `perf record -e cycles -a sleep 10` any system call that hasn't transitioned to user code will only have the kernel side of the stack trace.
- The bpf helper bpf_get_stackid with BPF_F_USER_STACK will just return a placeholder value which may break or cause additional memory usage in BPF programs.
- JITs need to be taught to generate/register/remove SFrames or perhaps some kind of JIT interface like GDB's can be devised.
"Unfortunately, frame pointers hurt performance; they occupy a CPU register, and must be saved and restored on each function call." A register can be freed by holding the frame pointer in thread-local storage. Function call cost is addressed by inlining. x86 lacks any callee-save floating point registers which is likely a greater performance issue and unfortunately not addressed by the upcoming APX changes. A greater performance issue is the -fno-omit-frame-pointers is less well optimized by C compilers, for example, missing tail call optimizations.
Last year the return of frame pointers was heralded:
https://www.brendangregg.com/blog/2024-03-17/the-return-o...
There have been calling conventions like ARM32's APCS that have very dense debug info but also would be reasonably easy to reverse engineer by walking the code forward until the function return is seen. Perhaps the code itself is the best debug format.