Comparing SystemTap and bpftrace

Posted Apr 20, 2021 1:22 UTC (Tue) by ringerc (subscriber, #3071)
In reply to: Comparing SystemTap and bpftrace by ringerc
Parent article: Comparing SystemTap and bpftrace

I split the details into a child comment so the main one wouldn't be too long.

SystemTap is widely available even for older systems, though the packaged versions are usually older so it's a bit of a pain to write scripts that work with them. It's easy to compile if you're allowed to install the needed toolchain and dependencies on the target and you have the time, but that adds to the hassle. Especially when you're not hands-on and you just want the other end (customer, or whatever) to run a tapscript for you.

Additionally, for its most fully featured and default runtime (kmod) SystemTap requires kernel headers and preferably debuginfo. These are frequently unavailable for whatever older kernel point release happens to be running on the target system at the time you need to run some tracing tools. Or at best you have to go digging manually through some archive of old packages that have aged out of the main repositories for the OS. The stap-prep tool can't usually find them for you. So to reliably use systemtap's kmod runtime you need to plan ahead and install kernel headers and debuginfo whenever you update the kernel, which nobody ever does. This drastically limits its practical utility.

But lots of eBPF features and helper functions are only available in much newer kernels. On widely deployed "enterprise" system kernels it's basically useless for nontrivial userspace tracing and analysis. eBPF is quite fragile in the face of kernel version changes as soon as you step outside the canned tracepoints, and the set of helper functions is extremely limited.

Even if you can run your bpf scripts, your userspace stacks are going to look like "-" most of the time, because everything is compiled with -fomit-frame-pointer. AFAICS most bpf tools don't handle external DWARF debuginfo or use tools like libunwind to help them out. So you land up having to recompile with -fno-omit-frame-pointer and use unstripped binaries with debuginfo in the main binary. This basically means you can't do much tracing of packaged userspace binaries as are the norm on production systems.

SystemTap on the other hand will not only get you your userspace stacks using DWARF detached debuginfo, it'll now even talk to a debuginfod to download symbols for you during probe compilation. It'll walk userspace pointers chains, examine struct members, recursively print structs, handle unions and so much more using simple built-in syntax. So it's currently infinitely more powerful for userspace probing and analysis ...

... or it would be if only you could find and install the kernel headers.

SystemTap also has 'dyninst' and 'bpf' runtimes, which entirely avoid the need for kernel headers and can often be used without kernel debuginfo. But a considerable number of the built-in systemtap "tapsets" rely on embedded-C code written for kernelspace, which simply won't work for a dyninst or bpf tapscript. Or they rely on helper functions exported by the kmod runtime that are not implemented for the dyninst or bpf runtimes. So in practice most of your existing systemtap scripts won't work, and scripts are more difficult to write for the dyninst or bpf runtimes.

Additionally, the dyninst runtime requires that you wrap the target using LD_PRELOAD. So it's cool for development and QA work but for a production system it's often impractical, as you frequently want to non-intrusively trace an already-server running process.

This means you can't usually apply eBPF or use SystemTap with any of its runtimes to any system you encounter in the wild.