LWN: Comments on "Relief for insomniac tracepoints" https://lwn.net/Articles/835426/ This is a special feed containing comments posted to the individual LWN article titled "Relief for insomniac tracepoints". en-us Sat, 04 Oct 2025 10:39:08 +0000 Sat, 04 Oct 2025 10:39:08 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Relief for insomniac tracepoints https://lwn.net/Articles/835754/ https://lwn.net/Articles/835754/ ringerc <div class="FormattedComment"> <font class="QuotedText">&gt; There would be value, though, in the ability to look at user-space data in a tracepoint handler as well.</font><br> <p> Gee, really?<br> <p> Kernel people have been saying &quot;Systemtap is obsolete! Use ebpf-tools! Use bpftrace!&quot;<br> <p> So I tried to convert a few of my simplest trace programs from systemtap to run as bpftrace scripts and found that it was *not possible to read the value of a short null terminated C string from userspace in bpftrace*. At least when I tried it.<br> <p> perf isn&#x27;t a lot better there either. Got static tracepoints with char* arguments? Good luck with that. It doesn&#x27;t know how to capture the duration of a syscall without a lot of help or post-processing either. (strace -c can, but is expensive and limited).<br> <p> I can&#x27;t help wish various interested parties would actually finish one trace framework before replacing it with another that - again - only services the needs of kernel hackers.<br> <p> So yeah. I can&#x27;t say I&#x27;m entirely shocked that it might be desirable to read userspace memory when doing full-system tracing, complex program flow analysis, targeted performance work, etc.<br> </div> Sun, 01 Nov 2020 13:44:02 +0000 Access user space data without sleeping https://lwn.net/Articles/835711/ https://lwn.net/Articles/835711/ simcop2387 <div class="FormattedComment"> I don&#x27;t know how it relates to tracepoints and other things, but this is mostly true. eBPF can do the dereferencing you mentioned, but cBPF can&#x27;t. This is one of the things that differs for seccomp()&#x27;s cBPF programs.<br> </div> Sat, 31 Oct 2020 02:17:46 +0000 Access user space data without sleeping https://lwn.net/Articles/835690/ https://lwn.net/Articles/835690/ danobi <div class="FormattedComment"> <font class="QuotedText">&gt; Thus, for example, a tracepoint running on entry to the openat2() system call can see the pointer to the open_how structure passed by user space, but is unable to examine the contents of the structure itself. </font><br> <p> IIUC, this is incorrect. BPF programs can dereference userspace memory with bpf_probe_read{,_user}. It&#x27;s just that if that access would fault memory, the helper returns an error and the read memory is 0s. Unless the system is under memory pressure, I&#x27;ve usually only seen bpf_probe_read* fail on immutable strings stored in rodata.<br> <p> For example, if you run the follow bpftrace one-liner:<br> <p> # bpftrace -e &#x27;tracepoint:syscalls:sys_enter_openat2 { printf(&quot;0x%x\n&quot;, args-&gt;how-&gt;flags) }&#x27; --btf -kk<br> ...<br> 0x40<br> 0x410002<br> 0x200000<br> <p> against `tools/testing/selftests/openat2/openat2_test`, things seem to work right.<br> <p> (the --btf flag resolves the tracepoint types, the -kk flag reports if bpf helpers return an error).<br> </div> Fri, 30 Oct 2020 21:48:01 +0000 Access user space data without sleeping https://lwn.net/Articles/835674/ https://lwn.net/Articles/835674/ compudj <div class="FormattedComment"> As far as my own comment is concerned, I&#x27;m discussing use a trace post-processing approach (or live trace streaming) through LTTng to analyze the behavior of a system either after the fact or in real-time (shortly after it has happened). There it is possible to reconstruct a model of the entire filesystem mounts and path hierarchy anywhere within the trace from a trace post-processing analysis tool.<br> <p> I did not have the eBPF vs KRSI use-cases in mind when writing that comment.<br> </div> Fri, 30 Oct 2020 16:10:11 +0000 Access user space data without sleeping https://lwn.net/Articles/835673/ https://lwn.net/Articles/835673/ walters <div class="FormattedComment"> Using BPF for security intersects at KRSI, right?<br> <a href="https://lwn.net/Articles/813261/">https://lwn.net/Articles/813261/</a><br> <p> Also doing things like looking at file paths should be known to be fairly flawed in general even if it weren&#x27;t racy just *loading* the path - bind mounts etc. can obscure what you&#x27;re seeing. The SELinux model of e.g. having `etc_t` for /etc avoids all races and problems with comparing file paths.<br> <p> </div> Fri, 30 Oct 2020 15:27:45 +0000 Access user space data without sleeping https://lwn.net/Articles/835618/ https://lwn.net/Articles/835618/ compudj <div class="FormattedComment"> One clarification: we do access user-space memory already from LTTng at system call enter/exit by using __copy_from_user_inatomic(). However, if the userspace pages are not paged in memory, the access fails and we either truncate (if our userspace strnlen fails when calculating a string length) or write zeroes into the trace rather than the user-space data.<br> <p> We found out however that for things like security-related tooling which rely on grabbing the open(2) file name argument, this behavior where we cannot read the user-space data in specific conditions (which I suspect can be controlled by user-space by carefully making sure the page is _not_ paged in memory) is bad for security-related system behavior analysis through tracing.<br> <p> Moreover, it&#x27;s bad for our continuous integration, because we have test-cases where system call parameters are expected in the trace. This makes the tests flaky because they can then fail spuriously depending on what is present or not in the page table.<br> <p> Of course, there are plenty of use-cases where it&#x27;s good enough that the user-space data show up when available, and it&#x27;s not a big deal if it happens to be missing in a few cases, but there are lots of tracing use-cases for live system monitoring which depend on having reliable data, and those require taking the page fault at system call entry/exit to fetch the user-space data.<br> </div> Fri, 30 Oct 2020 13:14:35 +0000 Access user space data without sleeping https://lwn.net/Articles/835615/ https://lwn.net/Articles/835615/ wEddy <div class="FormattedComment"> copy_from_user_nofault() maybe? bpftrace can use it by bpf_probe_read_user() helper.<br> </div> Fri, 30 Oct 2020 11:23:39 +0000 Access user space data without sleeping https://lwn.net/Articles/835614/ https://lwn.net/Articles/835614/ epa <div class="FormattedComment"> Is there a way for kernel code to try accessing user space data, and either succeed immediately if it&#x27;s in RAM, or fail immediately if it would need to be paged in? That might be good enough for most tracepoints.<br> </div> Fri, 30 Oct 2020 07:54:16 +0000