|
|
Subscribe / Log in / New account

Uprobes: 11th time is the charm?

By Jonathan Corbet
March 16, 2011
Last week's Kernel Page included an article about improving the ptrace() interface; the author of that work, Tejun Heo, was quoted as saying that part of the problem with ptrace() is that it has been starved of developer attention in favor of efforts to replace it entirely. One of those efforts is uprobes, which has also been featured on this page a few times. A new uprobes patch was posted on March 14; so this seems like a good time to have a look at it and further deprive ptrace() of attention. Uprobes looks like it is getting closer to acceptance, but it seems unlikely that the 11th revision will be the last.

The purpose of the uprobes subsystem is what one might expect: to enable the placement of probes into user-space executable process memory. These probes might be used to support a debugger like gdb (though uprobes is said to be unsuitable for use by gdb in its current form) or to support user-space tracing. This feature does thus duplicate some of the functionality provided by ptrace(), which will make its acceptance harder, especially since ptrace() is (more or less) a standardized interface. To succeed, uprobes will clearly have to do things better than ptrace() does.

The ptrace() interface is tied to processes; uprobes, instead, works with files. A probe is placed at a certain offset within a specific file; it will then trigger for every process which executes through the probe's location. If the code placing the probe is only interested in specific processes, it will need to filter the events itself. The interface may seem a little strange - users will probably almost always be interested in specific processes - but there are some advantages to doing things this way.

Underneath the hood, uprobes works by faulting in the page which will contain the probe. The instruction at the probe location is copied aside and replaced by a breakpoint. Every process which has that file mapped then gets a pointer in its mm structure pointing to the data describing the probe(s) for that file. When a process executes the breakpoint, the probe's handler function will be called; on that handler's return, the kernel will single-step the displaced instruction, then return to the location following the probe.

This "execute out of line" (XOL) mechanism has been controversial in the past because it requires the injection of a new virtual memory area (VMA) into every process which encounters probes. That VMA is seen as a distortion of the process's behavior which could have strange effects. The alternatives, though, are not entirely appealing either. The ptrace() approach is to put the original instruction back into its original location, execute it, then replace the breakpoint; that only works if every process which has the file mapped is stopped for the duration of the operation (otherwise they might execute the affected code while the breakpoint is missing). Uprobes, instead, is able to handle breakpoint hits without perturbing other processes. Another alternative discussed in the past is emulating the displaced instruction in the kernel; that requires having a full x86 emulator in kernel space, which is not entirely appealing either. So the current plan appears to be to stick with XOL.

Not having to stop the world when a breakpoint is hit is one of the advantages of uprobes, but there are others. It dispenses with the whole ptrace() mechanism involving signals, reparenting processes, and so on. Handling a probe hit does not require a context switch unless the probe itself does; many types of tracing tasks, for example, would never have to switch to another process. Uprobes also allows multiple applications to be tracing the same set of processes at the same time. All of these make the interface appealing to some users.

Who those users are is not clear to everybody, though. There is clearly some interest in the SystemTap camp, but the needs of SystemTap do not necessarily carry a lot of weight on linux-kernel. Thomas Gleixner put it this way:

And it does not matter at all whether systemtap can use this or not. If the main debuggers used like gdb are not going to use it then it's a complete waste. We don't need another debugging interface just for a single esoteric use case.

At times, gdb developers have indicated that they might be open to using a Linux-specific interface if there were advantages to doing so. Such use seems distant at the moment, though. More immediate users are likely to be found in the tracing community; uprobes opens the possibility of getting single stream of trace data covering both user and kernel space. ptrace() is not a useful interface for tracing, so something needs to be done (though there is still some disagreement over whether user-space tracing needs to involve the kernel at all). Uprobes might be that something.

In fact, this version of the uprobes patch includes an ftrace-based interface. Part 20 of the patch contains the entirety of the documentation for this feature, quoted below:

    # cd /sys/kernel/debug/tracing/
    # cat /proc/`pgrep  zsh`/maps | grep /bin/zsh | grep r-xp
    00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
    # objdump -T /bin/zsh | grep -w zfree
    0000000000446420 g    DF .text  0000000000000012  Base        zfree
    # echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
    # cat uprobe_events
    p:uprobes/p_zsh_0x46420 /bin/zsh:0x0000000000046420
    # echo 1 > events/uprobes/enable
    # sleep 20
    # echo 0 > events/uprobes/enable
    # cat trace

An actual document is listed as a "TODO" item. The current interface looks a bit painful to use, and it appears to be limited to printing register contents for now. A more flexible and better documented interface could prove useful, though, especially if (as planned) it also can be made to work with the perf events subsystem.

The comments on the patch set indicate some concern about whether the kernel needs the feature or not. But even the more critical reviewers have been going over the code pointing out small things - the kind of review one does when one wants to help the author get the code into shape for merging. This code will not be merged for 2.6.39, and, for this type of code, making predictions for merging at any definite time is a hazardous affair. But, given sufficient will, it seems like uprobes could be made ready for inclusion sometime this year.

Index entries for this article
KernelTracing
KernelUprobes


to post comments

Uprobes: 11th time is the charm?

Posted Mar 24, 2011 6:06 UTC (Thu) by kevinm (guest, #69913) [Link] (1 responses)

So, if I want to put a probe in libc.so, it ends up being executed by every process on the system?

Presumably I would have to have write permissions to libc to do so - so if gdb was to use this, I wouldn't be able to set breakpoints on library functions unless I'm root? That sounds like a nonstarter to me.

Uprobes: 11th time is the charm?

Posted Mar 24, 2011 17:41 UTC (Thu) by fuhchee (guest, #40059) [Link]

A future gdb interface for this functionality would not require nor give you any new privileges. Probes would be naturally restricted to your processes. No actual writing to the disk files is done.

Systemtap has the same properties w.r.t. the older utrace/uprobes it carries: a root user can operate systemwide, and an unprivileged user can operate upon her own processes.


Copyright © 2011, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds