|
|
Subscribe / Log in / New account

Uprobes in 3.5

By Jonathan Corbet
May 30, 2012
Uprobes is a kernel patch with a long story and many contentious discussions behind it. This code has its roots in utrace, a user-space tracing and debugging API that was first covered here in early 2007. Utrace ran into various types of opposition (only partly related to its own origin in SystemTap) and has never been merged, but a piece of it lives on in the form of uprobes, which is charged with the placement of probes into user-space code. After several mailing-list rounds of its own, uprobes was finally merged for the 3.5 kernel development cycle. Just how this facility will be used remains to be seen, however.

At the core of uprobes is this function:

    #include <linux/uprobes.h>

    int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc);

The inode structure represents an executable file; the probe is to be placed at offset bytes from the beginning. The uprobe_consumer structure tells the kernel what is to be done when a process encounters the probe; it looks like:

    struct uprobe_consumer {
	int (*handler) (struct uprobe_consumer *self, struct pt_regs *regs);
	bool (*filter) (struct uprobe_consumer *self, struct task_struct *task);
	struct uprobe_consumer *next;
    };

The filter() function is optional; if it exists, it determines whether handler() is called for each specific hit on the probe. The handler returns an int, but the return value is ignored in the current code.

Since probes are associated with files, they affect all processes that run code from those files. A special copy is made of the page to contain the probe; in that copy, the instruction at the specified offset is copied and replaced by a breakpoint. When the breakpoint is hit by a running process, filter() will be called if present, and handler() will be run unless the filter said otherwise. Then the displaced instruction is executed (using the "execute out of line" mechanism described in this article) and control returns to the instruction following the breakpoint.

Uprobes thus implements a mechanism by which a kernel function can be invoked whenever a process executes a specific instruction location. One could imagine a number of things that said kernel function could do; there has been talk, for example, of using uprobes (and, perhaps someday, something derived from utrace) as a replacement for the much-maligned ptrace() system call. Tools like GDB could place breakpoints with uprobes; it might even be possible to load simple filters for conditional breakpoints into the kernel, speeding their execution considerably. Uprobes could also someday be a component of a Dtrace-like dynamic tracing functionality. For now, though, the interfaces for that kind of feature have not been added to the kernel; none have even been proposed.

What the current implementation does have is integration with the perf events subsystem. New dynamic "events" can be added to any file location via an interface similar to that used for dynamic kernel tracepoints. In particular, there is a new file called uprobe_events in the tracing directory (/sys/kernel/debug/tracing/ on most systems) that is used to add and remove events. As an example, a line like:

    echo 'p:bashme /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
would place a new event (called "bashme") at location 0x4245c0 in the bash executable. The event would then appear with all other events in /sys/kernel/debug/tracing/events, in the uprobes subdirectory. Like other events, it is not actually turned on until its enabled attribute is set. See Documentation/trace/uprobetracer.txt for details on the interface at this level.

Placing uprobes is, by default, a privileged operation requiring the CAP_SYS_ADMIN capability. One can remove the privilege requirement by setting the perf_paranoid sysctl knob to -1, but doing so will allow the placement of dynamic tracepoints anywhere in the system, in kernel or user space. Thus, one need not be overly paranoid to leave perf_paranoid at its default setting.

The perf tool has been enhanced to make working with dynamic user-space tracepoints easy. One can, for example, set a tracepoint at the entry to the C library's malloc() implementation with:

    perf probe -x /lib64/libc.so.6 malloc

That tracepoint can then be treated like any other event understood by perf. See the explanatory text from Ingo Molnar's pull request for examples of what can be done.

Most kernel patches are conceived, implemented, reviewed, and merged into the mainline over a fairly short period of time. But some of them seem to languish for years without making much progress. Uprobes was such a patch set. It must have been frustrating for the developers to keep revising and posting this code, only to see it shot down over and over again. But the kernel community can be supportive of developers who show both persistence and a willingness to listen to criticism. The result, in this case, is a user-space probing mechanism that has been simplified, made more robust, and integrated into the existing events infrastructure. Hopefully it was worth the wait.

Index entries for this article
KernelTracing
KernelUprobes


to post comments

Uprobes in 3.5

Posted Jun 3, 2012 20:11 UTC (Sun) by razb (guest, #43424) [Link] (1 responses)

I think it would be nice to have a tui/gui for this tool. it would be difficult to set a trace to the regular user.

Uprobes in 3.5

Posted Dec 12, 2012 9:49 UTC (Wed) by andreoli (subscriber, #20174) [Link]

Hi,

you might want to try out fulltrace, available at the following address:
https://github.com/andreoli/fulltrace

Fulltrace is a complete program, library and kernel tracer. Given a command, it dynamically finds all functions invoked by it, by any library it uses and by the kernel. It only requires a recent Linux kernel (>=3.5) compiled with ftrace and uprobes support. Note: this is still very experimental (consider it "proof-of-concept" code) and needs a lot of work.
Any suggestion is more than welcome.

Uprobes in 3.5

Posted Jun 5, 2012 13:45 UTC (Tue) by fuhchee (guest, #40059) [Link]

"it has been simplified, made more robust, and integrated into the existing events infrastructure"

While the last of those is definitely true, the first may just be due to equivalent functionality being deferred, and the second is way too early to say.

Uprobes in 3.5

Posted Jun 9, 2012 22:54 UTC (Sat) by slashdot (guest, #22014) [Link]

Isn't using perf_paranoid for this a disastrous choice?

AFAICT uprobe insertion is equivalent to root, since you can modify bytes in the middle of instructions, and thus alter the behavior of any process.

On the other hand, currently perf_paranoid only gives access to PMU, which can normally only be use to DoS the system at worst.

Uprobes in 3.5

Posted May 19, 2019 9:31 UTC (Sun) by uronce (guest, #102007) [Link] (1 responses)

Setting a uprobe for a binary does not modify the on-disk binary file directly.
It sets the probe for (inode, offset) in a kernel internal data structure, inode is the inode for the binary file, offset is the instruction offset in the binary file.
Since the binary is memory mapped for execution, when the probed instruction is accessed through mmap, https://elixir.bootlin.com/linux/latest/source/kernel/eve... will replace the original instruction with "0xcc" (int3).
So if a uprobe is set for binary /tmp/test, then mv /tmp/test /home/test, then run /home/test, the probe still works, because the inode does not change after mv.

Uprobes in 3.5

Posted May 19, 2019 9:37 UTC (Sun) by uronce (guest, #102007) [Link]


Copyright © 2012, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds