| LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing |
The perf tool for performance analysis is adding functionality quickly. Since being added to the mainline in 2.6.31, primarily as a means to access various CPU performance counters, it has expanded its scope. Support for treating kernel tracepoint events like performance counter events came into the kernel at around the same time. More recently, though, Tom Zanussi has added support for using perl and python scripts with the perf tool, making it even easier to do sophisticated processing of perf events.
The perl support is already in the mainline, but Zanussi added a python scripting engine more recently. Interpreters for both perl and python can be embedded into the perf executable, which allows processing the raw perf trace data stream in either of those languages.
The perl scripting can be used from the 2.6.33-rc series, but the python support is only available by applying Zanussi's patches to the tip tree. Building perf in the tools/perf directory, which requires development versions of various libraries and tools (glibc, elfutils, libdwarf, perl, python, etc.), then gives access to the new functionality.
Multiple different example scripts are provided with perf, which can be listed from perf itself:
# perf trace -l
List of available trace scripts:
syscall-counts [comm] system-wide syscall counts
syscall-counts-by-pid [comm] system-wide syscall counts, by pid
failed-syscalls-by-pid [comm] system-wide failed syscalls, by pid
workqueue-stats workqueue stats (ins/exe/create/destroy)
check-perf-trace useless but exhaustive test script
failed-syscalls [comm] system-wide failed syscalls
wakeup-latency system-wide min/max/avg wakeup latency
rw-by-file <comm> r/w activity for a program, by file
rw-by-pid system-wide r/w activity
This list is a mix of perl and python scripts that live in the
tools/perf/scripts/{perl,python} directories and get installed in
the proper location (/root/libexec by default) after a make
install.
The scripts themselves are largely generated by the perf trace command. Zanussi's documentation for perf-trace-perl and perf-trace-python explain the process of using perf trace to create the skeleton scripts, which can then be edited to add the required functionality. Adding two helper shell scripts (for recording and reporting) to the appropriate directory will add new scripts to the list produced by perf trace described above.
The installed scripts can then be used as follows:
# perf trace record failed-syscalls
^C[ perf record: Woken up 11 times to write data ]
[ perf record: Captured and wrote 1.939 MB perf.data (~84709 samples) ]
This captures the perf data into the appropriately named perf.data
file, which can then be processed by:
# perf trace report failed-syscalls
perf trace started with Perl script \
/root/libexec/perf-core/scripts/perl/failed-syscalls.pl
failed syscalls, by comm:
comm # errors
-------------------- ----------
firefox 1721
claws-mail 149
konsole 99
X 77
emacs 56
[...]
failed syscalls, by syscall:
syscall # errors
------------------------------ ----------
sys_read 2042
sys_futex 130
sys_mmap_pgoff 71
sys_access 33
sys_stat64 5
sys_inotify_add_watch 4
[...]
# perf trace report failed-syscalls-by-pid
perf trace started with Python script \
/root/libexec/perf-core/scripts/python/failed-syscalls-by-pid
syscall errors:
comm [pid] count
------------------------------ ----------
firefox [10144]
syscall: sys_read
err = -11 1589
syscall: sys_inotify_add_watch
err = -2 4
firefox [10147]
syscall: sys_futex
err = -110 7
[...]
This simple example shows using the failed-syscalls script to
gather the data, then processing it with the corresponding perl script as
well as a compatible python script (failed-syscall-by-pid) that slices the same data somewhat
differently. The first report shows a count of each system call that
failed during the few seconds while the trace was active. It shows the
number of errors by process, as well as by system call.
The second report combines the two and shows each process along with a which system calls failed for it, and how many times. There are also corresponding scripts that count all system calls, not just those that failed, and report on them similarly. Wakeup latency, file read/write activity, and workqueue statistics are the focus of some of the other provided scripts.
These scripting features will make it that much easier for kernel hackers—or possibly those who aren't—to access the perf functionality. The state of tracing and instrumentation in the kernel has been quick to develop over the last few development cycles. It doesn't look to be slowing down anytime soon.
Scripting support for perf
Posted Feb 11, 2010 6:34 UTC (Thu) by prasadkr (subscriber, #44457) [Link]
live reporting of data
Posted Feb 11, 2010 9:42 UTC (Thu) by mjw (subscriber, #16740) [Link]
You would need pre-filtering/aggregating for that, not post-process scripting. You can do something like that with for example systemtap which filters and can aggregate values at probe point hit time, so the only data being recorded is that which is needed for the live reporting. e.g. live (top like) reporting failed syscalls with argument strings would be done by errsnoop.stp
$ stap errsnoop.stp
SYSCALL PROCESS PID HITS ERRSTR ARGSTR
inotify_add_watch gdm-simple-gree 2569 2 13 (EACCES) 18, "/home/mark", 16789454
open hald-addon-stor 2178 1 123 (ENOMEDIUM) "/dev/sdh", O_RDONLY
open hald-addon-stor 2175 1 123 (ENOMEDIUM) "/dev/sde", O_RDONLY
open hald-addon-stor 2177 1 123 (ENOMEDIUM) "/dev/sdg", O_RDONLY
open hald-addon-stor 2174 1 123 (ENOMEDIUM) "/dev/sdd", O_RDONLY
open hald-addon-stor 2176 1 123 (ENOMEDIUM) "/dev/sdf", O_RDONLY
open sendmail 2291 1 6 (ENXIO) "/proc/loadavg", O_RDONLY
And you can then let it simple run to see live what silly things user space programs are doing. Some other systemtap process examples.
Scripting support for perf
Posted Feb 11, 2010 16:23 UTC (Thu) by trz (subscriber, #7752) [Link]
$ perf trace record myscript | perf trace report myscript
or more simply just get rid of the two steps and combine them into one:
$ perf trace myscript
Currently, perf isn't pipe-friendly mainly because of the header-read/write code, which pre-allocates space in the file and does a lot of seeking to fill in length and offset fields later. That works nicely for a file, but presents problems if you want to feed it into a pipe.
I think if some changes were made to that part of the code, the rest would follow naturally. I plan to look into it soon and hopefully post some patches to enable it.
Copyright © 2010, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds