LWN.net Logo

Scripting support for perf

By Jake Edge
February 10, 2010

The perf tool for performance analysis is adding functionality quickly. Since being added to the mainline in 2.6.31, primarily as a means to access various CPU performance counters, it has expanded its scope. Support for treating kernel tracepoint events like performance counter events came into the kernel at around the same time. More recently, though, Tom Zanussi has added support for using perl and python scripts with the perf tool, making it even easier to do sophisticated processing of perf events.

The perl support is already in the mainline, but Zanussi added a python scripting engine more recently. Interpreters for both perl and python can be embedded into the perf executable, which allows processing the raw perf trace data stream in either of those languages.

The perl scripting can be used from the 2.6.33-rc series, but the python support is only available by applying Zanussi's patches to the tip tree. Building perf in the tools/perf directory, which requires development versions of various libraries and tools (glibc, elfutils, libdwarf, perl, python, etc.), then gives access to the new functionality.

Multiple different example scripts are provided with perf, which can be listed from perf itself:

    # perf trace -l
    List of available trace scripts:
      syscall-counts [comm]                system-wide syscall counts
      syscall-counts-by-pid [comm]         system-wide syscall counts, by pid
      failed-syscalls-by-pid [comm]        system-wide failed syscalls, by pid
      workqueue-stats                      workqueue stats (ins/exe/create/destroy)
      check-perf-trace                     useless but exhaustive test script
      failed-syscalls [comm]               system-wide failed syscalls
      wakeup-latency                       system-wide min/max/avg wakeup latency
      rw-by-file <comm>                    r/w activity for a program, by file
      rw-by-pid                            system-wide r/w activity
This list is a mix of perl and python scripts that live in the tools/perf/scripts/{perl,python} directories and get installed in the proper location (/root/libexec by default) after a make install.

The scripts themselves are largely generated by the perf trace command. Zanussi's documentation for perf-trace-perl and perf-trace-python explain the process of using perf trace to create the skeleton scripts, which can then be edited to add the required functionality. Adding two helper shell scripts (for recording and reporting) to the appropriate directory will add new scripts to the list produced by perf trace described above.

The installed scripts can then be used as follows:

    # perf trace record failed-syscalls
    ^C[ perf record: Woken up 11 times to write data ]                         
    [ perf record: Captured and wrote 1.939 MB perf.data (~84709 samples) ]   
This captures the perf data into the appropriately named perf.data file, which can then be processed by:
    # perf trace report failed-syscalls
    perf trace started with Perl script \
	/root/libexec/perf-core/scripts/perl/failed-syscalls.pl


    failed syscalls, by comm:

    comm                    # errors
    --------------------  ----------
    firefox                     1721
    claws-mail                   149
    konsole                       99
    X                             77
    emacs                         56
    [...]

    failed syscalls, by syscall:

    syscall                           # errors
    ------------------------------  ----------
    sys_read                              2042
    sys_futex                              130
    sys_mmap_pgoff                          71
    sys_access                              33
    sys_stat64                               5
    sys_inotify_add_watch                    4
    [...]

    # perf trace report failed-syscalls-by-pid
    perf trace started with Python script \
	/root/libexec/perf-core/scripts/python/failed-syscalls-by-pid


    syscall errors:

    comm [pid]                           count
    ------------------------------  ----------

    firefox [10144]
      syscall: sys_read
	err = -11                         1589
      syscall: sys_inotify_add_watch
	err = -2                             4

    firefox [10147]
      syscall: sys_futex       
	err = -110                           7
    [...]
This simple example shows using the failed-syscalls script to gather the data, then processing it with the corresponding perl script as well as a compatible python script (failed-syscall-by-pid) that slices the same data somewhat differently. The first report shows a count of each system call that failed during the few seconds while the trace was active. It shows the number of errors by process, as well as by system call.

The second report combines the two and shows each process along with a which system calls failed for it, and how many times. There are also corresponding scripts that count all system calls, not just those that failed, and report on them similarly. Wakeup latency, file read/write activity, and workqueue statistics are the focus of some of the other provided scripts.

These scripting features will make it that much easier for kernel hackers—or possibly those who aren't—to access the perf functionality. The state of tracing and instrumentation in the kernel has been quick to develop over the last few development cycles. It doesn't look to be slowing down anytime soon.


(Log in to post comments)

Scripting support for perf

Posted Feb 11, 2010 6:34 UTC (Thu) by prasadkr (subscriber, #44457) [Link]

While the scripting support for perf-events is a very useful feature...was wondering if anybody has seen what it takes to have 'live' reporting of data (through perf and hence the script) as opposed to the two-step process - for perf data collection followed by analysis using script?

live reporting of data

Posted Feb 11, 2010 9:42 UTC (Thu) by mjw (subscriber, #16740) [Link]

You would need pre-filtering/aggregating for that, not post-process scripting. You can do something like that with for example systemtap which filters and can aggregate values at probe point hit time, so the only data being recorded is that which is needed for the live reporting. e.g. live (top like) reporting failed syscalls with argument strings would be done by errsnoop.stp
$ stap errsnoop.stp
          SYSCALL         PROCESS   PID HITS ERRSTR       ARGSTR
inotify_add_watch gdm-simple-gree  2569    2  13 (EACCES) 18, "/home/mark", 16789454
             open hald-addon-stor  2178    1 123 (ENOMEDIUM) "/dev/sdh", O_RDONLY
             open hald-addon-stor  2175    1 123 (ENOMEDIUM) "/dev/sde", O_RDONLY
             open hald-addon-stor  2177    1 123 (ENOMEDIUM) "/dev/sdg", O_RDONLY
             open hald-addon-stor  2174    1 123 (ENOMEDIUM) "/dev/sdd", O_RDONLY
             open hald-addon-stor  2176    1 123 (ENOMEDIUM) "/dev/sdf", O_RDONLY
             open        sendmail  2291    1   6 (ENXIO)  "/proc/loadavg", O_RDONLY
And you can then let it simple run to see live what silly things user space programs are doing. Some other systemtap process examples.

Scripting support for perf

Posted Feb 11, 2010 16:23 UTC (Thu) by trz (subscriber, #7752) [Link]

'Live' reporting should be just a short step away - to do that we could stick a pipe in between the 'record' and 'report' steps e.g.

$ perf trace record myscript | perf trace report myscript

or more simply just get rid of the two steps and combine them into one:

$ perf trace myscript

Currently, perf isn't pipe-friendly mainly because of the header-read/write code, which pre-allocates space in the file and does a lot of seeking to fill in length and offset fields later. That works nicely for a file, but presents problems if you want to feed it into a pipe.

I think if some changes were made to that part of the code, the rest would follow naturally. I plan to look into it soon and hopefully post some patches to enable it.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds