LWN.net Logo

The Linux trace toolkit's next generation

By Jake Edge
January 9, 2008

Instrumenting a running kernel for debugging or profiling is on the wish list of many administrators and developers. Advocates of OpenSolaris like to point to DTrace as a feature that Linux lacks, though SystemTap has started to close that gap. The Linux Trace Toolkit next generation (LTTng) takes a different approach and was recently submitted for inclusion in the kernel (in two patches: arch independent and arch dependent).

LTTng relies upon kernel markers to provide static probe points for its kernel tracing activities. It also provides the ability to trace userspace programs and combine that data with kernel tracing data to give a detailed view of the internals of the system. Unlike other tools, LTTng takes a post-processing approach, storing the data away as efficiently as possible for later analysis. This is in contrast to SystemTap and DTrace which have their own mini-languages that specify what to do as each trace point is reached.

One of the major design goals of LTTng is to have as little impact on the system as possible, not only when it is actually tracing events, but also when it is disabled. Kernel hackers are quite resistant to debugging solutions that add any significant performance penalty when not in use. In addition, any significant delays while enabled may change the system timing such that the bug or condition being studied does not occur. For this reason, LTTng does not take the path that various dynamic tracing solutions have used and avoids the expense of a breakpoint interrupt by using the static markers.

Another major design goal is to provide monotonically increasing timestamp values for events. The original LTT uses timestamps derived from the kernel Network Time Protocol (NTP) time, which can fluctuate somewhat as adjustments are made – sometimes going backward. LTTng uses a timestamp derived from the hardware clocks that will work on various processor architectures and clock speeds. In addition, the timestamps can be correlated between different processors in a multi-processor system.

As LTTng gathers its data, it uses relayfs to get the data to a userspace daemon (lttd) that writes the data to disk. The daemon is started from the lttctl command-line tool, which controls the tracing settings in the kernel via a netlink socket. A user wishing to investigate tracing could use lttctl to start and stop a trace; once the trace is complete, the data could be viewed and analyzed.

The LTT viewer (LTTV) is the program that is used to analyze the data gathered. It provides both GUI and text-based viewers to interpret the binary data generated by LTTng and present it to the user. Multi-gigabyte files of tracing data are not uncommon when using LTTng, so a tool like LTTV is indispensable for visualization and filtering to allow the user to focus on the events of interest. LTTV has a plugin mechanism that allows users to develop their own display and analysis tools, while using the LTTV framework and filtering capabilities.

An advantage of using static probe points – though some may see it as a disadvantage – is that they can be maintained with the kernel code they are targeting. If the kernel markers patch is merged, subsystems can add probe points at places they find interesting or useful and those markers will be carried along in the kernel source; updated as the kernel changes. Other solutions rely on matching an external list of probes with the version of the running kernel, which can result in mismatches and incorrect traces. Also, SystemTap will be able to use any markers that get added to the kernel as is, so users who want the abilities that it provides will also benefit.

LTTng is being developed at the École Polytechnique de Montréal with support from quite a few Linux companies. It has the looks of a very well thought out framework that builds upon the tracing work that has been done before. It certainly won't make it into 2.6.24, but it would seem to have a good chance of making it into a future mainline kernel.


(Log in to post comments)

The Linux trace toolkit's next generation

Posted Jan 10, 2008 4:32 UTC (Thu) by jd (guest, #26381) [Link]

There are several projects that provide architecture-independent access to the hardware clocks, PAPI is one. Although this is a relatively minor part of LTT-ng, it would presumably make sense if there is an agreed way to do this - or, perhaps if one project is already installed, any subsequent project should be configurable to use that as the access point. The problem with multiple methods of accessing the same data is that if you use multiple profilers, they may give inconsistent and incompatible results.

The Linux trace toolkit's next generation

Posted Jan 10, 2008 18:15 UTC (Thu) by compudj (subscriber, #43335) [Link]

Yes, I would be happy to do that. However, reading timestamps in a tracer is a bit trickier
than the average scenario, especially because of NMI context tracing. This is why I developed
algorithms that keeps track of 64 bits counters that can be read atomically, even if the
underlying hardware only provides a 32 bits counter.

So I guess it would make sense to use the LTTng timestamping infrastructure for other uses,
but the opposite is not necessarily true because of the reentrancy constraints.

Mathieu

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds