More DTrace envy
Nearly a year ago, we looked at the status of SystemTap in the context of Sun's much-hyped DTrace tool. Since that time there has been progress, but the basic problem still remains: Linux does not have a good, ready-to-run answer to those wanting the equivalent functionality of DTrace. Due to an apparent disconnect between the developers of SystemTap and the kernel hackers, tracing for the Linux kernel—never mind user space programs—is not up to the competition.
Both SystemTap and DTrace are tools meant to help administrators track down performance and other problems on production systems by instrumenting the kernel. Because SystemTap has not matured to the point of easy usability, DTrace is often seen as a prime differentiator between Linux and Solaris. In a posting to the ksummit-2008-discuss mailing list—where Kernel Summit topics are considered—Matthew Wilcox brought up the subject based on his experience at a recent PostgreSQL conference:
So there was a lot of talk about switching away from Linux. This can't possibly be a good thing for us. I don't personally know what the state of our competing projects are, but clearly they haven't got their hooks into postgres ... at least not upstream.
Typically Linux has been in the forefront of interesting new technologies for free operating systems. When Sun opened up Solaris, though, a few features jumped ahead of their Linux counterparts, in particular the ZFS filesystem and DTrace. SystemTap is supposed to provide the tracing functionality while Btrfs is the leading candidate for a "next generation" filesystem. But, so far, SystemTap has not lived up to its potential.
There are a few reasons for disappointment with SystemTap, some of which were pointed out by James Bottomley:
Those are all valid concerns, but the biggest problem for users is that, unless they are knowledgeable about kernel internals, it is difficult to know how to use SystemTap. A more simplified interface, one that is less reliant on kernel internals, needs to be created; the way to do that is through the placement of static trace points in the kernel and the creation of "tapsets" to make them easily usable. The SystemTap developers think the kernel hackers are in the best position to do that work. Ted Ts'o agrees but sees some barriers:
[ ... ] the real problem isn't as much kernel developers, it's that (a) it's too hard for many kernel developers to use (and so many kernel developers are [not] using it), and (b) there aren't enough tapsets. The latter is something that kernel developers can help solve, but unfortunately I'm not sure discussing it at the Kernel Summit will necessarily lead to making forward progress.
If the kernel developers have trouble using SystemTap, they are unlikely to add the tapsets that would make it more usable for system administrators and others who have some general kernel knowledge but not enough to sensibly instrument it. For people using distribution kernels—at least for the enterprise distributions and Fedora—it is only somewhat painful to get SystemTap up and running. But kernel hackers tend to run their own kernels, often many different versions in a short period of time, so they need to be able to be easily build one that works with SystemTap and includes all of the debugging information that it requires.
SystemTap developer Frank Ch. Eigler has a long reply to many of the complaints in the thread. It seems clear that the SystemTap folks and the kernel hackers have not been communicating—there are solutions to many of the problems that were cited. They are in various states of readiness, but are mostly working. So SystemTap is most of the way there for kernel tracing as long as you are well-versed in kernel internals, but that has been true for some time.
In order to get SystemTap to where it needs to be, the kernel hackers need to be involved. Building the infrastructure and waiting for tapsets to magically appear is not a recipe for success. The SystemTap hackers need to be engaging the kernel community, as well as distributions, to make the tool into something that gets used.
SystemTap can use static probe points, kernel markers—merged into 2.6.24—but it is notable that no one has, as yet, made use of them. A concerted effort needs to be made to make the tool more usable for the kernel developers who can, in turn, help make it more usable for others. There is a clear problem when folks like Ts'o regularly try, but find it too difficult to be useful:
It is a commonly heard complaint that while SystemTap is difficult to use, DTrace "just works" for Solaris; Eigler responds:
A bunch of third-party scripts are often conflated with "dtrace", which is just a matter of growing the user community enough, and giving them a good tool to build on top of. A growing set of runnable end-user scripts is already packaged with systemtap, intended for use by nonexperts, more help (e.g. concise problem statements about what you'd like to measure/see) would be welcome.
Many administrators and other users of tracing facilities are not
necessarily interested in kernel-level tracing, but would really like to
be able to use the instrumented versions of things like PostgreSQL.
That is in the plan according to Eigler: "We aim to piggyback on these efforts by reusing the dtrace
instrumentation calls embedded into postgres etc., if at all
possible.
"
Until the rough edges can be smoothed on the kernel side, Bottomley wonders if it even makes sense to start considering user space:
DTrace sounds like a nice working solution that has many uses and many happy users. If one can ignore the self-congratulatory postings from its lead developer, it might be worth having in Linux, but that simply is not going to happen. Paul Fox is working on a port of DTrace to Linux, but that ignores the licensing realities that would never allow it to become part of Linux. It also ignores the difficult path a DTrace port would face getting merged into the mainline. (We hope to have an article from Mr. Fox on his DTrace porting work soon, stay tuned).
For all of the talk out of Sun about how they would love to make DTrace a part of Linux, they clearly made a choice to ensure that could not happen. Even if any technical barriers were lifted, the CDDL is not compatible with the GPL. It is perfectly fine as a free software license, but if you wish to get things into Linux, they must be licensed in a GPL-compatible way. This was well understood at the time Sun freed Solaris, so this must have been a conscious decision. Given how much their marketing organization likes to tout DTrace, it would seem to be a choice that Sun is quite happy with.
Linux will eventually get the tracing support it needs, in a way that is easily accessible to users, but it may take some time. Conversations like the recent one on ksummit-2008-discuss are an important part of getting there. It would appear that better support for the use cases of kernel developers will be forthcoming. It is mostly a matter of documentation along with simplifying some of the building and installation issues. Once the kernel hackers actually start using it, progress is likely to be fairly swift.
This is the way free software development works; it generally does not track a straight path to a solution, but often wanders about in the solution space for a while. It is highly unlikely that a development like DTrace could have come about in the way that it did in a true community-developed operating system. For that you need everyone pulling in the same exact direction, which may be why Sun is reluctant to turn over much of the governance of Solaris to the community. That may help them develop things more quickly, because there will be fewer barriers, but it won't help them to foster the kind of development community that characterizes Linux.
| Index entries for this article | |
|---|---|
| Kernel | SystemTap |
| Kernel | Tracing |
