KS2008: Tracing
James Bottomley started with a description of his experiences with SystemTap, the utility which is most often cited as our answer to DTrace. He had a lot of trouble getting it to work with his system. In his mind, the root cause for all this trouble is the simple fact that nobody from the development community is actually using SystemTap. A quick query of the room suggested that about half of the developers present had tried using SystemTap at one point or other; maybe 20% actually succeeded. So there is a roadblock of sorts here; SystemTap needs attention from kernel developers to progress, but those developers find it unsuited to their needs and difficult to use, so they tend to ignore it.
But kernel developers are not the targeted user base for a tool like SystemTap; it is aimed at end users and deployed systems. To help clarify what those users need, Vinod Kutty from the Chicago Mercantile Exchange took some time to talk about his needs for tracing tools. In general, these users need a higher level of visibility into running, production systems. They need to be able to track down slowdowns, look at the environment in which processes are running, and, in general, to be able to look in corners of the system which nobody will have anticipated in advance. All of this has to happen while the system is running in production; it is, he says, somewhat like needing to look under the hood of a car while driving at 100mph.
Also useful is the ability to run tracing tools in a "flight recorder" mode, where an administrator can look at historical data after something goes wrong. And it is necessary to be able to look at user-space events as well as those from the kernel. Events generated in user space are often more meaningful to the people running the system. All of this is needed to be able to communicate with distributors about where the problems come up, so that the distributor can work toward a fix. Current tracing tools for Linux are insufficient.
Linus asked: is tracing needed primarily to track down bugs or to find performance issues? It turns out that performance problems are the big issue. James asked what parameters were the most important; the answer mentioned individual process I/O events, user-space events, and the ability to map user events to kernel-level events.
Moving on, Vinod also noted that low impact is important; the tracing tool cannot place a heavy load on the system. These tools need to start quickly. Current tools are far too big. There is also a need for good filtering; tracing tools can generate a lot of data. Administrators and developers need a way to boil down all that data to an amount they can deal with. Even better are tools which can spot problems in the trace stream and raise red flags when they happen.
And, of course, tracing tools really cannot crash the system while they are running. SystemTap still falls a little short in this area; it's not hard to bring down a system while trying to trace it. Adding a DTrace-style virtual machine was discussed; in theory, a VM can make the tracing tool demonstrably safer. Vinod responded that it could be useful, but the proof of maturity is in watching the software run for a while.
This is where Linus came in to proclaim that he hates every tracing tool he has seen. SystemTap is far too complicated; these tools need to be simpler. Adding a virtual machine to SystemTap would just make things more complicated; that's not the way to fix its problems. According to Linus, most of the problems solved by tracing come down to figuring out scheduling issues, and we have the tools to do that now. We should be making better use of the simple tools which are currently in the kernel before trying to put more complicated stuff in. We should, for example, make latencytop work better and push to get it into the enterprise distributions. This "use the tools we already have" suggestion came back many times during the session.
Christoph Hellwig brought up another recurring theme: while dynamic tracing is nice, there is a lot of value to be had from well-placed static trace points which are managed by the maintainer of the code. Matthew Wilcox added that the user-space trace points (for DTrace) added to PostgreSQL have proved to be highly useful for database administrators; people running PostgreSQL now have a strong motivation to do so on Solaris systems. We would do well to match that functionality on Linux.
A key component of SystemTap is the collection of "tapsets," scripts which allow a user to look into the kernel for specific information. These tapsets are a problem, though; they are tightly tied to a specific kernel, but the kernel is constantly being changed. So tapsets go stale quickly. Moving these tapsets into the kernel might help, but they will still be a separate body of code which is prone to breaking. Static trace points, which can be maintained directly with the code they monitor, are much more likely to continue to work in the long term.
Martin Bligh noted that Google maintains a set of 20-30 static trace points for use with the LTTng trace tool. This very small set of trace points is sufficient to solve most problems that Google encounters. Martin will, hopefully, be posting those trace points for inclusion into the mainline, though Google's associated tools might not be available.
Vinod finished this portion of the session by stating the he likes the tapset concept. It allows him (or somebody in his group) to write a script aimed at a specific situation, and others can make use of it immediately. There's no need to wait for the release of a more specialized tool.
Trace toolkits
Mathieu Desnoyers spent a few moments introducing the LTTng tracing package. LTTng is a static tracing tool, depending on markers placed in the kernel itself. It has been designed for high performance and simplicity; that, in turn, should help to make it safe to use on production systems. All LTTng trace points have to be in the kernel code itself and be maintained by the appropriate subsystem maintainers.
The core kernel code includes a module for precise time stamping (needed to preserve the ordering of events which go through different per-CPU relay buffers), the relaying code, and a netlink-driven module to control tracing. There is a user-space library and, of course, a set of analysis tools. LTTng can support "flight recorder" mode which can initiate tracing when a specific trigger situation comes about. There is also a mechanism for putting markers into user space.
Frank Eigler spent some time talking about SystemTap; he used much of that time to defend the design decisions which had been made. When the SystemTap project started, the kernel had almost no tracing features at all, so they had to pick a path that worked. Since there is a lot of hostility to putting a virtual machine into the kernel, they had to go with code generation instead. They used kprobes because that was the mechanism that was available. And so on. In general, SystemTap has a lot of the same objectives as LTTng, plus, of course, the dynamic tracing feature. There are "some demos" showing working user-space tracing.
James stated that there was a real need for the users of these tools - kernel developers, in this case - to provide input into how they work. Frank responded that the SystemTap team has been crying out for people to help. It's clear, though, that this particular user base is not sufficiently engaged in the development process. It was said that the real users of SystemTap are Red Hat consultants, who find that it works well with the standard RHEL kernel. But people trying to use SystemTap with a current mainline kernel have to download "a shaky weekly tarball" to try to make it work. Until SystemTap is easier to use with the mainline kernel, it will be a hard sell in the development community.
The problem there, of course, is that keeping SystemTap current while it is out of the mainline tree is always going to be a struggle. Resolving that problem will require getting more of that code merged. It seems that the core SystemTap code is about 15,000 lines - small, according to Frank. This could maybe go in, but Linus is resistant, saying that we need to get the current, simple, in-kernel tracing tools into a usable state before we try to add more of them.
Ted Ts'o remarked that there is a real difference with SystemTap: it is the only Linux-based tracing package which, like DTrace, allows users to run code at the trace points. Thus it is able to do more complicated triggering, filtering, and analysis. Thomas Gleixner responded that this is all good, but what is really needed is a simple trace package which does not require the installation of a whole set of new tools. He does tracing (using ftrace) on a number of platforms, including embedded systems, and he isn't willing to deal with the hassle involved in adding another complicated set of software.
After that the conversation wandered into various, relatively obscure
technical topics like the details of how buffering mechanisms should work,
who should really be managing trace points, who manages instrumentation as
a whole, and so on. But there was a general sense that the summit wasn't
the venue for that kind of low-level detail, which isn't where the real
problems are anyway. The tracing topic will be revisited at the Linux
Plumbers Conference, so it was decided to defer much of the discussion of
the details until then.
Index entries for this article | |
---|---|
Kernel | Tracing |
Posted Sep 18, 2008 4:27 UTC (Thu)
by bronson (subscriber, #4806)
[Link] (20 responses)
Posted Sep 18, 2008 4:35 UTC (Thu)
by fuhchee (guest, #40059)
[Link]
Posted Sep 18, 2008 5:19 UTC (Thu)
by corbet (editor, #1)
[Link] (17 responses)
Posted Sep 18, 2008 10:15 UTC (Thu)
by zdzichu (subscriber, #17118)
[Link]
DTrace is wonderfully usable by Joe random admin and is generic enough to stop blooming of specialised tracing frameworks. Sure, latency top is cool, powertop is cool, usbmon is cool, blktrace is cool, but each of them needs its own hooks in kernel. On Solaris, powertop is implemented on top of DTrace, without modifing kernel.
Posted Sep 18, 2008 16:56 UTC (Thu)
by zooko (guest, #2589)
[Link] (15 responses)
Also, I find it hard to imagine that some Linux copyright holder would sue you if you redistributed it to a third party. That would be interesting. I would almost want to be sued (for what -- monetary damages? An injunction forbidding me to keep redistributing it?) just to see it happen.
Unfortunately, I don't have time today to download the dtrace port, combine it with the Linux kernel, and put it up on my web server for redistribution. Maybe tomorrow.
Posted Sep 19, 2008 1:48 UTC (Fri)
by njs (subscriber, #40338)
[Link] (14 responses)
And no major distro's legal department is going to let them blatantly violate the copyrights of everyone on lkml and open themselves up to legal action. (But at least SCO would finally have a valid case against Novell!)
So don't let me stop your fun, but you're kind of tilting at windmills, to prove some sort of point that I don't quite see...
Posted Sep 19, 2008 12:22 UTC (Fri)
by zooko (guest, #2589)
[Link] (13 responses)
For example:
http://fedoraproject.org/wiki/ForbiddenItems#MP3_Support
https://help.ubuntu.com/community/RestrictedFormats
In both cases, there is some source code which is legal for end-users to use, but (perhaps?) not legal for Linux distributions to distribute.
If I'm right that this is the same legal situation, then this implies that it is legally possible for dtrace-on-linux to be as widely used as MP3-on-linux is. :-)
Regards,
Don Quizooko
P.S. I love tilting at windmills. Every now and then you get a solid hit on those titanic monsters.
Posted Sep 20, 2008 1:37 UTC (Sat)
by rahulsundaram (subscriber, #21946)
[Link] (12 responses)
Posted Sep 20, 2008 14:43 UTC (Sat)
by zooko (guest, #2589)
[Link] (11 responses)
How's that for a comparison? The legal constraint on using dtrace or ZFS on Linux is exactly the same as the legal constraint on using a proprietary hardware driver on Linux.
Posted Sep 21, 2008 4:43 UTC (Sun)
by njs (subscriber, #40338)
[Link] (2 responses)
I note that proprietary hardware drivers *suck* in many, many ways. They make your system impossible to provide support for, which is kind of a problem for a product whose core market is high-end enterprise production servers. They break things horribly and regularly -- and dtrace is obviously *much* more intrusive than your average driver. They're a huge pain to use and maintain -- is there any reason to think that the guy porting dtrace will be more successful at tracking linux mainline than RH's engineers are with systemtap? And will it even be possible to load as a module, or will you have to patch your core kernel/rebuild/reboot?
And, perhaps worst, they create all kinds of obnoxious arguments that divide the community -- look at any discussion of nvidia drivers, or the dtrace discussions here. Part of the magic of FOSS is that normally the enthusiasts, the law-skirters, the enterprise distros, the big-iron vendors, the small consultants, etc. etc. can all work towards common goals, and all provide necessary value at different parts of a system's lifecycle. When people spend their time yelling at each other over whether nVidia's drivers are legal or whether systemtap/btrfs/whatever is worth the effort, we all lose.
Which maybe shouldn't be surprising, since all the evidence suggests that one of Sun's strategic goals in choosing to license dtrace and ZFS in a Linux-incompatible way was to create exactly this sort of division within the FOSS community, and thus give Solaris a competitive chance. What I'm still not clear on is which of these titanic monsters you're going after, exactly...
Posted Sep 21, 2008 12:39 UTC (Sun)
by zooko (guest, #2589)
[Link] (1 responses)
But this original argument is not true. At least, it is true only inasmuch as it is also true that the licensing issues of nvidia drivers make them undistributable by anyone, so therefore not not of any practical interest.
Now you're making two other arguments. If I understand correctly, Argument 2: If people do use kernel modules or patches with this kind of legal restriction, it is hard to support and tends to break. and Argument 3: It distracts the community with inflammatory bickering about legal issues.
I don't necessarily disagree with either of those. I personally have gone to great effort to get Free Software drivers for my graphics cards. On the other hand, I have occasionly fallen back to using the proprietary graphics card drivers when needed. I think it could be useful to people to understand that even if the latter two arguments are valid, the first one wasn't: dtrace and ZFS are exactly as legally-portable-to-linux as are proprietary graphics card drivers.
Regards,
Zooko
P.S. Oh wait, I'm not sure if I agree with Argument 2. Argument 2 is a strong argument when we're talking about proprietary, closed-source software, but dtrace and ZFS are Free and Open source, so I'm not sure that it would be as fragile and problematic as proprietary drivers.
Posted Sep 21, 2008 18:38 UTC (Sun)
by njs (subscriber, #40338)
[Link]
It's still of no practical interest to kernel developers, distributors, enterprise users, and many others. Jon's (not bronson's) comment was totally reasonable in context -- a kernel developers' discussion about what enterprise-funded developers should work on!
But fair enough -- there may be (probably are) others who find linux-dtrace of practical interest.
But arguments 2 and 3 are linked to this -- to the extent that such people use linux-dtrace, the harms described in those arguments swing into effect. To the extent they avoid linux-dtrace, the harms are abated -- with a trade-off: then they lose dtrace's benefits. There's a tragedy of the commons danger here; the costs of supporting inscrutably broken systems and inflammatory bickering are borne by the community, while the benefits of dtrace are received only only by individuals. Plus, people using/supporting linux-dtrace are not doing themselves any favors in the long run, because it's clearly a dead-end; it's better than nothing, but getting a great solution will require abandoning it and switching to one of the other systems that linux-dtrace sucks the oxygen away from.
So I guess 2 & 3 are arguments for why if we encounter someone for whom linux-dtrace is of practical interest, we should attempt to stop them ;-).
>dtrace and ZFS are exactly as legally-portable-to-linux as are proprietary graphics card drivers.
As a side-point, I'm actually not convinced, at least for dtrace. The nvidia driver uses two tricks: it makes a serious attempt not to be a derived work of the kernel, by including a Free shim layer to basically a windows driver. And it's always distributed separately from the kernel -- otherwise you're distributing a combined, thus derived, thus un-distributable, work. Even the little distros that tried to play fast and loose seem to have given in on this point.
Dtrace is far more intrusive than a graphics driver, and in copyright relevant ways -- it needs to muck around with other people's code to put hooks in. That makes both of the above tricks hard to achieve. Maybe not impossible, I don't know. From his rhetoric about licenses, I haven't gotten the impression that Paul Fox is being that careful. I would not tell people that linux-dtrace as it exists is legal to distribute at all without knowing many more details.
>Argument 2 is a strong argument when we're talking about proprietary, closed-source software, but dtrace and ZFS are Free and Open source, so I'm not sure that it would be as fragile and problematic as proprietary drivers.
It's true the problems are worse for proprietary drivers, but they're quite bad even for plain old out-of-tree drivers. (Note part of the discussion above is about RH's systemtap team's trouble keeping sync with mainline!) And worries about license contamination are an extra burden on top of that.
Sorry for nattering on so!
Posted Sep 21, 2008 9:29 UTC (Sun)
by paulj (subscriber, #341)
[Link] (7 responses)
Uh, not really. DTrace is free software..
Posted Sep 21, 2008 11:22 UTC (Sun)
by Jonno (subscriber, #49613)
[Link] (5 responses)
Morally and technically it's another matter however, as the source *is*
Posted Sep 22, 2008 10:58 UTC (Mon)
by paulj (subscriber, #341)
[Link] (4 responses)
With DTrace being free-software, there isn't even much of a moral hazard...
Posted Sep 22, 2008 13:57 UTC (Mon)
by corbet (editor, #1)
[Link] (3 responses)
DTrace presents a different hazard; imagine a Sun-turns-SCO scenario, for example. The fact that Sun does not appear to be interested in taking that path now is irrelevant; neither was Caldera, once upon a time.
Posted Sep 22, 2008 14:54 UTC (Mon)
by paulj (subscriber, #341)
[Link]
Posted Sep 22, 2008 15:42 UTC (Mon)
by zooko (guest, #2589)
[Link] (1 responses)
I have a feeling that you already addressed this in your "What if Sun turns Evil?" article, but I don't recall any actual problems that could result from use of DTrace or ZFS source code.
Posted Sep 22, 2008 17:33 UTC (Mon)
by njs (subscriber, #40338)
[Link]
Unless they have changed their mind and answered this at some point. I hope so -- haven't followed closely.
But paulj is right re: copyrights; Sun wrote the CDDL in such a way that combining CDDL and GPLv2 violates the GPL but not the CDDL. This has the same effect in practice as not granting permission, but it means that for copyright issues we don't need to worry about Sun-turns-SCO -- we need to worry about SCO-turns-SCO.
Posted Sep 21, 2008 12:43 UTC (Sun)
by zooko (guest, #2589)
[Link]
However, my point here was that resulting restrictions on use and redistribution are similar:
The copyright holder on DTrace has granted permission to use it in Linux systems, but the GPL which governs redistribution of Linux source code forbids redistributing derived works which include non-GPL'ed code such as DTrace.
This winds up imposing the same sort of restriction on users and distributors that a proprietary driver does: the copyright holder of the proprietary driver grants you the right to use it in your Linux kernel, but the Linux licence forbids you to redistribute it (assuming the current standard interpretation of GPL applied to this issue).
Posted Sep 18, 2008 20:10 UTC (Thu)
by SEJeff (guest, #51588)
[Link]
Posted Sep 19, 2008 10:37 UTC (Fri)
by mjw (subscriber, #16740)
[Link]
Also useful is the ability to run tracing tools in a "flight recorder" mode, where an administrator can look at historical data after something goes wrong.
Posted Sep 19, 2008 17:19 UTC (Fri)
by bcantrill (guest, #31087)
[Link] (2 responses)
There is, however, one issue I would like to lay to rest. DTrace is not, contrary to ongoing assertions here and elsewhere, a marketing endeavor. DTrace has succeeded because we (Team DTrace) worked damned hard for a damned long time solving a damned hard problem -- and we succeeded at a time when the world didn't want to hear anything that wasn't "Linux" (trust me on this). Far from putting "a lot of marketing energy" into telling customers about DTrace, Sun has put virtually no marketing dollars into DTrace, and it (ironically) took a herculean effort on our part to get even the most basic marketing support. (As an example: when we held a DTrace un-conference this past March, it was a Sun partner who stepped up to pay for it.) It's especially grating when we are accused of "hyping" aspects of the technology -- safety in particular -- that are foundational in nature. We are not and have not "hyped" our safety -- but we (and particular, I) have also not hesitated to point out the architectural differences between DTrace and SystemTap in this regard, e.g.:
http://blogs.sun.com/bmc/entry/dtrace_safety
Providing motivation for architectural decisions is not "marketing", it's technology -- and you mar your otherwise strong work when you dismiss it as being something less...
Posted Sep 19, 2008 22:02 UTC (Fri)
by nix (subscriber, #2304)
[Link] (1 responses)
i.e. this is as opposed to the sort of marketing endeavour where marketing
Posted Sep 19, 2008 22:21 UTC (Fri)
by bcantrill (guest, #31087)
[Link]
dtrace on Linux
dtrace on Linux
this is a sentence stating "no".
No mention of the DTrace port. The licensing of Linux makes a DTrace port undistributable by anybody, so it is not of any practical interest.
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
-- Nathaniel
dtrace on Linux
The legal constraint on using dtrace or ZFS on Linux is exactly the same as the legal constraint on using a proprietary hardware driver on Linux.
dtrace on Linux
availible, and *can* be fixed if nessesary...
dtrace on Linux
The number of Linux distributions shipping binary-only drivers has gotten quite small; there are good reasons for that.
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
dtrace on Linux
Nice to see immediate results from this in the systemtap code base.
KS2008: Tracing
commit 2fa2a091a0b248855d7f77aa20677ef4c7a7cc61
Author: Nobuhiro Tachino <tachino@jp.fujitsu.com>
Date: Tue Sep 16 22:04:02 2008 -0400
add new stap -F (flight recorder) option that just passes through to staprun -L
DTrace != Marketing endeavor
DTrace != Marketing endeavor
raving about dtrace than every other feature of Solaris put together, even
than things like ZFS and zones. dtrace *should* be a marketing endeavour
in the sense that marketing loves it and pushes it hard, because it rocks
and customers love it. (The license incompatibility is really annoying.
I'd like to see dtrace everywhere but this is impossible until someone
shoots the current copyright system or reimplements all of dtrace, what a
waste of time.)
lies to customers with ridiculously overblown promises ("our program will
make your coffee for you and can replace all your staff with AIs which
don't need paying"), makes up features that don't exist, or outright makes
up features that are only implementable in a parallel universe with
different maths ("[system] analyses your programs and warns about all
those that will halt" as one project I declined to work on was described).
DTrace != Marketing endeavor