User: Password:
|
|
Subscribe / Log in / New account

Fun with tracepoints

Fun with tracepoints

Posted Aug 13, 2009 2:45 UTC (Thu) by karim (subscriber, #114)
Parent article: Fun with tracepoints

Maybe I'm just missing something but ... I can't help but feel somewhat cynical about this. I released the Linux Trace Toolkit on July 22nd 1999, over 10 years ago - and yet still, this was quite some time before DTrace came to be FWIW. The value of static tracepoints seemed obvious at the time (for me at least). I can't believe this debate is still going on. In fact, I can't help but think that Linux in this case could have been way much further ahead of DTrace. Of course the worst part is that early on all this static tracing was turned down because it would result in unmaintainable bloat. The irony is that the vast majority of initial trace-points suggested are still valid today.

As to the argument that nobody wants to "use" this stuff, I've never bought this. You can't expect users to come asking for tools they've never seen before -- that's rare. That doesn't mean they won't find those tools very useful if they were made available to them. It just so happens that those feeling the need for these tools have no way to show mass user adoption of these tools because they can never get those tools to the users in the 1st place (if it's not mainlined, it's likely not if your latest distro ...) So one can only point at other OSes delivering same functionality ... The fact of the matter is that in this specific case users don't get enough credit for acting smart when given the right information. Even Windows allows me to get more information about what's going on than Linux does. There has to be a point where users are given the tools to find out what *they* want to know about what's going on, not what some maintainer somewhere decides they should see through /proc/foo.

I seriously hope this issue can be settled to Linux's benefit at some point in time in the future. Though I've stopped maintaining LTT quite a few years ago, I still hope one day being able to have a tool in Ubuntu to have the tools that give me the power to get full control over the information *I* want to see.

Karim Yaghmour


(Log in to post comments)

Fun with tracepoints

Posted Aug 13, 2009 7:52 UTC (Thu) by michaeljt (subscriber, #39183) [Link]

One of the arguments I've heard against adding static trace points is that they become part of the kernel ABI and can no longer be removed if they prove not to be useful. Perhaps a way around this argument would be to initially add tracepoints with a mangled name that depends on the current kernel release? That way, until a given set of tracepoints have proved themselves, any scripts using them would have to be updated with every kernel release in order to keep working. The mangling needn't be very complex (e.g. all tracepoints in 2.6.32 which might be removed at a later time could have "dot32" pre-pended to their names), it just needs to be unpredictable in future kernel releases.

Fun with tracepoints

Posted Aug 13, 2009 9:59 UTC (Thu) by addw (guest, #1771) [Link]

I don't see a problem with making it clear that static trace points are NOT part of the ABI, ie that they may come & go. If you are getting that close to the kernel you have got to expect things to change. But how do people use traces ? Probably to look at particular problems, not as general monitoring.

In practice most people use Distro X, version Y. The release people for this will ensure that trace points don't get removed during the 5/... year lifetime of Y, thus you can install new distro provided kernels without worrying. When you rebuild your machine in 5 years time you redo your traces.

LTTng

Posted Aug 13, 2009 8:33 UTC (Thu) by alex (subscriber, #1355) [Link]

FWIW I've used your LTT code on embedded systems and found it very useful in understanding the pseudo-realtime behaviour of my system. The optimist in me likes to think when we finally have a full tracing solution in the kernel it will be a much more powerful and refined experience for the 10 years of experience and experimenting done leading up to the final solution.

Fun with tracepoints

Posted Aug 13, 2009 12:20 UTC (Thu) by fuhchee (guest, #40059) [Link]

As to the argument that nobody wants to "use" this stuff, I've never bought this. You can't expect users to come asking for tools they've never seen before -- that's rare. That doesn't mean they won't find those tools very useful if they were made available to them.

I think it goes even beyond that. The very fact that certain subsystem maintainers have found certain tracepoint suites already useful to themselves does not seem to carry any weight. A satisfactory burden/level of proof is not stated, just a counterfactual caricature "I don't think anyone needs this".

Fun with tracepoints

Posted Aug 13, 2009 16:24 UTC (Thu) by SEJeff (subscriber, #51588) [Link]

Why don't you (as a systemtap developer) get a list of random joes, developers, and actual subsystem maintainers to write a small blurb mentioning how static tracepoints helped them out?

Then you can say look, these are not random uses? Remember all of the push against the memleak detector stuff inkernel? It seems to have pulled it's weight already by helping find plenty of bugs.

Fun with tracepoints

Posted Aug 13, 2009 18:18 UTC (Thu) by karim (subscriber, #114) [Link]

Sorry, this has been tried and has failed. Check out the list of companies who have contributed to LTTng:
Google, IBM, Ericsson, Autodesk, Wind River, Fujitsu, Monta Vista, STMicroelectronics, C2 Microsystems, Sony, Siemens, Nokia.

But, hey, who are they to know, the kernel developers know better. And the hell with Sun, Apple, IBM, Microsoft, etc. who spent large gobs of money on implementing tracing infrastructure in their OSes (Apple by the way ported DTrace to MacOS ... :/ ) and maintaining it through the years. They're wrong too. The Linux kernel developers surely are better than the collective intelligence of the engineers and product managers of the aforementioned.

I forgot to mention that apart from pushing and maintaining LTT for a number of years, I also worked/defended a number of ideas which were dear to my heart. Take for example real-time. Very early on I came to the LKML pointing out that the tacit laissez-faire towards the RTLinux patent was not good for Linux. This was dismissed off-hand: the uses, I was told, were so narrow and the applications so specific that this is a non-issue ("real time apps are a niche market and they're not mainstream" ... i.e. those users don't matter). Skip a few years and there were two approaches being discussed Ingo's and the iPipe (my idea); at the subsequent OLS I asked a prominent developer whether what he thought were the chances of success of Ingo's very invasive approach, his reply was clear: Ingo has got the clout to make it happen. Just about then I knew iPipe wasn't likely to "win". And have his lunch he did.

That along with other things I witnessed (such as Con Kolivas quitting kernel development because he saw little interest in helping desktop interactivity) made me increasingly feel there's a NIH-syndrome. If nothing else, it distills from this that Linux' development has become highly politicized. You're either part of the in-crowd or you're not. And if you're not part of the in-crowd you're going to have a hell of a time trying to push something in if it's the least bit unconventional. Don't get wrong, being part of the in-crowd doesn't guarantee a radical change's acceptance. But being an outsider clearly ensures that you've got zero chances of success. It might have changed since I've stopped keeping track of it all, but juxtapose the previous with the fact that most kernel developers work for/on big iron and you've got a huge disconnect with the realities of real-life mainstream users. It's not that user preoccupations aren't eventually taken care or fixed (ex.: udev/sysfs/devfs), it's just that an absolute non-priority. And *that* is a serious issue. Last I checked, Linux has been flatlining in the end-user market for a very long time. If the diagnosis I'm making out of the symptoms I've witnessed is the least bit right (and I really hope I'm wrong), this isn't about to change any time soon.

I sincerely apologize if I've offended anyone with the above, but this is a case where *everything* ***EVERYTHING*** has been tried to convince the kernel development community. The ball is in their camp.

Fun with tracepoints

Posted Aug 18, 2009 18:33 UTC (Tue) by karim (subscriber, #114) [Link]

Just so there's no misunderstanding, please note that I don't speak on behalf on LTTng in way shape or form; the above opinions are mine and mine alone. LTTng and the now defunct LTT, which I used to maintain, have nothing but part of the name in common. I could have used any other of the tracing projects as an example, it just so happens that this is the one I'm most familiar with :)

Karim

Fun with tracepoints

Posted Aug 18, 2009 21:40 UTC (Tue) by oak (guest, #2786) [Link]

> If nothing else, it distills from this that Linux' development has
> become highly politicized. You're either part of the in-crowd or
> you're not.

I think the problem is more that individual kernel developers don't really
(need to) look at the whole system or be responsible for it, just a one
corner of it. On commercial operating systems, there are dedicated people
who look after the whole thing and need to make sure that the whole thing
works fine (and this responsibility gives them influence over the
operating system implementation to make sure that these tools get done &
available).

If you're looking just at one or some parts of the whole system, things
like LTT (or to some extent Systemtap[1]) that try to get an overview of
what happens in the whole system may seem too large / complex /
intrusive / bloated. "I just need this specific info from the block
layer" (or memory subsystem, or ...). And then they write their own NIH
tracing for that single thing that doesn't much benefit others, or
somebody who wants to make sense out of the whole system.

Note: I have gotten useful info both from LTT and LTTng (lttng.org) + it's
finally getting easier to apply to kernel... LTTv plays a large part in
this too as one can easily zoom into details etc.

[1] Systemtap seems nice, but it doesn't have the post-processing /
visualization for the whole system like LTT does. I see it more like a
tool to do more specific analysis tools. However, for this kind of stuff
it's a bit too complicated (e.g. in embedded environments where you don't
want to run stap / compile the scripts on the device itself etc), so no
wonder devs write their own tracing...

Fun with tracepoints

Posted Aug 19, 2009 16:18 UTC (Wed) by fuhchee (guest, #40059) [Link]

[1] Systemtap seems nice, but it doesn't have the post-processing / visualization for the whole system like LTT does. I see it more like a tool to do more specific analysis tools.

I see what you mean. systemtap people are working on some GUI data graphing tools, but are just starting. (I got the impression though that LTTV was being deprecated in favour of eclipse-based widgets, which systemtap and other tools could feed data into also.)

However, for this kind of stuff it's a bit too complicated (e.g. in embedded environments where you don't want to run stap / compile the scripts on the device itself etc)

We hope to ease that pain by more automated cross-compilation/execution.

Fun with tracepoints

Posted Aug 20, 2009 19:29 UTC (Thu) by oak (guest, #2786) [Link]

> I got the impression though that LTTV was being deprecated in favour of
eclipse-based widgets

Do you have any pointers to more information about this?

Fun with tracepoints

Posted Aug 20, 2009 19:32 UTC (Thu) by fuhchee (guest, #40059) [Link]

I don't want to misrepresent LTTng, so please do take all this
with a grain of salt, but this is what I gathered from the presentations
given at http://ltt.polymtl.ca/tracingwiki/index.php/TracingMiniSu...

Fun with tracepoints

Posted Aug 20, 2009 19:34 UTC (Thu) by fuhchee (guest, #40059) [Link]

Fun with tracepoints

Posted Aug 20, 2009 20:58 UTC (Thu) by oak (guest, #2786) [Link]

Thanks! According to this:
http://eclipse.org/linuxtools/projectPages/lttng/

"The first release, scheduled for September 2009 (code name: Vanilla),
will provide feature parity with the LTTng Viewer (LTTV) v0.12.12."

And this seemed to have a screenshot of the LTTng plugin:
http://ltt.polymtl.ca/tracingwiki/images/0/00/TMF_-_Traci...

This was pretty good overview of past & present tracing:
http://ltt.polymtl.ca/tracingwiki/images/5/57/Ts2009-hell...

Architecture bit here was annoying:
http://ltt.polymtl.ca/tracingwiki/images/4/46/Ts2009-Syst...

As it assumes that one has a working user-space in problematic cases one
wants to analyze. Kernel should be able to optionally get the data out
also through some high-speed HW interface without going through user-space
and filling of the flight record buffer would better not rely on
user-space.

Fun with tracepoints

Posted Aug 22, 2009 16:23 UTC (Sat) by compudj (subscriber, #43335) [Link]

About your comment on the architecture, I just want to clarify a few points. First, the architecture diagram you see at http://ltt.polymtl.ca/tracingwiki/images/4/46/Ts2009-Syst... focuses on tracing of userland. In this diagram, kernel tracing is contained within the "kernel trace facilities" box. For user-space tracing, where the goal is to get data out of the applications, it makes sense to consider than user-space is working. As you point out, this assumption makes less sense when we talk about tracing the kernel.

Second, more specifically about the LTTng kernel tracer, you are right in that the current mechanism used to extract data is a splice() system call controlled by a user-space daemon. However, alternate implementations of ltt-relay-alloc.c and ltt-relay-lockless.c could easily permit to use a high-speed debug interface. This has already been done with earlier LTTng versions for ARM.

The core of the LTTng kernel tracer therefore does not depend on userland. It's only the peripheral data extraction and trace control modules which depend on working userland. But they could be replaced easily by built-in kernel objects interacting directly with the LTTng kernel API. I made sure all operations we allow from interfaces presented to user-space are also doable from within the kernel.

Mathieu


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds