A SystemTap update
A SystemTap update
Posted Jan 29, 2009 4:04 UTC (Thu) by akpm (guest, #4826)Parent article: A SystemTap update
kernel patches (utrace and uprobes) which have dim-to-zero prospects
of ever being included in Linux?
Posted Jan 29, 2009 8:23 UTC (Thu)
by eugeniy (guest, #24280)
[Link] (7 responses)
Posted Jan 29, 2009 10:08 UTC (Thu)
by ctg (guest, #3459)
[Link] (5 responses)
.. so reading this article was timely. Looks like systemtap would enable us to quickly home in on the big disk users..
.. the article quite clearly states that to get the best out of systemtap you need these patches, so when Mr Morton himself makes this sort of criticism, then its a bit of a concern.
Despite all that, I'm off to look at systemtap in a bit more detail (it's lack of ubiquity has put me off before), but the lack of decent tools for working out what is really going on in a complex system is pretty frustrating (I'm still suffering from the lack of the "W" flag in the output of ps(1) to show which processes are swapped out - I understand why it doesn't show that any more - but when your system goes into swap, it's useful to see which processes are being paged out.. I suspect systemtap might be able to help with this too).
Posted Jan 29, 2009 10:42 UTC (Thu)
by mjw (subscriber, #16740)
[Link]
Also take a look at some of the examples that come with Systemtap. disktop.stp probably does what you want:
Posted Jan 29, 2009 17:05 UTC (Thu)
by knobunc (guest, #4678)
[Link]
Posted Jan 29, 2009 20:07 UTC (Thu)
by epb205 (guest, #50182)
[Link] (1 responses)
Posted Feb 3, 2009 22:33 UTC (Tue)
by oak (guest, #2786)
[Link]
If the process has stuff that's marked as swapped, but not anymore as
Posted Jan 30, 2009 3:15 UTC (Fri)
by SEJeff (guest, #51588)
[Link]
http://www.digitalprognosis.com/opensource/scripts/top-di...
The output looks like this:
Posted Jan 29, 2009 12:46 UTC (Thu)
by eugeniy (guest, #24280)
[Link]
Posted Jan 29, 2009 10:38 UTC (Thu)
by mjw (subscriber, #16740)
[Link]
The last part of the article gives some idea of ways people are working on getting this functionality faster upstream, so they are included with more distributions by default. By splitting it up, providing other users, etc. One recent example is the utrace->ftrace engine proof of concept: http://lkml.org/lkml/2009/1/27/294
If you have any hints and tips for getting these things, or similar user space hooks that Systemtap can use, upstream faster that would be appreciated.
Posted Jan 29, 2009 13:28 UTC (Thu)
by fuhchee (guest, #40059)
[Link]
For probing user-space, there is apprx. no alternative: one needs a
> which have dim-to-zero prospects of ever being included in Linux?
While skepticism may be warranted, we are making efforts to make this
Posted Jan 29, 2009 16:15 UTC (Thu)
by jejb (subscriber, #6654)
[Link] (5 responses)
We've spent quite a lot of effort explaining the problems with the utrace/uprobes dependency (especially the issues of having to pull the process symbol table into the kernel and of having the kernel actually execute the compiled code to do the traps). There is hope that we might be able to go with a lighter weight infrastructure that simply vectors traps to the user space stap runtime and does all the interpreting in user space. It's just we still haven't quite got system tap buy in yet.
Posted Jan 29, 2009 16:36 UTC (Thu)
by fuhchee (guest, #40059)
[Link] (3 responses)
Can you provide some links to discussion about these specifics: ?
> (especially the issues of having to pull the
User-space symbol tables are made available to the systemtap module
> and of having the kernel actually
Like in dtrace, instrumentation is run within the kernel because
Posted Jan 29, 2009 16:55 UTC (Thu)
by jejb (subscriber, #6654)
[Link] (2 responses)
Um, just use a search ... if you search lkml for utrace you get the less polite version .. if you search the systemtap lists on the same thing, you get the more polite one.
>> (especially the issues of having to pull the
Only if you buy the premise that the kernel has to be intimately involved in the trace instead of being a simple conduit for mediating it.
>> and of having the kernel actually
Well, this would be the classic illustration of the problems systemtap faces. Nothing on the above laundry list is impossible even if the kernel merely controls the traced process and lets userspace poke at it ... that, after all, is how gdb works. The brick wall is that kernel developers don't think this is at all a compelling argument and apparently systemtap people think it is.
Posted Jan 29, 2009 17:26 UTC (Thu)
by fuhchee (guest, #40059)
[Link]
I asked because I recall no serious debate about the two specific items ("process symbol tables in the kernel" and "having kernel ... execute code ... to do the traps") you listed. Please humor fellow readers and give some links.
> > User-space symbol tables are made available to the systemtap module
> Only if you buy the premise that the kernel has to be intimately involved
There are many possible details behind such a summary. If one wants dtrace-level introspection and manipulation, never mind going beyond it, some "intimate involvement" (kernel-side processing?) is necessary. Merely "mediating" (data copying?) is not sufficient, since the choice of data and the nature of the programmed reaction is itself variable.
> [...] that, after all, is how gdb works. [...]
The work involved in how gdb does its thing is several orders of magnitude heavier.
> The brick wall is that kernel developers don't think this is at all a
Individual kernel people don't need to buy into every argument for systemtap to bloom. We have promoted numerous "dual-use" kernel-side technologies that can stand on their own feet. For example, with utrace, if you believe that user-space instrumentation is plausible, you should support utrace and forthcoming ("froggy" or "ubs"-like) layers on top, for dispatching those events to a hypothetical user-space handler.
The details deserve more in-depth discussion.
Posted Feb 3, 2009 22:41 UTC (Tue)
by oak (guest, #2786)
[Link]
As to why to do it in kernel... Doing it from user space is just too slow.
Posted Jan 30, 2009 10:31 UTC (Fri)
by mjw (subscriber, #16740)
[Link]
Correct.
> We've spent quite a lot of effort explaining the problems with the utrace/uprobes dependency (especially the issues of having to pull the process symbol table into the kernel and of having the kernel actually execute the compiled code to do the traps).
Could you post the problems you see?
How a tracing tool like systemtap processes and uses the symbol table is kind of orthogonal from utrace and uprobes. utrace and uprobes might make it easier to access them during runtime. But that isn't what Systemtap currently does. If you want a tracer to do these things dynamically at trace event time, or even push the whole thing towards user space in reaction to trace events and hand it off to a user space helper then that is certainly a design choice you can make (unlike tracers, debuggers do this for example since they don't mind suspending the tracee for a longer period). The article does hint at why "offloading" this to a user space helper might not be practical (see the vfs example and the explanation of what might happen if you try to offload something like that to a perl script). But those are tradeoffs you can make independent of the infrastructure you use in the kernel to handle events and trace point insertion.
> There is hope that we might be able to go with a lighter weight infrastructure that simply vectors traps to the user space stap runtime and does all the interpreting in user space.
Yes, there is nothing inherent in utrace or uprobes about how you handle trace events or how you use and insert vector traps into user space. That is the basic idea behind pushing them upstream, because they are useful apart from systemtap. They should also be useful for other tracers like connecting them to ftrace or lttng. You could even use them for a new debugger interface if you aren't interested in a no-overhead tracer. That is what the froggy project is exploring. It seems time to provide something better than the ptrace interface for debuggers.
A SystemTap update
A SystemTap update
A SystemTap update
http://sourceware.org/systemtap/examples/keyword-index.ht...
A SystemTap update
A SystemTap update
A SystemTap update
It's separate for each of the memory mapping the process has i.e. you may
need to write a small script to process the data.
dirty, it's completely swapped out. For some reason kernel/SMAPS doesn't
think swapped pages to be anymore dirty which loses the distinction
between shared dirty and private dirty that SMAPS shows for pages still in
RAM.
A SystemTap update
root@desktopmonster:~# ./top-disk-users
COMMAND PID NUM ACTION DEVICE
banshee-1 23999 8 READ sda9
kjournald 2494 131 WRITE sda5
kjournald 5182 5 WRITE sda8
pdflush 228 15 WRITE sda5
pdflush 228 1 WRITE sda8
pdflush 228 32 WRITE sda9
A SystemTap update
A SystemTap update
A SystemTap update
> kernel patches (utrace and uprobes)
kprobes-like infrastructure.
code more palatable to the gatekeepers.
A SystemTap update
A SystemTap update
> utrace/uprobes dependency
> process symbol table into the kernel
only if it is required by the script - if it performs symbolic
address or backtrace type lookups.
> execute the compiled code to do the traps
having user-space processes instrument each other is too disruptive.
We're looking for microsecond-level probe effect, not something
involving multiple context switches, indirect address space accesses,
and so on.
A SystemTap update
>> utrace/uprobes dependency
>
> Can you provide some links to discussion about these specifics: ?
>> process symbol table into the kernel
>
> User-space symbol tables are made available to the systemtap module
> only if it is required by the script - if it performs symbolic
> address or backtrace type lookups.
>> execute the compiled code to do the traps
>
> Like in dtrace, instrumentation is run within the kernel because
> having user-space processes instrument each other is too disruptive.
> We're looking for microsecond-level probe effect, not something
> involving multiple context switches, indirect address space accesses,
> and so on.
A SystemTap update
> > only if it is required by the script
> in the trace instead of being a simple conduit for mediating it.
> compelling argument and apparently systemtap people think it is.
A SystemTap update
faces. Nothing on the above laundry list is impossible even if the kernel
merely controls the traced process and lets userspace poke at it ... that,
after all, is how gdb works.
Try e.g. get backtraces to mallocs through ptrace and you notice how
infeasible this is from user-space (at least through the interface ptrace
offers). With the modern desktop apps that use malloc pretty heavily, the
programs become unusable slow (in addition to their usability, also their
functionality may suffer if they use timeouts for responses etc).
A SystemTap update