Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
(Nearly) full tickless operation in 3.10
Finding a profiler that works, damnit
Posted Mar 24, 2010 15:18 UTC (Wed) by foom (subscriber, #14868)
You can't convert a single address to a stacktrace later on! You'd need a copy of the whole stack to
do it offline...which I doubt is any faster than just running the unwinder to save out the PC of every
Anyways, thanks for mentioning sysprof: I hadn't heard of that one before. But looking at the
source, it doesn't seem like it's likely to work either, given the function heuristic_trace in:
Posted Mar 24, 2010 17:48 UTC (Wed) by nix (subscriber, #2304)
So it's call the unwinder or nothing, really. Unfortunately the job of figuring out what the call stack looks like really *is* quite expensive :/ any effort should presumably go into optimizing libunwind...
Posted Mar 26, 2010 17:55 UTC (Fri) by sandmann (subscriber, #473)
You are right that it doesn't generate good callgraphs on x86-64 unless you compile the application with -fno-omit-frame-pointer. I very much would like to fix this somehow, but I just don't see any really good way to do it.
Fundamentally, it needs to take a stack trace in the kernel in NMI context. You cannot read the .eh_frame information at that time because that would cause page faults and you are not allowed to sleep in NMI context.
Even if there were a way around that problem, you would still have to *interpret* the information which is a pretty hairy algorithm to put in the kernel (though Andi Kleen did exactly that I believe, resulting in flame wars later on).
You could try copying the entire user stack, but there is considerable overhead associated with that because user stacks can be very large (emacs for example allocates a 100K buffer on the stack). You could also try a heuristic stack walk (which is what an old version of sysprof - new versions use the same kernel interface as perf). That sort of works, but there is a lot of false positives from function pointers and left-over return addresses. The function pointers can be filtered out, but the return addresses can't. These false positives tend to make sysprof's UI display somewhat confusing, though not completely unusable.
You could also try some hybrid scheme where the kernel does a heuristic stack walk and userspace then uses the .eh_frame information to filter out the false positives. This is what I think is the most promising approach at this point. Some day I'll try it.
Finally, the distributions could just compile with -fno-omit-frame-pointer by default. The x86-64 is not really register-starved so it wouldn't be a significant performance problem. The Solaris compiler does precisely this because they need to take stack traces for dtrace.
But, I fully expect to be told that for performance reasons we can't have working profilers.
Posted Mar 26, 2010 18:06 UTC (Fri) by rahulsundaram (subscriber, #21946)
Posted Mar 26, 2010 21:44 UTC (Fri) by foom (subscriber, #14868)
Why does it need to happen at NMI time? Why can't you just do it in the user process's context,
before resuming execution of their code?
The setitimer(ITIMER_PROF) solution that userspace profilers use clearly works out fine for
userspace profiling. Can't you do something similar for userspace profiling from within the kernel?
The stack trace of the userspace half clearly can't change between when you received the NMI and
when you resume execution of the process...
That just leaves the complication of implementing the DWARF unwinder in the kernel, but there's
already much more complex code in the kernel...that really seems like it should be a non-issue.
Posted Mar 26, 2010 23:28 UTC (Fri) by nix (subscriber, #2304)
Posted Mar 27, 2010 14:10 UTC (Sat) by garloff (subscriber, #319)
Posted Mar 27, 2010 15:12 UTC (Sat) by foom (subscriber, #14868)
Posted Mar 28, 2010 0:26 UTC (Sun) by garloff (subscriber, #319)
Posted Apr 4, 2010 12:35 UTC (Sun) by chantecode (subscriber, #54535)
Posted Apr 5, 2010 1:37 UTC (Mon) by foom (subscriber, #14868)
Posted Mar 30, 2010 18:51 UTC (Tue) by fuhchee (subscriber, #40059)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds