|
|
Subscribe / Log in / New account

Reliable user-space stack traces with SFrame

Reliable user-space stack traces with SFrame

Posted May 23, 2023 13:29 UTC (Tue) by nix (subscriber, #2304)
In reply to: Reliable user-space stack traces with SFrame by quotemstr
Parent article: Reliable user-space stack traces with SFrame

> Now, if we were to do the unwinding in a signal handler instead of hard coding SFrame

I'm not sure what this means. The mechanism for unwinding (in-kernel, copies to userspace, whatever) is orthogonal to the format being used (DWARF, SFrame, ORC): they can presumably all be unwound using code running in many contexts. They're just formats after all.

But... in general in a signal handler you can't do anything useful involving the process you're running inside -- in particular you can't use stdio or allocate memory and more or less arbitrary locks might be taken out, and that's when nothing has gone wrong: and if you're backtracing quite often it's because all hell has broken loose and the program might be in any state at all. glibc removed the machinery that gave (fp-based) backtraces on stack-protector failure for a reason.

One attractive-sounding alternative suggested at a past LPC is to use a coredump handler: that is given an image of as much or as little of the process as you wish to configure (this stuff is customizable in /proc) and can do whatever it wants because it's a completely separate process that nothing has gone wrong with and which isn't in a signal handler and has no unexpected locks or half-completed mallocs fouling things up. But a signal handler? The more you do with signals, the more pain you'll eventually be in, and that goes double if the process is halfway through crashing!


to post comments

Reliable user-space stack traces with SFrame

Posted May 23, 2023 23:32 UTC (Tue) by eklitzke (subscriber, #36426) [Link]

Obviously care needs to be taken with the code you write in a signal handler, but that doesn't mean they're not useful, and they're definitely not only useful for a crashing process. At the company I work for we use setitimer with ITIMER_PROF, and in the SIGPROF signal handler we unwind the stack following frame pointers up to 48 frames deep, and these are written into a fixed-size circular ring buffer, so we have the last ~10s of profile data in memory at all times. None of this requires using stdio or memory allocation or anything else unsafe. There is some slightly tricky locking logic for reading/writing the ring buffer (when we dump profiles from the buffer we need to make sure it doesn't race with the signal handler), but it isn't rocket science.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds