LWN.net Logo

Constraining concurrent core dumps

By Jonathan Corbet
June 23, 2010
Systems running PHP are naturally beset with more than the usual number of challenges from the outset. In some cases, though, it can get even worse; consider this story from Edward Allcutt:

For example, a common configuration for PHP web-servers includes apache's prefork MPM, mod_php and a PHP opcode cache utilizing shared memory. In certain failure modes, all requests serviced by PHP result in a segfault. Enabling coredumps might lead to 10-20 coredumps per second, all attempting to write a 150-200MB core file. This leads to the whole system becoming entirely unresponsive for many minutes.

Edward's response to this non-fun situation was a patch limiting the number of core dumps which can be underway simultaneously; any dumps which would exceed the limit would simply be skipped.

It was generally agreed that a better approach would be to limit the I/O bandwidth of offending processes when contention gets too high. But that approach is not entirely straightforward to implement, especially since core dumps are considered to be special and not subject to normal bandwidth control. So what's likely to happen instead is a variant of Edward's patch where processes trying to dump core simply wait if too many others are already doing the same.


(Log in to post comments)

Constraining concurrent core dumps

Posted Jun 24, 2010 8:23 UTC (Thu) by saffroy (subscriber, #43999) [Link]

The solution could also be 100% in user space, the kernel has the right interface for that:
http://lwn.net/Articles/280959/

I used it in a small tool of mine:
http://jeanmarc.saffroy.free.fr/corefilter/

Constraining concurrent core dumps

Posted Jun 25, 2010 13:34 UTC (Fri) by NAR (subscriber, #1313) [Link]

Hm, I never heard about the /proc/sys/kernel/core_pattern file before. I used the coreadm Solaris tool a number of times, now I know how to set the names of the core files on Linux too.

Constraining concurrent core dumps

Posted Jun 27, 2010 16:55 UTC (Sun) by nix (subscriber, #2304) [Link]

More useful than setting names to me is the ability to pipe the output to an arbitrary program (which runs as root), e.g.

|/usr/libexec/dump_core %u %g %p %t

and then have that program do whatever it wants (mail backtraces somewhere, preserve the coredump in a centralized location, raise alarms, you name it).

Constraining concurrent core dumps

Posted Jun 28, 2010 0:58 UTC (Mon) by madscientist (subscriber, #16861) [Link]

I agree with nix; this problem is already easily solvable in userspace: I see no point in adding kernel support for it.

I would like to see the "receiving a signal aborts a core" problem fixed, though; that's not something that can be resolved in userspace: I published a patch (very simple--too simple, according to Alan Cox who apparently really wants this capability) last summer and Roland McGrath and Oleg Nesterov banged on it for a while but it fell through the cracks and I was lame and dropped it as well. I'll bring it up again...

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds