|| ||Paul Smith <paul-AT-mad-scientist.net> |
|| ||Alan Cox <alan-AT-lxorguk.ukuu.org.uk> |
|| ||Re: [PATCH] coredump: Retry writes where appropriate |
|| ||Mon, 01 Jun 2009 14:39:04 -0400|
|| ||Oleg Nesterov <oleg-AT-redhat.com>, linux-kernel-AT-vger.kernel.org,
stable-AT-kernel.org, Andrew Morton <akpm-AT-linux-foundation.org>,
Andi Kleen <andi-AT-firstfloor.org>,
Roland McGrath <roland-AT-redhat.com>|
|| ||Article, Thread
On Mon, 2009-06-01 at 18:49 +0100, Alan Cox wrote:
> > On the other hand, IMO all other signals, including SIGINT and SIGQUIT,
> > should be ignored during core dumping. Allowing SIGKILL gives a method
> > for getting rid of core dumps in the relatively rare situation where
> > people want/need to do so, and I don't see any real benefit to adding
> > more signals to the list of things you can't do if you want robust
> > cores. Isn't one enough?
> I also want usability. SIGINT/SIGQUIT are never sent except by user
> requests to terminate a process so they can safely be allowed. If the
> alternatives are the status quo or SIGKILL only then I'd favour the
> status quo particularly having experienced the alternatives on some old
> Unix systems.
SIGINT/SIGQUIT are sent all the time in situations where the user might
not want the core dump to be canceled. This is what I meant by "wanted
to actually interrupt the core"; it implies the user knows that a core
is being dumped and explicitly decides they do not want to have that
happen in this situation and takes some affirmative action to stop it.
If a program seems to be unresponsive the user could ^C, without
realizing that it was really dumping core. Now when they are asked to
produce the core so the problem can be debugged, they can't do it. Or,
a worker process might appear unresponsive due to a core being dumped
and the parent would decide to shoot it with SIGINT based on various
timeouts etc. Again we have no core available.
If the user has problems with coredumps there are all sorts of ways to
manage that. You can disable core dumps altogether via ulimit. You can
set core_pattern to dump to a fully-qualified pathname on faster media
instead of whatever working directory you're using.
Or, with this change, you can kill -9 the PID that's dumping core.
These things seem to me to provide a lot of usability features.
On the other hand there's no way to ensure full, reliable core dumps
with today's behavior.
to post comments)