Re: [PATCH 6/6] kernel: Avoid softlockups in stop_machine() during
heavy printing
[Posted March 19, 2014 by corbet]
From: |
| Andrew Morton <akpm-AT-linux-foundation.org> |
To: |
| Jan Kara <jack-AT-suse.cz> |
Subject: |
| Re: [PATCH 6/6] kernel: Avoid softlockups in stop_machine() during heavy printing |
Date: |
| Thu, 13 Mar 2014 16:09:15 -0700 |
Message-ID: |
| <20140313160915.17f0a285ae1cde36dbc76399@linux-foundation.org> |
Cc: |
| LKML <linux-kernel-AT-vger.kernel.org>, pmladek-AT-suse.cz, Steven Rostedt <rostedt-AT-goodmis.org>, Frederic Weisbecker <fweisbec-AT-gmail.com> |
Archive‑link: | |
Article |
On Thu, 13 Mar 2014 16:58:38 +0100 Jan Kara <jack@suse.cz> wrote:
> When there are lots of messages accumulated in printk buffer, printing
> them (especially over serial console) can take a long time (tens of
> seconds). stop_machine() will effectively make all cpus spin in
> multi_cpu_stop() waiting for the CPU doing printing to print all the
> messages which triggers NMI softlockup watchdog and RCU stall detector
> which add even more to the messages to print. Since machine doesn't do
> anything (except serving interrupts) during this time, also network
> connections are dropped and other disturbances may happen.
>
> Paper over the problem by waiting for printk buffer to be empty before
> starting to stop CPUs. In theory a burst of new messages can be appended
> to the printk buffer before CPUs enter multi_cpu_stop() so this isn't a 100%
> solution but it works OK in practice and I'm not aware of a reasonably
> simple better solution.
>
Yes it's rather hacky, but it's simple and direct and explicit and
obvious. It's the stealth hackiness which causes harm.