In defense of per-BDI writeback
Chris Mason has tried to provide that justification with a combination of benchmark results and explanations. The benchmarks show a clear - and large - performance improvement from the use of per-BDI writeback. That is good, but does not, by itself, justify the switch to per-BDI writeback; Andrew had suggested that the older code was slower as the result of performance regressions introduced over time by other changes. If the 2.6.31 code could be fixed, the performance improvement could be (re)gained without replacing the entire subsystem.
What Chris is saying is that the old, per-CPU pdflush method could not be fixed. The fundamental problem with pdflush is that it would back off when the backing device appeared to be congested. But congestion is easy to cause, and no other part of the system backs off in the same way. So pdflush could end up not doing writeback for significant periods of time. Forcing all other writers to back off in the face of congestion could improve things, but that would be a big change which doesn't address the other problem: congestion-based backoff can defeat attempts by filesystem code and the block layer to write large, contiguous segments to disk.
As it happens, there is a more general throttling mechanism already built
into the block layer: the finite number of outstanding requests allowed for
any specific device. Once requests are exhausted, threads generating block
I/O operations are forced to wait until request slots become free again.
Pdflush cannot use this mechanism, though, because it must perform
writeback to multiple devices at once; it cannot block on request
allocation. A per-device writeback thread can block there, though,
since it will not affect I/O to any other device. The per-BDI patch
creates these per-device threads and, as a result, it is able to keep
devices busier. That, it seems, is why the old writeback code needed to be
replaced instead of patched.
| Index entries for this article | |
|---|---|
| Kernel | Block layer/Writeback |
| Kernel | Memory management/Writeback |
