|
|
Subscribe / Log in / New account

Ext4 data corruption in stable kernels: it's not a larger problem

Ext4 data corruption in stable kernels: it's not a larger problem

Posted Dec 10, 2023 22:06 UTC (Sun) by geofft (subscriber, #59789)
In reply to: Ext4 data corruption in stable kernels by rolexhamster
Parent article: Ext4 data corruption in stable kernels

Do problems like this happen frequently? I can't remember the last time that something like this happened at all in Linux. There certainly have been bugs introduced by incorrect backports and complaints about it, e.g. https://lore.kernel.org/stable/Y1DTFiP12ws04eOM@sol.local... (which I thought LWN covered but I can't find the article about it), but I can't remember an incorrect backport causing data corruption or some other problem so widespread as to retroactively tell people not to upgrade.

Actually, that particular thread is interesting in its own way, because it's complaining about an AI called "autosel" that picks patches to backport even if they're not tagged as Cc stable. But the problematic patch in this case https://lore.kernel.org/r/20231013121350.26872-1-jack@sus... wasn't picked up by autosel, it was explicitly tagged Cc stable and was explicitly claimed to be fixing a data loss bug of its own - which also means this isn't an example of "it's good to have it just in case." It was believed to be broke. That's why they fixed it.

On the other hand, I think there are a huge number of real fixes that have come from stable kernels actually being vigorously maintained.

A system as complex as Linux kernel development is never going to have zero errors of commission and zero errors of omission at the same time. One error of commission in a really long time seems like a fine tradeoff to me. We don't really have a sense of how many problems in stable kernels are left unfixed because nobody thinks to backport the change. (Anecdotally, I think 1-2 times a year at my day job where we run upstream stable, we find something broken, spend the effort to track it down, and discover that it has indeed been fixed in a newer kernel and never backported.)

In fact, LWN had an article recently about how not enough patches are being backported to stable kernels for ext4 in particular, and someone (ideally an enterprise customer) needs to step up and do so: https://lwn.net/Articles/934941/ That article also compared individual kernel branches, for which some had enterprise-backed backports and some didn't, and the general sense is the one with fewer backports was the less desirable one. I think it will require way more than a single mistake to argue convincingly that the current policy is wrong.


to post comments


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds