The newest development model and 2.6.14
Posted Nov 3, 2005 15:26 UTC (Thu) by mingo
In reply to: The newest development model and 2.6.14
Parent article: The newest development model and 2.6.14
(disclaimer: i am a Linux kernel developer)
i have read your comments with interest, up to this one:
Instead, the devs seem to think that the distros will have to do the bugfixing. But there's no guarantee they'll do it in compatible ways. If, say, shared memory has a problem, Red Hat may solve it one way, and Mandrake may solve it another, and it's entirely possible they won't be compatible. So if you, the customer, has a problem with a particular program running (like, say, Oracle), then all you end up with is finger pointing. Oracle blames Red Hat, Mandrake blames Oracle.
while i mostly agree with the points you raised before this, you are really - i stress - _really_ wrong regarding compatibility, and about the mechanics of Linux kernel bugfixing.
firstly, distros do not fix bugs 'for themselves', they fix them in the _upstream_ kernel, primarily because they dont want the additional maintainance cost of having to carry a special fix with them. So they try really, really hard to have an upstream solution for whatever bug. If upstream has changed in that area too much they _still_ fix the bug in the upstream kernel and backport that fix to their own kernel. That way they'll be able to monitor the upstream kernel about possible side-effects and reap other QA benefits.
the result: distros fix kernel bugs pretty much the same way: they take an existing upstream fix in 90% of the cases, in 9% of the cases they fix upstream themselves, and backport the upstream fix. Only in a small portion (1%) of cases do they write their 'own' special fix, and such fixes they still try to get 'rid of' via getting it upstream, because per-distro maintainance is expensive.
secondly, the overwhelming majority of stability fixes are cornercase fixes, and only a small minority of bugfixes can introduce something user-visible like an incompatibility. Distros are very much aware of such issues and they try to avoid incompatible changes like fire: incompatibility almost always means 'apps break on our distro only' which causes follow-up regressions, so it's a big no-no.
out of tens of thousands of distro bugfixes i can recall one or two at most that caused some sort of (incidental and harmless) incompatibility, which was quickly fixed later on. So i believe that what you fear is really not happening in practice. Fact is that distributions almost always 'fork' the upstream kernel - and it's a natural thing. SuSE for example has more than 1000 patches ontop of the upstream kernel. So if there was any incentive for an incompatible fork, it would have happened years ago. But it didnt happen, for the reasons i outlined, and for a variety of other reasons.
my personal observation is that the new 2.6 kernel development method has actually improved the dynamics of bugfixing. E.g. Fedora Rawhide (which is as bleeding-edge as you can get - it picks up Linus' kernel tree 2-3 days after Linus _commits_ it to his tree) has quite okay-ish stability to run on most of my boxes, and it even has a daily yum enabled to automatically install all devel RPMs. There's a new kernel RPM almost every day, often with kernel stuff that i wrote perhaps a week ago and which got into Linus' tree a couple of days ago - such a short 'latency of deployment' was unheard of before. In the 2.3/2.4 days we literally had to wait years to get stuff into distro kernels, and having a small latency here helps reliability immensely.
and it is clearly the new 2.6 development method that enabled this: it is stable enough to run bleeding-edge distro code, giving much closer interaction between latest kernel and latest user-space developments. Previously there was a big lag between kernel and userspace - and often the devel kernel didnt even build in a distro setup - let alone boot. So there was alot of effort wasted on trying to fix bugs that could have been found easily by a large, dedicated team of early adopters. With Ubuntu, OpenSuSE and Fedora these early adopters are there now en masse, and are making a difference big time. They are finding kernel bugs much faster, leading to the seemingly paradox situation of "more changes result in more stability".
regarding reliability, there will always be deployments where not even the 2.0 kernel is proven enough. It is the market that decides how fast various distros should go - you can certainly pick something like RHEL to have much more conservative updates. You can vote with your money.
to post comments)