LWN: Comments on "Kernel bugs: out of control?" https://lwn.net/Articles/183053/ This is a special feed containing comments posted to the individual LWN article titled "Kernel bugs: out of control?". en-us Mon, 22 Sep 2025 09:16:51 +0000 Mon, 22 Sep 2025 09:16:51 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Kernel bugs: out of control? https://lwn.net/Articles/184526/ https://lwn.net/Articles/184526/ nix The problem is that a large number of the stability problems are in specific drivers and in the interaction of drivers with hardware... and you can't test that without having the hardware.<br> <p> Most things are amenable to automated testing, but kernels are one of those things that aren't entirely so. (The non-driver parts, sure: I can imagine a UML-based kernel core testsuite, for instance. But the driver parts are where the nastiest bugs often lie.)<br> Sun, 21 May 2006 17:12:35 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184525/ https://lwn.net/Articles/184525/ nix If you're rebooting whenever *any* security fix to 2.6 -stable comes out, you're wasting your own time. Read the changelogs, or preferably the patches: if you're not even compiling in the code which was fixed, there's no point upgrading.<br> <p> The patches are *short*. Exploit that. :)<br> <p> <p> Personally, my firewall is a UML-based virtual machine, and the bridge to the external world has no IP address on the host, so that most attacks don't affect the host at all, but are passed straight through to the UML instance. Immediate security fixes are a matter of bouncing that instance: perhaps a minute and a half of network downtime, and most of *that* is ADSL negotation delay. The only annoyance is the dropping of persistent connections.<br> <p> If you have vast amounts of state on your firewall, such that rebooting it is hard, you're doing something *very* wrong.<br> Sun, 21 May 2006 17:09:46 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184524/ https://lwn.net/Articles/184524/ nix Er, how often do you *run* traceroute? I don't run it so often myself that I'd notice immediately if it broke. It could easily be a week or so between runs...<br> Sun, 21 May 2006 17:05:37 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184522/ https://lwn.net/Articles/184522/ nix Actually, stable means `we think it will work'. Length-of-support has nothing to do with it.<br> Sun, 21 May 2006 17:00:31 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184521/ https://lwn.net/Articles/184521/ nix Most of the security fixes weren't `immediate must-install'. A goodly number related to single drivers or obscure new protocols: SMB/CIFS, SCTP...<br> <p> ... I mean, the SCTP code is, what, a kernel release old? Thus, it has bugs; some of which may be remotely exploitable (by the nature of network protocol code). How terribly shocking.<br> Sun, 21 May 2006 16:58:59 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184520/ https://lwn.net/Articles/184520/ nix Actually, 2.4 was classifiable as `stale' as soon as most devs weren't running it. 2.5 was unusual becuse the abortive IDE rework made it so unstable that the devs stuck with 2.4; but after that was reverted, 2.4 staled out (to coin a horrible neologism) really rather fast.<br> Sun, 21 May 2006 16:56:09 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184322/ https://lwn.net/Articles/184322/ quintesse So maybe Tanenbaum was right afterall and we should switch to a microkernel? *grin*<br> <p> <a rel="nofollow" href="http://www.cs.vu.nl/~ast/reliable-os/">http://www.cs.vu.nl/~ast/reliable-os/</a><br> Thu, 18 May 2006 21:04:40 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184237/ https://lwn.net/Articles/184237/ malor It feels like you've read about three sentences of what I've written here, and you're reacting just to that. Most other replies I've put in this thread address these issues. I'd suggest reading them... I'm not going to repeat all of them here. <br> <p> The strongest objection I have to the current model is that we are forced to take new features with our bugfixes, because they will not support kernels for more than two months. New features = new bugs. New bugs = new patches. New patches = new features. New features = new bugs. And so on. <br> <p> 'Stability', as defined from the point of view of the Linux kernel, should mean:<br> <p> 1) It's maintained with security patches;<br> 2) No fundamental new features are added;<br> 3) Drivers are added, if possible, without violating #2. <br> <p> In other words.... do it like 2.4 did it, after Marcelo took over. If a new network card comes out, of course you can add the driver to the source tree... it's not going to affect anyone else. If that new driver requires an update to the memory management model of the kernel, then you don't include it in the stable branch, but rather in the dev tree. <br> <p> I think they might have retrofit the USB system in 2.4... it's been awhile, and I wasn't following it closely, because I didn't need to. I do know that their backports from 2.5 were done without large-scale overhauls of kernel subsystems; they kept the changes focused and very limited. And, by and large, the 2.4 kernel was very stable. It wasn't as solid as 2.2, but it was quite acceptable.<br> <p> Basically, the kernel devs had the model NAILED during 2.4. This high-speed 2.6 development, on the other hand, is an absolute disaster. These guys are some of the smartest in the business, but they are still human, and they are running into the limitations of their own intelligence. The code has become too complex for them to maintain... it's hard and nasty and difficult work now, and instead of slowing down development, they're ignoring the bugs and SPEEDING UP instead, apparently because that's more fun.<br> <p> It's significantly less fun for people trying to keep production machines running.<br> <p> Andrew Morton is most unhappy about the quality of the kernel. That should tell you something. <br> <p> Thu, 18 May 2006 10:05:56 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184234/ https://lwn.net/Articles/184234/ arcticwolf You're confusing (unintentionally, I assume) two distinct meanings of the word "stable" here. Stable can mean either:<br> <p> 1) Bug-free enough to not crash on most systems encountered in the wild (i.e., "stable" in the sense of "production-ready");<br> 2) Not undergoing changes.<br> <p> It's important to keep in mind that these are not related to each other. When you say "it lasted THREE DAYS", you apparently mean that it was replaced with a newer patch (.16) three days later - that's the second definition of stable. So, yes, in that sense, 2.6.16.x isn't stable, but that's just because the developers are actually fixing security issues that are found and releasing patches immediately.<br> <p> Would you rather have them sit on those patches for weeks or months? Well, if you do, you can still have that; nobody's forcing you to apply those new patches. <br> <p> But in any case, what Andrew Morton talked about was stability in the first sense, and that's a different beast. How long would it have taken for 2.6.16.15 to crash on your boxen? It's hard to say, but I'd guess that unless you'd have been rather unlucky, it would've been more than three days.<br> <p> So, the answer to your question is: you choose the latest one that's available. Whether you continue to apply newer patches as they come out is your choice, not ours, and complaining that you have downtime when patching security issues in the *kernel* is pretty silly. That's how things are in the real world. (And it's still true that nobody's forcing you to apply anything, so if you'd rather avoid downtime than patch newly-found issues, just don't apply them.)<br> Thu, 18 May 2006 09:46:21 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184229/ https://lwn.net/Articles/184229/ malor And which 'stable' 2.6 kernel do I choose? And define 'large value of Y'. Bu your definition, 2.6.16.15 should be 'stable', but it lasted THREE DAYS. <br> <p> Thu, 18 May 2006 09:17:16 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184228/ https://lwn.net/Articles/184228/ malor You're not reading what I'm saying. I'm using THE KERNEL from Debian unstable, because the kernel from testing doesn't work at all (2.6.15 crashes within an hour in my 865 machines), and the kernel from stable doesn't support all my hardware. I use nothing else from unstable on production servers. I have exactly one machine running the actual unstable distribution in its entirety, because that one clues me in when there's (yet another) kernel patch. <br> <p> Debian's kernel is pretty much vanilla 2.6.16. Linus et al call Linux 2.6.16 'stable'. <br> <p> The kernel devs' expectation that 'the distros' will magically fix all their bugs amounts to simple handwaving, shirking of their fundamental responsibility: when they call it stable, it should BE STABLE. <br> <p> Software that's supported for only two months is not, pretty much by definition, 'stable'. <br> <p> Thu, 18 May 2006 09:16:03 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184210/ https://lwn.net/Articles/184210/ gowen He's running Debian <b>unstable</b>. And he's complaining that it's unstable. Linus is not the only one struggling with nomenclature.<p> Kernel 2.6 has a stable branch. The stable branch of 2.6.x is called 2.6.x.y, for large values of y. Thu, 18 May 2006 06:48:54 +0000 Kernel bugs: out of control? https://lwn.net/Articles/184154/ https://lwn.net/Articles/184154/ k8to I think your comments on versioning are not far from the mark. The fact <br> of these "minor" stable relases, eg. 2.6.X.Y, is that they are _smaller_ <br> changes than have ever occurred in the stable series before. It is true <br> that these smaller changes do not receive widespread real-world <br> production evaluation, but no non-stable release kernel (rc versions <br> included) ever receives enough attention to catch even some showstopper <br> bugs.<br> <p> So I think you are right to question this change, but the balancing facts <br> are that the release candidate process for the Linux kenel doesn't seem <br> very effective, and the changes made in the revision series are <br> _strongly_ conservative. <br> <p> It is important to remember that in this particular (highly visible, <br> highly open) development process, there is very little pressure to <br> deviate from the conservative perspective in these updates.<br> Wed, 17 May 2006 23:17:29 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183895/ https://lwn.net/Articles/183895/ oak <font class="QuotedText">&gt; And 2.6.0 through 2.6.8 or so worked great on that same board.... </font><br> <font class="QuotedText">&gt; so I really shouldn't have HAD to file bug reports. I was, after all, </font><br> <font class="QuotedText">&gt; tracking a 'stable' kernel. Stuff that worked in 2.6.0 should work </font><br> <font class="QuotedText">&gt; in 2.6.16. </font><br> <br> You cannot really expect that unless you know that there's <br> a regularly executed test setup: <br> - with the same HW as yours <br> - with the similar software and same kind of load as yours <br> <br> For example fixing a bug (for a setup developer has) might make <br> (e.g. an already existing) bug somewhere else in the code happen <br> more likely in your setup. <br> <br> Only testing and error detection inside &amp; outside kernel can <br> help in catching those. The testing has to be automated, <br> it should not produce (too many) false positives, and it has <br> to pinpoint fairly well where the problem happens so that <br> the bugs can be fixed. Otherwise only alternative developer <br> has is to resolve the bugs as WORKSFORME. <br> <br> It would be nice if kernel developers would provide an automated <br> test-set for people who "live on the bleeding edge" which they could <br> run on their test setups before deploying the kernel on production <br> machine. If the test-set outputs an error, you could just forward <br> it to kernel.org and the automatically produced bug report would <br> have all the relevant info; your kernel config, HW info, OOPS etc... <br> <br> If the automated test-set would go through, then you could do your <br> own tests on the kernel before putting it into real use. And if <br> those fail, you could propose tests to be added to the automated <br> test-set so that those kind of problems are caught earlier. <br> <br> Tue, 16 May 2006 17:49:43 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183875/ https://lwn.net/Articles/183875/ hazelsct No. It's more like, don't call it "stable" unless/until it is.<br> Tue, 16 May 2006 16:23:29 +0000 Measure first, fix later https://lwn.net/Articles/183721/ https://lwn.net/Articles/183721/ Richard_J_Neill This would be a very good use for those (almost) redundant floppy drives. After a kernel panic, you shouldn't touch the HDD in case you make it worse. But writing to an FDD would be a great solution.<br> <p> *Warning* you'd have to make this activated manually, because it would trash anything that was already on a floppy belonging to an unsuspecting user.<br> Mon, 15 May 2006 21:18:33 +0000 Measure first, fix later https://lwn.net/Articles/183613/ https://lwn.net/Articles/183613/ walles As said in the article, since there's no good measurement of how buggy the kernel is (or what BUG() macros that trigger) the kernel's bugginess can't be neither quantified nor measurably improved.<br> <p> Currently, when I've seen Linux systems panic, they have printed a bunch of information to screen and then given me the option to re-boot.<br> <p> What *should* happen IMO, to make these things measurable, is to store that information somewhere it can survive the re-boot. At a convenient point in time, the up-and-running re-booted system should ask the admin if (s)he wants the bug to be registered in some central repository.<br> <p> This way we'd have *a lot* more statistics on what usually goes wrong inside the kernel. And what parts need fixing the most.<br> <p> That said, for me, Linux kernels usually work very well. But just because I don't have any problems doesn't mean nobody has them...<br> <p> Mon, 15 May 2006 14:00:16 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183586/ https://lwn.net/Articles/183586/ malor 2.6.14 broke *traceroute*. Give me a break.<br> <p> Mon, 15 May 2006 05:48:31 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183582/ https://lwn.net/Articles/183582/ ChristopheC I think it is unfair to say the kernel developer do not test their patches. However, they can only test them on the few combinations of hardware they have access to.<br> <p> To discover the bugs, the kernel needs wide-spread testing. But few people are willing to test the development releases (-rc) - the problem has been mentioned countless times on lkml and here on lwn. So they have to release often toge tthe needed coverage. (This is a somewhat simplified explanation, of course)<br> Mon, 15 May 2006 04:27:34 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183538/ https://lwn.net/Articles/183538/ Baylink This sub-thread speaks to a topic near and dear to my heart: what does a version number *mean*? <P> Let me quote here my contribution to the Wikipedia page on the topic, based on my 20 years of observation of various software packages: <blockquote> <P> A different approach is to use the major and minor numbers, along with an alphanumeric string denoting the release type, i.e. 'alpha', 'beta' or 'release candidate'. A release train using this approach might look like 0.5, 0.6, 0.7, 0.8, 0.9 == 1.0b1, 1.0b2 (with some fixes), 1.0b3 (with more fixes) == 1.0rc1 (which, if it's stable enough) == 1.0. If 1.0rc1 turns out to have bugs which must be fixed, it turns into 1.0rc2, and so on. The important characteristic of this approach is that the first version of a given level (beta, RC, production) must be identical to the last version of the release below it: you cannot make any changes at all from the last beta to the first RC, or from the last RC to production. If you do, you must roll out another release at that lower level. <P> The purpose of this is to permit users (or potential adopters) to evaluate how much real-world testing a given build of code has actually undergone. If changes are made between, say, 1.3rc4 and the production release of 1.3, then that release, which asserts that it has had a production-grade level of testing in the real world, in fact contains changes which have not necessarily been tested in the real world at all. </blockquote> <p> The assertion here seems to be that an even higher level of overloading on version numbering ("even revision kernels are stable") and it's associated 'social contract' are no longer being upheld by the kernel development team. <P> If that's, in fact, a reasonable interpretation of what's going on, then indeed, it's probably not the best thing. I'm not close enough to kernel development to know the facts, but I do feel equipped to comment on the 'law'. Sat, 13 May 2006 20:19:37 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183460/ https://lwn.net/Articles/183460/ dps Quite a few bugs affect only one kernels with a specific feature, for exmaple the recent smbfs bug requires you to use smbfs and have a cracked server. If there is an alsa or module exploit my (linux) firewall is not affected becaue it supports neother those features nor any devices not part of the box.<br> <p> At least my firewall would not be content without the stateful inspection features of iptables. Without this I suspect the firewall would be more complex and provide less protection.<br> <p> A new version of the mm integer overflow bugs or ping of doom would be much more exciting.<br> <p> Sat, 13 May 2006 15:32:31 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183364/ https://lwn.net/Articles/183364/ error27 I think that the 2.6.x releases aren't viewed as "major". The 2.6.0 release was major and people spent tons of time in tracking down bugs.<br> <p> Back then the bug tracking wasn't in place so people had to hand compile lists of bugs and it was pretty hit or miss. These days people have bugzillas. The kernel.org is pretty worthless, but Fedora's is pretty decent. Suse's bugzilla is still revving up but that will be useful too.<br> <p> I've got 3 main issues with 2.6.16. For aacraid, there is new debug code in Fedora and -mm that calls BUG(). With mptscsi there was a massive rewrite and now it doesn't work with my nStor. Sky2 is another complete rewrite that doesn't work on my hardware. The sk98lin driver that sky2 is replacing wasn't that great. The code was rough, the Makefile were astonishingly bad and it had issues with bonding. The sky2 developer is active so I'm happy about the progress being made.<br> <p> <p> Sat, 13 May 2006 06:05:53 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183453/ https://lwn.net/Articles/183453/ chromatic <blockquote><em>Quality can't be retrofit; if it wasn't there to begin with, it can't be added later, especially not by other people.</em></blockquote> <p>How can this possibly be true? Consider OpenBSD's auditing process, for example.</p> Fri, 12 May 2006 19:53:43 +0000 A positive spin on the kernel bugs issue https://lwn.net/Articles/183442/ https://lwn.net/Articles/183442/ pr1268 <p>I'm not too sure anyone else sees the current rate of kernel bug fixes in the positive light I see them:</p> <p>The sheer fact that bug fixes are coming faster is indication that people are adopting Linux in increasing numbers. The (relatively) few users of Linux of several years past found (relatively) few reasons to complain about a kernel bug. Fewer bugs were noticed therefore fewer bugs needed fixing.</p> <p>Fast forward to the present. As pervasive as Linux usage is these days, doesn't it stand to reason that more bugs will get noticed? That's GOOD! Think about how the process works. As long as the bug reports keep flowing in, and the kernel developers keep troubleshooting and fixing the bugs, then the <b>rate</b> at which the bugs are being noticed really shouldn't matter. The kernel has grown in size consistently since before 1.0, and the number of bug reports has grown correspondingly.</p> <p>I'd almost be more concerned if bug reports slowed to a trickle (or stopped completely). With a piece of software as large, powerful, and complex as the Linux kernel, this would surely indicate lack of usage or apathy. Seeing the bug reports tells me that people are using Linux more frequently, and doing the responsible thing of reporting a bug when they encounter one.</p> Fri, 12 May 2006 18:01:42 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183422/ https://lwn.net/Articles/183422/ snitm You seem like a high maintenance individual ^H^H^H^H^H^H^H^H^H^H environment deserving of RHEL or some other enterprise distro. If cost is an issue you should look at embracing a RHEL clone. CentOS tracks RHEL closely; run CentOS and upgrade to the RHEL4 update kernels as they are released.<br> Fri, 12 May 2006 15:45:27 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183409/ https://lwn.net/Articles/183409/ vmole <ol> <li>I want a kernel that supports the latest hardware <li>I don't want the kernel to change, or have bugs </ol> <p>Pick one. Fri, 12 May 2006 13:51:55 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183372/ https://lwn.net/Articles/183372/ scarabaeus The kernel folks should invest some work into an automated testing framework, one which allows you to run the tests without booting into the kernel. While automated tests, e.g. unit tests, won't make the problem go away, they are incredibly useful for realizing that your own changes to the code break some other part of it.<br> <p> Actually, I'm a bit surprised that things do work so well without unit tests even though hardly anyone understands the entire kernel. With code of that complexity, usually when you pull at one end, something breaks at the other end. I guess the reason why the dev process still works so well is the intensive peer review of patches.<br> <p> With the complex interdependencies within the kernel, writing test cases is certainly a challenge. For example, writing a case which tests the scheduler's behaviour in a certain situation won't be easy. But once a "scheduler test framework" is in place, it can be used for future work on the scheduler, so the work will pay off IMHO.<br> Fri, 12 May 2006 10:50:17 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183343/ https://lwn.net/Articles/183343/ malor Oops, I inserted a paragraph in the wrong place. If you swap the last two paragraphs, it'll be more readable, although the concluding note will be in the wrong place. :)<br> <p> Fri, 12 May 2006 00:02:26 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183342/ https://lwn.net/Articles/183342/ malor I'm running testing on production servers because testing is current enough to be useful and essentially never breaks things. I'm just using the kernel from unstable, becuase that's the only one that works in all cases. I actually had to use a Ubuntu kernel for awhile when things were really bad with 2.6.15. <br> <p> Linux 2.4 always pushed security fixes out right away... I don't remember Marcelo sitting on security patches. He'd accumulate a bunch of non-security stuff and roll it out all at once, but security patches were immediate release. And in the 5.5 years of 2.4's existence, it's had 32 total releases... and 10 of those were when Linus was still tinkering with it. So 22 is more accurate. <br> <p> 22 patches in 5 years, I can handle, particularly since many of them were optional... just new drivers, not security fixes, which meant they could be deployed whenever there was time.<br> <p> 16 patches in five weeks, nearly all of them immediate must-install security fixes... that's not so good. <br> <p> Thu, 11 May 2006 23:59:41 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183340/ https://lwn.net/Articles/183340/ malor Yes, I think exactly that. Development speed is not the same as code quality. The new process is tuned to let them do more of the 'fun' stuff (writing new code), and force them to do less of the 'unfun' stuff, like making sure things actually work. It's also to force more testers; they've explicitly said one of the reasons they're doing it this way is to force people to test new code. <br> <p> I don't mind testing code when there's a call for testers (and if I can slot in some time). I do mind being forced to test beta-quality code by them calling it 'stable' and refusing to support code that's more than two months old.<br> <p> As far as exponentiation goes, you're exactly right... I'm not sure if I hit this idea yet in this thread. What that means is that as the kernel grows, development needs to slow down, to cover all the various interactions. Instead, they're _speeding up_, not testing, and expecting the Rest of World to fix their problems. <br> <p> I tried to report the APIC bug on that VIA board. I first emailed the ACPI author (got my acronyms confused :) ), who very promptly replied, and politely told me I was talking to the wrong person. Then I tried mailing the APIC maintainers twice, but didn't get a reply. I dropped it after that... probably should have sent it to the catchall address, but forgot it. And now I don't have the board anymore, so a bug report won't be very useful. <br> <p> The 865 bugs I can't diagnose, because it's all remote, so I haven't even tried to report it. Those machines are production, and I can't afford to take them down for testing. So my bug report wouldn't be very useful. And 2.6.16 has worked well so far, although the unending reboots are painful.<br> <p> And 2.6.0 through 2.6.8 or so worked great on that same board.... so I really shouldn't have HAD to file bug reports. I was, after all, tracking a 'stable' kernel. Stuff that worked in 2.6.0 should work in 2.6.16.<br> <p> Thu, 11 May 2006 23:52:09 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183339/ https://lwn.net/Articles/183339/ malor I can remember only three or four 'emergency' patches to 2.4, after Linus branched to 2.5... ones that needed to be installed immediately because they were DOS or security fixes. The size of most of those patches is adding new drivers. When an emergency fix was needed, 2.4 had it out in a day or two, but for the most part, it rolled up a bunch of stuff at once so that people could test the upcoming kernel. I tested a few of those, but never saw any problems with them. Marcelo was wonderfully conservative in his philosophy about patching. (ie, don't add features, fix the old ones, and just add drivers.)<br> <p> The vast majority of the 16 updates to 2.6.16 have been security-related, and they've required immediate reboots, at least if you're security-conscious. Whether they're 10KiB or 10MiB, they still are primarily to fix security problems. Running a Linux 2.6.16 free of known security holes this month, in other words, has necessitated a reboot every two or three days. <br> <p> I don't remember that ever happening on ANY earlier kernel, from 0.8 through 2.4, though you're certainly welcome to correct me if I'm misremembering. All security updates to 2.4, as far as I know, came through _really_ fast, so the patch *releases* should be very comparable in terms of total numbers. 16 releases in five or six weeks versus 32 in 5.5 years looks like total shit, IMO. And Linus didn't even go to 2.5 until about Linux 2.4.10, so you could argue that it's really 16 versus 22.<br> <p> Thu, 11 May 2006 23:34:10 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183337/ https://lwn.net/Articles/183337/ malor I agree with you about forcing quality... that's a great idea. If I thought the new development process would actually DO that, I'd be enthusiastically behind it. Instead, it's just about speed, speed, speed... and avoiding the stuff that's no fun to do, like bugfixing and testing. <br> <p> Waving your hands in the air and expecting other people to fix your programs is not, in my long experience supporting developers, the way to get it fixed, particularly not properly. <br> <p> As far as switching OSes goes, I've already stopped using Linux on my firewalls because of the unending stream of security reboots. Netfilter is faster and more featureful than OpenBSD's pf, and its language is more amenable to shell scripting, but the first mission of a firewall is to stay up. I can throw OpenBSD on a firewall and not have to update it again for a couple of years. This means no downtime, which means happy users. I've never seen any Linux kernel that lasted that long without security holes.<br> <p> FreeBSD is looking better all the time... I've been talking about switching over, but haven't yet. If matters continue as they have, maybe I will. And you'll have one less complaining user, which, from your tone, you may prefer. <br> <p> Thu, 11 May 2006 23:24:01 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183334/ https://lwn.net/Articles/183334/ malor As the other poster said... I have put a great deal of money (thousands of dollars) into Linux and ancillary products over the years. Far, far more than I've spent on Microsoft products. Part of that money has gone to pay kernel devs at places like Red Hat, and likely has indirectly resulted in the creation of Linux-related jobs. Some of that money goes to LWN. <br> <p> What I'm asking for here benefits all of us... you, me, AND the kernel devs. Stability and security are what got Linux to this point, to where Linux experience is a good thing to have on a resume, to where you can get good jobs knowing only Linux. <br> <p> That will not remain true if the fundamental strengths of Linux are lost in a chase for 'development speed', which benefits primarily the developers, and not so much the Rest of World.<br> <p> Thu, 11 May 2006 23:13:18 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183333/ https://lwn.net/Articles/183333/ vonbrand <p> Please, don't compare the 16 surgical patches to 2.6.16 in the stable series with the huge patches to 2.4.x (at a short glance, they seem to average around 1MiB bzipped; compared to 10KiB tops for 2.6.16.y) Thu, 11 May 2006 22:56:02 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183312/ https://lwn.net/Articles/183312/ edstoner Well, this is a little late in coming with a lot of comments before, but...<br> <p> I have no problems with the way things are. I have a large network to maintain (over 2600 clients) and I don't have any problems. All of our servers run Gentoo, and get HEAVILY used. I've yet to upgrade a kernel from bugs. When I build the boxes (which I do fairly frequently) I put the latest kernel (that is in Gentoo's portage system as vanilla-sources, but it seems to pretty closely track what the real latest kernel is) on them.<br> <p> About 100 of our client systems run linux, and fairly shortly I'm expecting to switch almost all over to linux. If money is needed to get a full-time "kernel bug fixer/tracker", I herby right now pledge $5,000.00 a year (assuming that they promise to fix any bugs that I run into with the kernel). It is certainly worth that much to my organization. How can there not be 20 other organizations in the US alone where it is worth at least that much? How much does a "kernel bug fixer/tracker" cost?<br> Thu, 11 May 2006 20:50:09 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183297/ https://lwn.net/Articles/183297/ oak <font class="QuotedText">&gt; The problem with the backporting is that it's boring work, and </font><br> <font class="QuotedText">&gt; the kernel devs don't like to do it. Stability is also boring work, </font><br> <font class="QuotedText">&gt; with a similar outcome. The new kernel development model is for THEM </font><br> <br> Are you argumenting that if the kernel development tools and processes <br> were more cumbersome for the kernel developers, the code quality would <br> improve? Let me doubt that... <br> <br> <br> <font class="QuotedText">&gt; Linux 2.2 was incredibly stable; it NEVER fell over. </font><br> <br> It also supported a lot less hardware. Note that while the number of <br> components grows linearly, the possible interactions between them grow <br> exponentially. <br> <br> <br> <font class="QuotedText">&gt; even though I'd been struggling with bugs for months </font><br> <br> How good bug reports you made of them? Bugs cannot be fixed <br> if they are not known... <br> <br> Thu, 11 May 2006 19:16:49 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183293/ https://lwn.net/Articles/183293/ oak <font class="QuotedText">&gt; If the approach of continuing to re-engineer interfaces and </font><br> <font class="QuotedText">&gt; systems to eliminate categories of problems offends you, </font><br> <font class="QuotedText">&gt; then the linux kernel in general should offend you, since </font><br> <font class="QuotedText">&gt; this has been the mode of operation since day one. </font><br> <br> This reminds me of the recent change in Glibc, they now <br> abort programs which do double frees. <br> <br> Yes, more programs may now be "appear unstable", but I personally <br> prefer application rather being terminated than silently corrupting <br> my data when they hobble forward with inconsistent state. <br> Broken apps should be shot down as soon as possible so that <br> people know to fix them, this is the Unix way. <br> <br> If you don't force quality, you don't get it. <br> You end up with an unmaintainable mess instead. <br> <br> <br> <font class="QuotedText">&gt; There _are_ other free unixes which have a much more </font><br> <font class="QuotedText">&gt; conservative approach. They are not horrible. </font><br> <br> I'm sure the person complaining here would then <br> complain about the lack of features and HW support... <br> <br> Thu, 11 May 2006 19:04:09 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183290/ https://lwn.net/Articles/183290/ richardr Well, curiously enough, that is pretty much what I do expect. I'm an end user when it comes to linux <br> kernels, and what I want to hear is that "it just works". However, I have happily spent money on <br> distributions I could download for free in the expectation that the money would go to developers to <br> improve performance and *squash bugs*, and to that extent I have bought (a very small part of) a <br> kernel developer. <br> Thu, 11 May 2006 18:55:25 +0000 New bugs? Old Bugs? Many eyes? https://lwn.net/Articles/183262/ https://lwn.net/Articles/183262/ southey The problem is that it is not always clear when the bug was introduced. I am not knowledgable in the kernel but these are latent bugs that either never got exposed by the kernel or the kernel could previously recover from. In some cases (one very recently) adding a new feature actually resulted in finding a bug. Alternatively these new features may also rewrite code that removes old bugs but may also introduce new ones.<br> <p> I think the real problem is being able to replicate these bugs. As evident from the comments to this article and the one for X.org. <br> Thu, 11 May 2006 16:43:13 +0000 Kernel bugs: out of control? https://lwn.net/Articles/183257/ https://lwn.net/Articles/183257/ smoogen Well, I have an idea.. buy yourself a kernel developer to do this work for you. The tenor and tone of your posts seem to be expecting them to do this work for free for you. <br> Thu, 11 May 2006 16:22:54 +0000