LWN.net Logo

Survey responses

This survey is now closed

This survey has 483 responses

What is your occupation?

11 2% Not telling
60 12% Kernel developer
186 38% User-space developer
124 25% System administrator
19 3% Manager
83 17% Other

How many machines (of different types) do you run 2.6 kernels on?

2 0% --
1 0% None
32 6% Just one
342 70% Up to 10
64 13% 11 to 25
20 4% 26 to 50
19 3% 50 to 500
3 0% Over 500

Have you encountered a 2.6 kernel bug in the last two years?

7 1% --
137 28% No
338 70% Yes

Do you feel that 2.6 kernels are becoming less reliable?

12 2% --
404 83% No
67 13% Yes

Do you think that bugs reported to kernel mailing lists get enough attention?

185 38% --
81 16% No
217 44% Yes

Do you think that bugs reported in the kernel Bugzilla get enough attention?

251 51% --
143 29% No
89 18% Yes

Kernel bugs that affect my systems tend to be resolved...

103 21% --
70 14% Too slowly
262 54% As quickly as one would expect
48 9% Surprisingly quickly

Do you think that the kernel developers are emphasizing new features at the expense of stabilization?

66 13% --
237 49% No
180 37% Yes

[The following questions relate to your experience with specific kernel bugs. There are questions for up to five distict bugs you have encountered; if you do not have that many to talk about, please just skip to the end at the appropriate point to submit your answers. For the first 2.6 kernel bug you encountered: ]

Kernel version where bug was discovered:

203 42% --
103 21% 2.6.9 or earlier
13 2% 2.6.10
20 4% 2.6.11
19 3% 2.6.12
13 2% 2.6.13
18 3% 2.6.14
43 8% 2.6.15
34 7% 2.6.16
17 3% 2.6.17

By which version was this bug fixed?

220 45% --
28 5% 2.6.10
19 3% 2.6.11 (or 2.6.11.x)
13 2% 2.6.12 (or 2.6.12.x)
12 2% 2.6.13 (or 2.6.13.x)
11 2% 2.6.14 (or 2.6.14.x)
30 6% 2.6.15 (or 2.6.15.x)
32 6% 2.6.16 (or 2.6.16.x)
27 5% 2.6.17 (or 2.6.17.x)
10 2% 2.6.18-rc
81 16% Still unfixed

What was the source of the affected kernel?

199 41% --
143 29% kernel.org
40 8% Debian
35 7% Fedora
17 3% Gentoo
18 3% Red Hat
10 2% SUSE
16 3% Ubuntu
5 1% Other distributor

Which kernel subsystem was affected by the bug?

200 41% --
55 11% Core kernel
43 8% Disk driver
42 8% Network driver
34 7% USB driver
36 7% Other device driver
13 2% Sound subsystem
19 3% Suspend/resume
41 8% Somewhere else

In which ways did you respond to this bug?

137 28% Suffered in silence
71 14% Reported to a kernel mailing list
24 4% Reported on the kernel Bugzilla
72 14% Reported to distributor
37 7% Submitted a patch to fix it
3 0% Posted a disparaging blog entry

Were any binary-only kernel modules loaded?

200 41% --
242 50% No
41 8% Yes

[If you have encountered a second kernel bug, please describe your experience by answering the questions below. Otherwise you may skip to the bottom to submit your answers. ]

Kernel version where bug was discovered:

380 78% --
25 5% 2.6.9 or earlier
3 0% 2.6.10
7 1% 2.6.11
8 1% 2.6.12
5 1% 2.6.13
4 0% 2.6.14
29 6% 2.6.15
11 2% 2.6.16
11 2% 2.6.17

By which version was this bug fixed?

384 79% --
9 1% 2.6.10
1 0% 2.6.11 (or 2.6.11.x)
4 0% 2.6.12 (or 2.6.12.x)
7 1% 2.6.13 (or 2.6.13.x)
1 0% 2.6.14 (or 2.6.14.x)
13 2% 2.6.15 (or 2.6.15.x)
9 1% 2.6.16 (or 2.6.16.x)
7 1% 2.6.17 (or 2.6.17.x)
5 1% 2.6.18-rc
43 8% Still unfixed

What was the source of the affected kernel?

378 78% --
51 10% kernel.org
15 3% Debian
10 2% Fedora
5 1% Gentoo
6 1% Red Hat
3 0% SUSE
12 2% Ubuntu
3 0% Other distributor

Which kernel subsystem was affected by the bug?

379 78% --
18 3% Core kernel
14 2% Disk driver
21 4% Network driver
4 0% USB driver
10 2% Other device driver
9 1% Sound subsystem
17 3% Suspend/resume
11 2% Somewhere else

In which ways did you respond to this bug?

48 9% Suffered in silence
23 4% Reported to a kernel mailing list
8 1% Reported on the kernel Bugzilla
24 4% Reported to distributor
13 2% Submitted a patch to fix it
3 0% Posted a disparaging blog entry

Were any binary-only kernel modules loaded?

381 78% --
89 18% No
13 2% Yes

[Please describe a third kernel bug by answering the questions below, or skip to the end to submit your answers.]

Kernel version where bug was discovered:

451 93% --
8 1% 2.6.9 or earlier
0 0% 2.6.10
3 0% 2.6.11
2 0% 2.6.12
0 0% 2.6.13
2 0% 2.6.14
12 2% 2.6.15
2 0% 2.6.16
3 0% 2.6.17

By which version was this bug fixed?

452 93% --
2 0% 2.6.10
1 0% 2.6.11 (or 2.6.11.x)
2 0% 2.6.12 (or 2.6.12.x)
0 0% 2.6.13 (or 2.6.13.x)
1 0% 2.6.14 (or 2.6.14.x)
3 0% 2.6.15 (or 2.6.15.x)
3 0% 2.6.16 (or 2.6.16.x)
4 0% 2.6.17 (or 2.6.17.x)
0 0% 2.6.18-rc
15 3% Still unfixed

What was the source of the affected kernel?

451 93% --
15 3% kernel.org
6 1% Debian
3 0% Fedora
2 0% Gentoo
4 0% Red Hat
0 0% SUSE
2 0% Ubuntu
0 0% Other distributor

Which kernel subsystem was affected by the bug?

451 93% --
6 1% Core kernel
1 0% Disk driver
3 0% Network driver
0 0% USB driver
5 1% Other device driver
5 1% Sound subsystem
6 1% Suspend/resume
6 1% Somewhere else

In which ways did you respond to this bug?

7 1% Suffered in silence
7 1% Reported to a kernel mailing list
6 1% Reported on the kernel Bugzilla
12 2% Reported to distributor
3 0% Submitted a patch to fix it
0 0% Posted a disparaging blog entry

Were any binary-only kernel modules loaded?

453 93% --
28 5% No
2 0% Yes

[Please describe a fourth kernel bug below, or skip to the end if you have no such bug to report.]

Kernel version where bug was discovered:

467 96% --
3 0% 2.6.9 or earlier
0 0% 2.6.10
1 0% 2.6.11
4 0% 2.6.12
0 0% 2.6.13
0 0% 2.6.14
4 0% 2.6.15
3 0% 2.6.16
1 0% 2.6.17

By which version was this bug fixed?

469 97% --
2 0% 2.6.10
1 0% 2.6.11 (or 2.6.11.x)
0 0% 2.6.12 (or 2.6.12.x)
0 0% 2.6.13 (or 2.6.13.x)
0 0% 2.6.14 (or 2.6.14.x)
1 0% 2.6.15 (or 2.6.15.x)
2 0% 2.6.16 (or 2.6.16.x)
3 0% 2.6.17 (or 2.6.17.x)
0 0% 2.6.18-rc
5 1% Still unfixed

What was the source of the affected kernel?

467 96% --
10 2% kernel.org
2 0% Debian
1 0% Fedora
1 0% Gentoo
2 0% Red Hat
0 0% SUSE
0 0% Ubuntu
0 0% Other distributor

Which kernel subsystem was affected by the bug?

467 96% --
3 0% Core kernel
2 0% Disk driver
2 0% Network driver
1 0% USB driver
2 0% Other device driver
0 0% Sound subsystem
2 0% Suspend/resume
4 0% Somewhere else

In which ways did you respond to this bug?

2 0% Suffered in silence
6 1% Reported to a kernel mailing list
3 0% Reported on the kernel Bugzilla
4 0% Reported to distributor
5 1% Submitted a patch to fix it
1 0% Posted a disparaging blog entry

Were any binary-only kernel modules loaded?

467 96% --
15 3% No
1 0% Yes

[Should you have been unlucky enough to encounter a fifth bug, please describe it below. You're almost done!]

Kernel version where bug was discovered:

473 97% --
4 0% 2.6.9 or earlier
0 0% 2.6.10
1 0% 2.6.11
0 0% 2.6.12
1 0% 2.6.13
1 0% 2.6.14
1 0% 2.6.15
1 0% 2.6.16
1 0% 2.6.17

By which version was this bug fixed?

473 97% --
2 0% 2.6.10
0 0% 2.6.11 (or 2.6.11.x)
0 0% 2.6.12 (or 2.6.12.x)
0 0% 2.6.13 (or 2.6.13.x)
1 0% 2.6.14 (or 2.6.14.x)
0 0% 2.6.15 (or 2.6.15.x)
1 0% 2.6.16 (or 2.6.16.x)
1 0% 2.6.17 (or 2.6.17.x)
0 0% 2.6.18-rc
5 1% Still unfixed

What was the source of the affected kernel?

473 97% --
6 1% kernel.org
1 0% Debian
1 0% Fedora
1 0% Gentoo
1 0% Red Hat
0 0% SUSE
0 0% Ubuntu
0 0% Other distributor

Which kernel subsystem was affected by the bug?

473 97% --
5 1% Core kernel
1 0% Disk driver
1 0% Network driver
0 0% USB driver
1 0% Other device driver
0 0% Sound subsystem
1 0% Suspend/resume
1 0% Somewhere else

In which ways did you respond to this bug?

4 0% Suffered in silence
3 0% Reported to a kernel mailing list
1 0% Reported on the kernel Bugzilla
1 0% Reported to distributor
1 0% Submitted a patch to fix it
1 0% Posted a disparaging blog entry

Were any binary-only kernel modules loaded?

472 97% --
10 2% No
1 0% Yes

[ Thank you for answering all of these questions. Please click on the button below to submit your answers and view the survey's results so far. ]


(Log in to post comments)

Survey: Linux kernel quality

Posted Jul 10, 2006 3:05 UTC (Mon) by vonbrand (subscriber, #4458) [Link]

Interesting survey. Just that the memory gets fuzzy...

Survey: Linux kernel quality

Posted Jul 10, 2006 3:16 UTC (Mon) by kirkengaard (subscriber, #15022) [Link]

Option for "Searched internet to see if anyone else had this problem; found patch that fixed it" or "Reverted to previous version" missing or assumed to be covered under "suffered in silence"; also no option for bugs in non-stable. Most of the bugs I encounter are in non-stable trees, but I guess we assume that that will be the case when we test non-stable. That's why we test them. :)

Survey: Linux kernel quality

Posted Jul 10, 2006 4:04 UTC (Mon) by lutchann (subscriber, #8872) [Link]

Also needed options for "Hacked it in my own tree but didn't have time to write a bug report or make a proper patch" and "Still suffering although surely somebody else has fixed this bug by now but I haven't had time to upgrade and find out", so both of those were filed under "Suffered in silence".

Survey: Linux kernel quality

Posted Jul 10, 2006 4:50 UTC (Mon) by tetromino (subscriber, #33846) [Link]

What about "blamed all on old dodgy hardware & lack of funds to replace same, only to find that the newest rc magically fixes the problem"?

Survey: Linux kernel quality

Posted Jul 10, 2006 6:05 UTC (Mon) by krp (subscriber, #4866) [Link]

LOL, and the more "evil" corollary:

"Blamed all the random hangs/crashes on ancient dodgy hardware, so decided to install Windoze on it" ... (and run Fedora Core 5 somewhere else.)

I might note: W2K3R2 runs perfectly on that ancient Supermicro P3TDE6-G dually, and older FC4 kernels did also... but something at about 2.6.16 or later chokes on that ancient SMP server mobo though, (newer FC4 or FC5 kernels all die horribly on that box.) Its too old to struggle with though... easier to just build a newer box for Linux use.

Of course I was looking for an excuse to build a new system anyway... 8-)

Survey: Linux kernel quality

Posted Jul 10, 2006 6:32 UTC (Mon) by beejaybee (subscriber, #1581) [Link]

Hmm, in my experience it usually works the other way round - linux runs just fine on old boxes but has problems detecting hardware on new systems - especially laptops - presumably because peripherals are getting tied to windoze in some way, or at least because linux is being crippled by binary-only drivers.

Anyhow it scarcely matters any more - 5 year old hardware is more than powerful enough for anyone except hardcore gamers (who are going to be tied to windoze anyway) and, in my experience, new systems are much less reliable than they once were - increasing churn rate and price competition seem to be reducing new PC systems build quality, hence reliability, to the point where the retailers and consumers could easily be removed from the loop altogether - just ship direct to the landfill!

Survey: Linux kernel quality

Posted Jul 10, 2006 15:42 UTC (Mon) by tjc (subscriber, #137) [Link]

5 year old hardware is more than powerful enough for anyone except hardcore gamers (who are going to be tied to windoze anyway) and, in my experience, new systems are much less reliable than they once were...
As a fully recovered Quake II adict I can understand that people like their games, but I and several million other people could get along really well with a 2D graphics card and all the 3D stuff ripped out of the X server.

...increasing churn rate and price competition seem to be reducing new PC systems build quality, hence reliability, to the point where the retailers and consumers could easily be removed from the loop altogether - just ship direct to the landfill!
Yes, it's bad. I gave up on consumer-level PCs some years ago and build from parts. It's the only way you can get a good power supply.

However, as a very casual TV viewer (my "home theater" features a 19" tube) I have adopted the landfill approach to consumer electronics. When something breaks I go to Best Buy and buy the cheapest replacement I can find. When that breaks, I dumpster it and buy another. I can't imagine where they're going to find room to pile all this stuff 100 years from now.

The instant fix is the enemy of the correct fix...

Posted Jul 11, 2006 1:48 UTC (Tue) by xoddam (subscriber, #2322) [Link]

> As a fully recovered Quake II adict I can understand that people like
> their games, but I and several million other people could get along
> really well with a 2D graphics card and all the 3D stuff ripped out of
> the X server.

Indeed. After installing a 'new' distro one of the first config changes
I make is to change to vesafb, so I don't need to worry or care about 3D
problems.

My most recent one was that the (free) X.org ATI driver installed by the
Ubuntu Breezy CD (just few weeks before the release of Dapper) crashed
and burned because it tried to use instructions that weren't implemented
on my CeleronM CPU. Instant fix: switch to vesa X server.

I googled for other people with the same problem, found none, didn't
bother reporting it because Badger obviously wasn't tested with this
now-common combination of CPU and video chip and the release was
effectively already obsolete. I presume Dapper fixed the problem (google
still doesn't find any bug reports), but I haven't actually checked.

(BTW I did look for a laptop with 'supported' Intel graphics hardware,
but the price was right on this one and I liked the styling. What can I
say?)

The instant fix is the enemy of the correct fix...

Posted Jul 11, 2006 9:26 UTC (Tue) by nix (subscriber, #2304) [Link]

I used to do that sort of thing too, bu the increased likelihood that desktops will *require* 3D on high-end video cards for efficient drawing has made me stop doing that. (OK, call it a certainty; when video cards drop 2D support, we'll *have* to use the 3D stuff for everything).

(The existence of the rather nice OpenGL Elite variant 'oolite' is entirely unconnected from my decision to get DRI working on my machines. Entirely.)

The instant fix is the enemy of the correct fix...

Posted Jul 11, 2006 14:16 UTC (Tue) by tjc (subscriber, #137) [Link]

Death, taxes, and 3D desktops. Somehow it just doesn't ring.

The instant fix is the enemy of the correct fix...

Posted Jul 11, 2006 17:09 UTC (Tue) by unaiur (subscriber, #3563) [Link]

vesafb has a terrible refresh rate on my mobo/monitor. I only get
1024x768 at 60Hz, just an eye killer. The via drivers isn't a jewel, but
I get 85Hz.

Survey: Linux kernel quality

Posted Jul 10, 2006 3:28 UTC (Mon) by cventers (subscriber, #31465) [Link]

The only real bug I've been grappling with is one in sky2 that is causing
TX hangs. I hope to find some time to work on it personally as I think the
maintainer can't reproduce the issue (though I know of at least one friend
who has the same problem).

The biggest difficulty I'm aware of here is that it is still sometimes
hard to get your hands on hardware documentation. If I could personally
get some of the docs necessary, I might be able to take a better stab at
the problem.

So I would say that interaction with hardware vendors remains our biggest
problem.

Survey: Linux kernel quality

Posted Jul 10, 2006 9:46 UTC (Mon) by gvegidy (subscriber, #5063) [Link]

drivers for sky2 seem to be hard to get correct.

the driver written by syskonnect/marvell themselves (sk98lin) works for
only about a day on one of my systems, I gave up contacting their support
after a bit of useless back and forth. A colleague told me that even their
windows driver sometimes hangs.

The moment Stephen Hemminger took up developing a clean linux driver I
thought the problems will be gone soon. But it looks like it's still a way
to go.

So stay away from sky2 nics if you can...

sky2

Posted Jul 10, 2006 11:18 UTC (Mon) by dwmw2 (subscriber, #2063) [Link]

I spent a large part of the last couple of weeks watching the tennis by streaming MPEG from DVB-S out over my local network with a sky2 card. I didn't have a single lockup from the sky2. In fact, I'm listening to the radio the same way now, because I haven't bothered moving the satellite cable back to the standalone receiver box.

This is with the current Fedora (2.6.17) kernel, which I believe has no patches to the sky2 driver.

sky2

Posted Jul 11, 2006 17:29 UTC (Tue) by unaiur (subscriber, #3563) [Link]

To watch a dvb channel over a gigabit ethernet card isn't any serious
test.

I've usign a traffic generator over these cards and they hang before
processing 1 billion packets (about 10 minutes). These cards are buggy
(or at least, the dual port PCI Express x4 card).

sky2

Posted Jul 11, 2006 20:07 UTC (Tue) by dwmw2 (subscriber, #2063) [Link]

To watch a dvb channel over a gigabit ethernet card isn't any serious test.
True. I mention it mainly because it used to fail.

Survey: Linux kernel quality

Posted Jul 10, 2006 9:19 UTC (Mon) by job (subscriber, #670) [Link]

That over half of the end users taking this survey has even encountered a kernel bug is a bit alarming.

I've also had stability issues with some of the releases, which aren't really any identified bugs so I can't put them in here.

On my desktop machine my USB flash reader hasn't been working reliably since I switched from devfs to udev, only sometimes does the device node even exist. It may be a problem with Gentoo rather than udev, I haven't had time to look into it yet.

Survey: Linux kernel quality

Posted Jul 10, 2006 12:07 UTC (Mon) by arjan (subscriber, #36785) [Link]

well there is also the element of selective audience.. people who did encounter a kernel bug are more likely to fill in this questionair since they have a more vested interest in the topic...

Survey: Linux kernel quality

Posted Jul 10, 2006 14:50 UTC (Mon) by nix (subscriber, #2304) [Link]

Also, I haven't encountered *any* bugs in the heavily-tested core kernel: everything I've seen has been either
- on relatively obscure arches (ARM, MIPS, SPARC64), or
- in relatively-new features (e.g. initramfs), or
- in drivers for specific hardware

The heavily-tested does-it-basically-work-on-common-hardware stuff seems to be cracked: it's just that everyone has *some* piece of obscure hardware or other, and they might get bitten by bugs in it...

Survey: Linux kernel quality

Posted Jul 10, 2006 15:49 UTC (Mon) by tomsi (subscriber, #2306) [Link]

I filled out one bug in the survey, because of misssing/error in IDE driver.

This was fixed in a later version of the kernel (rc-? at the time).

To behonest, that isn't eally a bug, just new hardware not supported yet. I think that for many, that is the issue - new hardware not supported yet.

Survey: Linux kernel quality

Posted Jul 10, 2006 16:33 UTC (Mon) by thoffman (subscriber, #3063) [Link]

The only kernel related problems that have really bothered me in the last couple of years are:

1. Firewire is flaky. Sometimes works, sometimes not. It has been this way since 2.4 or earlier. After reporting bugs a few times I've chosen the "suffer in silence" option and am replacing all my firewire hardware with USB 2.0.

2. On some oddball hardware (Tyan Dual-Slot-1 Pentium-III mb) the PCI IRQ routing is broken for kernels > 2.6.1x or so... didn't get as far as tracking down the details, just enough to know that Fedora Core 5's kernel will boot but can't find any PCI devices, while Fedora Core 4 works perfectly. The machine is a server so I don't care enough to investigate further :-(

3. Incompatibilities between changing APIs for wireless drivers vs. the userspace stuff... the binary "madwifi" driver works great with NetworkManager in FC4 but stopped working in FC5, this is apparantly a known bug which the Fedora team doesn't care about, luckily it works perfectly in Ubuntu, which was one of many reasons I switched distros :-)

Survey: Linux kernel quality

Posted Jul 10, 2006 10:47 UTC (Mon) by mgh (subscriber, #5696) [Link]

An option to indicate something about how the fix was resolved would be good. I ticked "Suffered in Silence" although in fact I searched widely, used the release notes to ident the issue and where it had been fixed and then built the 2.6.17 kernel which fixed the issue (sata_nv lost ticks with x86_64 x2). Decided not to file a report because all the hard work had already been done.

Surrounding software

Posted Jul 10, 2006 14:11 UTC (Mon) by kleptog (subscriber, #1183) [Link]

I myself havn't come across bugs in the kernel, but I've had more trouble with the growing constellation of software that surrounds it which is slowly becoming essential for proper functioning of the system.

For example, I recently found several bugs in the r128 DRI driver. Not the kernel portion mind you (though there are still some flaky bits) but in the user space library using it. Still kind of irritating when that breaks on you.

I wonder how many issues are caused by people combining the kernel with versions of userspace libraries that the developers didn't expect?

Surrounding software

Posted Jul 10, 2006 14:52 UTC (Mon) by nix (subscriber, #2304) [Link]

<chorus>
udev, alsa-lib
</chorus>

(OK, so both of these are a *lot* better at back-compatibility these days: but back in the 2.6.10--15-odd days they made upgrades... somewhat scary.)

Survey: Linux kernel quality

Posted Jul 10, 2006 15:56 UTC (Mon) by jamesm (subscriber, #2273) [Link]

Should the survey itself be subscriber-only? It seems like it'd be better to leave it open to everyone.

Survey: Linux kernel quality

Posted Jul 10, 2006 16:54 UTC (Mon) by Lovechild (subscriber, #3592) [Link]

For me, running Fedora Development I find that most bugs get solved quickly, however a "security" update to the SCSI drivers disallowed users to burn DVD/CDs and despite reporting this many months ago and offering up a reward for a viable fix this has still not be resolved.

I take it the developers feel that I couldn't be trusted to not fuck up my DVD burner so it got disabled all together.

Aside that one issue and the fact that Reiser4 has yet to be merged (which would cause considerable less frustration amongst the users I talk to), I find the new kernel development methodology much more pleasing as a user.

The fact that I no longer have to wait 2 years for an unstable kernel tree to stabilise to the point where it can be considered "stable" but really isn't for several releases to come, due to lack of testing is brilliant.

The kernel has definately improved in both stability and functionality. Take the .17 release, coupled with updates to mesa and x.org I finally have working stable hardware acceleration using the r300 driver.

Survey: Linux kernel quality

Posted Jul 10, 2006 18:26 UTC (Mon) by nix (subscriber, #2304) [Link]

No, it's that they don't trust hostile local non-root attackers not to send commands that vape other disks on the same bus (possibly *permanently* depending on what commands they send).

And neither should you.

Survey: Linux kernel quality

Posted Jul 11, 2006 3:58 UTC (Tue) by Lovechild (subscriber, #3592) [Link]

I think you misunderstood me, it's not that I'm annoyed they are keeping me safe. In fact one of the primary reasons I use Fedora, because they work to keep me safe.

Anyways, what I meant was that FC4 burned CDs for me (and about 20 other people who CC'ed on the bug), FC5 shipped without the ability and a solution was discussed, as I cannot burn CDs nor code the solution up, I offered up 100USD (I'm hoping others will pledge as well but even a little has to count) for an acceptable solution. This being the only longterm bug I've experienced in about 2 years, I'm very pleased with, and grateful for, the Linux kernel.

However I think functionality regressions are bad things in all but a few cases.

I hope it gets fixed shortly since it's preventing a personal deployment.

Survey: Linux kernel quality

Posted Jul 11, 2006 9:27 UTC (Tue) by nix (subscriber, #2304) [Link]

Hang on, you can't even burn CDs as root?

That's odd. I guess it's a bug :)

CD burning

Posted Jul 11, 2006 22:27 UTC (Tue) by dlang (subscriber, #313) [Link]

as noted above, the vunerability was that anyone on the system could send commands to any drive to do anything to it (completely defeating security, along with potentially destroying the hardware)

the change was to eliminate this capability for non-root users (root is allowed to destroy your hardware :-)

the fix for non-root users is to set your burning software suid root. then it runs as root and is allowed to do whatever it wants.

what more are you looking for?

there has been talk about creating filters in the kernel that would allow burning specific commands but not allow other 'dangerous' commands, but nobody who is willing to talk knows what commands are nessasary (this is made even more difficult by the fact that some of the nessasary commands and the dangerous commands are, in fact, the same command with different parameters)

David Lang

rats...

Posted Jul 10, 2006 17:27 UTC (Mon) by joey (subscriber, #328) [Link]

Had an opportunity to be one of the few reporting on all 5 bugs, but I forgot about my laptop's SATA CD issues. Found in 2.6.16, suffered in silence since I knew it was somewhat experimental anyway, partially fixed in 2.6.17.

To be fair, I probably also missed a dozen or more kernel bugs that didn't affect me very much, and were fixed quickly. It seems easier to remeber bugs like the kernel no longer recognising a machine's PCI bus at all, especially when they're still not fixed after 9 months. (http://bugs.debian.org/332962) I also have the joy of running 2.6 on a variety of old non-x86 hardware and so encountering lots of other such breakage.

bug history; stability vs features

Posted Jul 10, 2006 18:33 UTC (Mon) by sanjoy (subscriber, #5026) [Link]

I just filled in the bug report history section by searching bugzilla.kernel.org to see what I'd reported: 21 bugs! A few are fixed or are duplicates; a couple have patches that fix them waiting to go into mainline (5000, 5989); a few are perhaps unfixable with the current design (4928, 5069); one I still need to report test results, whoops (6293); and several are still open. 5989 (s3 suspend hanging on the second suspend) involved tons of debugging and testing, so I'm especially glad that it has a candidate patch.

Suspend/resume have been the most buggy areas for me, probably because they are hard to debug and ACPI is incredibly complex. There's a saying that every Unix program grows in complexity until it can order pizza or maybe read email. The ACPI specification (an interpreter running in the kernel, who would have thought) seems well beyond that point!

I'd like to see a bug-sqashing-only release and perhaps have it recur every odd minor number (e.g. 2.6.19 then 2.6.21 ...). Stability and correctness are more important than features, not least because correctness implies making the internal designs robust and that gives a good platform for adding features.

I try to teach my physics students to "consider extreme cases" as a way of reasoning. On this issue, the two extremes are Windows, which is full of features and cruft; and TeX, which is rock solid. TeX admittedly solves a different problem and has different requirements than an OS does. But users appreciate that the base is rock solid. I think Linux needs to move towards the TeX side of the extreme.

bug history; stability vs features

Posted Jul 11, 2006 4:29 UTC (Tue) by xoddam (subscriber, #2322) [Link]

Thankyou Sanjoy for reporting your bugs and seeing through the process of
getting them analysed and fixed. It is users like you who make Linux
stable and usable for the rest of us (lazy freeloaders that we are :-).

> TeX, which is rock solid.

> I think Linux needs to move towards the TeX side of the extreme.

TeX is rock solid because it takes a well-defined input and produces a
well-defined output, and those basic requirements have not changed in
decades. It differs in sophistication but not in kind from filters like
cat, awk and grep, which are similarly solid.

Software which serves such ill-specified requirements as 'support every
piece of junk hardware in the world', 'run the buggy binary ACPI scripts
on laptop X without turning it into a brick' and 'placate the pundits
who claim Free Software is not ready for the desktop' are necessarily
rapidly evolving and therefore more error-prone.

It is unfortunate that the Linux kernel core falls into this category and
not that of a stable filter, but unavoidable. Users who require a
rock-solid base to their systems always have the alternative of sticking
to their old hardware and older kernels, or even switching to another OS
which is developed with solidity, and not features, as first priority.

Bug-squashing only releases

Posted Jul 13, 2006 2:46 UTC (Thu) by Max.Hyre (subscriber, #1054) [Link]

I'd like to see a bug-sqashing-only release and perhaps have it recur every odd minor number (e.g. 2.6.19 then 2.6.21 ...)
Well, considering that a 2.7 kernel has been roundly rejected, it appears that 2.6.x.y will be the value for the rest of time. Therefore, `2.6' no longer contains any useful information, and should be dropped, leaving kernel version numbers of `x.y', so the suggestion reduces to
I'd like to see a bug-sqashing-only release and perhaps have it recur every odd [...] number (e.g. 19 then 21 ...)
Hmmm, sound familiar?

Bug-squashing only releases

Posted Jul 14, 2006 5:21 UTC (Fri) by dlang (subscriber, #313) [Link]

if the cycle was fast enough it wouldn't be a problem, however when the cycle got to multiple years between stable kernels the result was that kernels shipped by distros were significantly incompatable with each other.

right now they are haveing trouble keeping the cycle to a couple of months. give them a little time to stabilise that (and hopefully speed it up a bit) and then they can try a stabilization-only kernel without the preasure for other improvements being too big

even with the current cycle there are things being posted today that are being debated as 2.6.19 or 2.6.20 pushing those out to .22 or so is hard on the people developing the idea.

it's also frequently hard to draw a line between a fix and a new thing, in many cases the best fixes (long term anyway) are drastic departures from what was there before.

Not necessarily *getting* less reliable; rather, consistently not very reliable

Posted Jul 10, 2006 18:29 UTC (Mon) by Richard_J_Neill (subscriber, #23093) [Link]

I wouldn't necessarily say that newer kernels (2.6.16+) are worse than older ones (2.6.12 ish). BUT, none of the kernels are sufficiently stable.

I run, or administrate about 10 machines, mainly desktops. None of the desktops ever have uptime exceeding about 2 months. I always unplug USB devices with some trepidation (even USB mass storage), and quite often experience X server crashes which take down the whole system. This inevitably happens when I have left the machine unattended for a week, and cannot physically reset it!
[A software watchdog, and panic=60 doesn't help much]

The servers are pretty solid in normal use, although not absolutely.

In my view, Linux is no longer sufficiently stable. [I had 500 days uptime on a 2.4.19 kernel in a server]. There are (at least) the following general problems:

1)When the kernel crashes, i.e. locks completely, requiring a reset, there is no way to get diagnostic info. Why can we use, say, the floppy drive for debug after a panic? What happened to the idea of running a second copy of the kernel designed to take control after a panic, and dump diagnostics to file?

2)There still exist unkillable processes, or unmountable filesystems.
kill -9 should be able to terminate a process, even if it is in the "D" state. Otherwise, a reboot is required to solve the problem; worse, you can't reboot remotely, since the kernel hangs just before the end of the shutdown.

3)Unplugging a USB device which is still in use (eg a USB NIC, or a USB sound card) is a nearly guaranteed way to get a crash. It shouldn't happen!
[In this example, I specifically exclude mounted filesystems on USB.]

I am led to believe that this is due to poor driver design; we have just comissioned a USB driver for a USB I/O device, and this can be repeatedly hot-(un)plugged even while it is active, without causing any trouble at all. So, why not sound or networking?

On a related note, when you try to unmount a filesystem, and get a "filesystem busy" error, or when you try to rmmod a module and get "module in use", there needs to be a way to find out what is using it, and, if desired, to kill the process.

4)X server crashes should never take down the entire system. But they often do, especially when using 3D accel. This applies with both non-free (nvidia) drivers and free drivers (eg xorg's ati driver for the r128)

5)Most importantly, the class of people such as myself (technical users, who are not kernel developers) make up the majority of the Linux community. We own most of the hardware, and experience most of the more subtle bugs. Yet, as a resource, we go untapped, since there is very little we can do to debug a problem with our hardware. This is a dreadful waste of most of the community! Is there any way to automate debugging/diagnostics so that we can be of more help?

Regards,

Richard

Not necessarily *getting* less reliable; rather, consistently not very reliable

Posted Jul 10, 2006 18:50 UTC (Mon) by shirgall (guest, #24745) [Link]

http://sourceware.org/systemtap/

Not necessarily *getting* less reliable; rather, consistently not very reliable

Posted Jul 10, 2006 19:13 UTC (Mon) by arjan (subscriber, #36785) [Link]

3D acceleration runs partially in the kernel, and in case of the binary crud, almost entirely. In addition, X is effectively a ring 3 kernel component, it is in just all aspects part of the kernel entirely; it does DMA, it programs PCI devices etc etc. That means it exposes the same risk as the kernel to the stability of the system...

and with 3D one of the most common failure scenarios is that the 3D card locks up the PCI bus. Not a lot the kernel can do after that to get anything useful out ;)

Not necessarily *getting* less reliable; rather, consistently not very reliable

Posted Jul 11, 2006 0:10 UTC (Tue) by dlang (subscriber, #313) [Link]

you can find what is useing a filesystem with lsof (just do a lsof |grep path and you should be able to find the processes)

since X is accessing the memory and PCI bus directly there are all sorts of ways that it can crash the system that the kernel cannot do anything about. blame X, not the kernel for those crashes.

Survey: Linux kernel quality

Posted Jul 10, 2006 20:46 UTC (Mon) by landley (subscriber, #6789) [Link]

Ok, the IPW2200 wireless driver used to go off into never-never land back
in 2.6.10 or so, and since 2.6.12 it just does a lot of firmware resets.
Does that count as "binary-only driver loaded"? It's binary-only firmware
loaded into the card, but it's not a binary-only kernel module...

Lots of the questions about bugs are that way, where I'm thinking "the
info provided by answering these questions can't possibly be useful" as
I'm filling them in...

Survey: Linux kernel quality

Posted Jul 11, 2006 2:01 UTC (Tue) by jsedwards (guest, #38952) [Link]

While I have only found a couple of specific bugs myself, my overall feeling is that 2.6 is not as stable as 2.4 was. When we upgraded from the 2.6.7 to 2.6.9 although it fixed some problems, we also had new problems we didn't have before. I can't help but think that part of the problem is doing development in the stable build. For one of the applications I have been working on it would have been much better to not have to deal with new features and major changes to subsystems everytime we upgrade.

Survey: Linux kernel quality

Posted Jul 13, 2006 10:50 UTC (Thu) by toni (subscriber, #17559) [Link]

I also forgot to mention one or the other kernel problem, but I agree to the assessment that the 2.6 series having no consistent quality. One of these was unable to mount my parallel IDE drive which is no problem with an older version that has known security holes, and on another machine, going from 2.6.16 to 2.6.17 sort of breaks disk access (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=377959). But at least it brought back sound which was lost from 2.6.12 to 2.6.16 ;-}

On the downside, I must note that I don't test every kernel variant, but only change the kernel when I suspect that a new kernel might have a fix for (my) known breakage, so error reporting isn't what it theroetically could be.

Survey: Linux kernel quality

Posted Jul 11, 2006 5:55 UTC (Tue) by dilinger (subscriber, #2867) [Link]

Oh man, the number of ACPI bugs that I've encountered in the 2.6 series.. I can't keep count. Other than that, however, my main problems have been related to old, known bugs (ie, pdflush being terribly bitrotten) and flakey support for newer hardware (marvell sata controllers, 8 port promise sata controllers, various wireless cards..).

Survey: Linux kernel quality

Posted Jul 13, 2006 3:15 UTC (Thu) by richardfish (subscriber, #20657) [Link]

For me, the best indicator of my perception of the kernel quality is the
fact that I am willing to try out -rc kernels on my main (desktop)
systems.

A good question to capture this might be:

"Assuming you do not have dedicated testing hardware, how nervous are you
about trying out -rc1 kernels?"
1. "No way in h*** am I doing that!"
2. "Only after making a fresh backup, and a few hours to kill"
3. "I cross my fingers while rebooting"
4. "No worries"

-Richard

Survey: Linux kernel quality

Posted Jul 13, 2006 4:36 UTC (Thu) by dlang (subscriber, #313) [Link]

I actually just upgraded my desktop at work from 2.6.17-rc2 to 2.6.18-rc1 (I never bothered to install a later 2.6.17 kernel on there, I would have had to reboot the box :-)

actually, what triggered the upgrade is that I did hav ethe box lock up on me, so I upgraded to the latest. if it crashes on me I'll try something else.

on my server at home I went from the gentoo 2.6.15 to 2.6.17-rc1, to 2.6.17-rc2 and finally identified my problem as a driver issue (so I stopped useing that scsi controller until the fix was announced), it's now runnign 2.6.18-rc1 as well, nothing critical there yet (the old server is still around, just in case) I'll probably start waiting for -rc2 on the home server and stick with my laptop or desktop for the -rc1 or -git versions.

production systems are very different. those I pick what looks like a good one and start testing it when it's released, I don't start actually running critical systems on it for several weeks (I like to wait until 2.6.x+1 is out to see what fixes it includes, but the -stable support greatly reduces the need for that)

an interesting question would be to list a bunch of kernel versions and ask which one would you run on a test machine, which on a personal desktop/laptop, which you would let run on a friends machine, and which into production

David Lang

Bad Question: By which version was this bug fixed?

Posted Jul 13, 2006 7:40 UTC (Thu) by PaulDickson (subscriber, #478) [Link]

The results from this question aren't really meaningful unless it is directly linked to the previous question (when was it discovered?). Perhaps the result should be published as a duration rather than endpoint.

Survey: Linux kernel quality

Posted Jul 13, 2006 15:25 UTC (Thu) by MisterIO (subscriber, #36192) [Link]

I think that the new development cycle(even though it is not so new anymore) is too fast and it implies 3 different problems : 1) A not enough tested kernel , 2) An increasingly out of date documentation of the kernel and 3) New kernel developers have many more problems when trying to follow and be part of the development cycle.

Survey: Linux kernel quality

Posted Jul 14, 2006 14:51 UTC (Fri) by ranmachan (subscriber, #21283) [Link]

An option to add comments would be nice.
I've encountered two kernel bugs recently, but both on friends systems.
One was Fedora Core, where after upgrading from 2.6.15 to 2.6.17 (IIRC) suspend to disk no longer worked from within X (would resume to text mode).
The other was a freshly installed OpenSuSE 10.1, where the KDE TV application would just hang in D-State.

Now that I think of it I remember a third bug, where wpa_supplicant just fails if I add the wireless interface (ipw2200) into a bridge. I guess I should report that last one (or at least verify if this is still the case with a bleeding-edge kernel).

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Mana