LWN.net Logo

Noise is better than bias

Noise is better than bias

Posted Jul 11, 2006 18:31 UTC (Tue) by jimmybgood (guest, #26142)
In reply to: A survey on kernel quality by vblum
Parent article: A survey on kernel quality

I'm going to try real hard once more to explain my point, which seems to have been missed.

If you want to find out something with a survey, a large noisy sample is far better than a small biased sample. If Andrew Morton thinks he's going to find out something by limiting his survey to a small sample of "knowledgeable" respondents, he's making a mistake. He works in an environment where expert knowledge is highly desirable and out of habit, he imagines that experts might be able to tell him whether the kernel is getting buggier.

Even studying the kernel itself with objective code analysis tools is not a valid way to answer his questions, because those tools have recently been applied to the kernel. Many of the bugs those tools can detect have already been fixed, so the bias will tend to make the kernel appear to be less buggy than it really is. A survey is a good approach to finding out if the Linux kernel is getting buggier.

I have no proof that LWN subscribers are biased, but it is generally accepted that professional societies _are_ biased. I think LWN functions more as a professional society than as a social group.

If you want a good survey, procure as many unique responses from as wide a sample as you can get.


(Log in to post comments)

Noise is better than bias

Posted Jul 11, 2006 19:36 UTC (Tue) by nix (subscriber, #2304) [Link]

Fine, so figure out a way of preventing a single malicious attacker from poisoning an open survey by means of an auto-submission robot. Avoid penalizing multiple people behind a single proxy, and detect a single malicious attacker routing false requests via a network such as tor or a bunch of compromised hosts (he needn't be *running* said botnet: there are vast numbers of known botted hosts with open proxies running on them; he can use some of those).

Until you've done that, a scheme whereby survey responses are tied to single entities (like the registered-subscriber scheme) is needed.

Bias BS

Posted Jul 11, 2006 21:48 UTC (Tue) by s_cargo (guest, #10473) [Link]

I think your objections would be valid if Andrew Morton asked to survey kernel developers. He did not. This is where your earlier analogy regarding automotive engineers falls flat. I consider it a completely reasonable assumption that LWN subscribers are "serious" users with no motivation for creating any false sense of kernel quality.

And to be blunt, you and I have paid nothing to keep LWN going. If everyone were "freeloading" as we are, there wouldn't be any LWN to conduct a survey in the first place. You should be grateful you only have to wait one week to have access to what subscribers have paid to access. So stop your whining.

Noise is better than bias

Posted Jul 13, 2006 14:53 UTC (Thu) by lysse (guest, #3190) [Link]

> If you want to find out something with a survey, a large noisy sample is far better than a small biased sample.

Not if the bias actually consists of a useful property you want to capture, not if the large noisy sample results in such a poor signal-to-noise ratio that anything meaningful descends to the level of statistical insignificance, and not if the noise itself is biased.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds