The problem is that the vast majority of the regressions are not ones that will be found by tests like you describe.
They are regressions under some specific workload that a user has, or with specific hardware (or combinations of hardware) that a user has.
As a result, it's impossible for any single test project to have comprehensive coverage.
The kernel regression tracker (and assistants) are not people running the tests, they are people working with the users who run into problems, helping those users identify the relevant factors of their environment/workload, identifying where the problem started, helping get the report in front of the appropriate maintainer, and then tracking it to keep it from getting lost (and hopefully the user doesn't disappear in the middle of all this), with an additional task of trying to combine duplicate reports (which gains reliability in terms of the user reporting)
If it's a workload related problem, once a problem workload can be simulated, the maintainer/developer may be able to go off and work on it without needing the user to test it all the time, but the user is still needed to test the resulting fix because the simulated workload may not match the real workload as closely as everyone thinks.
If it's a hardware related problem, it requires someone with the appropriate hardware to test the result (and if it's a combination of hardware it's even worse), the maintainers and developers cannot have every variant of hardware, so it's impossible for them to test, and so when they think they have the fix, they will again need to go back to the user to validate the fix.
Posted Aug 29, 2012 23:22 UTC (Wed) by raven667 (subscriber, #5198)
[Link]
Maybe this is just a different usage of the term "regression". In kernel-land a regression as you describe is where a change negatively affects performance in some unique scenario, Linux kernel developers take this seriously. For obvious functionality type regressions it would seem a test suite could, over time, test every code path.
KS2012: The future of kernel regression tracking
Posted Aug 29, 2012 23:27 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
The problem is that many code paths will only be used with specific hardware, or in other specific conditions.
Plus, you need to remember that the kernel is multi-threaded, so timing of different things happening matters as well.
There are very few "obvious functionality" type regressions.
More testing is better than less testing
Posted Aug 30, 2012 0:17 UTC (Thu) by sdalley (subscriber, #18550)
[Link]
More testing is better than less testing.
Of course, 99999 out of 100000 tests are going to pass every time. But processing power is so cheap, why not let it work for you? You never know what might have broken if there's no test.
And, once you have found an unexpected regression, you can then write a test for it if there wasn't one before. Having that test available, as part of a loadable "test" module, say, in a generic kernel release then means it can be triggered in the field on request by anyone at all with the latest update and whatever oddball hardware they have. This would greatly increase the result data and illuminate the circumstances and manner in which the test passes or fails.
And there's nothing like having to write unit tests to thrash out the idiocies and dark corner cases of a new interface. You don't even have to run them initially, the mere mental questioning debugs the design before its stupidities get coagulated into something you're going to have to maintain for years afterward.
I have myself grumbled about having to write tests. But I have never regretted the payoff in quality. And it's very satisfying to see the new release of one's library run its test suite in the blink of an eye and know that you didn't break anything important with your last changes. Or maybe you did, and you get to save yourself a wasted release and maybe a brown paper bag too.
And of course, the test results are gold dust to anyone who wants to document the interfaces. I hadn't heard that the Linux kernel's documentation has been a howling success story. A more formal requirement to write unit tests as part of the kernel development process would go far to improving things. And, dammit, it's just satisfying to know that what you wrote definitely works.
I fully accept that there is no getting around the need for skill and interaction in tracking down the more devious regressions. It's just that we should work towards an environment where that is made as easy as possible.
I hope that new kernel regression trackers are soon appointed and get the support and remuneration their important job deserves. If not, one is saying in the loudest possible language that, words aside, quality is actually only for wimps and doesn't really matter.
More testing is better than less testing
Posted Aug 30, 2012 0:49 UTC (Thu) by dlang (✭ supporter ✭, #313)
[Link]
the choice is not "do test driven development" vs "you obviously don't care about quality in the slightest". Taking this attitude is just insulting the people you are asking to do more work.
maintaining the tests has overhead as well, it's not free. If they fail is it because the system is broken? or because the test didn't get changed to match the new way the kernel works? (and is the new way the kernel works actually going to work in the real world)
More testing is better than less testing
Posted Oct 1, 2012 18:20 UTC (Mon) by oak (subscriber, #2786)
[Link]
> More testing is better than less testing.
No, whether that's true depends a lot on the tests and also what you're testing.
I have been in a situation where analyzing results from tests took more time than actually manually finding the bugs *and* fixing them. Eventually we got rid of them. They were quality tests and at wrong level in the stack.
Tests are mostly useful only if:
* they're (mostly) automated
* they produce statistically reliable and non-ambivalent results
* writing, maintaining and analyzing their results save time in the long run
Preferably they should also be mostly auto-generated so that they get automatically updated with the code, there's less code to maintain and issues with it are more apparent.
Does test code need tests?
More testing is better than less testing
Posted Oct 1, 2012 22:49 UTC (Mon) by sdalley (subscriber, #18550)
[Link]
I agree with what you say.
Under the impression there is a relative dearth of formal testing, more testing will obviously be better. I was assuming testing at the appropriate levels in the stack.
It'd already be great to pick lower-hanging fruit like automatic tests for library/programmer/kernel/userspace/filesystem/device interfaces, which must never change without good reason and when they do break, you jolly-well want to know as soon as possible. Regressions Are Bad. If these tests were put into an installable package, then anyone who wanted to help in the testing effort could run them in their own peculiar environment and have the failures forwarded optionally to a central clearinghouse, like Microsoft does with their system crashes.
Harder tests like response latency under varying loads and configurations, memory management fragmentation problems, etc, necessarily have a symbiotic relationship to the code they are testing, and have to be maintained together with them.
More testing is better than less testing
Posted Oct 4, 2012 0:02 UTC (Thu) by nix (subscriber, #2304)
[Link]
Does test code need tests?
That depends on its complexity. Testsuite engines are often complex enough to merit it, but it's hard to figure out a way to test most tests except to test the same thing again in a different way and make sure the results of both tests agree. If you know of a less tiresome way that doesn't require doing the same work more than twice (because thinking of a second way to test something is often harder than thinking of the first), I'm all ears.
More testing is better than less testing
Posted Oct 4, 2012 0:10 UTC (Thu) by Cyberax (✭ supporter ✭, #52523)
[Link]
Yes, test code needs tests too. There are projects to do things like finding if tests are efficient. For example, if you comment out (or somehow disable) a part of unit test setup - then the test should fail.