Kernel quality control, or the lack thereof

Posted Dec 8, 2018 1:20 UTC (Sat) by vomlehn (guest, #45588)
Parent article: Kernel quality control, or the lack thereof

One hopes that these test will be added to a well maintained test project, though such projects are even less common than a well maintained development project. LTP comes to mind but there may be other possibilities. And, tests that aren't run don't really exists, for all practical purposes.

Kernel quality control, or the lack thereof

Posted Dec 8, 2018 16:45 UTC (Sat) by marcH (subscriber, #57642) [Link] (9 responses)

> And, tests that aren't run don't really exists, for all practical purposes.

Agreed 200%, this is the core issue:

> > We ended up here because we *trusted* that ...

Either tests already exist and it's just the matter of the extra mile to automate them and share their results.

Or there's no decent, repeatable and re-usable test coverage and new features should simply not be added until there is. "Thanks your patches looks great, now where are your tests results please?". Not exactly ground-breaking software engineering.

Exceptions could be tolerated for hardware-specific or pre-silicon drivers which require very specific test environments and for which vendors can only hurt themselves anyway. That clearly doesn't seem the case of XFS or the VFS.

Validation and automation have a lesser reputation than development and tend to attract less talent. One possible and extremely simple way to address this is to treat the *development* of tests and automation to the same open-source and code review standards.

Kernel quality control, or the lack thereof

Posted Dec 9, 2018 11:17 UTC (Sun) by iabervon (subscriber, #722) [Link] (7 responses)

I think, for this case, fuzzing is probably more useful that developer-written tests. If a developer misses the code for some checks necessary to maintain security constraints, what are the chances they'll write tests that verify that using the API in a way they didn't intend doesn't violate security constraints they didn't think about? I'd be more convinced if they taught a fuzzing framework how to call their API and set it loose on a filesystem with a lot of interesting cases. I care somewhat less that it does what it's supposed to do than that whatever it actually does is something the caller is allowed to do.

Kernel quality control, or the lack thereof

Posted Dec 9, 2018 14:20 UTC (Sun) by saffroy (guest, #43999) [Link] (5 responses)

Fuzzing is extremely useful, but it still needs a *thinking* developer to help it generate interesting cases in reasonable time.

Besides tests themselves, it helps a LOT to have some kind of test coverage report, just to remind you of which parts of the code are never touched by any of your current tests.

Do people publish such coverage reports for the kernel?

Kernel quality control, or the lack thereof

Posted Dec 10, 2018 9:49 UTC (Mon) by metan (subscriber, #74107) [Link] (4 responses)

The tool to generate coverage for kernel is maintained at https://github.com/linux-test-project/lcov it should work, but I haven't tried it.

However I can pretty much say that the main problems I see are various corner cases that are rarely hit (i.e. mostly failures and error propagation) and drivers. My take on this is that there is no point in doing coverage analysis when the gaps we have are enormous and easy to spot. Just have a look at our backlog of missing coverage in LTP at the moment https://github.com/linux-test-project/ltp/labels/missing%..., and these are just scratching the surface with most obviously missing syscalls. We may try to proceed with the coverage analysis once we are out of work there, which will hopefully happen at some point.

The problems with corner cases can be likely caught by combination of unit testing and fuzzing. Drivers testing is more problematic though, there is only so much you can do with qemu and emulated hardware. Proper driver testing needs a reasonably sized lab stacked with hardware and it's much more problematic to set up and maintain which is not going to happen unless somebody invests reasonable amount of resources into it. But there is light at the end of the tunnel as well, as far as I know Linaro has a big automated lab stacked with embedded hardware to run tests on, we are trying to tackle automated server grade hardware lab here in SUSE, and I'm pretty sure there is a lot more outside there just not that visible to the general public.

Kernel quality control, or the lack thereof

Posted Dec 10, 2018 12:57 UTC (Mon) by nix (subscriber, #2304) [Link] (3 responses)

Yeah -- and lcov won't help with the sorts of things this LWN post is talking about anyway. 100%-coverage filesystems could easily still have all these bugs, because they relate to specific states of the filesystem, and *no* coverage system could *possibly* track whether we got complete coverage of all possible corrupted filesystems! (Or, indeed, all possible states of the program: for all but the most trivial programs there are far too many.)

There is no alternative to thinking about these problems, I'm afraid. There is no magic automatable road to well-tested software of this complexity.

Kernel quality control, or the lack thereof

Posted Dec 10, 2018 13:14 UTC (Mon) by metan (subscriber, #74107) [Link] (2 responses)

Exactly there is nothing that would replace well thought tests written by senior developers, we only need to throw more manpower on the problem, which seems to be happening albeit slowly.

Kernel quality control, or the lack thereof

Posted Dec 11, 2018 17:37 UTC (Tue) by nix (subscriber, #2304) [Link] (1 responses)

Indeed if there is any part of the kernel this has really happened for, filesystems and in particular XFS must be it, and probably have the best test coverage of all. I mean, xfstests is called that for a *reason*. :) (I tell a lie: RCU has gone to the next step beyond this, formal model verification. Coming up with a formal model of XFS would be... a big job!)

Kernel quality control, or the lack thereof

Posted Dec 11, 2018 20:59 UTC (Tue) by marcH (subscriber, #57642) [Link]

Interesting, now the question is: how much did/do xfstests offer for the two specific features reported above?

Kernel quality control, or the lack thereof

Posted Dec 9, 2018 17:28 UTC (Sun) by marcH (subscriber, #57642) [Link]

Features or security? Sad but the priority has to be the former to do business. Have fewer more secure features and you lose in the market place almost every time.

Thinking of it computer security is a bit like... healthcare: extremely opaque and nearly impossible for customers to make educated choices about it. From a legal perspective I suspect it's even worse, breach after breach and absolutely zero liability. To top it up class actions are no longer, killed by arbitration clauses in all Terms and Conditions. Brands might be more useful in security though.

https://www.google.com/search?q=site%3Aschneier.com+liabi...

Kernel quality control, or the lack thereof

Posted Dec 9, 2018 13:32 UTC (Sun) by mupuf (subscriber, #86890) [Link]

> Validation and automation have a lesser reputation than development and tend to attract less talent. One possible and extremely simple way to address this is to treat the *development* of tests and automation to the same open-source and code review standards.

This is what we do in the i915 community. No feature lands in DRM without a test in IGT, and CI developers are part of the same team.

My view on this is that good quality comes from:
1) Well written driver code, peer reviewed to catch architectural issues
2) Good tests exercising the main use case, and corner cases. Tests are considered at the same level as driver code.
3) Good understand of the CI system that will execute these tests
4) Good following of the bugs filed when these tests fail

Point 1) is pretty much well done in the Linux community.

Point 2) is hard to justify when tests are not executed, but comes more naturally when we have a good CI system

Point 3) is probably the biggest issue for the Linux CI systems: The driver usually covers a wide variety of HW and configuration which cannot all be tested in CI at all time. This leads to complexity in the CI system that needs to be understood by developers in order to prevent regressions. This is why our CI is maintained and developed in the same team developing the driver.

Point 4) is coming pretty naturally when introducing a filtering system for CI failures. Some failures are known and pending fixing, and we do not want these to be considered as blocking for a patch series. We have been using bugs to create a forum of discussion for developers to discuss how to fix these issues. These bugs are associated to CI failures by a tool doing pattern matching (https://intel-gfx-ci.01.org/cibuglog/). The problem is that these bugs are now every developer's responsibility to fix, and that requires a change in the development culture to hold up some new features until some more important bugs are fixed.

I guess we are getting quite good at CI, and I am really looking forward to us in the CI team to have more time to share our knowledge and tools for others to replicate! We have already started working on an open source toolbox for CI (https://gitlab.freedesktop.org/gfx-ci), as discussed at XDC 2018 (https://xdc2018.x.org/slides/GFX_Testing_Workshop.pdf).

Kernel quality control, or the lack thereof

Posted Dec 10, 2018 20:35 UTC (Mon) by sandeen (guest, #42852) [Link]

"One hopes that these test will be added to a well maintained test project"

You may wish to subscribe to fstests@vger.kernel.org or peruse git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git if this sort of thing is of interest to you.