How many kernel test frameworks?
The kernel self-test framework (kselftest) has been a part of the kernel for some time now; a relatively recent proposal for a kernel unit-testing framework, called KUnit, has left some wondering why both exist. In a lengthy discussion thread about KUnit, the justification for adding another testing framework to the kernel was debated. While there are different use cases for kselftest and KUnit, there was concern about fragmenting the kernel-testing landscape.
In early May, Brendan Higgins posted v2 of the KUnit patch set with an eye toward getting it into Linux 5.2. That was deemed a bit of an overaggressive schedule by Greg Kroah-Hartman and Shuah Khan given that the merge window would be opening a week later or so. But Khan did agree that the patches could come in via her kselftest tree. There were some technical objections to some of the patches, which is no surprise, but overall the patches were met with approval—and some Reviewed-by tags.
There were some sticking points, however. Several, including Kroah-Hartman
and
Logan Gunthorpe complained about the reliance on user-mode Linux (UML)
to run the tests. Higgins said
that he had "mostly fixed that
". The KUnit tests will now run
on any architecture, though the Python wrapper scripts are still expecting
to run the tests in UML. He said that he should probably document that,
which is something that he has subsequently
done.
A more overarching concern was raised
by Frank Rowand. From his understanding, using UML is meant to
"avoid booting a kernel on
real hardware or in a virtual machine
", he said, but he does not
really see that as anything other than "a matter of
semantics
"; running Linux via UML is simply a different form of
virtualization. Furthermore:
I would guess that some developers will focus on just one of the two test environments (and some will focus on both), splitting the development resources instead of pooling them on a common infrastructure.
Khan replied
that she sees kselftest and KUnit as complementary. Kselftest is "a
collection of user-space tests with a
few kernel test modules back-ending the tests in some cases
", while KUnit
provides a framework for in-kernel testing. Rowand was
not particularly swayed by that argument, however. He sees that there
is (or could be) an almost complete overlap between the two.
Unlike some other developers, Ted Ts'o actually finds the use of UML to be beneficial. He described some unit tests that are under development for ext4; they will test certain features of ext4 in isolation from any other part of the kernel, which is where he sees the value in KUnit. The framework provided with kselftest targets running tests from user space, which requires booting a real kernel, while KUnit is simpler and faster to use:
Frameworks
Part of the difference of opinion may hinge on the definition of
"framework" to a certain extent. Ts'o stridently argued that kselftest is
not providing an in-kernel testing framework, but Rowand just as vehemently
disagreed with that. Rowand pointed
to the use of kernel modules in kselftest and noted
that those modules can be built into a UML kernel. Ts'o did not think
that added up to a framework since "each of
the in-kernel code has to create their own in-kernel test
infrastructure
". Rowand sees
that differently:
"The kselftest in-kernel tests follow a common pattern. As such, there
is a framework.
"
To Ts'o, that
doesn't really equate to a framework, though perhaps the situation
could change down the road:
In addition, Ts'o said that kselftest expects to have a working user-space environment:
Rowand disagreed:
No userspace environment needed. So exactly the same overhead as KUnit when invoked in that manner.
Ts'o is not convinced by that. He noted that the kselftest documentation is missing any mention of this kind of test. There are tests that run before init is started, but they aren't part of the kselftest framework:
Overlaps
There may be overlaps in the functionality of KUnit and kselftest, however. Knut Omang, who is part of the Kernel Test Framework project—another unit-testing project for the kernel that is not upstream—pointed out that there are two types of tests that are being conflated a bit in the discussion. One is an isolated test of a particular subsystem that is meant to be run rapidly and repeatedly by developers of that subsystem. The other is meant to test interactions between more than one subsystem and might be run as part of a regression test suite or in a continuous-integration effort, though it would be used by developers as well. The unit tests being developed for ext4 would fall into the first category, while xfstests would fall into the latter.
Omang said that the two could potentially be combined into a single tool, with common configuration files, test reporting, and so on. That is what KTF is trying to do, he said. But Ts'o is skeptical that a single test framework is the way forward. There are already multiple frameworks out there, he said, including xfstests, blktests, kselftest, and so on. Omang also suggested that UML was still muddying the waters in terms of single-subsystem unit tests:
But Ts'o sees things differently:
Gunthorpe saw some potential overlap as well. He made a distinction in test styles that was somewhat similar to Omang's. He noted that there are not many users of the kselftest_harness.h interface at this point, so it might make sense to look at unifying the areas that overlap sooner rather than later.
Looking at the selftests tree in the repo, we already have similar items to what Kunit is adding as I described in point (2) above. kselftest_harness.h contains macros like EXPECT_* and ASSERT_* with very similar intentions to the new KUNIT_EXECPT_* and KUNIT_ASSERT_* macros.
Ts'o is not opposed to unifying the tests in whatever way makes sense, but said that kselftest_harness.h needs to be reworked before in-kernel tests can use it. Gunthorpe seemed to change his mind some when he replied that perhaps the amount of work to unify the two use cases was not worth it:
Ultimately, what Rowand
seems to be after is a better justification for
KUnit and why it is, and needs to be, different from kselftest, in the patch
series itself. "I was looking for a fuller, better explanation than
was given in patch 0
of how KUnit provides something that is different than what kselftest
provides for creating unit tests for kernel code.
" Higgins asked for
specific suggestions on where the documentation of KUnit was lacking.
Rowand replied
that in-patch justification is what he, as a code reviewer, was looking
for:
But Gunthorpe did
not agree; "in my
opinion, Brendan has provided over and above the information required to
justify Kunit's inclusion
". The difference of opinion about whether
kselftest provides any kind of in-kernel framework appears to be the crux
of the standoff. Gunthorpe believes that the in-kernel kselftest code
should probably be changed to use KUnit, once it gets merged, which he was
strongly in favor of.
As the discussion was trailing off, Higgins posted v3 of the patch set on May 13, followed closely by an update to v4 a day later. Both addressed the technical comments on the v2 code and also added the documentation about running on architectures other than UML. There have been relatively few comments and no major complaints about those postings. One might guess that KUnit is on its way into the mainline, probably for 5.3.
Index entries for this article | |
---|---|
Kernel | Development tools/Testing |
Posted Jun 5, 2019 20:41 UTC (Wed)
by logang (subscriber, #127618)
[Link]
I am in favour of using UML, however when I tried to use KUnit I ran into a bunch of problems being able to compile my tests at all seeing the tree I wrote a test for wouldn't compile without PCI being selected and that could not be done in UML. I managed to work around it but I suspect there's going to be a lot of these problems in the future [1].
I think the consensus at the time was roughly that we'd need to add more mocking to UML to allow these subsystems to use it, not to stop using UML entirely.
Furthermore, my position regarding kselftests changed during the course of the discussion because it wasn't clear what kselftests actually provides or where the in-kernel tests were (they are in lib/test*). There's very little documentation for kselftests and they seem to cover a bunch of different cases. In contrast, documentation is one of the things KUnit has done very well.
Logan
[1] https://lore.kernel.org/lkml/6d9b3b21-1179-3a45-7545-30aa...
Posted Jun 6, 2019 7:56 UTC (Thu)
by diconico07 (guest, #117416)
[Link] (2 responses)
Roughly speaking, for me it is the difference between unit tests and functional tests, and in most userspace-centric project these two uses different frameworks as they don't have the same needs, the only common thing is usually the output format.
And here again, in a project as big as the kernel the limit can be blurry as you might want to functionally test an entire subsystem that is not directly exposed to userspace.
Something like:
With the three sharing the same output format and the functional tests sharing the same way of writing scenarios seems like the most sane way to go.
Posted Jun 8, 2019 5:31 UTC (Sat)
by marcH (subscriber, #57642)
[Link]
I rarely ever saw such a clear limit - in any project. Even with the best and clearest definitions there are always grey areas and overlaps somewhere in the middle. Not an exact science.
Posted Jun 9, 2019 19:19 UTC (Sun)
by k3ninho (subscriber, #50375)
[Link]
> And for this point there might be need for a third framework to keep things clear and avoid getting a bloated framework or unreadable/unmaintainable tests.
*: I've starred 'measure' each time I used it because I talk about testing in terms of taking measurements aimed to accept or reject a falsifiable hypothesis about the system. We talk about preparing the system, then making a single change, and measuring the impact. And we also talk about the layers of these tests: the 'testing pyramid' I prefer to use is one of a base of super-quick and super-numerous, whose output you trust when assessing whether the components will integrate properly as their interfaces work to explicit interface contracts, and then external interfaces (user and programmatic) becoming more expensive because they require more setup and levels to the stack to be representative of real-world use (which is balanced by the harness being lightweight because you're building on the trust of the lower levels of your testing); and finally, the smoke tests of "did we deploy it right?"
K3n.
How many kernel test frameworks?
How many kernel test frameworks?
kselftest is meant to detect any API break and make sure not to break userspace, on the other hand KUnit test is meant for testing specific parts of the kernel and possibly parts that are not exposed to userspace or at least not directly.
And for this point there might be need for a third framework to keep things clear and avoid getting a bloated framework or unreadable/unmaintainable tests.
- kselftest for userspace interface functional testing
- KUnit for kernel features unit testing
- ???? for in-kernel features functional testing
With a well-defined structure you can make unit-tests mandatory for every patch set and functional-tests mandatory for inclusion in "main" tree. A set of tests like this is needed to build more trust in stable kernels.
How many kernel test frameworks?
How many kernel test frameworks?
Conventionally, unit tests *are* functional tests. Harnessing program logic in its own scope is unit testing; the tests themselves measure* the functionality. You're also mistaking the API conformity suite for being unit tests: they're integration tests simply becuase they ask "will these components play nicely together?"
Convention holds that the phrases you want to use are 'separation of concerns' for having tests appropriate to the layer of production functionality you want to measure* and 'single responsibility principle' for having production and test code do only one thing (hopefully well) -- and that single responsibility for the test code is to measure* the outcome of a single change in one layer of the system.