Snowpatch: continuous-integration testing for the kernel
There are a number of advantages to CI, Currey said. It provides immediate feedback to developers; with luck, they can fix their problems before other people have to spend any time reporting them. It can save a lot of time for reviewers. As a result, the whole code submission process speeds up, and the project is able to move more quickly as a whole.
The core idea behind a kernel CI implementation is not complicated: one
just needs to merge
patches from the mailing lists, then run a set of tests on the result.
These tests can be as simple as checkpatch.pl, but can also
include building and booting, running the kernel's self-testing code, and
more. Once the tests are done, the results can be reported back to the
developer.
Doing this in the kernel context proves to be harder than in projects that are hosted on sites like GitHub, though. A pull request contains all of the information needed to merge a group of changes; an email containing, say, patch 7/10 lacks that context. It is nearly impossible to tell from an email message whether a patch series has been merged, rejected, or superseded. In general, mailing lists simply do not carry the same level of metadata as contemporary project-hosting sites, and that makes the CI problem harder.
Even so, there are groups doing CI testing on the kernel now. The "big boy" of kernel CI is the 0day robot, which picks up patches from the mailing lists and runs a number of tests. It does some static-analysis testing on the x86 architecture, build testing with over 100 kernel configurations, and a runs set of tests looking for performance regressions. When tests fail, email is sent to the developer. 0day is useful, but it is proprietary to Intel, so nobody else has the ability to change it to do what they want. In the absence of failures, there is also no way for developers to tell whether the tests have been run on a given patch posting or not.
Providing better CI for the kernel requires obtaining better metadata for patches, but any proposal that requires kernel developers to change their workflow is clearly not going to get far, he said. The solution is to use Patchwork, which is already in use by a number of kernel subsystems and is designed to supplement mailing lists rather than replacing them. Patchwork is able to track the state of patches, keep a patch series together, and host test results. And, perhaps best of all for those who would like to extend its functionality, it has a JSON API that can be used to build scripts around it.
Patchwork fills the bill nicely because it is already in use and accepted by many developers; adopting it will not require any workflow changes. Patchwork can host test results without having to run the tests itself; they can come from anywhere. There is also value in having the results posted on a web site; developers can learn when tests have been run (and their outcome) without the need to send out email for every patch set.
Snowpatch, thus, is built on top of patchwork. It is written in Rust in, Currey said, an attempt to be cool. The effort began at linux.conf.au 2016 in Geelong, and is maintained in collaboration with Andrew Donnellan. The code is GPL-licensed. There is an instance running now for the linuxppc-dev mailing list.
At its core, Snowpatch grabs a patch from Patchwork, applies it to one or more repository branches, then sends the result to a remote system for testing. When the results come back, they are added to the Patchwork entry. Actually running the tests requires Jenkins for now — a limitation that Currey apologized for. But, he said, Jenkins does everything that the project needs it to do.
Should anybody else want to set up a Snowpatch instance, he said, there are a few basic requirements. First of all, it needs a local repository to which patches can be applied. Access to a patchwork instance is needed to be able to publish the results. A Jenkins server is needed to run the tests, and there needs to be a remote Git repository that is visible to the Jenkins system. Currey ended his talk with an expression of hope that more kernel subsystems will set up Snowpatch and start making use of it to improve their CI testing.
A member of the audience asked about the risk of malicious patches taking over the test machines. Currey answered that "something" needs to be in place to deal with that problem, but it hasn't been addressed yet. That something might involve having a maintainer approve test runs. That said, bad patches haven't been a problem so far. The final question had to do with dependencies between patches; Snowpatch has no real solution for that problem at this time.
A video of this talk is available on YouTube.
[Thanks to linux.conf.au and the Linux Foundation for supporting my travel
to the event.]
| Index entries for this article | |
|---|---|
| Kernel | Development tools/Testing |
| Kernel | Patchwork |
| Conference | linux.conf.au/2019 |
Posted Jan 26, 2019 7:57 UTC (Sat)
by ruscur (guest, #104891)
[Link] (9 responses)
Posted Jan 26, 2019 10:23 UTC (Sat)
by jani (subscriber, #74547)
[Link] (6 responses)
There was an LWN article on this in 2017 https://lwn.net/Articles/735468/. We've improved and expanded since.
Posted Jan 27, 2019 5:50 UTC (Sun)
by ajdlinux (subscriber, #82125)
[Link] (5 responses)
Is any of your infrastructure tooling publicly available at the moment?
Posted Jan 27, 2019 10:07 UTC (Sun)
by jani (subscriber, #74547)
[Link] (4 responses)
- https://gitlab.freedesktop.org/patchwork-fdo
But you caught me there, I'm not sure if the actual nuts and bolts of building the kernel, deploying to the farm of test machine, and gathering the test suite results to https://intel-gfx-ci.01.org/ are available. Mostly Jenkins last I checked.
There was another presentation at FOSDEM last year https://archive.fosdem.org/2018/schedule/event/intel_ci/
Posted Jan 27, 2019 16:59 UTC (Sun)
by rahulsundaram (subscriber, #21946)
[Link] (1 responses)
Posted Jan 28, 2019 12:24 UTC (Mon)
by mupuf (subscriber, #86890)
[Link]
I disagree that it should be a turnkey solution, as each project has different needs. Instead, I believe we should be aiming towards developing components that work well together: an open source toolbox for CI.
This is what https://gitlab.freedesktop.org/gfx-ci/documentation is about. Here is the lightning talk that kicked off the effort: https://xdc2018.x.org/slides/GFX_Testing_Workshop.pdf
Also, we should be aiming towards having common infrastructure for CIs, allow CI farms to plug themselves and provide additional testing.
--------------------------------
One thing that is missing from all of the kernel CIs project that I have seen is a competent tool keeping track of failures automatically and mapping them to bugs. This tool also needs to be able to do a filtered A/B comparison, indicating what were the changes when going from A -> B (critical for pre-merge testing).
I have been working on such a tool for years, but you are in luck, it finally got open sourced on Friday: https://gitlab.freedesktop.org/gfx-ci/cibuglog . A read-only instance can be seen here ( https://intel-gfx-ci.01.org/cibuglog/ ) and it produces A/B comparisons for patchwork ( https://patchwork.freedesktop.org/series/55750/ ):
CI Bug Log - changes from CI_DRM_5488 -> Patchwork_12046
Summary
**SUCCESS**
No regressions found.
External URL: https://patchwork.freedesktop.org/api/1.0/series/55750/re...
Known issues
Here are the changes found in Patchwork_12046 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_exec_suspend@basic-s4-devices:
* igt@kms_chamelium@hdmi-hpd-fast:
* igt@kms_chamelium@dp-edid-read:
* igt@kms_pipe_crc_basic@read-crc-pipe-a:
* igt@kms_pipe_crc_basic@read-crc-pipe-b-frame-sequence:
Participating hosts (44 -> 40)
Missing (4): fi-kbl-soraka fi-ilk-m540 fi-byt-squawks fi-bsw-cyan
Build changes
* Linux: CI_DRM_5488 -> Patchwork_12046
CI_DRM_5488: f13eede6ea3e780d900c5220bf09d764a80a3a8f @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
6f40b811103e drm/i915/execlists: Suppress redundant preemption
I really believe this system is the reason for the success of our CI, and I will explain these reasons at FOSDEM next week: https://fosdem.org/2019/schedule/event/igt_ci/
Posted Jan 29, 2019 5:28 UTC (Tue)
by ajdlinux (subscriber, #82125)
[Link] (1 responses)
Our primary interest right now is in filling the gap you identify - the nuts and bolts of building kernels, deploying and publishing results. That's what's valuable across multiple different subsystems - we don't have any particular use for other subsystems' actual tests other than as inspiration for implementing our own for the subsystems we're personally interested in.
As we develop our own infrastructure our plan is to open source as much of the pipeline stuff as we can, and try to make our test environments as replicable as possible for the benefit of other kernel developers who want to spin up their own stuff.
Posted Jan 29, 2019 17:10 UTC (Tue)
by mupuf (subscriber, #86890)
[Link]
I agree, but you also would not want to paint yourself in a corner by neglecting to think about the userspace ;)
So far, we are really considering moving all the userspace to a container, for reproducibility and getting rid dependency issues between the builders and runners.
If you have any question about our setup, join us on Freenode, #intel-gfx-ci.
Posted Jan 27, 2019 1:03 UTC (Sun)
by ndesaulniers (subscriber, #110768)
[Link] (1 responses)
Posted Jan 27, 2019 5:46 UTC (Sun)
by ajdlinux (subscriber, #82125)
[Link]
Posted Jan 26, 2019 12:48 UTC (Sat)
by unixbhaskar (guest, #44758)
[Link] (6 responses)
Posted Jan 26, 2019 17:25 UTC (Sat)
by olof (subscriber, #11729)
[Link] (5 responses)
Over time they might expand it to do just that, but it's not there today.
Posted Jan 27, 2019 5:58 UTC (Sun)
by ajdlinux (subscriber, #82125)
[Link]
What we're trying to do is tighten the feedback loop for developers so that CI results come within minutes of patch submission, not days or weeks after maintainers have merged patches into their trees.
On the other hand, we're also NOT trying to test as many different combinations of kernel configurations as services like KernelCI or KISSKB. KernelCI concentrates on making sure we detect breakage across many different types of hardware, whereas currently we are quite happy to concentrate on a few high value tests and also static analysis.
Posted Jan 27, 2019 7:47 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (3 responses)
Patchwork applies patches to git trees so not clear why patchwork and kernelci couldn't be chained.
I think I know what you actually mean; the reason I'm being pedantic is that this regular confusion between the content and its format keeps weakening the companion argument that "email is better for code reviews". The next step in confusion would be not being interested in CI because interfaces are typically web-based... Don't laugh, this is actually not too far from some of the worst LWN comments seen on these topics. Unusually low.
Posted Jan 29, 2019 2:50 UTC (Tue)
by ajdlinux (subscriber, #82125)
[Link] (2 responses)
It doesn't - Patchwork itself has minimal awareness of git, it just maintains a database of patches from the mailing list.
Posted Jan 29, 2019 7:15 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (1 responses)
Proves the terminology is even more confusing I thought it was.
Posted Jan 29, 2019 7:23 UTC (Tue)
by ajdlinux (subscriber, #82125)
[Link]
Posted Jan 28, 2019 7:32 UTC (Mon)
by mjthayer (guest, #39183)
[Link] (2 responses)
Although it is probably a different direction to what Russell was looking for, I could imagine some number of kernel developers, enough to be interesting, being ready to change their workflow somewhat - like adding additional tags to patches for instance - to take advantage of a CI system which was there and working. Especially if they were asked beforehand what sort of changes they could live with. And I could imagine more following suit if it caught on.
Posted Jan 28, 2019 12:01 UTC (Mon)
by jani (subscriber, #74547)
[Link] (1 responses)
Posted Jan 29, 2019 3:47 UTC (Tue)
by ajdlinux (subscriber, #82125)
[Link]
Posted Jan 28, 2019 11:16 UTC (Mon)
by pm215 (subscriber, #98099)
[Link] (1 responses)
Posted Jan 29, 2019 5:22 UTC (Tue)
by ajdlinux (subscriber, #82125)
[Link]
From what I can tell, Patchew does have a bit of a different design philosophy to Patchwork and snowpatch, and perhaps that's because Patchew is much newer than Patchwork and they were thinking about CI from the very beginning.
In our case, we started with Patchwork primarily because that's what the development workflow in many parts of the kernel is already using.
Posted Jan 29, 2019 12:53 UTC (Tue)
by NAR (subscriber, #1313)
[Link] (2 responses)
Posted Jan 29, 2019 14:10 UTC (Tue)
by MattJD (subscriber, #91390)
[Link]
Posted Jan 30, 2019 2:33 UTC (Wed)
by ajdlinux (subscriber, #82125)
[Link]
The only part of the process that's common to all developers and all kernel subsystems is the submission of patches for review on the mailing list. That's why we chose to have snowpatch run tests at that point - it will catch everything, and the CI results will be visible (in patchwork) right alongside the code review comments.
It would probably be better if everyone did make use of git in a more sensible fashion, and indeed some subsystems like drm are looking at using more features of tools like GitLab. But we don't live in a perfect world...
The obvious disadvantage of the snowpatch approach is that a developer can't take advantage of the infrastructure to run the tests before they're ready to submit their patches for review on the mailing list. We're thinking about what we can do there, though that's a low priority for us at the moment.
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
- https://gitlab.freedesktop.org/gfx-ci
- https://gitlab.freedesktop.org/drm/igt-gpu-tools
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
====================================================
-------
------------
- fi-blb-e6850: PASS -> INCOMPLETE [fdo#107718]
- fi-kbl-7500u: PASS -> FAIL [fdo#108767]
#### Possible fixes ####
- fi-kbl-7500u: WARN -> PASS
- fi-byt-clapper: FAIL [fdo#107362] -> PASS
- fi-byt-clapper: FAIL [fdo#103191] / [fdo#107362] -> PASS +1
[fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
[fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362
[fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
[fdo#108767]: https://bugs.freedesktop.org/show_bug.cgi?id=108767
------------------------------
-------------
IGT_4790: dcdf4b04e16312f8f52ad389388d834f9d74b8f0 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_12046: 6f40b811103eee129743c6465e987be7a51e7596 @ git://anongit.freedesktop.org/gfx-ci/linux
2ee9b7413598 drm/i915/execlists: Suppress preempting self
0cf0a44086c4 drm/i915: Rename execlists->queue_priority to preempt_priority_hint
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Search/replace accordingly.
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Snowpatch: continuous-integration testing for the kernel
Aren't developers already using git?
Aren't developers already using git?
Aren't developers already using git?
