Snowpatch: continuous-integration testing for the kernel

By Jonathan Corbet
January 26, 2019

Many projects use continuous-integration (CI) testing to improve the quality of the software they produce. By running a set of tests after every commit, CI systems can identify problems quickly, before they find their way into a release and bite unsuspecting users. The Linux kernel project lags many others in its use of CI testing for a number of reasons, including a fundamental mismatch with how kernel developers tend to manage their workflows. At linux.conf.au 2019, Russell Currey described a CI system called Snowpatch that, he hopes, will bridge the gap and bring better testing to the kernel development process.

There are a number of advantages to CI, Currey said. It provides immediate feedback to developers; with luck, they can fix their problems before other people have to spend any time reporting them. It can save a lot of time for reviewers. As a result, the whole code submission process speeds up, and the project is able to move more quickly as a whole.

The core idea behind a kernel CI implementation is not complicated: one just needs to merge patches from the mailing lists, then run a set of tests on the result. These tests can be as simple as checkpatch.pl, but can also include building and booting, running the kernel's self-testing code, and more. Once the tests are done, the results can be reported back to the developer.

Doing this in the kernel context proves to be harder than in projects that are hosted on sites like GitHub, though. A pull request contains all of the information needed to merge a group of changes; an email containing, say, patch 7/10 lacks that context. It is nearly impossible to tell from an email message whether a patch series has been merged, rejected, or superseded. In general, mailing lists simply do not carry the same level of metadata as contemporary project-hosting sites, and that makes the CI problem harder.

Even so, there are groups doing CI testing on the kernel now. The "big boy" of kernel CI is the 0day robot, which picks up patches from the mailing lists and runs a number of tests. It does some static-analysis testing on the x86 architecture, build testing with over 100 kernel configurations, and a runs set of tests looking for performance regressions. When tests fail, email is sent to the developer. 0day is useful, but it is proprietary to Intel, so nobody else has the ability to change it to do what they want. In the absence of failures, there is also no way for developers to tell whether the tests have been run on a given patch posting or not.

Providing better CI for the kernel requires obtaining better metadata for patches, but any proposal that requires kernel developers to change their workflow is clearly not going to get far, he said. The solution is to use Patchwork, which is already in use by a number of kernel subsystems and is designed to supplement mailing lists rather than replacing them. Patchwork is able to track the state of patches, keep a patch series together, and host test results. And, perhaps best of all for those who would like to extend its functionality, it has a JSON API that can be used to build scripts around it.

Patchwork fills the bill nicely because it is already in use and accepted by many developers; adopting it will not require any workflow changes. Patchwork can host test results without having to run the tests itself; they can come from anywhere. There is also value in having the results posted on a web site; developers can learn when tests have been run (and their outcome) without the need to send out email for every patch set.

Snowpatch, thus, is built on top of patchwork. It is written in Rust in, Currey said, an attempt to be cool. The effort began at linux.conf.au 2016 in Geelong, and is maintained in collaboration with Andrew Donnellan. The code is GPL-licensed. There is an instance running now for the linuxppc-dev mailing list.

At its core, Snowpatch grabs a patch from Patchwork, applies it to one or more repository branches, then sends the result to a remote system for testing. When the results come back, they are added to the Patchwork entry. Actually running the tests requires Jenkins for now — a limitation that Currey apologized for. But, he said, Jenkins does everything that the project needs it to do.

Should anybody else want to set up a Snowpatch instance, he said, there are a few basic requirements. First of all, it needs a local repository to which patches can be applied. Access to a patchwork instance is needed to be able to publish the results. A Jenkins server is needed to run the tests, and there needs to be a remote Git repository that is visible to the Jenkins system. Currey ended his talk with an expression of hope that more kernel subsystems will set up Snowpatch and start making use of it to improve their CI testing.

A member of the audience asked about the risk of malicious patches taking over the test machines. Currey answered that "something" needs to be in place to deal with that problem, but it hasn't been addressed yet. That something might involve having a maintainer approve test runs. That said, bad patches haven't been a problem so far. The final question had to do with dependencies between patches; Snowpatch has no real solution for that problem at this time.

A video of this talk is available on YouTube.

[Thanks to linux.conf.au and the Linux Foundation for supporting my travel to the event.]

Index entries for this article
Kernel	Development tools/Testing
Kernel	Patchwork
Conference	linux.conf.au/2019

Snowpatch: continuous-integration testing for the kernel

Posted Jan 26, 2019 7:57 UTC (Sat) by ruscur (guest, #104891) [Link] (9 responses)

Thanks for the article! Happy to answer any questions.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 26, 2019 10:23 UTC (Sat) by jani (subscriber, #74547) [Link] (6 responses)

Just FYI, we're doing fairly serious kernel CI on Intel Graphics i.e. the drm/i915 driver: https://intel-gfx-ci.01.org

There was an LWN article on this in 2017 https://lwn.net/Articles/735468/. We've improved and expanded since.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 5:50 UTC (Sun) by ajdlinux (subscriber, #82125) [Link] (5 responses)

That looks super cool.

Is any of your infrastructure tooling publicly available at the moment?

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 10:07 UTC (Sun) by jani (subscriber, #74547) [Link] (4 responses)

Large parts of it are free software:

- https://gitlab.freedesktop.org/patchwork-fdo
- https://gitlab.freedesktop.org/gfx-ci
- https://gitlab.freedesktop.org/drm/igt-gpu-tools

But you caught me there, I'm not sure if the actual nuts and bolts of building the kernel, deploying to the farm of test machine, and gathering the test suite results to https://intel-gfx-ci.01.org/ are available. Mostly Jenkins last I checked.

There was another presentation at FOSDEM last year https://archive.fosdem.org/2018/schedule/event/intel_ci/

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 16:59 UTC (Sun) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

It would really help if all of this could be more of a turnkey type solution. Tall ask I know but it will likely lead to most of the free software projects adopt this type of testing.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 28, 2019 12:24 UTC (Mon) by mupuf (subscriber, #86890) [Link]

> It would really help if all of this could be more of a turnkey type solution.

I disagree that it should be a turnkey solution, as each project has different needs. Instead, I believe we should be aiming towards developing components that work well together: an open source toolbox for CI.

This is what https://gitlab.freedesktop.org/gfx-ci/documentation is about. Here is the lightning talk that kicked off the effort: https://xdc2018.x.org/slides/GFX_Testing_Workshop.pdf

Also, we should be aiming towards having common infrastructure for CIs, allow CI farms to plug themselves and provide additional testing.

--------------------------------

One thing that is missing from all of the kernel CIs project that I have seen is a competent tool keeping track of failures automatically and mapping them to bugs. This tool also needs to be able to do a filtered A/B comparison, indicating what were the changes when going from A -> B (critical for pre-merge testing).

I have been working on such a tool for years, but you are in luck, it finally got open sourced on Friday: https://gitlab.freedesktop.org/gfx-ci/cibuglog . A read-only instance can be seen here ( https://intel-gfx-ci.01.org/cibuglog/ ) and it produces A/B comparisons for patchwork ( https://patchwork.freedesktop.org/series/55750/ ):

CI Bug Log - changes from CI_DRM_5488 -> Patchwork_12046
====================================================

Summary
-------

**SUCCESS**

No regressions found.

External URL: https://patchwork.freedesktop.org/api/1.0/series/55750/re...

Known issues
------------

Here are the changes found in Patchwork_12046 that come from known issues:

### IGT changes ###

#### Issues hit ####

* igt@gem_exec_suspend@basic-s4-devices:
- fi-blb-e6850: PASS -> INCOMPLETE [fdo#107718]

* igt@kms_chamelium@hdmi-hpd-fast:
- fi-kbl-7500u: PASS -> FAIL [fdo#108767]

#### Possible fixes ####

* igt@kms_chamelium@dp-edid-read:
- fi-kbl-7500u: WARN -> PASS

* igt@kms_pipe_crc_basic@read-crc-pipe-a:
- fi-byt-clapper: FAIL [fdo#107362] -> PASS

* igt@kms_pipe_crc_basic@read-crc-pipe-b-frame-sequence:
- fi-byt-clapper: FAIL [fdo#103191] / [fdo#107362] -> PASS +1

[fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
[fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362
[fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
[fdo#108767]: https://bugs.freedesktop.org/show_bug.cgi?id=108767

Participating hosts (44 -> 40)
------------------------------

Missing (4): fi-kbl-soraka fi-ilk-m540 fi-byt-squawks fi-bsw-cyan

Build changes
-------------

* Linux: CI_DRM_5488 -> Patchwork_12046

CI_DRM_5488: f13eede6ea3e780d900c5220bf09d764a80a3a8f @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4790: dcdf4b04e16312f8f52ad389388d834f9d74b8f0 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_12046: 6f40b811103eee129743c6465e987be7a51e7596 @ git://anongit.freedesktop.org/gfx-ci/linux

== Linux commits ==

6f40b811103e drm/i915/execlists: Suppress redundant preemption
2ee9b7413598 drm/i915/execlists: Suppress preempting self
0cf0a44086c4 drm/i915: Rename execlists->queue_priority to preempt_priority_hint

I really believe this system is the reason for the success of our CI, and I will explain these reasons at FOSDEM next week: https://fosdem.org/2019/schedule/event/igt_ci/

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 5:28 UTC (Tue) by ajdlinux (subscriber, #82125) [Link] (1 responses)

Thanks for the links, there's some interesting stuff in there that we should take a look at. We're familiar with FDO Patchwork, indeed initially that's what snowpatch targetted, because upstream Patchwork didn't yet have a REST API. The divergence between mainline and FDO patchwork is a bit of a sad story really.

Our primary interest right now is in filling the gap you identify - the nuts and bolts of building kernels, deploying and publishing results. That's what's valuable across multiple different subsystems - we don't have any particular use for other subsystems' actual tests other than as inspiration for implementing our own for the subsystems we're personally interested in.

As we develop our own infrastructure our plan is to open source as much of the pipeline stuff as we can, and try to make our test environments as replicable as possible for the benefit of other kernel developers who want to spin up their own stuff.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 17:10 UTC (Tue) by mupuf (subscriber, #86890) [Link]

> we don't have any particular use for other subsystems' actual tests other than as inspiration for implementing our own for the subsystems we're personally interested in.

I agree, but you also would not want to paint yourself in a corner by neglecting to think about the userspace ;)

So far, we are really considering moving all the userspace to a container, for reproducibility and getting rid dependency issues between the builders and runners.

If you have any question about our setup, join us on Freenode, #intel-gfx-ci.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 1:03 UTC (Sun) by ndesaulniers (subscriber, #110768) [Link] (1 responses)

Neat stuff, I hope you got a chance to talk with Joel Stanley (also of IBM) at Linux Conf AU '19; we're looking to do additional CI of kernel builds with Clang.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 5:46 UTC (Sun) by ajdlinux (subscriber, #82125) [Link]

We absolutely can throw clang on top of our existing builds we're doing on linuxppc-dev, shouldn't be too much trouble at all.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 26, 2019 12:48 UTC (Sat) by unixbhaskar (guest, #44758) [Link] (6 responses)

I am not sure, what's wrong with https://kernelci.org/ ...or is not relevant??

Snowpatch: continuous-integration testing for the kernel

Posted Jan 26, 2019 17:25 UTC (Sat) by olof (subscriber, #11729) [Link] (5 responses)

kernelci is mostly focused on building and booting contents in existing git trees, not patches posted on the lists and in patchwork.

Over time they might expand it to do just that, but it's not there today.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 5:58 UTC (Sun) by ajdlinux (subscriber, #82125) [Link]

Exactly this.

What we're trying to do is tighten the feedback loop for developers so that CI results come within minutes of patch submission, not days or weeks after maintainers have merged patches into their trees.

On the other hand, we're also NOT trying to test as many different combinations of kernel configurations as services like KernelCI or KISSKB. KernelCI concentrates on making sure we detect breakage across many different types of hardware, whereas currently we are quite happy to concentrate on a few high value tests and also static analysis.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 27, 2019 7:47 UTC (Sun) by marcH (subscriber, #57642) [Link] (3 responses)

> kernelci is mostly focused on building and booting contents in existing git trees, not patches posted on the lists and in patchwork.

Patchwork applies patches to git trees so not clear why patchwork and kernelci couldn't be chained.

I think I know what you actually mean; the reason I'm being pedantic is that this regular confusion between the content and its format keeps weakening the companion argument that "email is better for code reviews". The next step in confusion would be not being interested in CI because interfaces are typically web-based... Don't laugh, this is actually not too far from some of the worst LWN comments seen on these topics. Unusually low.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 2:50 UTC (Tue) by ajdlinux (subscriber, #82125) [Link] (2 responses)

> Patchwork applies patches to git trees so not clear why patchwork and kernelci couldn't be chained.

It doesn't - Patchwork itself has minimal awareness of git, it just maintains a database of patches from the mailing list.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 7:15 UTC (Tue) by marcH (subscriber, #57642) [Link] (1 responses)

My bad, looks like I confused the "original" patchwork with their various CI extensions like this one: http://damien.lespiau.name/2016/02/augmenting-mailing-lis...
Search/replace accordingly.

Proves the terminology is even more confusing I thought it was.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 7:23 UTC (Tue) by ajdlinux (subscriber, #82125) [Link]

We're doing essentially the same thing with upstream Patchwork, using the API to fetch patches, then applying them to a git tree and triggering builds.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 28, 2019 7:32 UTC (Mon) by mjthayer (guest, #39183) [Link] (2 responses)

"...any proposal that requires kernel developers to change their workflow is clearly not going to get far, he said"

Although it is probably a different direction to what Russell was looking for, I could imagine some number of kernel developers, enough to be interesting, being ready to change their workflow somewhat - like adding additional tags to patches for instance - to take advantage of a CI system which was there and working. Especially if they were asked beforehand what sort of changes they could live with. And I could imagine more following suit if it caught on.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 28, 2019 12:01 UTC (Mon) by jani (subscriber, #74547) [Link] (1 responses)

Email is a lossy format for distributing changes. That said, I'm not prepared to give up on email based patch review just yet. I'd like to see a system where you contribute by git push/pull flow, and the other end emails the patches out for review. This would get rid of a class of lack of meta information issues with patchwork style systems parsing emails. To where and to whom should patch emails be sent to? Is this email really a patch? Is this series of emails a patch series? Against which git tree? What's the baseline? What are the changes to the earlier version? Has it been merged? Is this reply that contains a patch review or an updated version of the patch? Etc.

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 3:47 UTC (Tue) by ajdlinux (subscriber, #82125) [Link]

This would be a very interesting approach to try - perhaps something that could be experimented with by bolting something on to the Patchwork API...

Snowpatch: continuous-integration testing for the kernel

Posted Jan 28, 2019 11:16 UTC (Mon) by pm215 (subscriber, #98099) [Link] (1 responses)

It would be interesting to see a feature comparison with Patchew (https://patchew.org/) which also grabs patches from a mailing list, merges them, runs tests and reports the results on a website (and as email followups to the patches).

Snowpatch: continuous-integration testing for the kernel

Posted Jan 29, 2019 5:22 UTC (Tue) by ajdlinux (subscriber, #82125) [Link]

I haven't looked too deeply into Patchew, but on the surface it does appear to do a lot of the things we would want.

From what I can tell, Patchew does have a bit of a different design philosophy to Patchwork and snowpatch, and perhaps that's because Patchew is much newer than Patchwork and they were thinking about CI from the very beginning.

In our case, we started with Patchwork primarily because that's what the development workflow in many parts of the kernel is already using.

Aren't developers already using git?

Posted Jan 29, 2019 12:53 UTC (Tue) by NAR (subscriber, #1313) [Link] (2 responses)

I might miss something, but I don't quite get why does this system have to get patches out of e-mails. I presume most (or all?) kernel developers already work on a git tree (they need to base their code on something) and their patches are on a branch. So why not tell the CI system to pull this branch and run the tests there? Might not work with repos only living on the developer's laptop, but I'd guess at least for sake of redundancy these repos are public (or at least accessible from a remote CI system).

Aren't developers already using git?

Posted Jan 29, 2019 14:10 UTC (Tue) by MattJD (subscriber, #91390) [Link]

Speaking as someone who only has some ~4 patches added to the tree, my kernel source is not currently publicly available (mostly because I don't do enough work on the kernel to bother). For large developers (and especially maintainers), I'd bet it's true they have their trees shared somewhere. But for people doing smaller contributions, it's less likely if we just have to send an email. And realistically I'd feel a lot better if my patches had an automatic CI run over them, as I'd be more likely to have a small issue easily identified by a CI.

Aren't developers already using git?

Posted Jan 30, 2019 2:33 UTC (Wed) by ajdlinux (subscriber, #82125) [Link]

Pretty much all kernel developers use git (though I think that may still not be 100%!), but developers very much have their own individual ways of working which often (mostly?) do *not* involve pushing all their branches to a publicly accessible location. (Speaking for myself, I push any major work I'm doing to either my GitHub account or to one of my company's internal Git services (which *aren't* public) but for minor patches I often just go straight from my laptop's working tree to sending an email patch.)

The only part of the process that's common to all developers and all kernel subsystems is the submission of patches for review on the mailing list. That's why we chose to have snowpatch run tests at that point - it will catch everything, and the CI results will be visible (in patchwork) right alongside the code review comments.

It would probably be better if everyone did make use of git in a more sensible fashion, and indeed some subsystems like drm are looking at using more features of tools like GitLab. But we don't live in a perfect world...

The obvious disadvantage of the snowpatch approach is that a developer can't take advantage of the infrastructure to run the tests before they're ready to submit their patches for review on the mailing list. We're thinking about what we can do there, though that's a low priority for us at the moment.