|
|
Subscribe / Log in / New account

Some 5.19 development statistics

By Jonathan Corbet
August 1, 2022
The 5.19 kernel was released, after a one-week delay to deal with the fallout from the Retbleed mitigations, on July 31. By that time, 16,399 commits (15,134 non-merge and 1,265 merges) had found their way into the mainline repository, making this development cycle the busiest since 5.13 (16,030 non-merge changesets and 1,157 merges). Tradition dictates that now is the time for a look at where the changes in 5.19 came from, and we've learned not to go against tradition.

Individual contributors

Work on 5.19 was contributed by 2,086 developers; that is a new record, beating the 2,062 who contributed to 5.13. Of those developers, 278 made their first kernel contribution during this development cycle. The removal of a number of old drivers and an unloved architecture took 301,000 lines of code out of the kernel repository, but that effort was overwhelmed by the 1,105,000 lines of code that were added, for a net growth of 804,000 lines of code.

The top contributors to 5.19 were:

Most active 5.19 developers
By changesets
Krzysztof Kozlowski 2111.4%
Christoph Hellwig 1931.3%
Ville Syrjälä 1751.2%
Matthew Wilcox 1511.0%
Jakub Kicinski 1300.9%
Geert Uytterhoeven 1230.8%
Mark Brown 1180.8%
Masahiro Yamada 1050.7%
Arnd Bergmann 1040.7%
Martin Kaiser 1020.7%
Kuniyuki Iwashima 1010.7%
Christophe Leroy 960.6%
Minghao Chi 960.6%
Biju Das 940.6%
Andy Shevchenko 900.6%
Marek Vasut 890.6%
Miaohe Lin 870.6%
Dmitry Baryshkov 870.6%
Ping-Ke Shih 810.5%
Pavel Begunkov 790.5%
Jason A. Donenfeld 790.5%
Jack Xiao 790.5%
By changed lines
Hawking Zhang 22268218.1%
Huang Rui 18556615.1%
Martin Habets 443613.6%
Jakub Kicinski 346362.8%
Ping-Ke Shih 298712.4%
Huacai Chen 211591.7%
Bjorn Andersson 157381.3%
Christoph Hellwig 140241.1%
Leo Liu 116320.9%
Haijun Liu 110060.9%
Fabio M. De Francesco 95610.8%
Ian Rogers 86910.7%
Imre Deak 79370.6%
Zhengjun Xing 75080.6%
Arnd Bergmann 74240.6%
Leon Romanovsky 65730.5%
Mark Brown 65020.5%
Cezary Rojewski 64920.5%
Peter Ujfalusi 64630.5%
Veerasenareddy Burru 56520.5%
Manivannan Sadhasivam 56140.5%
Jack Xiao 52150.4%

The top contributor of changesets in 5.19 was Krzysztof Kozlowski, who focused mostly on devicetree fixes. Christoph Hellwig continues to rework code all over the kernel, and found the time to remove the h8300 architecture as well. Ville Syrjälä contributed a large number of changes to the Intel i915 graphics driver, Matthew Wilcox continues the folio work, and Jakub Kicinski worked extensively in the networking subsystem.

In the lines-changed column, as has become traditional, Hawking Zhang and Huang Rui outdid everybody else with the addition of hundreds of thousands of lines of machine-generated amdgpu header files. Martin Habets added the "siena" network driver, Kicinski removed a number of old network drivers while taking a break from his other work, and Ping-Ke Shih added support for Realtek 8852ce network adapters.

The lists of top testers and reviewers will look familiar to those who have been following these articles:

Test and review credits in 5.19
Tested-by
Daniel Wheeler 948.4%
Bean Huo 292.6%
Nathan Chancellor 292.6%
Geert Uytterhoeven 272.4%
Heiko Stuebner 262.3%
Nícolas F. R. A. Prado 232.1%
Michael Riesch 211.9%
Marek Szyprowski 201.8%
Arnaldo Carvalho de Melo 191.7%
Gurucharan 181.6%
Sedat Dilek 181.6%
Giuseppe Scrivano 181.6%
Reviewed-by
Christoph Hellwig 2462.9%
Hawking Zhang 2202.6%
Rob Herring 1642.0%
AngeloGioacchino Del Regno 1491.8%
Krzysztof Kozlowski 1441.7%
David Sterba 1231.5%
Darrick J. Wong 1031.2%
Bard Liao 1021.2%
Andy Shevchenko 1021.2%
Stephen Boyd 1011.2%
Jani Nikula 981.2%
Ranjani Sridharan 881.1%

Many of the test credits continue to accrue to people who are seemingly working as part of their employer's internal quality-assurance process, though there appear to be fewer of those than in previous cycles. On the review side, this was a 70-day development cycle; both Christoph Hellwig and Hawking Zhang thus reviewed at least three patches for each of those days. Hellwig's reviews are widespread, while Zhang's are focused on amdgpu patches by AMD developers. It is good to see that there are developers who are evidently reviewing patches as part of their job.

A look at the report credits — along with who is including the Reported-by: tags in their fixes — also shows the evolution of an ongoing story:

Report credits in 5.19
Reporter
kernel test robot 20717.0%
Zeal Robot 13411.0%
Abaci Robot 534.4%
Syzbot494.0%
Dan Carpenter 443.6%
Hulk Robot 373.0%
Stephen Rothwell 272.2%
Rob Herring 191.6%
Guenter Roeck 121.0%
Geert Uytterhoeven 110.9%
Marek Szyprowski 110.9%
Nathan Chancellor 80.7%
Sudip Mukherjee 80.7%
Credited by
Minghao Chi 937.6%
Jiapeng Chong 312.5%
Lv Ruyi 242.0%
Yang Li 221.8%
Krzysztof Kozlowski 201.6%
Eric Dumazet 191.6%
Yang Yingliang 161.3%
Paul E. McKenney 141.1%
Masahiro Yamada 141.1%
Hans de Goede 141.1%
Linus Torvalds131.1%
Randy Dunlap 121.0%
Mario Limonciello 121.0%

We are evidently in the midst of the robot wars and most of us never even noticed; a full 40% of the report credits are going to robots at this point. If one looks at which developers are adding Reported-by tags to their patches, the picture becomes clearer: the top four reporters work for the companies that run the Zeal and Abaci robots (ZTE and Alibaba, respective). It is reasonably clear that these developers are developing and running their own robots to find bugs, then crediting those robots with the reports.

Companies

The employer numbers are relatively steady and boring. A total of 245 employers supported work on 5.19, with the most active being:

Most active 5.19 employers
By changesets
Intel164510.9%
(Unknown)11357.5%
Linaro8625.7%
AMD8375.5%
Red Hat7925.2%
(None)6534.3%
Google6244.1%
Meta5283.5%
SUSE4623.1%
Huawei Technologies4462.9%
NVIDIA4212.8%
Oracle4142.7%
(Consultant)3852.5%
Renesas Electronics3482.3%
Arm2811.9%
MediaTek2351.6%
Qualcomm2321.5%
IBM2301.5%
Pengutronix2081.4%
NXP Semiconductors1951.3%
By lines changed
AMD46554837.9%
Intel800616.5%
Linaro597594.9%
Meta530804.3%
Xilinx457743.7%
(Unknown)375293.1%
Realtek360492.9%
Google307672.5%
NVIDIA305242.5%
MediaTek292152.4%
Red Hat270482.2%
Loongson238191.9%
(None)228901.9%
(Consultant)223221.8%
SUSE169831.4%
Qualcomm144551.2%
Oracle138151.1%
Arm128061.0%
IBM123391.0%
Renesas Electronics108120.9%

Perhaps noteworthy here is the slow but steady decline of Red Hat, which was the top employer for many years. The picture looks a little different if one considers non-author signoffs, though:

Non-author signoffs in 5.19
Individual
Greg Kroah-Hartman 9326.5%
David S. Miller 7855.5%
Alex Deucher 7044.9%
Mark Brown 6564.6%
Andrew Morton5253.7%
Jakub Kicinski 4222.9%
Jens Axboe 2962.1%
Mauro Carvalho Chehab 2822.0%
Bjorn Andersson 2731.9%
Kalle Valo 2721.9%
Borislav Petkov 2301.6%
Martin K. Petersen 2251.6%
Michael Ellerman 2071.4%
Arnaldo Carvalho de Melo 2001.4%
Shawn Guo 1951.4%
David Sterba 1761.2%
Rafael J. Wysocki 1661.2%
Geert Uytterhoeven 1521.1%
Vinod Koul 1481.0%
Catalin Marinas 1451.0%
By employer
Linaro195913.6%
Red Hat185412.9%
Intel144510.1%
Meta10567.4%
Linux Foundation10377.2%
Google9306.5%
AMD7865.5%
SUSE7485.2%
Qualcomm4162.9%
NVIDIA3522.5%
Arm3392.4%
IBM3132.2%
(Consultant)3072.1%
(None)3042.1%
Oracle2601.8%
Huawei Technologies2021.4%
(Unknown)1601.1%
Renesas Electronics1561.1%
Cisco1401.0%
Broadcom1120.8%

A developer who signs off on a patch that they did not write is (usually) the maintainer who accepts the patch and sends it upstream. The above tables, thus, offer an approximate picture of who our most active maintainers are. About half of the patches merged into the mainline kernel are going through the hands of maintainers working for just five companies. On one hand, that shows a potentially concerning concentration of power in a relatively small number of employers. On the other, this is the list of companies that are most willing to pay for maintainers to do their jobs — a good thing, given that the kernel project is short of maintainers overall.

When bugs were introduced

When a commit fixes a bug, it will often contain a Fixes: tag indicating the commit that first introduced that bug. This information is useful for a number of reasons, including deciding how far back a fix needs to be backported in the stable kernels. But it can also give an indication of how long bugs have been in the kernel. The 5.19 cycle saw the addition of 2,541 commits with Fixes: tags; 712 of those (28%) referred to other 5.19 commits. Those bugs never made it into a mainline release, but the rest did. Looking at tags referring to previous releases gives this result:

Fixes bar chart

As one might expect, many of the bugs fixed in 5.19 were introduced in recent releases; 268 of them came from 5.18. What is perhaps more surprising is the long tail of references back to earlier releases; only 2.6.21, 2.6.28, and 2.6.32 are missing from the plot because they had no commits that were fixed in 5.19. It can be surprising to see that there is any code left from those early development cycles at all; that code exists, though, and it still contains some bugs.

The spike at 2.6.12 may seem strange, but remember that the Git history begins then; all of the Fixes: tags pointing to 2.6.12 name commit 1da177e4c3f4, which was the initial commit that started the whole thing off. They are, thus, referring to bugs that were introduced sometime before early 2005. Almost all of those fixes are dealing with data-race issues that were seen as less problematic on the hardware of that era.

The curious can look at the full list of 5.19 fixes, which contains pointers to the fixed commits.

One can also use Fixes: tags to get a sense for when bugs are introduced during the development cycle. In this case, the results are:

-rc5.19All time
-rc1 6564.7% 66,1547.3%
-rc2 72.1% 1,5123.5%
-rc3 62.0% 1,1793.3%
-rc4 132.8% 9873.1%
-rc5 62.7% 9243.6%
-rc6 40.9% 8633.5%
-rc7 153.3% 7553.8%
-rc8 51.9% 2753.9%
-rc9 322.2%
final 4723.8%

The 5.19 numbers should be taken with at least one grain of salt; as we have seen above, the fixes for 5.19 commits will be wandering into the kernel over the next decade or so. That makes 5.19 appear, probably falsely, to be better than the kernel history as a whole; getting a complete picture for this cycle will require some patience. Beyond that, the Retbleed fixes were merged for 5.19-rc7; there were numerous fixes needed for those, which explains the elevated rate at -rc7.

In general we see, as we might expect, that most bugs enter the kernel during the merge window, whether one looks at absolute numbers or as a percentage of total commits. After that, the bug rate drops, but remains roughly the same through the development cycle. In theory, as the final release gets closer, developers should be more careful and only push the most important and well-tested commits. In the real world, late-cycle patches are just as likely to be buggy as those that came earlier, and patches that enter the mainline after the last -rc release seem to be especially risky.

On to 6.0

In the 5.19 announcement, Linus Torvalds let it be known that the next kernel would probably be named 6.0. As usual, the major-number bump has no special meaning for the kernel; it's just another release with a lot more changes in it. As of this writing, 12,325 non-merge changesets are waiting in linux-next, suggesting that 6.0 will not be as busy a cycle as its predecessor. Come back in early October for the details on how it played out.

Index entries for this article
KernelReleases/5.19


to post comments

Some 5.19 development statistics

Posted Aug 1, 2022 18:09 UTC (Mon) by zx2c4 (subscriber, #82519) [Link] (1 responses)

I'm vaguely curious about the calculation here:

By changesets
Krzysztof Kozlowski	211	1.4%
[...]
Pavel Begunkov	79	0.5%

Because when I run my own stats and compare it to the last one there, I should technically be tied for last place, right?

~/Projects/linux $ git log v5.18..v5.19 --oneline --author=Jason@zx2c4.com | wc -l
79
~/Projects/linux $ git log v5.18..v5.19 --oneline --author=asml.silence@gmail.com | wc -l
79

Running git describe --contains on those shows they all have v5.19-era tags too.

What's the missing detail? Does the LWN script automatically identify reverts and removes those from the count?

The missing detail

Posted Aug 1, 2022 18:23 UTC (Mon) by corbet (editor, #1) [Link]

The missing detail is that I just grabbed the first 20 out of the list, and your name happened to be #21. I sometimes look out for that and try to draw the dividing line in a way that doesn't have this effect, but clearly failed this time. Both you and Jack Xiao came in at 79 commits.

I've just extended the table by two lines; apologies for the oversight.

Some 5.19 development statistics

Posted Aug 1, 2022 21:54 UTC (Mon) by moorray (subscriber, #54145) [Link]

Thanks a lot for the analysis of the fixes! Surprising that fixes later in the -rc process are more rather than less likely to require a follow up. Also looking at the volume of fixes per -rc makes me feel a lot better about the fact that the count of networking fixes per -rc does not decrease much for later -rcs - we're not an outlier it seems :)

Some 5.19 development statistics

Posted Aug 2, 2022 9:09 UTC (Tue) by epa (subscriber, #39769) [Link] (2 responses)

Is it fair to count Red Hat and IBM separately? I know they are separate business units and all that, but these distinctions tend to dissolve over time.

Some 5.19 development statistics

Posted Aug 2, 2022 9:32 UTC (Tue) by airlied (subscriber, #9104) [Link]

They are still separate companies. Like Red Hat has a CEO.

Some 5.19 development statistics

Posted Aug 3, 2022 5:58 UTC (Wed) by ajdlinux (subscriber, #82125) [Link]

IBM kernel dev here: I can very confidently say those distinctions have not, at this point, dissolved. Still two very separate organisations with completely separate processes.

Some 5.19 development statistics

Posted Aug 2, 2022 18:12 UTC (Tue) by andy_shev (subscriber, #75870) [Link]

> The spike at 2.6.12 may seem strange, but remember that the Git history begins then; all of the Fixes: tags pointing to 2.6.12 name commit 1da177e4c3f4, which was the initial commit that started the whole thing off. They are, thus, referring to bugs that were introduced sometime before early 2005.

People can add history.git to their remote list and having fun with the pre-Git era of the commits. I have done a few, but not as a Fixes tag (just a pointer inside the commit messages).

Renesas H8/300

Posted Aug 4, 2022 9:47 UTC (Thu) by rwmj (subscriber, #5474) [Link]

Interestingly not the first time that h8300 has been removed (and later readded)! It was first dropped in 2013 and then added back in 2015.

Some 5.19 development statistics

Posted Aug 5, 2022 11:40 UTC (Fri) by anton (subscriber, #25547) [Link] (2 responses)

The last table is unclear to me. What is the basic value that the percentages are computed from? The connection with the accompanying text is also unclear.

Final table

Posted Aug 5, 2022 12:43 UTC (Fri) by corbet (editor, #1) [Link] (1 responses)

For the final table, I found all of the commits referred to by Fixes: tags, then looked at when during the development cycle those commits were introduced. During 5.19, 656 commits that hit the mainline for 5.19-rc1 were fixed by later commits - that's 4.7% of the 13,124 commits pulled for -rc1.

Apologies if that wasn't clear.

Final table

Posted Aug 5, 2022 13:59 UTC (Fri) by anton (subscriber, #25547) [Link]

Thanks. For the "All time" numbers, I guess this means that 66,154 (7.4%) of all rc0 patches were fixed later. If 5.19 has a similar proportion of patches to be fixed, about 2/3 of that are fixed before the 5.19 release, and 1/3 will be fixed later; that sounds worrying, but I expect that the bugs that survive into the release cause fewer problems than the ones that have been caught earlier, at least for functionality bugs (security is a different issue).


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds