|
|
Subscribe / Log in / New account

Development statistics for the 5.0 kernel

By Jonathan Corbet
February 21, 2019
The announcement of the 5.0-rc7 kernel prepatch on February 17 signaled the imminent release of the final 5.0 kernel and the end of this development cycle. 5.0, as it turns out, brought in fewer changesets than its immediate predecessors, but it was still a busy cycle with a lot of developers participating. Read on for an overview of where the work came from in this release cycle.

As of this writing, 12,517 non-merge changesets have been pulled into the mainline repository for the 5.0 release. This is low compared to the kernels that came before:

CycleChangesets
4.1514,866
4.1613,630
4.1713,541
4.1813,283
4.1914,043
4.2013,884
5.012,517(so far)

One has to go back to 4.7, released in July 2016, to find a development cycle that brought in fewer changesets than 5.0. The number of developers contributing to 5.0 was 1,712, roughly equivalent to previous cycles; 276 of those developers made their first kernel contribution in this development cycle.

The most active developers were:

Most active 5.0 developers
By changesets
Christoph Hellwig2131.7%
Masahiro Yamada1351.1%
Colin Ian King1351.1%
Jens Axboe1120.9%
Arnaldo Carvalho de Melo1120.9%
Yangtao Li1060.8%
Yue Haibing1000.8%
Kuninori Morimoto950.8%
Andy Shevchenko940.8%
Rob Herring920.7%
Maxime Ripard910.7%
Boris Brezillon890.7%
Jakub Kicinski830.7%
Michael Straube830.7%
Thierry Reding820.7%
Ville Syrjälä820.7%
Geert Uytterhoeven800.6%
Linus Walleij800.6%
Paul E. McKenney780.6%
Gustavo A. R. Silva770.6%
By changed lines
Olof Johansson418346.0%
Kan Liang314584.5%
Yong Zhi227993.3%
Aaro Koskinen204623.0%
Firoz Khan159812.3%
Jens Axboe130091.9%
Tony Lindgren122371.8%
Boris Brezillon114221.7%
Sean Christopherson106141.5%
Dong Aisheng79981.2%
Eric Biggers74761.1%
Manivannan Sadhasivam67241.0%
Christoph Hellwig61990.9%
Federico Vaga58770.8%
Jordan Crouse57720.8%
Kuninori Morimoto52550.8%
Florian Westphal51200.7%
Mauro Carvalho Chehab50970.7%
Lorenzo Bianconi49410.7%
Sagi Grimberg48270.7%

Christoph Hellwig was the most prolific contributor of changesets this time around; he did a lot of work in the block subsystem and the DMA API. Masahiro Yamada's work was mostly focused on improvements to the kernel's build system, Colin Ian King continues to make spelling and coding-style fixes throughout the tree, Jens Axboe converted a lot of block drivers to the multiqueue API (along with many other block-layer changes), and Arnaldo Carvalho de Melo worked extensively on the perf utility. Of the top twenty developers with regard to changesets, only one got there through work on the staging tree — a significant change from years past.

Switching to the "lines changed" column: Olof Johansson only contributed seven changesets to 5.0, but one of them was removing the old and unmaintained eicon ISDN driver. Other top contributors in the "lines changed" column include Kan Liang for adding some JSON metrics to the perf utility, Yong Zhi for the Intel IPU3 driver, Aaro Koskinen for work on MIPS OCTEON support, and Firoz Khan, who reworked how the system-call tables are generated for most architectures.

A total of 226 employers supported work on 5.0, which is a typical number. The most active of those were:

Most active 5.0 employers
By changesets
Intel136010.9%
(None)9377.5%
(Unknown)8676.9%
Red Hat8596.9%
Linaro5524.4%
Google4873.9%
Mellanox4813.8%
SUSE4183.3%
AMD4153.3%
Renesas Electronics3943.1%
IBM3472.8%
Huawei Technologies3202.6%
(Consultant)3112.5%
Facebook2682.1%
Bootlin2612.1%
NXP Semiconductors2472.0%
ARM2261.8%
Oracle2041.6%
Canonical1711.4%
Code Aurora Forum1481.2%
By lines changed
Intel11615816.8%
Facebook668169.7%
Linaro403685.8%
Red Hat330414.8%
(None)321914.7%
(Unknown)268583.9%
Mellanox264873.8%
Google240993.5%
Nokia206003.0%
Bootlin190192.7%
AMD171372.5%
NXP Semiconductors167162.4%
SUSE145462.1%
Renesas Electronics141032.0%
IBM130681.9%
Atomide122371.8%
Huawei Technologies109521.6%
Code Aurora Forum105661.5%
(Consultant)93531.4%
ARM70341.0%

The kernel development community relies heavily on its testers and reviewers. The testing and review picture for 5.0 looks like this:

Test and review credits in 5.0
Tested-by
Andrew Bowers506.4%
Ming Lei374.7%
Jarkko Sakkinen202.6%
Arnaldo Carvalho de Melo182.3%
Janusz Krzysztofik172.2%
Alan Tull172.2%
Tony Luck162.1%
Aaron Brown151.9%
Jesper Dangaard Brouer151.9%
Heiko Stuebner141.8%
Marek Szyprowski131.7%
Corentin Labbe131.7%
Adam Ford121.5%
Wolfram Sang111.4%
Tom Zanussi111.4%
Steve Longerbeam111.4%
Ravulapati Vishnu vardhan Rao111.4%
David Ahern101.3%
Jarkko Nikula101.3%
Ondrej Jirman101.3%
Reviewed-by
Rob Herring1863.8%
Ville Syrjälä1252.5%
Simon Horman1082.2%
Geert Uytterhoeven921.9%
Hannes Reinecke901.8%
Christoph Hellwig831.7%
Alex Deucher721.5%
David Sterba691.4%
Andrew Morton601.2%
Omar Sandoval601.2%
John Hurley581.2%
Rodrigo Vivi571.2%
Chris Wilson571.2%
Sagi Grimberg561.1%
Petr Machata561.1%
Daniel Vetter521.1%
Christian König511.0%
Chao Yu481.0%
Andy Shevchenko440.9%
Nikolay Borisov410.8%

The kernel's repository can tell us who the patches came from, but it is silent on the question of where they came from. Some insights, though, can be had by looking at the time zone stored in the commit time for each patch. For 5.0, the result looks like this:

Originating time zone for 5.0 patches
OffsetChangesetsNotes
-8:00 1,676 US west coast
-7:00 622 US mountain
-6:00 361 US central
-5:00 939 US east coast
-4:00 295
-3:00 158 Brazil
-2:00 105
0:00 1,611 UK
+1:00 2,812 Western Europe
+2:00 1,457 Eastern Europe
+3:00 447 Finland, Russia
+5:30 513 India
+8:00 952 China
+9:00 302 Japan, Korea
+10:00 99 Australia
+11:00 140 Australia

A few time zones with less than ten changesets have been omitted from the above table. The association of time zones with countries is, of course, approximate. Daylight savings time can throw things off, as can developers whose systems are not set to their local time. If nothing else, the number of patches with times in UTC is probably higher than the number that actually came from countries in that time zone. There are still a few conclusions that can be drawn, though: it seems clear that an awful lot of kernel work still happens at or just east of the Prime Meridian, for example.

More than anything else, though, this table highlights something we already knew: the Linux kernel community is truly global in scope. Patches come in at a high rate from all over the world and are integrated in a (usually) smooth manner. In this sense, the 5.0 kernel is just like the many that came before it; it's business as usual in the kernel community.

Index entries for this article
KernelReleases/5.0


to post comments

Development statistics for the 5.0 kernel

Posted Feb 21, 2019 19:09 UTC (Thu) by jani (subscriber, #74547) [Link] (1 responses)

> Daylight savings time can throw things off

Indeed. Judging by a handful of European developers whose time zones I know, about half their commits to 5.0 are in DST, and if they're in any way representative, a significant portion of the results are tilted one time zone to East.

Development statistics for the 5.0 kernel

Posted Feb 21, 2019 23:06 UTC (Thu) by mchehab (subscriber, #41156) [Link]

> > Daylight savings time can throw things off

> Indeed. Judging by a handful of European developers whose time zones I know, about half their commits to 5.0 are in DST, and if they're in any way representative, a significant portion of the results are tilted one time zone to East
Brazil was at GMT-2 also during part of the 5.0 development cycle.

Development statistics for the 5.0 kernel

Posted Feb 21, 2019 22:11 UTC (Thu) by arekm (guest, #4846) [Link] (9 responses)

I wonder if there are "who/which company produces bugs often" stats based on "Fixes" info? Not something too useful though.

Development statistics for the 5.0 kernel

Posted Feb 22, 2019 0:24 UTC (Fri) by neilbrown (subscriber, #359) [Link] (5 responses)

> I wonder if there are "who/which company produces bugs often" stats

I really don't think that pointing the finger at who produced a bug is ever helpful (except to quietly let them know so they might learn from the experience). All the bugs belong to all of us.

Conversely, highlighting people who fixed lots of bugs would do no harm and could be beneficial. Even better is highlighting people who fixed a bug and made it clear when the bug was introduced so that an informed backport to -stable is easier. This is a valuable contribution worth celebrating.

Development statistics for the 5.0 kernel

Posted Feb 22, 2019 21:39 UTC (Fri) by GustavoARSilva (subscriber, #112293) [Link] (1 responses)

I agree.

It'd be interesting to include such statistics in coming reports. In the meantime I took the time to get such data for the most active 5.0 developers.

$ git log --shortstat --author=<name> v4.20..v5.0-rc7 | grep 'Fixes:\s\+[0-9a-f]\{6,\}\s\+(".*")' | wc -l

Geert Uytterhoeven 44
Colin Ian King 30
Jens Axboe 18
Christoph Hellwig 16
Gustavo A. R. Silva 15
Ville Syrjälä 15
Linus Walleij 14
Arnaldo Carvalho de Melo 13
Boris Brezillon 10
Masahiro Yamada 10
Yue Haibing 7
Kuninori Morimoto 5
Andy Shevchenko 3
Rob Herring 3
Jakub Kicinski 2
Thierry Reding 2
Michael Straube 1
Maxime Ripard 1
Yangtao Li 1
Paul E. McKenney 0

I wanted to use grep 'Fixes:\s\+[0-9a-f]\{12\}\s\+(".*")' but some people don't use the canonical format. Also, I noticed that some people use this format: Fixes: commit 4d230d1271064. Which may be why this kind of info is not included in the report: it is prone to error. So, due this an other formatting issues, in more than three cases, I had to manually edit the final number.

--Gustavo

Development statistics for the 5.0 kernel

Posted Feb 28, 2019 8:22 UTC (Thu) by arnd (subscriber, #8866) [Link]

Everyone with 11 fixes or more
$ git rev-list --grep="Fixes:\s\+[0-9a-f]\{6,\}" v4.20..v5.0-rc8 | xargs git show --format=%an -s | sort | uniq -c | sort -rn | head -n 23
46 Dan Carpenter
38 Geert Uytterhoeven
30 Colin Ian King
25 Arnd Bergmann
19 Chris Wilson
18 Florian Westphal
17 Jens Axboe
15 Ville Syrjälä
15 Martin Blumenstingl
15 Linus Walleij
15 Gustavo A. R. Silva
15 Christoph Hellwig
14 Wei Yongjun
14 Sinan Kaya
14 Paolo Abeni
14 Eric Biggers
13 Arnaldo Carvalho de Melo
12 Willem de Bruijn
11 Yonghong Song
11 Nicholas Mc Guire
11 Lorenzo Bianconi
11 Ido Schimmel
11 Andrew Lunn

Development statistics for the 5.0 kernel

Posted Feb 22, 2019 23:09 UTC (Fri) by martinfick (subscriber, #4455) [Link] (1 responses)

I believe it could be helpful if new code contributions were throttled by ensuring that a contributor's known bugs were fixed before accepting new contributions from them. This is a policy I try to enforce on my team. That would probably be hard to do privately.

Development statistics for the 5.0 kernel

Posted Feb 23, 2019 8:24 UTC (Sat) by error27 (subscriber, #8346) [Link]

My impression is that this isn't really an issue in the kernel. Once you know the code at fault (and thus the author) then the fix is normally straight forward. The difficulty lies in figuring out which code is at fault.

The other issue in the kernel is that after two years the original author isn't around or doesn't want to fix bugs. I seldom bother reporting static analysis issues over two years old. If it's less than two years we are good at addressing those.

People take a lot of pride their work generally.

Development statistics for the 5.0 kernel

Posted Feb 23, 2019 1:18 UTC (Sat) by johannbg (guest, #65743) [Link]

Time from report to fix, should be measured
aswell and responses between reporters and developers could be meaningful to have ( thou impossible to implement ).

Also there should be comparison between the kernel communities on all the *nix platforms.

How healthy are those,how do they compare to each other, are developers contributing between linux,bsd and solaris.

Is there rise or decline in one but not the others etc.

I dont think Jon and or other writers here have researched into that, gather the stats and written about it.

Development statistics for the 5.0 kernel

Posted Feb 22, 2019 7:40 UTC (Fri) by error27 (subscriber, #8346) [Link]

I haven't looked at this rigorously, but I suspect that most fixes tags are for an initial driver merge. So they're not regressions, they're support for new hardware but it has bugs.

Development statistics for the 5.0 kernel

Posted Feb 22, 2019 22:40 UTC (Fri) by kees (subscriber, #27264) [Link] (1 responses)

I did this analysis a couple years ago but for CVEs. "Who introduced the most security bugs?" The result was the same as "who wrote the most code?" Which is really to be expected. :P

Development statistics for the 5.0 kernel

Posted Feb 24, 2019 4:51 UTC (Sun) by spaetz (guest, #32870) [Link]

> The result was the same as "who wrote the most code?" Which is really to be expected. :P
Germany has a saying (translates only clumsily): "Only those who create, create mistakes."

Development statistics for the 5.0 kernel

Posted Feb 22, 2019 18:02 UTC (Fri) by Unknown118081 (guest, #118081) [Link]

How are "(Consultant)" contributors counted?

Development statistics for the 5.0 kernel

Posted Mar 1, 2019 0:03 UTC (Fri) by flussence (guest, #85566) [Link]

It'll be interesting to see the timezone results for 5.2, when the UK probably won't be in UTC. (subject to release schedules… and contemporary insanity)

Development statistics for the 5.0 kernel

Posted Mar 7, 2019 10:34 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

> There are still a few conclusions that can be drawn, though: it seems clear that an awful lot of kernel work still happens at or just east of the Prime Meridian, for example.

East of the Prime Meridian? I don't think so! On the Prime Meridian we have the UK, MOST of which is to the WEST of the Prime Meridian (and the Meridian being on the east side of London, even if most UK work is done in London that is also to the west ...). Then we have the Iberian Peninsula, which despite lying on the Meridian is on West European time. I don't know how much work comes from Africa, nor which countries lie on the Meridian, although it does run straight down the centre of the continent.

(East of the Meridian, in the UK, we have Kent, Essex, and East Anglia. That's about it. Maybe a bit of Lincolnshire.)

Cheers,
Wol

Development statistics for the 5.0 kernel

Posted Mar 7, 2019 15:22 UTC (Thu) by rschroev (subscriber, #4164) [Link]

> > There are still a few conclusions that can be drawn, though: it seems clear that an awful lot of kernel work still happens at or just east of the Prime Meridian, for example.

> East of the Prime Meridian? I don't think so!

I'm not sure I'm following your reasoning here.

The graph shows a lot of changesets from timezones 0, +1 and +2. When the author says "it seems clear that an awful lot of kernel work still happens at or just east of the Prime Meridian", it appears therefore to me he means, very roughly speaking, 0 is at the Prime Meridian and +1 and +2 are just east from it.

But even if we look closer and watch at the geography of these time zones, the statement still holds.

Timezones +1 and +2 are mostly east of Greenwich in winter, and even in summer a large part of +2 is east of Greenwich (in Europe, +2 in summer is CEST. That excludes Portugal, UK, Ireland, Iceland and Greese but includes the rest of Southern and Western Europe and most of Northern and Central Europe. A lot of that is east of Greenwich.

Quite some of the changesets from zones 0 and +1 probably come from west of Greenwich, but even so I imagine part of that is covered by "at the Prime Meridian" (not *exactly at* of course: the line is infinitesimally thin).

It appears that in Africa there is also a lot of land around those longitudes. South Africa, for example is at UTC +2 and can be regarded "just east of the Prime Meridian".

So yes, I'd say a lot of kernel work happens at or just east of Greenwich (i.e. the Prime Meridian).

(For some pedantic nitpicking: Portugal is on Western European Time, but the rest of the Iberian Peninsula (i.e. Spain) is on Central European Time.)


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds