|
|
Subscribe / Log in / New account

Where 2.6.25 came from

By Jonathan Corbet
April 2, 2008
The Linux Foundation has just published a white paper, written by Greg Kroah-Hartman, Amanda McPherson, and your editor, reviewing the origins of the code merged into the kernel from 2.6.11 through 2.6.24. As LWN readers know, the 2.6.25 kernel is getting close to release. So this seems like as good a time as any to look at what happened with the process in this release cycle.

As of this writing, 12,269 individual changesets have been merged for 2.6.25 - a new record. That beats the previous record (2.6.24, with a mere 10,353 changesets) by almost 2,000. There were 1,174 individual developers involved with 2.6.25, 419 of whom contributed one single patch. All told, those developers worked for 159 employers (that your editor could identify). The changes added 766,979 lines of code and removed 399,791, for a total growth of 367,188 lines.

Here is an updated version of a plot that your editor has been fond of showing during talks in recent years:

[Kernel lines-changed plot]

This plot shows a cumulative count of lines changed over time, with kernel release dates added in. The effects of the merge window policy can be seen in the stair-step appearance of the plot. The steps appear to be getting bigger, but the time between releases has also increased slightly, so the overall rate of change remains roughly constant. It is a high rate, with over five million lines changed - well over half the total - in the last two years.

So who did this work? Here is the traditional table of the most active developers in the 2.6.25 series:

Most active 2.6.25 developers
By changesets
Bartlomiej Zolnierkiewicz3042.5%
Patrick McHardy2191.8%
Adrian Bunk2121.7%
Ingo Molnar2071.7%
Paul Mundt2041.7%
Greg Kroah-Hartman1711.4%
Jesper Nilsson1661.4%
Thomas Gleixner1641.3%
Pavel Emelyanov1551.3%
Harvey Harrison1481.2%
Herbert Xu1361.1%
Mauro Carvalho Chehab1361.1%
Roland McGrath1341.1%
David Woodhouse1341.1%
Al Viro1321.1%
Michael Krufky1281.0%
Glauber Costa1271.0%
David S. Miller1120.9%
Andrew Morton1090.9%
Takashi Iwai1040.8%
By changed lines
Jesper Nilsson344073.7%
David Howells297333.2%
Eliezer Tamir261532.9%
Adrian Bunk219982.4%
Kumar Gala197532.2%
Paul Mundt189182.1%
Jiri Slaby180022.0%
Glenn Streiff165971.8%
Auke Kok139391.5%
David Gibson112551.2%
Michael Chan112541.2%
Ingo Molnar106791.2%
James Bottomley99071.1%
Christoph Hellwig97841.1%
Mauro Carvalho Chehab93321.0%
Bartlomiej Zolnierkiewicz91081.0%
Thomas Gleixner91041.0%
Patrick McHardy85630.9%
Michael Krufky81950.9%
Takashi Iwai78250.9%

There are some familiar names on this list, but also some new ones. Bartlomiej Zolnierkiewicz contributed more changesets than any other developer; his work is contained entirely within the IDE subsystem. Patrick McHardy works in the networking area, mostly (but not exclusively) with the netfilter subsystem. Adrian Bunk continues to make small fixes all over the tree and to relentlessly hunt down unused code for removal. Ingo Molnar remains busy in his new role as one of the x86 maintainers; scheduler work also accounts for a number of his changes. Paul Mundt maintains the SuperH architecture.

The picture is a little different when one considers how many lines of code were changed. Jesper Nillson's work was done within the CRIS architecture. David Howells works all over the tree; his largest contribution was the addition of the MN10300 architecture code. Eliezer Tamir contributed the bnx2x (Broadcom Everest) network driver, and Kumar Gala works with the PowerPC architecture.

There is relatively little change in the lists of employers associated with all of this work (please remember that the numbers associated with employers are necessarily approximate):

Most active 2.6.25 employers
By changesets
(None)191815.6%
Red Hat156212.7%
(Unknown)123210.0%
Novell8266.7%
IBM7586.2%
Intel5664.6%
SWsoft2662.2%
Oracle2502.0%
Astaro2191.8%
(Academia)2181.8%
Renesas Technology2171.8%
Movial2131.7%
Axis Communications1661.3%
linutronix1661.3%
Freescale1321.1%
Qumranet1271.0%
Google1241.0%
Analog Devices1211.0%
SGI1181.0%
(Consultant)1110.9%
By lines changed
(None)13211714.4%
(Unknown)11799312.8%
Red Hat10318811.2%
IBM592496.4%
Freescale523365.7%
Intel464665.1%
Novell417904.5%
Axis Communications393824.3%
Broadcom377894.1%
Renesas Technology237042.6%
Movial223272.4%
Hansen Partnership120761.3%
Marvell116611.3%
Oracle112141.2%
linutronix106491.2%
Astaro101671.1%
(Consultant)93421.0%
SWsoft78490.9%
MontaVista75170.8%
(Academia)73530.8%

As usual, one can also look at who applies a Signed-off-by header to code for which they are not the author. These headers illustrate the chain of trust which gets code into the kernel. For 2.6.25, the top approvers of patches are:

Sign-offs in the 2.6.25 kernel
By developer
Andrew Morton151312.2%
David S. Miller144411.7%
Ingo Molnar11539.3%
Thomas Gleixner9918.0%
John W. Linville6145.0%
Jeff Garzik4683.8%
Mauro Carvalho Chehab4473.6%
Greg Kroah-Hartman3452.8%
Paul Mackerras3072.5%
James Bottomley3062.5%
Jaroslav Kysela2922.4%
Linus Torvalds2492.0%
Len Brown2201.8%
Russell King1971.6%
Takashi Iwai1701.4%
Avi Kivity1671.4%
Bryan Wu1321.1%
Herbert Xu1231.0%
Roland Dreier1211.0%
Kumar Gala1070.9%
By employer
Red Hat418533.8%
Google151612.2%
linutronix9948.0%
(None)8837.1%
IBM6895.6%
Novell6114.9%
(Unknown)5344.3%
Intel4683.8%
Hansen Partnership3062.5%
Linux Foundation2542.1%
(Consultant)2422.0%
Qumranet1701.4%
Oracle1261.0%
SGI1261.0%
Freescale1211.0%
Cisco1211.0%
Analog Devices1150.9%
Astaro1070.9%
Renesas Technology820.7%
Movial780.6%

Some of these developers are quite busy; Andrew Morton is signing off more than twenty patches every day - weekends included. The gatekeepers to the kernel continue to work for a relatively small number of companies, with the top ten employers accounting for over 75% of all non-author signoffs.

All told, all these numbers paint a picture of a development process which is healthy and continues to set a fast pace. It incorporates work from an increasingly large community of developers who are able to work in a highly cooperative manner despite the fact that their employers are fierce competitors. There are very few projects like it.

(Thanks to Greg Kroah-Hartman for his help in the creation of these statistics).

Index entries for this article
KernelReleases/2.6.25


to post comments

Where 2.6.25 came from

Posted Apr 3, 2008 12:34 UTC (Thu) by lacostej (guest, #2760) [Link] (4 responses)

What would be interesting is to compare information found in the first graph, e.g.
* size of the 'stair-step'
* comparison of the stair size with the release period length
* stabilization (e.g. inclination angle of the slope after the stair)
* average amount of reviews per change
* ...
with external information (perceived or computed stability, number of security issues
introduced in a release, holidays)

From my readings of the graph
the last 2 kernels have very steep merge. 2.6.19 started merging late
2.6.22 has had more late merges than its successors
2.6.20 seems to have stabilized early
2.6.16 seems to have had a bad stabilization 

Where 2.6.25 came from

Posted Apr 4, 2008 20:41 UTC (Fri) by roelofs (guest, #2599) [Link] (1 responses)

2.6.16 seems to have had a bad stabilization

2.6.16 is mostly off the graph. Are you off by one?

Greg

Where 2.6.25 came from

Posted Apr 5, 2008 5:27 UTC (Sat) by lacostej (guest, #2760) [Link]

"2.6.16 is mostly off the graph. Are you off by one?"

yep. I think I got the others right.

Where 2.6.25 came from

Posted Apr 6, 2008 10:28 UTC (Sun) by bunk (subscriber, #44933) [Link] (1 responses)

You said:
From my readings of the graph
the last 2 kernels have very steep merge. 2.6.19 started merging late
2.6.22 has had more late merges than its successors
2.6.20 seems to have stabilized early

Don't believe any statistics you haven't faked yourself.

There is no strong relation between the number of changes and the number of lines changed.

And none of them has any strong relation to when a kernel stabilizes.

An increase in the "count of lines changed over time" graph can e.g. be one or more of the
following:
* many bugfixes
* defconfig updates
* addition or removal of drivers

It's a nice graph, but when you ignore the fact that it contains zero information *why* lines
changed all conclusions you draw are invalid.

Where 2.6.25 came from

Posted Apr 6, 2008 10:59 UTC (Sun) by bunk (subscriber, #44933) [Link]

I just notice I forgot to add a smiley after the "Don't believe any statistics you haven't
faked yourself."

I don't claim this graph was faked (I haven't checked, but it's most likely correct).

But it's important to realize that the scale of the graph means that for example the step you
see one month before the release of 2.6.23 can easily equal 100.000 lines of code (the
re-addition of the sk98lin driver alone were over 40.000 lines of code).

Stabilization, also known as bugfixing, tends to consist of small patches (usually < 100 lines
changed, often even < 10 lines changed).

And these patches are simply too small for having any visible effect on a "count of lines
changed" graph for the Linux kernel.

Check e.g. http://lwn.net/Articles/274992/ for an explanation by Linus himself regarding what
causes most of the line changes later in the release cycle.

Comments and questions about graph

Posted Apr 7, 2008 21:23 UTC (Mon) by pr1268 (guest, #24648) [Link]

First of all, many thanks to Linus (for git and its repository enabling the compilation of these statistics), Greg KH, Amanda, and our editor (for the article and gitdm), René Descartes (for the coordinate system bearing his name on which a function like ΔLOC/Δt can be plotted), and to Sir Isaac Newton and Gottfried Leibniz for discovering (inventing?) differential calculus, all of which allows me to comment on, and ask about, our editor's informative graph:

Overall, the entire plot is relatively constant slope (with a barely-perceivable increase in more recent times)--this means that the rate of kernel development appears to be in a state of steady development, with a hint of increased patch submission rate in the newer releases.

However, there's one thing I notice mildly unusual: the steep vertical rises of new code submissions in the patch windows of 2.6.23 and .24 are taller than previous ones, with their matching stabilization periods correspondingly longer.

Yet, with the overall slope nearly constant, this would seem to imply that the stabilization period (shallow slope) is proportional to the number of patches submitted at the release window (steep slope).

Are the stabilization periods becoming longer because of the additional time to test all the new code? Or, is more code being submitted at the release window due to the longer time between releases? Thanks again!

P.S. I calculate ΔLOC/Δt ≅ (5.5M - .5M) LOC / 2 yr ≅ 1 new LOC every 12.6144 seconds.


Copyright © 2008, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds