Where 2.6.25 came from
As of this writing, 12,269 individual changesets have been merged for 2.6.25 - a new record. That beats the previous record (2.6.24, with a mere 10,353 changesets) by almost 2,000. There were 1,174 individual developers involved with 2.6.25, 419 of whom contributed one single patch. All told, those developers worked for 159 employers (that your editor could identify). The changes added 766,979 lines of code and removed 399,791, for a total growth of 367,188 lines.
Here is an updated version of a plot that your editor has been fond of showing during talks in recent years:
This plot shows a cumulative count of lines changed over time, with kernel release dates added in. The effects of the merge window policy can be seen in the stair-step appearance of the plot. The steps appear to be getting bigger, but the time between releases has also increased slightly, so the overall rate of change remains roughly constant. It is a high rate, with over five million lines changed - well over half the total - in the last two years.
So who did this work? Here is the traditional table of the most active developers in the 2.6.25 series:
Most active 2.6.25 developers
By changesets Bartlomiej Zolnierkiewicz 304 2.5% Patrick McHardy 219 1.8% Adrian Bunk 212 1.7% Ingo Molnar 207 1.7% Paul Mundt 204 1.7% Greg Kroah-Hartman 171 1.4% Jesper Nilsson 166 1.4% Thomas Gleixner 164 1.3% Pavel Emelyanov 155 1.3% Harvey Harrison 148 1.2% Herbert Xu 136 1.1% Mauro Carvalho Chehab 136 1.1% Roland McGrath 134 1.1% David Woodhouse 134 1.1% Al Viro 132 1.1% Michael Krufky 128 1.0% Glauber Costa 127 1.0% David S. Miller 112 0.9% Andrew Morton 109 0.9% Takashi Iwai 104 0.8%
By changed lines Jesper Nilsson 34407 3.7% David Howells 29733 3.2% Eliezer Tamir 26153 2.9% Adrian Bunk 21998 2.4% Kumar Gala 19753 2.2% Paul Mundt 18918 2.1% Jiri Slaby 18002 2.0% Glenn Streiff 16597 1.8% Auke Kok 13939 1.5% David Gibson 11255 1.2% Michael Chan 11254 1.2% Ingo Molnar 10679 1.2% James Bottomley 9907 1.1% Christoph Hellwig 9784 1.1% Mauro Carvalho Chehab 9332 1.0% Bartlomiej Zolnierkiewicz 9108 1.0% Thomas Gleixner 9104 1.0% Patrick McHardy 8563 0.9% Michael Krufky 8195 0.9% Takashi Iwai 7825 0.9%
There are some familiar names on this list, but also some new ones. Bartlomiej Zolnierkiewicz contributed more changesets than any other developer; his work is contained entirely within the IDE subsystem. Patrick McHardy works in the networking area, mostly (but not exclusively) with the netfilter subsystem. Adrian Bunk continues to make small fixes all over the tree and to relentlessly hunt down unused code for removal. Ingo Molnar remains busy in his new role as one of the x86 maintainers; scheduler work also accounts for a number of his changes. Paul Mundt maintains the SuperH architecture.
The picture is a little different when one considers how many lines of code were changed. Jesper Nillson's work was done within the CRIS architecture. David Howells works all over the tree; his largest contribution was the addition of the MN10300 architecture code. Eliezer Tamir contributed the bnx2x (Broadcom Everest) network driver, and Kumar Gala works with the PowerPC architecture.
There is relatively little change in the lists of employers associated with all of this work (please remember that the numbers associated with employers are necessarily approximate):
Most active 2.6.25 employers
By changesets (None) 1918 15.6% Red Hat 1562 12.7% (Unknown) 1232 10.0% Novell 826 6.7% IBM 758 6.2% Intel 566 4.6% SWsoft 266 2.2% Oracle 250 2.0% Astaro 219 1.8% (Academia) 218 1.8% Renesas Technology 217 1.8% Movial 213 1.7% Axis Communications 166 1.3% linutronix 166 1.3% Freescale 132 1.1% Qumranet 127 1.0% 124 1.0% Analog Devices 121 1.0% SGI 118 1.0% (Consultant) 111 0.9%
By lines changed (None) 132117 14.4% (Unknown) 117993 12.8% Red Hat 103188 11.2% IBM 59249 6.4% Freescale 52336 5.7% Intel 46466 5.1% Novell 41790 4.5% Axis Communications 39382 4.3% Broadcom 37789 4.1% Renesas Technology 23704 2.6% Movial 22327 2.4% Hansen Partnership 12076 1.3% Marvell 11661 1.3% Oracle 11214 1.2% linutronix 10649 1.2% Astaro 10167 1.1% (Consultant) 9342 1.0% SWsoft 7849 0.9% MontaVista 7517 0.8% (Academia) 7353 0.8%
As usual, one can also look at who applies a Signed-off-by header to code for which they are not the author. These headers illustrate the chain of trust which gets code into the kernel. For 2.6.25, the top approvers of patches are:
Sign-offs in the 2.6.25 kernel
By developer Andrew Morton 1513 12.2% David S. Miller 1444 11.7% Ingo Molnar 1153 9.3% Thomas Gleixner 991 8.0% John W. Linville 614 5.0% Jeff Garzik 468 3.8% Mauro Carvalho Chehab 447 3.6% Greg Kroah-Hartman 345 2.8% Paul Mackerras 307 2.5% James Bottomley 306 2.5% Jaroslav Kysela 292 2.4% Linus Torvalds 249 2.0% Len Brown 220 1.8% Russell King 197 1.6% Takashi Iwai 170 1.4% Avi Kivity 167 1.4% Bryan Wu 132 1.1% Herbert Xu 123 1.0% Roland Dreier 121 1.0% Kumar Gala 107 0.9%
By employer Red Hat 4185 33.8% 1516 12.2% linutronix 994 8.0% (None) 883 7.1% IBM 689 5.6% Novell 611 4.9% (Unknown) 534 4.3% Intel 468 3.8% Hansen Partnership 306 2.5% Linux Foundation 254 2.1% (Consultant) 242 2.0% Qumranet 170 1.4% Oracle 126 1.0% SGI 126 1.0% Freescale 121 1.0% Cisco 121 1.0% Analog Devices 115 0.9% Astaro 107 0.9% Renesas Technology 82 0.7% Movial 78 0.6%
Some of these developers are quite busy; Andrew Morton is signing off more than twenty patches every day - weekends included. The gatekeepers to the kernel continue to work for a relatively small number of companies, with the top ten employers accounting for over 75% of all non-author signoffs.
All told, all these numbers paint a picture of a development process which is healthy and continues to set a fast pace. It incorporates work from an increasingly large community of developers who are able to work in a highly cooperative manner despite the fact that their employers are fierce competitors. There are very few projects like it.
(Thanks to Greg Kroah-Hartman for his help in the creation of these
statistics).
Index entries for this article | |
---|---|
Kernel | Releases/2.6.25 |
Posted Apr 3, 2008 12:34 UTC (Thu)
by lacostej (guest, #2760)
[Link] (4 responses)
Posted Apr 4, 2008 20:41 UTC (Fri)
by roelofs (guest, #2599)
[Link] (1 responses)
2.6.16 is mostly off the graph. Are you off by one?
Greg
Posted Apr 5, 2008 5:27 UTC (Sat)
by lacostej (guest, #2760)
[Link]
yep. I think I got the others right.
Posted Apr 6, 2008 10:28 UTC (Sun)
by bunk (subscriber, #44933)
[Link] (1 responses)
Posted Apr 6, 2008 10:59 UTC (Sun)
by bunk (subscriber, #44933)
[Link]
Posted Apr 7, 2008 21:23 UTC (Mon)
by pr1268 (guest, #24648)
[Link]
First of all, many thanks to Linus (for git and its repository enabling the compilation of these statistics), Greg KH, Amanda, and our editor (for the article and gitdm), René Descartes (for the coordinate system bearing his name on which a function like ΔLOC/Δt can be plotted), and to Sir Isaac Newton and Gottfried Leibniz for discovering (inventing?) differential calculus, all of which allows me to comment on, and ask about, our editor's informative graph: Overall, the entire plot is relatively constant slope (with a barely-perceivable increase in more recent times)--this means that the rate of kernel development appears to be in a state of steady development, with a hint of increased patch submission rate in the newer releases. However, there's one thing I notice mildly unusual: the steep vertical rises of new code submissions in the patch windows of 2.6.23 and .24 are taller than previous ones, with their matching stabilization periods correspondingly longer. Yet, with the overall slope nearly constant, this would seem to imply that the stabilization period (shallow slope) is proportional to the number of patches submitted at the release window (steep slope). Are the stabilization periods becoming longer because of the additional time to test all the new code? Or, is more code being submitted at the release window due to the longer time between releases? Thanks again! P.S. I calculate ΔLOC/Δt ≅ (5.5M - .5M) LOC / 2 yr ≅ 1 new LOC every 12.6144 seconds.
Where 2.6.25 came from
What would be interesting is to compare information found in the first graph, e.g.
* size of the 'stair-step'
* comparison of the stair size with the release period length
* stabilization (e.g. inclination angle of the slope after the stair)
* average amount of reviews per change
* ...
with external information (perceived or computed stability, number of security issues
introduced in a release, holidays)
From my readings of the graph
the last 2 kernels have very steep merge. 2.6.19 started merging late
2.6.22 has had more late merges than its successors
2.6.20 seems to have stabilized early
2.6.16 seems to have had a bad stabilization
2.6.16 seems to have had a bad stabilization
Where 2.6.25 came from
"2.6.16 is mostly off the graph. Are you off by one?"
Where 2.6.25 came from
Where 2.6.25 came from
You said:
From my readings of the graph
the last 2 kernels have very steep merge. 2.6.19 started merging late
2.6.22 has had more late merges than its successors
2.6.20 seems to have stabilized early
Don't believe any statistics you haven't faked yourself.
There is no strong relation between the number of changes and the number of lines changed.
And none of them has any strong relation to when a kernel stabilizes.
An increase in the "count of lines changed over time" graph can e.g. be one or more of the
following:
* many bugfixes
* defconfig updates
* addition or removal of drivers
It's a nice graph, but when you ignore the fact that it contains zero information *why* lines
changed all conclusions you draw are invalid.
Where 2.6.25 came from
I just notice I forgot to add a smiley after the "Don't believe any statistics you haven't
faked yourself."
I don't claim this graph was faked (I haven't checked, but it's most likely correct).
But it's important to realize that the scale of the graph means that for example the step you
see one month before the release of 2.6.23 can easily equal 100.000 lines of code (the
re-addition of the sk98lin driver alone were over 40.000 lines of code).
Stabilization, also known as bugfixing, tends to consist of small patches (usually < 100 lines
changed, often even < 10 lines changed).
And these patches are simply too small for having any visible effect on a "count of lines
changed" graph for the Linux kernel.
Check e.g. http://lwn.net/Articles/274992/ for an explanation by Linus himself regarding what
causes most of the line changes later in the release cycle.
Comments and questions about graph