|
|
Subscribe / Log in / New account

Who's writing 2.6.21 and related issues

Our article Who wrote 2.6.20?, which appeared two weeks ago, generated a strong response. There is, it seems, a lot of interest in where this code is coming from, but nobody had gotten around to doing the crunching to figure it out. That article calls for a followup in a few ways.

First, those who saw the article early on may want to take another look, as some of the tables have been changed. There was only one serious mistake to fix - one developer's affiliation was incorrectly guessed by the code - but further information has also helped to shrink the "unknown" column somewhat. The original tables can be found from the article (for whatever historical reasons may exist), but the tables in the article itself are the current ones.

The 2.6.21 cycle has moved far enough along as of this writing (the 2.6.21-rc3 prepatch is due any time) that it's worth taking a look at the statistics for the just over 4,000 changesets which have been merged. There are some familiar names here, but some new ones as well. The reflect the different nature of this development cycle, 2.6.21 will have fewer changes in the virtualization area, for example, but it has some significant core changes (like the clockevents and dynamic tick work). A somewhat different set of developers had work ready to merge this time around, and the results show that.

Anyway, the developers with the most work merged this time around are:

Most active 2.6.21 developers
By changesets  By lines changed
Eric W. Biederman1042.5%   Adrian Bunk240976.1%
Ralf Baechle771.9%   Divy Le Ray182554.6%
Adrian Bunk711.7%   Ben Dooks175104.4%
Bob Moore661.6%   Andrew Victor138773.5%
Andrew Morton541.3%   Ralf Baechle99052.5%
Takashi Iwai541.3%   YOSHIFUJI Hideaki95052.4%
Robert P. J. Day531.3%   Steve Wise94182.4%
Jeff Dike521.3%   Jeff Garzik70141.8%
Jiri Slaby511.2%   Vitaly Bordug63871.6%
Ben Dooks501.2%   Thomas Gleixner60781.5%
Tejun Heo481.2%   Bob Moore60551.5%
Al Viro481.2%   Ishizaki Kou59121.5%
David Brownell471.1%   Richard Purdie59091.5%
YOSHIFUJI Hideaki441.1%   Liam Girdwood57731.5%
Mike Isely431.1%   Frank Mandarino52841.3%
Thomas Gleixner380.9%   Jay Cliburn51821.3%
Randy Dunlap380.9%   Tejun Heo51201.3%
Stephen Hemminger360.9%   Kumar Gala50441.3%
Alan Cox350.9%   Martin Schwidefsky47291.2%
Michael Krufky320.8%   Olof Johansson46591.2%

On the side of removing code, the list of names remains about the same:

Developers with the most lines removed
Adrian Bunk2372012.8%
Jeff Garzik68083.7%
Paul Mundt24421.3%
Bob Moore15260.8%
Len Brown12440.7%
Alexey Starikovskiy9870.5%
Jiri Slaby9540.5%
Kenji Kaneshige6610.4%
Eric Sandeen6090.3%
Tim Schmielau5470.3%

Adrian Bunk continues to remove code from the kernel at an amazing rate. Also about the same is the table of signoffs:

Developers with the most signoffs (total 8614)
Andrew Morton100011.6%
Linus Torvalds86510.0%
Jeff Garzik3464.0%
Jaroslav Kysela2242.6%
Greg Kroah-Hartman2242.6%
David Miller2082.4%
Mauro Carvalho Chehab2062.4%
Len Brown2022.3%
Takashi Iwai1872.2%
Ralf Baechle1561.8%
Russell King1531.8%
Paul Mackerras1511.8%
James Bottomley1141.3%
Eric W. Biederman1051.2%
Adrian Bunk991.1%
Andi Kleen941.1%
Alexey Starikovskiy821.0%
Kyle McMartin790.9%
David Brownell780.9%
Ingo Molnar680.8%

The list of developers contributing code to a given kernel release can change over time, but the people through whom those patches pass - the subsystem maintainers - remain about the same. These developers form the infrastructure which does the work of getting reviewed code into the mainline kernel.

Here's the by-employer tables for 2.6.21-rc:

Top contributors by employer
By changesets   By lines changed
(Unknown)110827.1%   (Unknown)8543621.5%
(None)3809.3%   (None)5231213.2%
Red Hat3047.4%   IBM281867.1%
Intel2806.8%   Intel207785.2%
IBM2596.3%   Red Hat190074.8%
Novell2586.3%   Novell187024.7%
Linux Foundation1593.9%   Chelsio183614.6%
Linux Networx1042.5%   Simtec175454.4%
(Consultant)1002.4%   SANPeople139493.5%
Oracle892.2%   MIPS Technologies126463.2%
MIPS Technologies771.9%   Open Grid Computing94422.4%
Google611.5%   MontaVista88612.2%
MontaVista551.3%   Toshiba74621.9%
SGI541.3%   Wolfson Microelectronics73791.9%
Simtec501.2%   Sony70611.8%
Nokia411.0%   Freescale69931.8%
TimeSys380.9%   TimeSys61841.6%
Sony360.9%   Endrelia54211.4%
HP350.9%   Nokia47901.2%
Toshiba340.8%   Renesas Technology47401.2%

Many of the names are the same, but Red Hat does not dominate to quite the same extent as in 2.6.20. The percentage of patches contributed by developers known to be working on their own time has increased slightly.

Finally, some commenters on the original article requested the release of the code used to generate the numbers. Your editor has some qualms about doing so. The biggest among them is not that the code is an embarrassing hack with, presumably, at least one bug still in it. Neither is it the fact that the code could be seen as a competitive tool for LWN; frankly, there's nothing that complicated there.

The biggest worry is related to the attention these numbers drew, and the fact that a couple of developers have mailed in to note that they have received job offers as a result of appearing in the LWN lists. In addition, a few employers have contacted us to be sure that their "account" is credited with the work of all of their employees. The numbers your editor has generated are approximations, but some people clearly see them as being important.

The editors at LWN have an interest in covering the free software community while minimizing the changes that such coverage might cause - most of the time, at least. It seems plausible that, if the "top 20 contributors list" is seen as a desirable place to appear - with positive career benefits - developers might change their behavior as a result. It would be a shame to start seeing kernel patches aimed mainly at increasing a developer's count of lines changed. Such patches, one assumes, would not fare well in the review process, but it would be better if the situation did not come up at all.

The issue of the mapping between developers and their employers is also worth some consideration. Some of that information was obtained directly from the developers with a promise not to disclose it further; that promise must be kept. Beyond that, developers tend to change employers over time, and the code is not currently smart enough to deal with that. This shortcoming is not a problem when looking at a single release cycle, but it clearly would be an issue for multi-year analysis. The code could be improved, but it's not at all clear that the maintenance and distribution of a database of kernel developers' work histories is something LWN wants to get into. There are serious privacy issues to consider.

Despite these worries, the code is being released. In the end, it's not as if somebody else would have all that much trouble reproducing it. Some of the employer information has been taken out in response to the concerns outlined above, though. A tarball of the initial release can be found here; your editor is looking forward to the flood of patches which will improve the system.

Index entries for this article
KernelDevelopment model
KernelDevelopment model/Contributor statistics
KernelReleases/2.6.21


to post comments

Who's writing 2.6.21 and related issues

Posted Mar 8, 2007 3:32 UTC (Thu) by pj (subscriber, #4506) [Link]

Good point-data... the obvious next step is to run it on kernels farther back in history and build some graphs from the resulting dataset to show how, say, Redhat's kernel input has changed over time and etc.

Measuring the impact

Posted Mar 8, 2007 5:14 UTC (Thu) by proski (subscriber, #104) [Link] (1 responses)

If we consider the impact of the changes, all that metric is meaningless. For me personally, 2.6.21 will be the first kernel to support Broadcom wireless card in my laptop reliable enough for day-to-day use with real life access points, without any patches and tricks. Many thousands of users will be able to dump ndiswrapper and stop running Windows software in kernel mode on their laptops (and maybe on some desktops too).

Yet people who made it possible are not in any top-list. That's probably because the Broadcom driver has been in the kernel for some time. What's being done now is perhaps the hardest part, namely fixing the bugs that were preventing the driver from working. Some bugs affect only specific revisions of the hardware, so the developers must rely on users' reports or actually buy that hardware.

Painstaking as it is, debugging the hard problems doesn't generate many commits, not does it change many lines of code. Yet that's what makes the most difference.

Those two developers need to be mentioned in the comment since they are not in the article. Many thanks to you, Larry Finger and Michael Buesch!

bcm43xx

Posted Mar 8, 2007 6:04 UTC (Thu) by ncm (guest, #165) [Link]

Hear, hear!

That's not to say that the bcm-4311 in my Dell actually works right yet. My latest experience is that the bcm43xx driver works only after bcm34xx-d80211 has been loaded and then unloaded. Evidently the latter code knows more about how to turn the hardware on. In particular, the indicator LED only operates after that driver has been loaded. However, it's the bcm43xx driver that succeeds in communicating with the hub.

Who's writing 2.6.21 and related issues

Posted Mar 8, 2007 8:50 UTC (Thu) by ortalo (guest, #4654) [Link]

Argh! Too late, you released the code...
It would have been a good opportunity to raise developpers funding and most generally help them to find a new good job: list them in some "edited list with manual ponderation" generated by your tool (most generally on after a friendly request). You certainly need to take care to not associate such friendly "ranking" with too much free beer - we do not want to lose your too early.
Hmm, maybe that would be cheating however... or some form of clever lobbying? Well, it's just statistics after all. By definition, they are worse than damn lies, so, someone will certainly abuse them, why not just ensure that they are abused the right way...;-).

I am joking of course!

Who's writing 2.6.21 and related issues

Posted Mar 8, 2007 10:11 UTC (Thu) by nix (subscriber, #2304) [Link] (3 responses)

The employers who consider that appearing far up the affiliation list to be important seem to me to be missing the point. It's not the employers who do the work: it's the *people*.

(This seems to be a mistake made by nontechnical managers really quite a lot, presumably because otherwise it would be plain that they weren't actually doing any productive work. ;} ;} )

Do employers deserve any credit?

Posted Mar 8, 2007 14:12 UTC (Thu) by kevinbsmith (guest, #4778) [Link]

Sure, the people do the work. But many of them could not (or would not) do that work if the employers weren't paying them to do so. Those employers are contributing significant value to the system, and deserve some kind of credit. (I realize that they have "selfish" reasons for contributing, but then that's true for most if not all individuals as well.)

By appearing higher on the list, employers may be able to attract better talent, and make some extra sales, not to mention boosting their stock value a bit. I don't think they are missing the point at all.

Who's writing 2.6.21 and related issues

Posted Mar 8, 2007 14:24 UTC (Thu) by massimiliano (subscriber, #3048) [Link] (1 responses)

Well, managers and employers don't do anything useful, except paying the bills...

...which for developers that don't like messing with business issues (because only writing code is funny, and "productive"), can be a real godsend!

In other words, the article contains two separate contribution tables: by developer, and by employer.
They are distinct for a good reason: they measure two different things.
But none of them would exist without the other.

And this comes from a developer, who strongly feels that writing code is the real thing, and managing layers are typically mostly useless overhead.
But, if I hadn't Novell paying my bills, I would hardly have the time to contribute anything...

Who's writing 2.6.21 and related issues

Posted Mar 8, 2007 22:52 UTC (Thu) by nix (subscriber, #2304) [Link]

(In case anyone wonders, my tongue was so far in my cheek when I wrote
that that I'm surprised it didn't pop through. :) )

but some people clearly seem them as being

Posted Mar 8, 2007 11:19 UTC (Thu) by xav (guest, #18536) [Link]

s/seem/see/

You can remove this comment after use.

Who's writing 2.6.21 and related issues

Posted Mar 8, 2007 16:55 UTC (Thu) by PhilHannent (guest, #1241) [Link] (3 responses)

With recent complaints about changeset breaking the kernel it begs the question as to why isn't there a tinderbox [1] of the kernel? Is there a reason why its not possible?

[1] http://www.mozilla.org/tinderbox.html

Who's writing 2.6.21 and related issues

Posted Mar 9, 2007 8:44 UTC (Fri) by dlang (guest, #313) [Link] (1 responses)

well, we have releases, with release candidates between them, daily snapshots of Linus' git tree, and the raw git tree itself (updated frequently by Linus, but not on a fixed schedule)

we have make targets of allyesconfig, allnoconfig, allmodconfig, randconfig (among others) to make the job of automating tests easier

we have sets of systems that work to compile lots of sets of options from the release candidates (and if they have time, daily snapshots)

what more do you think needs to be done?

remember, the problem isn't that the change breaks things for everyone, it can break something for someone with some combination of compile options. the number of combinations to compile to cover every possibility is far too high to take place in the real world (how many years do you want to wait betwen when development stops and testing begins before a release is made?

the thing that helps is people with different requirements testing at least the release candidates. while it doesn't garentee that there won't be any problems, the more combinations that get tested the more bugs get found early (and the earlier a group/originization runs tests with their requirements, the earlier any bugs that would affect them get found and fixed)

Who's writing 2.6.21 and related issues

Posted Mar 9, 2007 9:28 UTC (Fri) by PhilHannent (guest, #1241) [Link]

Thank you very much for taking the time to explain it, I understand now.

Regards
Phil Hannent

Who's writing 2.6.21 and related issues

Posted Mar 10, 2007 15:39 UTC (Sat) by dankohn (guest, #6006) [Link]

Take a look at Martin Bligh's http://test.kernel.org/.

Who's writing 2.6.21 and related issues

Posted Mar 9, 2007 7:45 UTC (Fri) by jzbiciak (guest, #5246) [Link]

Hmmm... For serious longitudinal studies of the kernel, it might make sense to assign more anonymous monikers to the tracked individuals and entities. Those monikers could be qualified by various details, but still left sufficiently vague that random people would find it hard to game the system.

Obviously, certain individuals will always stand out regardless of what you might do in this analysis. But, these same people stand out anyway. So, you prevent the otherwise-would-be-unknowns from getting unfair "play," as it were, and avoid seriously perturbing the development process.

Not a single woman in the first 20 contributors!

Posted Mar 15, 2007 8:39 UTC (Thu) by irios (guest, #19838) [Link] (1 responses)

In what place does the first girl appear in the contributor list?
There were about 20% girls in my class when I studied Electrical Engineering, and there were more of them in Computer Science. Where are they?

Not a single woman in the first 20 contributors!

Posted Mar 16, 2007 9:08 UTC (Fri) by slamb (guest, #1070) [Link]

Where did you go to school, and why didn't I go there?


Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds