Who's writing 2.6.21 and related issues
First, those who saw the article early on may want to take another look, as some of the tables have been changed. There was only one serious mistake to fix - one developer's affiliation was incorrectly guessed by the code - but further information has also helped to shrink the "unknown" column somewhat. The original tables can be found from the article (for whatever historical reasons may exist), but the tables in the article itself are the current ones.
The 2.6.21 cycle has moved far enough along as of this writing (the 2.6.21-rc3 prepatch is due any time) that it's worth taking a look at the statistics for the just over 4,000 changesets which have been merged. There are some familiar names here, but some new ones as well. The reflect the different nature of this development cycle, 2.6.21 will have fewer changes in the virtualization area, for example, but it has some significant core changes (like the clockevents and dynamic tick work). A somewhat different set of developers had work ready to merge this time around, and the results show that.
Anyway, the developers with the most work merged this time around are:
Most active 2.6.21 developers By changesets By lines changed Eric W. Biederman 104 2.5% Adrian Bunk 24097 6.1% Ralf Baechle 77 1.9% Divy Le Ray 18255 4.6% Adrian Bunk 71 1.7% Ben Dooks 17510 4.4% Bob Moore 66 1.6% Andrew Victor 13877 3.5% Andrew Morton 54 1.3% Ralf Baechle 9905 2.5% Takashi Iwai 54 1.3% YOSHIFUJI Hideaki 9505 2.4% Robert P. J. Day 53 1.3% Steve Wise 9418 2.4% Jeff Dike 52 1.3% Jeff Garzik 7014 1.8% Jiri Slaby 51 1.2% Vitaly Bordug 6387 1.6% Ben Dooks 50 1.2% Thomas Gleixner 6078 1.5% Tejun Heo 48 1.2% Bob Moore 6055 1.5% Al Viro 48 1.2% Ishizaki Kou 5912 1.5% David Brownell 47 1.1% Richard Purdie 5909 1.5% YOSHIFUJI Hideaki 44 1.1% Liam Girdwood 5773 1.5% Mike Isely 43 1.1% Frank Mandarino 5284 1.3% Thomas Gleixner 38 0.9% Jay Cliburn 5182 1.3% Randy Dunlap 38 0.9% Tejun Heo 5120 1.3% Stephen Hemminger 36 0.9% Kumar Gala 5044 1.3% Alan Cox 35 0.9% Martin Schwidefsky 4729 1.2% Michael Krufky 32 0.8% Olof Johansson 4659 1.2%
On the side of removing code, the list of names remains about the same:
Developers with the most lines removed Adrian Bunk 23720 12.8% Jeff Garzik 6808 3.7% Paul Mundt 2442 1.3% Bob Moore 1526 0.8% Len Brown 1244 0.7% Alexey Starikovskiy 987 0.5% Jiri Slaby 954 0.5% Kenji Kaneshige 661 0.4% Eric Sandeen 609 0.3% Tim Schmielau 547 0.3%
Adrian Bunk continues to remove code from the kernel at an amazing rate. Also about the same is the table of signoffs:
Developers with the most signoffs (total 8614) Andrew Morton 1000 11.6% Linus Torvalds 865 10.0% Jeff Garzik 346 4.0% Jaroslav Kysela 224 2.6% Greg Kroah-Hartman 224 2.6% David Miller 208 2.4% Mauro Carvalho Chehab 206 2.4% Len Brown 202 2.3% Takashi Iwai 187 2.2% Ralf Baechle 156 1.8% Russell King 153 1.8% Paul Mackerras 151 1.8% James Bottomley 114 1.3% Eric W. Biederman 105 1.2% Adrian Bunk 99 1.1% Andi Kleen 94 1.1% Alexey Starikovskiy 82 1.0% Kyle McMartin 79 0.9% David Brownell 78 0.9% Ingo Molnar 68 0.8%
The list of developers contributing code to a given kernel release can change over time, but the people through whom those patches pass - the subsystem maintainers - remain about the same. These developers form the infrastructure which does the work of getting reviewed code into the mainline kernel.
Here's the by-employer tables for 2.6.21-rc:
Top contributors by employer By changesets By lines changed (Unknown) 1108 27.1% (Unknown) 85436 21.5% (None) 380 9.3% (None) 52312 13.2% Red Hat 304 7.4% IBM 28186 7.1% Intel 280 6.8% Intel 20778 5.2% IBM 259 6.3% Red Hat 19007 4.8% Novell 258 6.3% Novell 18702 4.7% Linux Foundation 159 3.9% Chelsio 18361 4.6% Linux Networx 104 2.5% Simtec 17545 4.4% (Consultant) 100 2.4% SANPeople 13949 3.5% Oracle 89 2.2% MIPS Technologies 12646 3.2% MIPS Technologies 77 1.9% Open Grid Computing 9442 2.4% 61 1.5% MontaVista 8861 2.2% MontaVista 55 1.3% Toshiba 7462 1.9% SGI 54 1.3% Wolfson Microelectronics 7379 1.9% Simtec 50 1.2% Sony 7061 1.8% Nokia 41 1.0% Freescale 6993 1.8% TimeSys 38 0.9% TimeSys 6184 1.6% Sony 36 0.9% Endrelia 5421 1.4% HP 35 0.9% Nokia 4790 1.2% Toshiba 34 0.8% Renesas Technology 4740 1.2%
Many of the names are the same, but Red Hat does not dominate to quite the same extent as in 2.6.20. The percentage of patches contributed by developers known to be working on their own time has increased slightly.
Finally, some commenters on the original article requested the release of the code used to generate the numbers. Your editor has some qualms about doing so. The biggest among them is not that the code is an embarrassing hack with, presumably, at least one bug still in it. Neither is it the fact that the code could be seen as a competitive tool for LWN; frankly, there's nothing that complicated there.
The biggest worry is related to the attention these numbers drew, and the fact that a couple of developers have mailed in to note that they have received job offers as a result of appearing in the LWN lists. In addition, a few employers have contacted us to be sure that their "account" is credited with the work of all of their employees. The numbers your editor has generated are approximations, but some people clearly see them as being important.
The editors at LWN have an interest in covering the free software community while minimizing the changes that such coverage might cause - most of the time, at least. It seems plausible that, if the "top 20 contributors list" is seen as a desirable place to appear - with positive career benefits - developers might change their behavior as a result. It would be a shame to start seeing kernel patches aimed mainly at increasing a developer's count of lines changed. Such patches, one assumes, would not fare well in the review process, but it would be better if the situation did not come up at all.
The issue of the mapping between developers and their employers is also worth some consideration. Some of that information was obtained directly from the developers with a promise not to disclose it further; that promise must be kept. Beyond that, developers tend to change employers over time, and the code is not currently smart enough to deal with that. This shortcoming is not a problem when looking at a single release cycle, but it clearly would be an issue for multi-year analysis. The code could be improved, but it's not at all clear that the maintenance and distribution of a database of kernel developers' work histories is something LWN wants to get into. There are serious privacy issues to consider.
Despite these worries, the code is being released. In the end, it's not as
if somebody else would have all that much trouble reproducing it. Some of
the employer information has been taken out in response to the concerns outlined
above, though. A tarball of the initial release can be found here;
your editor is looking forward to the flood of patches which will improve
the system.
Index entries for this article | |
---|---|
Kernel | Development model |
Kernel | Development model/Contributor statistics |
Kernel | Releases/2.6.21 |
Posted Mar 8, 2007 3:32 UTC (Thu)
by pj (subscriber, #4506)
[Link]
Posted Mar 8, 2007 5:14 UTC (Thu)
by proski (subscriber, #104)
[Link] (1 responses)
Yet people who made it possible are not in any top-list. That's probably because the Broadcom driver has been in the kernel for some time. What's being done now is perhaps the hardest part, namely fixing the bugs that were preventing the driver from working. Some bugs affect only specific revisions of the hardware, so the developers must rely on users' reports or actually buy that hardware.
Painstaking as it is, debugging the hard problems doesn't generate many commits, not does it change many lines of code. Yet that's what makes the most difference.
Those two developers need to be mentioned in the comment since they are not in the article. Many thanks to you, Larry Finger and Michael Buesch!
Posted Mar 8, 2007 6:04 UTC (Thu)
by ncm (guest, #165)
[Link]
That's not to say that the bcm-4311 in my Dell actually works right yet. My latest experience is that the bcm43xx driver works only after bcm34xx-d80211 has been loaded and then unloaded. Evidently the latter code knows more about how to turn the hardware on. In particular, the indicator LED only operates after that driver has been loaded. However, it's the bcm43xx driver that succeeds in communicating with the hub.
Posted Mar 8, 2007 8:50 UTC (Thu)
by ortalo (guest, #4654)
[Link]
I am joking of course!
Posted Mar 8, 2007 10:11 UTC (Thu)
by nix (subscriber, #2304)
[Link] (3 responses)
(This seems to be a mistake made by nontechnical managers really quite a lot, presumably because otherwise it would be plain that they weren't actually doing any productive work. ;} ;} )
Posted Mar 8, 2007 14:12 UTC (Thu)
by kevinbsmith (guest, #4778)
[Link]
By appearing higher on the list, employers may be able to attract better talent, and make some extra sales, not to mention boosting their stock value a bit. I don't think they are missing the point at all.
Posted Mar 8, 2007 14:24 UTC (Thu)
by massimiliano (subscriber, #3048)
[Link] (1 responses)
Well, managers and employers don't do anything useful, except paying
the bills...
...which for developers that don't like messing with business issues
(because only writing code is funny, and "productive"), can be a real
godsend!
In other words, the article contains two separate contribution tables:
by developer, and by employer.
And this comes from a developer, who strongly feels that writing
code is the real thing, and managing layers are typically
mostly useless overhead.
Posted Mar 8, 2007 22:52 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Mar 8, 2007 11:19 UTC (Thu)
by xav (guest, #18536)
[Link]
You can remove this comment after use.
Posted Mar 8, 2007 16:55 UTC (Thu)
by PhilHannent (guest, #1241)
[Link] (3 responses)
[1] http://www.mozilla.org/tinderbox.html
Posted Mar 9, 2007 8:44 UTC (Fri)
by dlang (guest, #313)
[Link] (1 responses)
we have make targets of allyesconfig, allnoconfig, allmodconfig, randconfig (among others) to make the job of automating tests easier
we have sets of systems that work to compile lots of sets of options from the release candidates (and if they have time, daily snapshots)
what more do you think needs to be done?
remember, the problem isn't that the change breaks things for everyone, it can break something for someone with some combination of compile options. the number of combinations to compile to cover every possibility is far too high to take place in the real world (how many years do you want to wait betwen when development stops and testing begins before a release is made?
the thing that helps is people with different requirements testing at least the release candidates. while it doesn't garentee that there won't be any problems, the more combinations that get tested the more bugs get found early (and the earlier a group/originization runs tests with their requirements, the earlier any bugs that would affect them get found and fixed)
Posted Mar 9, 2007 9:28 UTC (Fri)
by PhilHannent (guest, #1241)
[Link]
Regards
Posted Mar 10, 2007 15:39 UTC (Sat)
by dankohn (guest, #6006)
[Link]
Posted Mar 9, 2007 7:45 UTC (Fri)
by jzbiciak (guest, #5246)
[Link]
Obviously, certain individuals will always stand out regardless of what you might do in this analysis. But, these same people stand out anyway. So, you prevent the otherwise-would-be-unknowns from getting unfair "play," as it were, and avoid seriously perturbing the development process.
Posted Mar 15, 2007 8:39 UTC (Thu)
by irios (guest, #19838)
[Link] (1 responses)
Posted Mar 16, 2007 9:08 UTC (Fri)
by slamb (guest, #1070)
[Link]
Good point-data... the obvious next step is to run it on kernels farther back in history and build some graphs from the resulting dataset to show how, say, Redhat's kernel input has changed over time and etc.Who's writing 2.6.21 and related issues
If we consider the impact of the changes, all that metric is meaningless. For me personally, 2.6.21 will be the first kernel to support Broadcom wireless card in my laptop reliable enough for day-to-day use with real life access points, without any patches and tricks. Many thousands of users will be able to dump ndiswrapper and stop running Windows software in kernel mode on their laptops (and maybe on some desktops too).
Measuring the impact
Hear, hear!
bcm43xx
Argh! Too late, you released the code...Who's writing 2.6.21 and related issues
It would have been a good opportunity to raise developpers funding and most generally help them to find a new good job: list them in some "edited list with manual ponderation" generated by your tool (most generally on after a friendly request). You certainly need to take care to not associate such friendly "ranking" with too much free beer - we do not want to lose your too early.
Hmm, maybe that would be cheating however... or some form of clever lobbying? Well, it's just statistics after all. By definition, they are worse than damn lies, so, someone will certainly abuse them, why not just ensure that they are abused the right way...;-).
The employers who consider that appearing far up the affiliation list to be important seem to me to be missing the point. It's not the employers who do the work: it's the *people*.Who's writing 2.6.21 and related issues
Sure, the people do the work. But many of them could not (or would not) do that work if the employers weren't paying them to do so. Those employers are contributing significant value to the system, and deserve some kind of credit. (I realize that they have "selfish" reasons for contributing, but then that's true for most if not all individuals as well.)Do employers deserve any credit?
Who's writing 2.6.21 and related issues
They are distinct for a good reason: they measure two different things.
But none of them would exist without the other.
But, if I hadn't Novell paying my bills, I would hardly have the time
to contribute anything...
(In case anyone wonders, my tongue was so far in my cheek when I wrote Who's writing 2.6.21 and related issues
that that I'm surprised it didn't pop through. :) )
s/seem/see/but some people clearly seem them as being
With recent complaints about changeset breaking the kernel it begs the question as to why isn't there a tinderbox [1] of the kernel? Is there a reason why its not possible?Who's writing 2.6.21 and related issues
well, we have releases, with release candidates between them, daily snapshots of Linus' git tree, and the raw git tree itself (updated frequently by Linus, but not on a fixed schedule)Who's writing 2.6.21 and related issues
Thank you very much for taking the time to explain it, I understand now.Who's writing 2.6.21 and related issues
Phil Hannent
Take a look at Martin Bligh's http://test.kernel.org/.
Who's writing 2.6.21 and related issues
Hmmm... For serious longitudinal studies of the kernel, it might make sense to assign more anonymous monikers to the tracked individuals and entities. Those monikers could be qualified by various details, but still left sufficiently vague that random people would find it hard to game the system.Who's writing 2.6.21 and related issues
In what place does the first girl appear in the contributor list?Not a single woman in the first 20 contributors!
There were about 20% girls in my class when I studied Electrical Engineering, and there were more of them in Computer Science. Where are they?
Where did you go to school, and why didn't I go there?Not a single woman in the first 20 contributors!