One longstanding attempt to resolve this conflict is RTLinux. At its core, RTLinux is a small, real-time kernel without a great deal of functionality. One of the things RTLinux can do, however, is run a normal Linux kernel as a low-priority task. The RTLinux kernel responds to interrupts, passing them through to the real-time code when appropriate; Linux only gets a chance to run when the real-time code has finished. In an RTLinux system, a small amount of real-time code can perform data acquisition or other real-time tasks while leaving much of the more time-flexible processing to Linux-based code.
One interesting thing to know about RTLinux is that the basic technique is patented. This patent - first covered in LWN in February, 2000 - was a relatively early indication of just how software patent claims can affect free software users. The core RTLinux code was licensed under the GPL, but it was not truly free; anybody wanting to use it was subject to the terms imposed by the patent owner. Those terms were eventually spelled out in the RTLinux patent license which allowed royalty-free use provided that either (1) the "Open RTLinux" distribution was used without modifications, or (2) the entire application was licensed under the GPL. Not everybody was happy with this license, but most of the world found ways of living with it or avoiding the patent, and things got quiet on the RTLinux front for some years.
On February 20, however, Wind River Systems announced the acquisition of RTLinux - including the patent. Interestingly, nothing to be found in Wind River's press release or acquisition FAQ mentions the patent license in any way. The text of that license, meanwhile, has disappeared from the FSMLabs site and has yet to reappear on the Wind River site. LinuxWorld ran an article on the acquisition with a verbal statement from Wind River that the license would be maintained, which is a step in the right direction, but it hardly adds up to a commitment on Wind River's part.
It is entirely possible that Wind River will continue with the current policy. Perhaps Wind River will even make new "Open RTLinux" releases allowing licensees to run reasonably contemporary software. At the moment, however, this code does not appear to be downloadable from anywhere, and there is no indication of when that situation might change. Along these lines, it's worth looking at some text from the acquisition FAQ [PDF]:
Given that Wind River sees an advantage to having a newer RTLinux than the "open source" versions, updated free releases of RTLinux from Wind River seem unlikely.
For anybody who is concerned, there are alternative approaches to real time and Linux which are worthy of consideration. At the lowest level, there is Adeos, a "nanokernel" which makes RTLinux-like functionality available while avoiding the claims of the RTLinux patent. Rather than run the general-purpose kernel as a task of the real-time kernel, Adeos runs both as tasks underneath itself. Adeos, in turn, is used at the base of RTAI, a longstanding RTLinux competitor. Things have been relatively quiet on the RTAI front in recent times, but a look at the RTAI-Lab project suggests that interesting things are happening there still.
Beyond that, work on the real-time preemption project, which aims to make Linux, itself, a real-time capable kernel, continues, and much of that work has found its way into the mainline. It will always be harder to prove that a full Linux kernel can provide deterministic response times, but, for many applications, the real-time performance of this kernel will be more than good enough. Some real-time vendors are already shipping products based on this work.
There may well be an ongoing market for the RTLinux technology that Wind River has just bought. It would be nice if Wind River could find a way to exploit that market while, simultaneously, using RTLinux to increase its contributions back to the community. There are few indications that Wind River sees RTLinux as anything more than a product, though, so those hoping for a more community-oriented stance may well be disappointed. The good news is that the alternatives are plentiful and quickly getting better.
The Fedora project has been trying to open itself up to contributions from the community, with slow (but real) success. The community is not just made up of developers and packagers, however; it turns out there is a group of motivated people who would like to help out with the Fedora artwork. Good design can be as hard as good code, and one would think that this sort of contribution would be welcome. And, to an extent, it is - to an extent.
There has been a conversation happening on the fedora-art list recently; some of the themes can be seen in this posting. It seems, frankly, that the Red Hat-based Fedora folks are concerned about the quality of artwork contributions and (though they don't say so in so many words) loss of control over the default look of the distribution. The end result is that the Fedora board has decided that contributed artwork will not be part of the default Fedora theme; instead, that work will be done within Red Hat. The project is trying not to close the door completely:
Nonetheless, there is a fair amount of disappointment in the artwork community at the moment.
On a related issue, the recent revelation that Dell's customers are asking for preinstalled Linux systems has created some interested in the Fedora community. Having a vendor as large as Dell preinstall Fedora would have clear benefits in helping the project to expand its user base. The Fedora folks would like to help make that happen, but it seems that there are some potential roadblocks on the way:
Some members of the advisory-board list have pointed out that worrying about the trademark policy is getting ahead of the game; making the distribution work seamlessly on, say, Dell laptops should maybe come first. Still, this issue points out the hazards of mixing trademark licensing and free software. Sometimes the results are not even in the trademark holder's interest.
Dell laptops were mentioned because the project knows that a surprisingly large number of its users are installing Fedora on those systems. How does Fedora know this? The answer is a tool called "smolt," which gathers information on the underlying hardware and phones home with it. The project is quite careful about how this communication is done - no connection is made until the user explicitly agrees to it happening. Even so, there have been some complaints on the lists, along with suggestions that it could be illegal under the privacy laws of some countries, especially in Europe.
During a recent Fedora board meeting, there was discussion of the Fedora 7 release delay, and, in particular, whether support for Fedora Core 5 and 6 would be extended to compensate. It came out that, while a number of people assume that the new 13-month support policy came into effect when it was adopted, that is not how the project understands it. The Fedora Core releases are currently expected to be supported under the old way of doing things: support for Fedora Core 5 will end when the second Fedora 7 test release (which just went into freeze mode) comes out. Support for Fedora Core 6 will end during the Fedora 8 development cycle. The full 13-month (or "2n+1") support mode is only expected to begin with Fedora 7. There has been some talk of trying to extend security support for FC5 and FC6, but it is not at all clear that it will happen.
Finally, it has been noted that a number of Fedora tasks seem to be going more slowly than many people would like. The word that your editor has heard is that much of this has to do with the impending release of RHEL 5. Getting that release into final form has been causing some heavy demands on Red Hat's developers, with the result that less time is available for working on Fedora. Once the RHEL release is out, things can be expected to pick up a bit on the Fedora side.Getting rich off those who work for free which, among other things, talked about free software this way:
It is not uncommon to see Linux referred to as a volunteer-created system, as opposed to the corporate-sponsored, proprietary alternatives. There has been little research, however, into how much work on Linux is truly "volunteer" - done on a hacker's spare, unpaid time. In general, the assumption that Linux is created by volunteers is simply accepted.
Determining the real provenance of free software can be a daunting task. There is a wealth of information available for those who look, however. In an attempt to shine some light in this area, your editor hacked up some scripts to do a lot of digging around in the kernel git repository. The idea was that, by looking at who is putting changes into the kernel, we can get a sense for where our source is coming from.
This study looked at the stream of patches that changed the 2.6.19 kernel into the current 2.6.20 release. There were, as it turns out 4983 non-merge changesets in this release, contributed by 741 different developers. (Merge changesets mark where the contents of other repositories were pulled into the mainline, but they do not carry any code changes, so the analysis skipped them). These patches added 286,439 lines of code and removed 159,812 others, for a total growth of 126,627 lines over the 2.6.20 development cycle.
Your editor's scripts looked over every non-merge commit in 2.6.20. For each, the developer listed as the "author" was given credit for the patch. This approach is not entirely fair, since one developer will, in some cases, be submitting code written by a group of people. In general, though, there is no easy way of getting around this problem - the true breakdown of authorship of a joint work simply is not available in the mainline repository. Your editor believes that this inaccuracy affects the accounting of a relatively small portion of the patches merged into the mainline.
Beyond that, how one generates statistics from a patch stream is an interesting question. How does one measure the productivity of programmers? One possibility is to look at the number of changesets merged. By that metric, this is the list of the most prolific contributors to 2.6.20:
Developers with the most changesets Al Viro 241 4.8% Andrew Morton 92 1.8% Jiri Slaby 92 1.8% Adrian Bunk 87 1.7% Gerrit Renker 79 1.6% Josef Sipek 79 1.6% Avi Kivity 68 1.4% Tejun Heo 67 1.3% Patrick McHardy 63 1.3% Ralf Baechle 61 1.2% Randy Dunlap 59 1.2% Alan Cox 58 1.2% Mariusz Kozlowski 57 1.1% Andrew Victor 53 1.1% Paul Mundt 52 1.0% Stefan Richter 49 1.0% David S. Miller 48 1.0% Russell King 44 0.9% Benjamin Herrenschmidt 44 0.9% Akinobu Mita 43 0.9%
Looking at patch counts rewards developers who put in large numbers of small patches. Al Viro's patches include a vast number of code annotations (to enable better checking with sparse), include file fixups, etc. Many of the changes are small - many do not affect the resulting kernel executable at all - but there are a lot of them. Even so, as the biggest contributor, Al generated less than 5% of the total changesets added to the kernel. The top 20 contributors, all together, generated 28% of the total changesets in 2.6.20.
One could make the argument that a better way to look at the problem is by the number of lines affected by a patch. In this way, a contributor's portion of the whole will not depend on whether it has been split into a long series of small patches or not. On the other hand, simply renaming a file can make it look like a developer has touched a large amount of code. Be that as it may, by looking at lines changed (defined as the greater of the number of lines added or removed by each individual changeset), one gets a table like this:
Developers with the most changed lines Jeff Garzik 20712 6.0% Patrick McHardy 15024 4.3% Jiri Slaby 13917 4.0% Avi Kivity 11726 3.4% Andrew Victor 9710 2.8% Amit S. Kale 9537 2.7% Stephen Hemminger 9120 2.6% Geoff Levand 8396 2.4% Michael Chan 8307 2.4% Chris Zankel 8099 2.3% Mauro Carvalho Chehab 7390 2.1% Adrian Bunk 6138 1.8% Yoshinori Sato 5232 1.5% Al Viro 4981 1.4% Benjamin Herrenschmidt 4588 1.3% Thierry MERLE 4549 1.3% Dan Williams 4516 1.3% Jonathan Corbet 3924 1.1% Gerrit Renker 3857 1.1% Jiri Kosina 3805 1.1%
Jeff Garzik comes out on top of this particular measurement by virtue of having deleted the long-unmaintained floppy tape subsystem. Patrick McHardy's work includes a number of additions to the netfilter subsystem, Jiri Slaby did a great deal of driver cleanup work, Avi Kivity was the contributor of the KVM virtualization code, and Andrew Victor contributed a number of ARM-related patches and the Atmel AT91 i2c driver. (The contributions made by other authors can be found by searching out their name in the 2.6.20 short-form changelog).
Most of the developers in the above list got there by adding code to the kernel. It can be said, however, that the true heroes in the development community are those who remove code and make the kernel smaller. The developers who were best at removing more code than they added were:
Developers with the most lines removed Jeff Garzik 19862 12.4% Chris Zankel 5608 3.5% Adrian Bunk 5528 3.5% Arnd Bergmann 2224 1.4% Linus Torvalds 1739 1.1% Atsushi Nemoto 1425 0.9% Thierry MERLE 911 0.6% David Gibson 878 0.5% Dominik Brodowski 528 0.3% Stefan Richter 509 0.3%
Once again, Jeff Garzik's removal of ftape comes out on top, by far. Chris Zankel cleaned up the Xtensa architecture, removing a number of files in the process. Adrian Bunk worked on the ftape removal, got rid of the frame diverter code, removed an old, broken block driver, and generally performed cleanups all over the tree. Mr. Bunk is, in fact, the bane of old code; over the last year (since 2.6.16) he has removed a full 127,000 lines from the kernel source tree. Arnd Bergman got rid of a bunch of syscall*() macros. Linus Torvalds removed the broken x86 stack unwinder code.
Finally, one could look at a different measure entirely: the number of patches signed off by each developer. A Signed-off-by: line is an indication that the person involved believes that the code is suitable for merging into the kernel; it implies that some degree of attention has been paid to the patch. Authors sign off their code, as do the subsystem maintainers who pass it up the chain. The top signers-off in 2.6.20 were:
Developers with the most signoffs Andrew Morton 1422 13.7% Linus Torvalds 1366 13.2% David S. Miller 483 4.7% Jeff Garzik 331 3.2% Greg Kroah-Hartman 269 2.6% Al Viro 241 2.3% Paul Mackerras 232 2.2% Andi Kleen 177 1.7% Mauro Carvalho Chehab 170 1.6% Russell King 166 1.6% Adrian Bunk 120 1.2% Arnaldo Carvalho de Melo 119 1.1% Ralf Baechle 117 1.1% James Bottomley 109 1.1% Patrick McHardy 96 0.9% Jiri Slaby 94 0.9% Avi Kivity 87 0.8% Josef Sipek 79 0.8% Paul Mundt 78 0.8% Gerrit Renker 78 0.8%
There were a total of 10,354 signoff lines in the 2.6.20 patch stream, so each changeset, on average, was signed off just over two times. It is interesting that Linus, who ultimately merges every patch, only signed off 13% of them. It seems that most patches, these days, go directly into the mainline from subsystem repositories without a signoff from Linus or Andrew. Most of the other names on that list, with just a few exceptions, are the maintainers of subsystem or architecture trees.
So now we have a sense for who got their fingers on the code which went into 2.6.20. But one interesting question still has not been answered: to what extent was that code contributed by volunteers (or "hobbyists")? Finding an answer to that question is somewhat trickier than looking at who wrote the patches, mostly because very few developers say "I wrote this on behalf of my employer."
The approach taken by your editor was relatively simplistic, but, perhaps, the best that is practical. Any patch whose author's given email address indicates a corporate affiliation is assumed to have been developed by an employee of that corporation. So any patch posted by somebody with an ibm.com email address is accounted as having been done by an IBM employee. Things are complicated by the fact that many people who work for companies do not use corporate addresses; it is not unheard-of for companies to have policies explicitly prohibiting code contributions associated with their domains. Your editor has coped with this problem by filling in the relevant developer's affiliation whenever it is known to him; in some cases, the developer was asked for this information.
This method has the effect of crediting all of an employee's work to his or her employer. In many cases, the situation is probably more complicated than that; one assumes, for example, that a certain kernel hacker's employer has not directed him to hack on Battle for Wesnoth. When looking only at kernel code, however, crediting all work to the employer is probably relatively safe.
Using this approach, the top sources of changesets were:
Top changeset contributors by employer (Unknown) 1244 25.0% Red Hat 636 12.8% (None) 383 7.7% IBM 368 7.4% Novell 295 5.9% Linux Foundation 261 5.2% Intel 178 3.6% Oracle 126 2.5% 97 1.9% University of Aberdeen 79 1.6% HP 78 1.6% Qumranet 71 1.4% Nokia 67 1.3% SGI 64 1.3% Astaro 63 1.3% MIPS Technologies 61 1.2% SANPeople 53 1.1% Miracle Linux 43 0.9% MontaVista 41 0.8% Broadcom 39 0.8%
Looking instead at the number of lines of code changed, the results become:
Top lines changed by employer (Unknown) 66154 19.0% Red Hat 44527 12.8% (None) 38099 11.0% IBM 25244 7.3% Astaro 15306 4.4% Linux Foundation 13638 3.9% Qumranet 12108 3.5% Novell 11930 3.4% Intel 11652 3.4% SANPeople 9888 2.8% NetXen 9607 2.8% Sony 8497 2.4% Broadcom 8349 2.4% Tensilica 8195 2.4% Nokia 5581 1.6% MontaVista 4394 1.3% University of Aberdeen 4324 1.2% LWN.net 3975 1.1% Secretlab 3370 1.0% HP 3211 0.9%
[Note that these tables have been updated once since the article was originally published; the curious can see what the original versions looked like.]
In these tables, the line marked "(Unknown)" is exactly that: patches for which existence of a supporting employer could not be determined. The line marked "(None)", instead, indicates the patches from developers known to be working on their own time.
Either way, the results come out about the same: at least 65% of the code which went into 2.6.20 was created by people working for companies. If the entire "unknown" group turns out to be developers working on a volunteer basis - an unlikely result - then just over 1/3 of the 2.6.20 patch stream was written by volunteers. The real number will be lower, but it still shows that a significant portion of the code we run is written by developers who are donating their time.
Looking at a single kernel release is instructive, but it can also be deceptive. The relatively short release cycle used by the kernel project makes it fairly easy for prolific developers to see few of their patches go into a specific release. In an attempt to gain a longer-term perspective, your editor forced his suffering system to crank through the entire history from 2.6.16 (released almost exactly one year ago) to the present. Some 28,000 non-merge changesets have been added to the mainline (by 1,961 developers) over that time, replacing 1.26 million lines of old code with 2.01 million lines of new code - the kernel grew by 754,000 lines.
The developers who touched the most lines over that time were:
Developers with the most changed lines Adrian Bunk 134021 5.3% Jeff Garzik 87847 3.5% Andrew Vasquez 75195 3.0% Mauro Carvalho Chehab 68568 2.7% David Teigland 46607 1.9% Ralf Baechle 38559 1.5% David S. Miller 35958 1.4% Andrew Victor 35594 1.4% Bryan O'Sullivan 33901 1.4% Paul Mundt 27041 1.1% Dave Kleikamp 26615 1.1% Lennert Buytenhek 25192 1.0% Haavard Skinnemoen 24372 1.0% Ben Dooks 23207 0.9% Patrick McHardy 23175 0.9% Ingo Molnar 22456 0.9% James Bottomley 22205 0.9% David Howells 19168 0.8% Jiri Slaby 18335 0.7% Divy Le Ray 17909 0.7%
The results for employers were:
Top lines changed by employer (Unknown) 740990 29.5% Red Hat 361539 14.4% (None) 239888 9.6% IBM 200473 8.0% QLogic 91834 3.7% Novell 91594 3.6% Intel 78041 3.1% MIPS Technologies 58857 2.3% Nokia 39676 1.6% SANPeople 36038 1.4% SteelEye 36021 1.4% Freescale 35034 1.4% Linux Foundation 34163 1.4% MontaVista 30211 1.2% Simtec 26166 1.0% Atmel 25975 1.0% HP 23714 0.9% SGI 22057 0.9% Oracle 21251 0.8% Open Grid Computing 20505 0.8%
The end result of all this is that a number of the widely-expressed opinions about kernel development turn out to be true. There really are thousands of developers - at least, almost 2,000 who put in at least one patch over the course of the last year. Linus Torvalds is directly responsible for a very small portion of the code which makes it into the kernel. Contemporary kernel development is spread out among a broad group of people, most of whom are paid for the work they do. Overall, the picture is of a broad-based and well-supported development community.
There are many other interesting things to be learned by looking at the kernel's development history. Expect more articles along these lines as your editor finds the time to improve his scripts.
Page editor: Jonathan Corbet
Next page: Security>>
Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds