Kernel development
Brief items
Kernel release status
The current 2.6 development kernel is 2.6.26-rc6, released by Linus on June 12. "I'd like to say that the diffs are shrinking and things are calming down, but I'd be lying. Another week, another -rc, and I another 350 commits." See the long-format changelog for all the details.
As of this writing, some 140 commits have gone into the mainline git repository since the 2.6.26-rc6 release. They include a number of fixes and a new driver for FM3130 realtime clocks.
The current -mm tree is 2.6.26-rc5-mm3. Says Andrew:
"The aim here is to get all the stupid bugs out of the way so that
some serious MM testing can be performed.
" Among other things, this
release contains the latest version of the pageout scalability patches (see
below).
The current stable 2.6 kernel is 2.6.25.7, released on June 16. It contains a rather long list of important fixes.
Kernel development news
Why some drivers are not merged early
Arjan van de Ven's kernel oops report always makes for interesting reading; it is a quick summary of what is making the most kernels crash over the past week. It thus points to where some of the most urgent bugs are to be found. Sometimes, though, this report can raise larger issues as well. Consider the June 16 report, which notes that quite a few kernel crashes were the result of a not-quite-ready wireless update shipped by Fedora. Ingo Molnar was quick to jump on this report with a process-related complaint:
Same for Nouveau: Fedora carries it and i dont understand why such a major piece of work is not done in mainline and not _helped by_ mainline.
He then took the discussion further with this observation:
This comment drew some unhappy responses from the networking developers, who feel that they have been unfairly targeted for criticism. Wireless drivers have been merged at the first real opportunity, they say, and trying to put them in earlier would have only made things worse. In fact, your editor will submit that mistakes were made with wireless drivers, but those mistakes have little to do with delaying their inclusion into the mainline. What went wrong with wireless is this:
- Early wireless developers did not really try to solve the wireless
networking problem; they just wanted to get their adaptor to work.
Wireless maintainer John Linville once told your editor that, for
years, these adaptors were treated as if they were Ethernet adaptors,
which they certainly are not. When these developers did get around to
dealing with issues specific to wireless networking, they created
their own wireless stacks contained within their drivers. So no
general wireless framework was created.
It's only in 2004 that Jeff Garzik started a project to create a generic wireless stack for Linux - and he started with a stack (HostAP) which, sometime later on, was seen as not being the best choice. So the work on HostAP - late to begin in the first place - was eventually abandoned.
- The networking stack which was eventually developed - mac80211 - began its life as a proprietary code base created with no community review or oversight at all. Predictably, it had all kinds of problems which required well over a year of work to resolve. Until mac80211 was in reasonable shape, there was no real way to get drivers ready for inclusion.
The result of all this (and the occasional legal hassle as well) is that wireless networking on Linux lagged for years, and is only now reaching something close to a stable state. So it is not surprising that there has been a lot of code churn in this area, or that things occasionally break. But it is hard to see how trying to merge wireless drivers sooner would have helped the situation significantly.
The non-merging of the Nouveau driver - the reverse-engineered driver for NVIDIA adapters - also has a simple explanation: the developers have not yet asked for this merge to happen. Nouveau is not considered to be at a point where it works yet, and, importantly, there are still user-space API issues which must be worked out. Breaking user-space code is severely frowned upon, so merging of code is nearly impossible if its user-space interfaces are still in flux.
James Bottomley put forward another reason why a driver may stay out of the mainline even though the author would like to see it merged:
In other words, their control over access to the mainline tree is the one club subsystem maintainers have at hand when they feel the need to push a developer to make changes to a driver. It may well be that simply merging drivers regardless of technical objections (something which a number of developers are pushing for) will reduce the incentive for developers to get their code into top shape - and it's not always clear that others will step in and do the work for them.
On the other hand, the idea that in-tree code tends to be less buggy than out-of-tree code is relatively uncontroversial. So, for many drivers at least, a "merge first and fix it up later" policy may well lead to the best results in the shortest period of time. One thing that is clear is that this discussion will not be going away anytime soon; chances are good that this year's kernel summit (happening in September) will end up revisiting the issue.
Peter Zijlstra: From DOS to kernel hacking
In a linux-kernel thread about fixing the Kernel Janitors project, Peter Zijlstra spoke up, with a bit of his perspective on attracting better kernel contributors. As he is a relatively recent addition to the kernel community, his path from Linux user to kernel hacker may serve as a template of sorts for others who are starting out now. We asked Peter to answer a few questions by email to help fill in some more of the details.
LWN: How did you get started with Linux? What attracted you?
A friend of mine introduced me to Unix/Linux at the time, and I started learning all about programming in a real environment. Basically all programming up to that point was in a freestanding environment where you had to poke the hardware to get anything done.
So initially it was the charm of a proper multitasking OS (with memory protection) that got me to use it – not having to reboot your machine every time, and the luxury of being able to run a debugger.
LWN: How quickly did you start poking around in the kernel? What did you first start to look at and why?
In those 10 years I learnt a lot about programming. I learnt about Unix system programming, I learnt about C++, multi-threading, database engines, and a whole range of interesting things.
Somewhere along I got a real internet connection and started lurking on mailing lists, including LKML – I must have been reading that on and off for about 5 years by the time I really sat down and wrote some patches.
During that time I might have sent in some trivial build fixes, and I remember finding a priority leak in one of the realtime patches. But I wasn't actively coding on the kernel – I just liked running real exotic stuff, you know Gentoo and building just about everything from CVS.
So what got me started on the kernel ... I can't quite remember how it happened, but I ran into some of Rik's [van Riel] Advanced Page Replacement stuff. I had worked on that problem space earlier while doing database engines, and had recently run into it again at work. So I started reading those papers and some of the proposed kernel patches, and I started to itch.
I dropped basically everything I was working on in my spare time (hacking WindowMaker, writing a C++ ASN.1-DER serialization class, writing a new LDAP server and I'm sure some other projects that are rotting away on a harddrive somewhere :-) and started hacking.
Why ... I'm not sure – it sure got me back to where I started out – crashing machines (and boot times haven't improved over those past 10 years at all).
I think because of the challenge – I knew I could write whatever it was I was coding and this page replacement stuff was a whole new challenge, and TBH [to be honest] the kernel code didn't look too hard at the time (phew how ignorant I was..)
LWN: How well were your contributions received by kernel hackers? Did you make any missteps along the way?
Mis-steps, feh, still do ;-) Unlike most people seem to think, kernel hackers are human too.
LWN: What suggestions do you have for folks that are looking at getting involved in kernel hacking today?
- take it and act upon it
- convince the other he's wrong
OK it can get personal, but that is only if you repeatedly fail the above two points.
LWN: There has been a lot of talk about the Kernel Janitors project recently, do you think that is a good way to get started with kernel development? What do you think should be done differently in that (or other) project(s) to attract more and better contributors?
- we don't have enough simple but interesting things lined up (not saying there are none, but we don't have a ready list). I think a proper challenging project would be much better that moronic code clean ups.
- the kernel really isn't a place for newbies; now let me explain this before it gets all mis-interpreted :-)
- Things really get a lot easier if you're fairly competent at (Unix) system programming before starting at the kernel.
- Kernel hacking is a solitary business in that you need to do things, nobody is going to do them for you. That is not saying nobody can help you if you have a question. Also, nobody is going to force you to do something – you need to want doing it.
So I guess what I'm saying is that you need to really want to do it. There is no other way to become a kernel hacker than by simply doing it.
LWN: Do you work on Linux for your job, as a hobby, or both?
So I applied for a kernel position at a few of the larger vendors, and Red Hat won the race.
Already having had a year's worth of exposure to kernel code and LKML, certainly helped in getting this amazing opportunity. Have I already mentioned I absolutely love working on the kernel?
So now I get to poke at the kernel all day, every day...
LWN: What are your current kernel projects? What kinds of things do you see yourself doing in the kernel in the future?
The future ... well we'll see what happens, loads of interesting stuff to do.
We would like to thank Peter for taking the time to answer our questions.
The state of the pageout scalability patches
The virtual memory scalability improvement patch set overseen by Rik van Riel has been under construction for well over a year; LWN last looked at it in November, 2007. Since then, a number of new features have been added and the patch set, as a whole, has gotten closer to the point where it can be considered for mainline inclusion. So another look would appear to be in order.One of the core changes in this patch set remains the same: it still separates the least-recently-used (LRU) lists for pages backed up by files and those backed up by swap. When memory gets tight, it is generally preferable to evict page cache pages (those backed up by files) rather than anonymous memory. File-backed pages are less likely to need to be written back to disk and they are more likely to be well laid-out on disk, making it quicker to read them back in if necessary. Current Linux kernels keep both types of pages on the same LRU list, though, forcing the pageout code to scan over (potentially large numbers of) pages which it is not interested in evicting. Rik's patch improves this situation by splitting the LRU list in two, allowing the pageout code to only look at pages which might actually be candidates for eviction.
There comes a point, though, where anonymous pages need to be reclaimed as well. The kernel will make an effort to pick the best pages to evict by going for those which have not been recently referenced. Doing that, however, requires going through the entire list of anonymous pages, clearing the "referenced" bit on each. A large system can have many millions of anonymous pages; iterating over the entire set can take a long time. And, as it turns out, it's not really necessary.
The VM scalability patch set now changes that behavior by simply keeping a certain percentage of the system's anonymous pages on the inactive list - the first place the system looks for pages to evict. Those pages will drift toward the front of the list over time, but will be returned to the active list if they are used. Essentially, this patch is applying a form of the "referenced" test to a portion of anonymous memory - whether or not anonymous pages are being evicted at the time - rather than trying to check the referenced state of all anonymous pages when the kernel decides it needs to reclaim some of them.
Another set of patches addresses a different situation: pages which cannot be evicted at all. These pages might have been locked into memory with a system call like mlock(), be part of a locked SYSV shared memory region, or be part of a RAM disk, for example. They can be either page cache or anonymous pages. Either way, there is little point in having the reclaim code scan them, since it will not be possible to evict them. But, of course, the current reclaim code does have to scan over these pages.
This unneeded scanning, as it turns out, can be a problem. The extensive unevictable LRU document included with the patch claims:
Most of us are not currently working with systems of this size; one must spend a fair amount of money to gain the benefits of this sort of pathological behavior. Still, it seems like something which is worth fixing.
The solution, of course, is yet another list. When a page is determined to be unevictable, that page will go onto the special, per-zone unevictable list, after which the pageout code will simply not see it anymore. As a result of the variety of ways in which a page can become unevictable, the kernel will not always know at mapping time whether a specific page can go onto the unevictable list or not. So the pageout code must keep an eye out for those pages as it scans for reclaim candidates and shunt them over to the unevictable list as they are found. In relatively short order, the locked-down pages will accumulate in this list, freeing the pageout code to concentrate on pages it can actually do something about.
Many of the concerns which have been raised about this patch set over the last year have been addressed. A few remain, though. Some of the new features require new page flags; these flags are in extremely short supply, so there is always pressure to find ways of implementing things which do not allocate more of them. There are a few too many configuration options and associated #ifdef blocks. And so on. Addressing these may take a while, but convincing everybody that these (rather fundamental) memory management changes are beneficial under all circumstances may take rather longer. So, while this patch set is making progress, a 2.6.27 merge is probably not in the cards.
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Security-related
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Next page:
Distributions>>