Kernel development
Brief items
Kernel release status
The current stable 2.6 kernel is 2.6.14.5, released on December 26. It contains the usual set of fixes, mostly in the networking and SCSI subsystems.
The current 2.6 kernel is 2.6.15, announced by Linus on
January 2. The changelog entry for the release says "Hey, it's
fifteen years today since I bought the machine that got Linux started.
January 2nd is a good date.
" This release contains a fair number of
fixes since -rc7, but no big changes. The 2.6.15 series as a whole has
added a big set of 802.11 improvements, hotplug memory support,
much-improved NTFS support, much-improved CIFS support, the open-iSCSI
initiator, shared subtrees, a
new, IPv6-capable netfilter connection tracking implementation, and much
more. The
long-format changelog has the details. See also LWN's Kernel Page
coverage of features as they were added (here and here) and the KernelNewbies Linux
Changes Wiki.
The floodgates have not yet opened for the 2.6.16 development cycle, so there is no pile of pending patches in the mainline git repository as of this writing. There have also been no -mm kernel releases since December 14.
The current 2.4 prepatch is 2.4.33-pre1; Marcelo launched the 2.4.33 cycle on December 29. This prepatch includes some security fixes, some networking work, and, it is said, the last ever big SATA update for 2.4.
Kernel development news
Quote of the week
We get in the situation where lots of people are sitting there with arms folded, complaining about lack of a new kernel release while nobody is actually working on the bugs. Nobody knows why this happens.
A summary of 2.6.15 API changes
The 2.6.15 kernel is out. The following is a summary of changes to the internal kernel API found in this release, with an emphasis on changes visible to driver writers. This information will be folded into the LWN 2.6 API changes page shortly.
- The nested class device
patch was merged, allowing class_device structures to
have other class_devices as parents. This patch is a hack to
make the input subsystem work with sysfs. This code will change again
in the future; see Greg
Kroah-Hartman's article for more information on what is planned.
- The prototypes for the driver model class "interface" methods
add() and remove() have changed; there is now a new
parameter pointing to the relevant interface structure.
- A new platform_driver structure has been added to describe
drivers for devices built into the core "platform."
- The prototypes for the suspend() and resume()
methods in struct device_driver have changed. They are also
only called once per event, rather than three times as in previous
kernels.
- Two new fields have been added to the device_pm_info which
control how drivers should act on hardware-created wakeup events; see
this article for
details.
- There is a notification mechanism which lets interested modules know
when a USB device is added to (or removed from) the system. This
system is used by some core code; drivers do not normally need to hook
in to it.
- The gfp_t type
is now used throughout the kernel. If you have a function which takes
memory allocation flags, it should probably be using this type.
- Code using reader/writer semaphores can now use
rwsem_is_locked() to test the (read) state of the semaphore
without blocking.
- The new vmalloc_node() function allocates memory on a
specific NUMA node.
- The "reserved" bit for memory pages has, for all practical purposes,
been removed.
- vm_insert_page()
has been added to make it easier for drivers to remap RAM into user
space VMAs.
- There is a new kthread_stop_sem() function which can be used
to stop a kernel thread which might be currently blocked on a specific
semaphore.
- RapidIO bus support has
been merged into the mainline.
- The netlink connector
mechanism makes netlink code easier to write. Independently, a
type-safe netlink interface has been added and is used in parts of the
networking subsystem.
- These kernel symbols have been unexported and are no longer available
to modules: clear_page_dirty_for_io,
console_unblank, cpu_core_id
hugetlb_total_pages, idle_cpu,
nr_swap_pages, phys_proc_id,
reprogram_timer, swapper_space,
sysctl_overcommit_memory, sysctl_overcommit_ratio,
sysctl_max_map_count, total_swap_pages,
user_get_super, uts_sem, vm_acct_memory,
and vm_committed_space.
- Version 1 of the Video4Linux API is now officially scheduled for
removal in July, 2006.
- The owner field has been removed from the pci_driver
structure.
- A number of SCSI subsystem typedefs (Scsi_Device,
Scsi_Pointer, and Scsi_Host_Template) have been
removed.
- The DMA32 memory zone has been added to the x86-64
architecture; its purpose is to make it easy to allocate memory below
the 4GB barrier (with the new GFP_DMA32 flag).
- A call to rcu_barrier() will block the calling process until all current RCU callbacks have completed.
As can be seen from this list, the kernel API continues to evolve. The claims of certain well-known maintainers notwithstanding, it doesn't look like things will slow down much anytime soon.
Drawing the line on inline
Kernel programmers tend to like inline functions. They resemble C macros, in that they result in code inserted directly into the calling function, with no added function call overhead. But, unlike macros, they offer type checking and the ability to include multiple lines of code without adding a pile of backslashes. In cases where a function is optimized out entirely, an inline function turns into no code at all - a level of efficiency which is hard to beat. And, in some cases, inlining is required; consider, for example, functions which embody special assembly instructions needed by the kernel.Inline functions also have their costs, however. Their code is duplicated for every call, so inline functions which are called from more than one place make the kernel larger. Increasingly, developers are becoming aware that this size increase carries a performance penalty. As the gap between CPU and memory speeds grows, cache behavior increasingly determines how fast a program runs. So the performance benefits of inline functions are often, at best, illusory, and sometimes negative; a larger kernel will be a slower kernel.
Ingo Molnar recently raised this issue with a set of patches changing how the kernel is built. By turning on unit-at-a-time compilation (which causes gcc to consider an entire file in its optimization decisions) and by turning off forced inlining, he was able to achieve a 5.3% size reduction. Taking things to an extreme, and applying these patches to an "allyesconfig" kernel (one with all configuration options turned on) results in a nearly 25% smaller kernel. That is, to say the least, a significant size reduction to be achieved by such a small patch. Anybody interested in de-bloating the kernel should be paying attention.
These patches have not been accepted by everybody, however. In particular, the turning off of forced inlining is controversial. When gcc is not forced to honor the inline keyword, it makes its own decisions, based on the size of the function and how many times it is called. When told to optimize for size, in particular, gcc will have a strong bias against inline functions. This approach yields a significant size reduction, but there is a problem: Linus doesn't trust the gcc maintainers to code consistent and correct inline heuristics, and Andrew Morton doesn't either. Rather than turning off forced inlining and letting gcc figure things out, they would rather go through the code and remove unnecessary inline declarations one by one.
It is true that the kernel has been burned by changes to how gcc handles inline in the past. Since then, gcc seems to have gotten smarter, and one can argue that its maintainers have become more aware of the issues. There is also the little fact that cleaning up the existing inline declarations is not a small job; Ingo says:
Arjan van de Ven adds:
How all of this will turn out is unclear. Certainly one can expect a higher level of resistance to patches adding inline functions in the future. There is likely to be a long flurry of de-inlining patches as well. The ability to turn off forced inlining might be added to the build system as an experimental option; some distributors may even decide to use this option for the kernels they ship. But enough developers seem uncomfortable with the idea of turning off forced inlining wholesale that this option may not get beyond the "experimental" stage for some time.
Goodbye semaphores?
In the previous episode, Ingo Molnar had posted his own version of the mutex patch, adding a new synchronization primitive to the kernel. Ingo has continued to refine this patch set, with frequent releases; the current version isPerhaps the most significant development since then has been a private conversation between Andrew and Ingo. There is, it seems, a plan in place which would replace the current semaphore implementation entirely. Almost all current semaphore users are implementing simple mutual exclusion areas, so they would be converted over to the new mutex type directly. An estimated 90% of current semaphore users fall into this category. Of the remaining users, about 90% employ semaphores to indicate event completion. The task of converting those users to the completion type has been ongoing for some time; replacing semaphores would require finishing this job. Finally, an estimated 1% of the semaphores in the kernel are used for their counting feature; they can be converted over to a (not yet posted) architecture-independent counter type.
Once all that work is done, semaphores could be removed from the kernel
altogether. Says Andrew: "It's a lot of churn, but we'll end up with
a better end result and a somewhat-net-simpler kernel, so I'm
happy.
" Linus, meanwhile, has offered some suggestions for
improvements (already incorporated by Ingo) and stated: "At that point I'd like to
switch to mutexes just because the code is cleaner!
"
Since then, most of the discussion has been concerned with the details of the mutex implementation rather than whether it is fundamentally a good idea or not. The main objections would appear to have been overcome. So, unless something new comes up, it looks like this change is going to happen; the only question is "when." The next couple of weeks will determine whether the mutex code will be part of 2.6.16 or not. Then all that's left is the long task of converting all semaphore users over and, finally, removing the old semaphore code.
Patches and updates
Kernel trees
Build system
Core kernel code
Development tools
Device drivers
Documentation
Janitorial
Memory management
Networking
Security-related
Miscellaneous
Page editor: Jonathan Corbet
Next page:
Distributions>>