LWN Weekly Edition Front pageSecurity Kernel development Distributions Development Linux in the news Announcements Letters to the editor ->One big page
This page Previous weekFollowing week |
Kernel developmentRelease status Kernel release status The current stable 2.6 kernel is 2.6.17.3, released on June 30. It was a single-fix release for a denial of service vulnerability in the netfilter SCTP connection tracking code. One day earlier, 2.6.17.2 had been released with a relatively large set of important fixes. The SCTP fix can also be found in 2.6.16.23.The current 2.6 prepatch is 2.6.18-rc1, released by Linus on July 5. A summary of changes can be found in a separate article below. Also available are the short-form changelog (too bulky to be included with Linus's announcement) and the long-form changelog. The current -mm tree is 2.6.17-mm6. Recent changes to -mm include some extensions to the read-copy-update API, some "massive" CPU scheduler cleanup work, the removal of a number of old (OSS) sound drivers, and a set of patches shrinking the inode structure. A great many patches have been removed from -mm as they have found their way into 2.6.18-rc1.
Kernel development news Quote of the week
I like colorized diffs, but let's face it, those particular color
choices will make most people decide to pick out their eyes with a
fondue fork. And that's not good. Digging in your eye-sockets with
a fondue fork is strictly considered to be bad for your health, and
seven out of nine optometrists are dead set against the practice.
So in order to avoid a lot of blind git users, please apply this patch.
Looking forward to 2.6.18 Your editor, having returned from an all-too-short vacation, was faced with the prospect of looking over the 4500 (and counting) patches merged for the 2.6.18-rc1 release. Much of what has been merged is the usual set of fixes and updates, but some more user and developer-visible patches have gone in as well. The user-visible patches include:
Internal API changes visible to kernel developers include:
As of this writing, the 2.6.18 merge window has closed, so there probably will not be a whole lot of additions to the above list.
Time for ext4 A few weeks ago, this page looked at possible additions to the ext3 filesystem and the question of whether the time had come to freeze ext3 and put new features into a new ext4 filesystem again. The ext2/3 filesystem developers have now responded to that discussion with a clear answer: they will be moving on to ext4.More specifically, a new filesystem will be created under fs/ext4 in the kernel source. Said filesystem will register itself as "ext3dev," in an attempt to make it crystal clear that it is a development filesystem, not suitable for the storage of data which one actually wishes to keep. New feature work - especially changes which change on-disk formats and prevent interoperation with current ext3 implementations - will go into this new filesystem, while ext3 will continue to receive bug fixes and some safe improvements. Throughout this process, the new filesystem will retain its ability to work with the current ext3 format. Sometime in the future, ext3dev will be declared stable and renamed "ext4." Once the last bugs have been shaken out, this filesystem will lose its "experimental" designation and users will be encouraged to upgrade. Since support for ext3 formats will be there, this upgrade should be an easy process, with no backup-and-restore step or downtime required. Further in the future, the ext3 code may be removed and ext4 would transparently handle ext3 filesystems as well. There seems to be little opposition to this approach, so it would appear that things will happen this way. Since the addition of a new, experimental filesystem carries little regression risk, the creation of ext4 and the addition of some new features (extents, for example) could yet happen for 2.6.18.
The 2006 Linux File Systems Workshop The Linux file systems community met in Portland in June 2006 to discuss the next 5 years of file system development in Linux. Organized by Val Henson, Zach Brown, and Arjan van de Ven, and sponsored by Intel, Google, Oracle, the Linux File Systems Workshop brought together thirteen Linux file systems developers and experts to share data and brainstorm for three days. Our goal was to discuss the direction of Linux file systems development during the next 5 years, with a focus on disruptive technologies rather than incremental improvements. Our goal was not to design one new file system to rule them all, but to come up with several useful new file system architecture ideas (which may or may not reuse existing file system code). To stay focused, we explicitly ruled out discussion of the design of distributed or clustered file systems, with the exception of how they impact local file system design. We came out of the workshop with broad agreement on the problems facing Linux file systems, several exciting file system architecture ideas, and a commitment to working together on the next generation of Linux file systems.
The ProblemWhy do we need a Linux file systems workshop, when all seems well in Linux file systems land? Disks purr gently along, larger and fatter than ever before, but still essentially the same. I/O errors are an endangered species, more rumor than fact, and easily corrected with a simple fsck. The "df" command returns a comforting 50% free on most of your file systems. You chuckle gently as you read old file system man pages with directions for tuning inode/block ratios. Sure, that 32-bit file system size limit is looming somewhere over the horizon, but a quick patch to change the size of your block pointers is all you need and you'll be back in business again. After all, file systems are a solved problem, right? Right?If computer hardware never changed, we kernel developers would have nothing better to do than argue about the optimal scheduling algorithm and flame each others' coding style. Unfortunately, hardware has this terrible habit of changing frequently, drastically, and worst of all, exponentially. File systems are especially vulnerable to changes in hardware because of their long-lived nature. Much of operating systems software can be changed at will given a simple system reboot. But file systems - and their on-disk data layouts - live on and on. What has changed in hardware that affects file systems? Let's start with some simple, unavoidable facts about the way disks are evolving. Everyone knows that disk capacity is growing exponentially, doubling every 9-18 months. But what about disk bandwidth and seek time? At the last Storage Networking World conference, Seagate presented some details of their hard disk road map for the next 7 years (see page 16 of the slides [PDF]). Their predictions for 3.5 inch hard disks are summarized in the following table.
In summary, over the next 7 years, disk capacity will increase by 16 times, while disk bandwidth will increase only 5 times, and seek time will barely budge! Today it takes a theoretical minimum 4,000 seconds, or about 1 hour to read an entire disk sequentially (in reality, it's longer due to a variety of factors). In 2013, it will take a minimum of 12,800 seconds, or about 3.5 hours, to read an entire disk - an increase of 3 times. Random I/O workloads are even worse, since seek times are nearly flat. A workload that reads, e.g., 10% of the disk non-sequentially will take much longer on our 8TB 2013-era disk than it did on our 500GB 2006-era disk. Another interesting change in hardware is the rate of increase in capacity versus the rate of reduction in I/O errors per bit. In order for a disk to have the same overall number of I/O errors, every time capacity doubles, the per-bit I/O error rate must halve. Needless to say, this isn't happening, so I/O errors are actually more common even though the per-bit error rate has dropped. These are only a few of the changes in disk hardware that will occur over the next decade. What do these changes mean for file systems? First, fsck will take a lot longer in absolute terms, because disk capacity is larger, but disk bandwidth is relatively smaller, and seek time is relatively much larger. Fsck on multi-terabyte file systems today can easily take 2 days, and in the future it will take even longer! Second, the increasing number of I/O errors means that fsck is going to happen a lot more often - and journaling won't help. Existing file systems simply weren't designed with this kind of I/O error frequency in mind. These problems aren't theoretical - they are already affecting systems that you care about. Recently, the main server for Linux kernel source, kernel.org, suffered file system corruption from a failure at the RAID level. It took over a week for fsck to repair the (ext3) file system, when it would have taken far less time to restore from backup.
The workshopNow that the stage is set, we'll move on to what happened at the 2006 Workshop. The coverage has been split into the following pages:
Patches and updates Kernel trees
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Janitorial
Memory management
Networking
Architecture-specific
Page editor: Jonathan Corbet |
Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.