LWN.net Logo

2.6.30 merge window, part 2

By Jonathan Corbet
April 8, 2009
There have been some 3400 non-merge changesets incorporated into the mainline since last week's update, for a total of some 9600 changes merged for 2.6.30 overall. At this point, the 2.6.30 merge window is complete. User-visible changes merged since last week include:

  • The preadv() and pwritev() system calls have been added. They have been long in coming; LWN first covered these system calls in 2005. The expected user-space interface will be:

          ssize_t preadv(int d, const struct iovec *iov, int iovcnt, off_t offset);
          ssize_t pwritev(int d, const struct iovec *iov, int iovcnt, off_t offset);
    

    Due to the portability challenges involved, though, the actual kernel interface (seen only by the C library) is somewhat different.

  • The loop block driver supports a new ioctl() (LOOP_SET_CAPACITY) which can be used to change the size of the device on the fly.

  • The eventfd() system call takes a new flag (EFD_SEMAPHORE) which causes it to implement simple counting-semaphore behavior. See the changelog entry for a description of how this works.

  • The ext4 system is now more careful about forcing data out to disk in situations where small files have been truncated or renamed. This behavior increases robustness in the face of crashes, but it can also have a performance cost. There is a new mount option (auto_da_alloc) which can be used to disable this behavior. Also new for ext4 is a set of control knobs found under /sys/fs/ext4.

  • The ext3 filesystem, too, is more careful to flush data to disk when running in the data=writeback mode.

  • The default mode for ext3 has been changed from data=ordered to data=writeback. The latter performs quite a bit better in 2.6.30, but also carries an information disclosure risk if the system crashes. Distributors can change the default mode when they configure their kernels; some may well choose to retain the older data=ordered default.

  • The btrfs filesystem has also been changed to be careful about flushing data to disk after truncate or rename operations.

  • The Nilfs log-structured filesystem has been merged.

  • The MD RAID layer now has support for block-layer integrity checking. MD can also change chunk_size and layout in a reshape operation - a capability which makes it possible to turn a RAID5 array into RAID6 while it is running.

  • The exofs (formerly osdfs) filesystem, providing support for object storage devices, has been merged.

  • FS-Cache (formerly cachefs) has been merged. This subsystem (first covered here in 2004), provides a local caching layer for network filesystems; it has finally overcome the concerns expressed by some developers and made it into the mainline.

  • The distributed storage subsystem and pohmelfs network filesystem have been merged. Interestingly, this code went in via the -staging tree.

  • The ATA subsystem has gained support for the TRIM command.

  • There are two new tuning knobs under /proc/sys/vm (nr_pdflush_threads_min and nr_pdflush_threads_max); they place limits on the number of running pdflush threads in the system.

  • Multiple message queue namespaces are now supported.

  • The PA-RISC architecture has gained support for ftrace and latencytop.

  • The ARM architecture now has high memory support, for all of you out there with 2GB ARM-based systems.

  • The Xtensa architecture now supports systems without a memory management unit.

  • New device drivers:

    • Block: Marvell MMC/SD/SDIO host drivers.

    • Graphics: Samsung S3C framebuffers.

    • Miscellaneous: National Semiconductor LM95241 sensor chips, Linear Technology LTC4215 Hot Swap controller I2C monitoring interfaces, PPC4xx IBM DDR2 memory controllers, AMD8111 HyperTransport I/O hubs, AMD8131 HyperTransport PCI-X Tunnel chips, TI TWL4030/TWL5030/TPS695x0 PMIC voltage regulators, DragonRise game controllers, National Semiconductor DAC124S085 SPI DAC devices, Rohm BD2802 RGB LED controllers, TXx9 SoC NAND flash memory controllers, and ASUS ATK0110 ACPI hardware monitoring interfaces.

    • Networking: Neterion X3100 Series 10GbE PCIe server adapters.

    • Processors and systems: Tensilica S6000 processors and S6105 IP camera reference design kits, and Merisc AVR32-based boards.

    • Sound: HTC Magician audio devices.

    • Video: i.MX1/i.MXL CMOS sensor interfaces, Conexant cx231xx USB video capture devices, and Legend Silicon LGS8913/LGS8GL5/LGS8GXX DMB-TH demodulators.

    • Staging drivers (those not considered ready for regular mainline inclusion): stlc4550 and stlc4560 wireless chipsets, Brontes PCI frame grabbers, ATEN 2011 USB to serial adapters, Phison PS5000 IDE adapters, Plan 9 style capability pseudo-devices, Intel Management Engine Interfaces, Line6 PODxt Pro audio devices, USB Quatech ESU-100 8 port serial devices, Ralink RT3070 wireless network adapters, and a vast array of COMEDI data acquisition drivers.

Changes visible to kernel developers include:

  • There is a new memory debug tool controlled by the PAGE_POISONING configuration variable. Turning this feature on causes a pattern to be written to all freed pages and checked at allocation time. The result is "a large slowdown," but also the potential to catch a number of use-after-free errors.

  • The new function:

        int pci_enable_msi_block(struct pci_dev *dev, int count);
    

    allows a driver to enable a block of MSI interrupts.

  • As part of the FS-Cache work, the "slow work" thread pool mechanism has been merged. Some have expressed the hope that it would become the One True Kernel Thread Pool, but there seems to be little progress in that direction. See Documentation/slow-work.txt for more information.

  • There is a pair of new printing functions:

         int vbin_printf(u32 *bin_buf, size_t size, const char *fmt, ...);
         int bstr_printf(char *buf, size_t size, const char *fmt, 
                         const u32 *bin_buf);
    

    The difference here is that vbin_printf() places the binary value of its arguments into bin_buf. The process can be reversed with bstr_printf(), which formats a string from the given binary buffer. The main use for these functions would appear to be with Ftrace; they allow the encoding of values to be deferred until a given trace string is read by user space.

  • Also added is printk_once(), which only prints its message the first time it is executed.

  • The "kmemtrace" tracing facility has been merged. Kmemtrace provides data on how the core slab allocations function. See Documentation/vm/kmemtrace.txt for details.

  • A number of ftrace changes have been merged. There is a workqueue tracer which tracks the operations of workqueue threads. The blktrace block subsystem tracer can now be used via ftrace. The new "event" tracer allows a user to turn on specific tracepoints within the kernel; tracepoints have been added for various scheduler and interrupt events. "Raw" events (with binary-formatted data) are available now. The new "syscall" tracer is for tracing system calls.

The merge window is now closed, and the stabilization process can begin. Past experience suggests that something close to 3000 more changes will find their way into the mainline before the 2.6.30 release, which can be expected to happen sometime in June.


(Log in to post comments)

2.6.30 merge window, part 2

Posted Apr 9, 2009 6:37 UTC (Thu) by Felix.Braun (subscriber, #3032) [Link]

The ext3 filesystem, too, is more careful to flush data to disk when running in the (non-default) data=writeback mode.

Wasn't data=writeback made the default value too?

2.6.30 merge window, part 2

Posted Apr 9, 2009 9:06 UTC (Thu) by johill (subscriber, #25196) [Link]

It even says so in the next sentence.

data=writeback

Posted Apr 9, 2009 13:32 UTC (Thu) by corbet (editor, #1) [Link]

Yes, data=writeback was made the "default default." Obviously different sentences were written at different times...

Device drivers for AMD8111 HyperTransport I/O hubs, AMD8131 HyperTransport PCI-X Tunnel chips

Posted Apr 10, 2009 13:25 UTC (Fri) by TRS-80 (subscriber, #1804) [Link]

These were part of the original Opteron platform in 2002 - what's actually been added is Error Detection and Correction (EDAC) support for these chipsets. Similarly the PPC4xx IBM DDR2 memory controller driver added is for EDAC.

Device drivers for AMD8111 HyperTransport I/O hubs, AMD8131 HyperTransport PCI-X Tunnel chips

Posted Apr 17, 2009 2:32 UTC (Fri) by Duncan (guest, #6647) [Link]

Thanks. My personal main system is a dual Opteron 290 running these, and
I'm not sure I'd have understood the significance had you not mentioned
it.

Duncan

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds