LWN.net Logo

What's going into 2.6.17, part 2

The flood of patches heading into the mainline continues at full rate - though the merge window should be closing soon. The following is the highlights from code merged since last week's summary, starting with the user-visible changes:

  • The lightweight robust futexes patch.

  • The software RAID (MD) layer can now handle on-the-fly resizing of RAID5 arrays.

  • Support for devfs has been removed from the SCSI subsystem, though it remains in many other parts of the kernel.

  • The user-space software suspend patch.

  • A big XFS update

  • An 802.11 software MAC implementation for wireless networking stacks. Version 20 of the wireless extensions API was also merged.

  • The reverse-engineered Broadcom 43xx driver has been merged. As a result, the list of wireless network cards supported by Linux has just grown considerably.

  • A "memory spreading" mechanism which can be used to spread page cache and filesystem buffer allocations across all nodes of a NUMA system.

  • Two new fadvise() operations for controlling asynchronous file writeout behavior.

  • Support for reordering functions in the linked kernel image. The idea here is to put the highly-used bits of kernel code together so that the highly-trafficked part of the kernel fits within a single TLB entry. Currently, only x86-64 has the infrastructure for reordering.

  • Multiple-block allocation and mapping has been added to the ext3 filesystem, improving performance for sequential file access patterns.

  • A new scheduling domain has been added to represent multi-core systems.

  • A new RTC subsystem has been added, providing support for a variety of real-time hardware clocks.

Internal kernel API changes merged include:

  • A new utility function has been added:

         int execute_in_process_context(void (*fn)(void *data),
                                        void *data, 
    				    struct execute_work *work);
    

    This function will arrange for fn() to be called in process context (where it can sleep). Depending on when execute_in_process_context() is called, fn() could be invoked immediately or delayed by way of a work queue.

  • The SMP alternatives patch.

  • A rework of the relayfs API - but the sysfs interface has been left out for now.

  • A tracing mechanism for developers debugging block subsystem code.

  • There is a new internal flag (FMODE_EXEC) used to indicate that a file has been opened for execution.

  • The obsolete MODULE_PARM() macro is gone forevermore.

  • A new function, flush_anon_page(), can be used in conjunction with get_user_pages() to safely perform DMA to anonymous pages in user space.

  • Zero-filled memory can now be allocated from slab caches with kmem_cache_zalloc(). There is also a new slab debugging option to produce a /proc/slab_allocators file with detailed allocation information.

  • There are four new ways of creating mempools:

         mempool_t *mempool_create_page_pool(int min_nr, int order);
         mempool_t *mempool_create_kmalloc_pool(int min_nr, size_t size);
         mempool_t *mempool_create_kzalloc_pool(int min_nr, size_t size);
         mempool_t *mempool_create_slab_pool(int min_nr, 
                                             struct kmem_cache *cache);
    

    The first creates a pool which allocates whole pages (the number of which is determined by order), while the second and third create a pool backed by kmalloc() and kzalloc(), respectively. The fourth is a shorthand form of creating slab-backed pools.

  • The prototype for hrtimer_forward() has changed:

         unsigned long hrtimer_forward(struct hrtimer *timer,
                                       ktime_t now, ktime_t interval);
    

    The new now argument is expected to be the current time. This change allows some calls to be optimized. The data field has also been removed from the hrtimer structure.

  • A whole set of generic bit operations (find first set, count set bits, etc.) has been added, helping to unify this code across architectures and subsystems.

  • The inode f_ops pointer - which refers to the file_operations structure for the open file - has been marked const. Quite a bit of code, which used to change that structure, has been changed to compensate. Similar changes have been made in many filesystems. "The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean)."

If the usual pattern holds, the merging of new features will stop sometime around the end of the month, with 2.6.17-rc1 being released shortly thereafter.


(Log in to post comments)

What's going into 2.6.17, part 2

Posted Mar 30, 2006 3:39 UTC (Thu) by ehovland (subscriber, #2284) [Link]

An 801.11 software MAC

Maybe that should be 802.11

unsigned lnog hrtimer_forward

Likewise, maybe that should be long.

Man that is a lot of change.

What's going into 2.6.17, part 2

Posted Apr 2, 2006 17:34 UTC (Sun) by Zenith (subscriber, #24899) [Link]

While I'm sure your corrections are most appreciated, I do think LWN has a policy of keeping them out of the discussions, and have it mailed to them instead.

What's going into 2.6.17, part 2

Posted Mar 30, 2006 10:24 UTC (Thu) by nix (subscriber, #2304) [Link]

blktrace isn't just useful for debugging. It's also useful for laptop users, people with machines in their bedrooms ;) or other such areas; if the disk's being hit a lot they can figure out what's hitting it. There's no other way to go back from block I/O to process (or not-a-process-this-is-metadata-or-journal) like that.

(I assume the 'tracing mechanism' mentioned is blktrace.)

splice()

Posted Mar 30, 2006 13:30 UTC (Thu) by brugolsky (subscriber, #28) [Link]

A notable development this week is the imminent merging of splice(). Hooray! :-)

What's going into 2.6.17, part 2

Posted Mar 30, 2006 18:26 UTC (Thu) by pr1268 (subscriber, #24648) [Link]

As an owner of a laptop and a Broadcom 43xx chipset-based 802.11 card, I couldn't be more pleased its driver will be integrated. I've been hoping a native Linux driver would appear integrated with the mainline kernel soon.

Many thanks to the Kernel hackers and testers who spent the time to develop the 43xx drivers. And equal thanks to the Soft MAC developers. I look forward to running this on my laptop (and abandoning the dependency of NDISWrapper and the Windows driver).

fadvise() operations --> sync_file_range()

Posted Mar 31, 2006 22:22 UTC (Fri) by mkerrisk (subscriber, #1978) [Link]

Two new fadvise() operations for controlling asynchronous file writeout behavior.

In fact these new operations are now likely to be implemented as a new system call:

sync_file_range(int fd, loff_t offset, loff_t nbytes, int flags);

with the following flags:

SYNC_FILE_RANGE_WAIT_BEFORE: wait upon writeout of all pages in the range before performing the write.

SYNC_FILE_RANGE_WRITE: initiate writeout of all those dirty pages in the range which are not presently under writeback.

SYNC_FILE_RANGE_WAIT_AFTER: wait upon writeout of all pages in the range after performing the write.

See http://marc.theaimsgroup.com/?l=linux-kernel&m=114370458614279&w=2

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds