LWN.net Logo

2.6.35 merge window part 2

By Jonathan Corbet
May 26, 2010
Much code - some 6300 non-merge changesets - has gone into the mainline kernel since last week's article. Listing all of the changes would be an impossible task, even for the KernelNewbies folks who seem to get close, but an overview can be given. The most interesting user-visible changes are:

  • The receive packet steering and receive flow steering mechanisms have been added to the networking subsystem.

  • The memory compaction patch set has been merged. This should lead to less memory fragmentation and higher success rates for large allocations.

  • A loophole which would, in some circumstances, allow a security module to be registered at runtime has been closed; security modules must be present at boot time.

  • The network traffic control subsystem is now namespace-aware.

  • The "Communication CPU to Application CPU Interface" (CAIF) protocol, used to speak with ST-Ericsson modems, is now supported in the network stack. Also supported in version 3 of the Layer Two Tunneling Protocol (L2TP - RFC 3931).

  • "FunctionFS" is a mechanism by which user-space USB drivers can create USB gadgets with composite functionality. The "f_uvc" driver builds on this feature to create a video capture device driven by data from user space.

  • There is now support for the "restricted access regions" mechanism built into Intel "Moorestown" processors. RAR can be used to block devices (including processors) out of specific regions of memory.

  • The cpuidle "menu" governor now features idle pattern detection which tries to be smarter about sleep-state selection based on recent system history.

  • The slab allocator has gained memory hotplug support.

  • Extended attributes are now supported in the squashfs filesystem.

  • The size of the in-kernel buffer used to hold data for a pipe can be queried with the new F_GETPIPE_SZ fcntl() command, and changed with F_SETPIPE_SZ. As of this writing, the units used by this command are page-sized buffers, but that will almost certainly change to bytes in the near future.

  • Lots of new drivers:

    • Input: Analog Devices AD714x capacitance touch sensors, Hampshire serial touchscreens, TI TCA6416 keypads, Cando dual touch panels, Minibox PicoLCD devices, Prodikeys PC-MIDI Keyboard devices, and Roccat Kone mice.

    • Network: Atheros HTC based wireless cards, USB-connected Agere Orinoco wireless cards, and SBE wanPMC-C[421]E1T1 WAN adapters. The ath5k driver has also gained adaptive noise immunity support, a feature which is said to nearly double throughput in noisy situations.

    • Sound: USB Audio Class v2.0 compliant devices, Wolfson Micro WM1133-EV1 on i.MX31ADS systems, Wolfson Micro WM9090 amplifiers, TI TWL6040 codecs, Texas Instruments SDP4430 audio devices, Zipit Z2 WM8750 audio devices, and AudioScience ASI sound cards.

    • Storage: devices with the SmartMedia/xd flash translation layer, Ricoh R5C852 xD card readers, MPC5121 built-in NAND flash controllers, Denali NAND controller on Intel Moorestown systems, Samsung OneNAND controllers, and QLogic PCIe QLE InfiniBand host channel adapters.

    • Systems and processors: much of the support code for the ARM "MSM" architecture, as found in the G1/ADP1 handset, has been merged. Additionally: OMAP3 SBC STALKER boards and PowerPC 476 processors.

    • Video4Linux: Trident TV Master tm5600/tm6000 chips, SuperH VOU video output devices, AK8813/AK8814 video encoders and DataTranslation DT3155 frame grabbers.

    • Miscellaneous: RDMA and iWARP on Chelsio T4 1GbE and 10GbE adapters, viafb-based i2c and GPIO devices, SuperH IrDA devices, Altera UARTs, GSM MUX line discipline support (in the TTY layer), Niagara2 stream processing units, ACX565AKM (Nokia N900) panels, Analog Devices ADIS16255 low power gyroscopes, Analog Devices ADIS16300 and ADIS16400/5 inertial sensors, Analog Devices ADIS16240 programmable impact sensors, Analog Devices ADIS16260/5 digital gyroscope sensors, Analog Devices ADIS16209 dual-axis digital inclinometer and accelerometer devices, Timberdale FPGA DMA engines, ST-Ericsson DMA40 DMA controllers, Texas Instruments ADS7871 A/D converters, Freescale IMX2 hardware watchdogs, Freescale MPC512x PSC SPI controllers, and Cirrus EP93xx SPI controllers.

      Additionally, "g_hid" is a USB gadget driver implementing the human interface device class specification, g_webcam is a gadget-side USB video device, and g_ffs allows the creation of USB composite functions.

Changes visible to kernel developers include:

  • There is a new variant of request_irq():

        request_any_context_irq(unsigned int irq, irq_handler_t handler,
    			    unsigned long flags, const char *name, 
    			    void *dev_id);
    

    This function connects the interrupt handler in the usual way. The difference is that it looks at how the interrupt line itself was set up (by architecture-specific code) and decides whether to establish a traditional hard interrupt handler or a threaded handler. The return value on success is either IRQC_IS_HARDIRQ or IRQC_IS_NESTED.

  • Also related to request_irq(): the IRQF_DISABLED flag now does nothing; it will be removed entirely in 2.6.36. All interrupt handlers are now called with interrupts disabled; see this article for details on the change.

  • The timer slack mechanism has been merged. Timer slack (which applies to old-style timers, not hrtimers) allows the system to defer timer expiration by a bounded amount in order to get timers to expire at the same time, thus minimizing system wakeups.

  • The new function ktime_to_ms() converts a kernel time value to milliseconds.

  • A number of unused security module hooks (task_setuid, sb_post_remount, sb_post_pivotroot, sb_umount_close, acct, inode_delete, sb_umount_busy, task_setgid, task_setgroups, sb_post_addmount, cred_commit, and key_session_to_parent) have been removed.

  • The power management quality of service (pm_qos) system API has been significantly changed; see this article for details.

  • "Tagged directory support" has been added to sysfs. This feature allows namespace-specific tags to be added to sysfs directories; these tags then control which version of a directory (if any) is visible within any given namespace.

  • kref_set() has been removed after a determination that none of its three users were correct.

  • The read(), write() and mmap() methods in struct bin_attribute have gained a new struct file pointer argument.

  • The "kdb" low-level kernel debugger has been joined with kgdb and merged into the mainline.

  • The checkpatch script will now complain if kernel configuration options are added with fewer than four lines of help text.

  • There is a bunch of new infrastructure in the Video4Linux2 subsystem, including a framework for supporting memory-to-memory devices (video processing engines), a mechanism for reporting asynchronous events to user space, and a new core subsystem for infrared controllers.

In a normal merge window, changes would continue to flow into the mainline until the end of the month. Linus has made it clear that he no longer guarantees "normal" merge windows, though, so the window could close sooner than that. Regardless, tune in next week for a summary of the final changes to be merged for 2.6.35.


(Log in to post comments)

Memory compaction and the effect on fragmentation

Posted May 27, 2010 11:45 UTC (Thu) by mel (subscriber, #5484) [Link]

Our esteemed editor said that compaction should lead to less fragmentation but this is not necessarily the case. Anti-fragmentation (preferably with min_free_kbytes adjusted as recommended by hugeadm or tuned automatically within the transparent hugepage tree) still is the key to handling fragmentation (i.e. not necessarily reducing but but giving the system options to handle it). Anti-frag aims to keep pages of different mobility within the same huge page boundaries. Compaction doesn't affect this directly although there would be some minor benfits when the linear scanner would take movable pages from an unmovable range of superpages and put them somewhere else.

They key benefits it really brings are

a) one can avoid lumpy reclaim so a contiguous page can be created faster by copying pages instead of dumping them with IO

b) it can move pages that are mlocked so if locked pages are a major factor of the workload, they will have less impact on large allocations

c) On swapless systems, the pages can be moved where as without compaction there is very little page reclaim can do about anonymous pages. This is probably the "big" use-case I am expecting in 2.6.35 because it's something that has been whinged about in clusters.

d) if/when transparent hugepage support gets merged, it'll be cheaper to promote base pages to huge pages than if lumpy reclaim was used.

A side-benefit is that it avoids some bad trashing from lumpy reclaim. It wasn't such a big deal before but it can cause trashing in the system. Waiting to writeback dirty pages is *slow* and its waiting logic can lead to large stalls as well. It's been on my TODO list to investigate this closely for some time and see if there is a better option which I'm hoping to get around to some time next month. It's been put on the long finger because there were other issues that were a lot more urgent over the last 6-12 months.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds