|
|
Subscribe / Log in / New account

The second half of the 2.6.37 merge window

By Jonathan Corbet
November 1, 2010
The 2.6.37-rc1 prepatch has been released, so the merge window is now closed. Nearly 3100 changesets were merged between last week's summary and the closing of the window; there were 9518 non-merge changesets merged in total for 2.6.37. The most significant user-visible changes include:

  • The last significant big kernel lock holdout - the file locking code - has been fixed. It is now possible to build a generally useful kernel without the BKL, though quite a few older drivers still require it.

  • Support for the CAIF shared memory protocol has been added.

  • The perf probe command has a new --vars option which will cause it to list the local variables which are accessible from a given probe point. With --externs, global variables are listed as well. It is now possible to place probes in loadable modules.

  • The ext4 filesystem now supports "lazy inode table initialization," an option which makes the creation of filesystems faster. Ext4 now features a reworked I/O submission path which should improve performance and scalability.

  • "Batched discard" support has been added in the form of the new FITRIM ioctl() command. This feature allows the filesystem to tell the underlying storage device about all of the unused blocks at once. So far, this feature is only implemented by the ext4 filesystem.

  • Much of the long-delayed Xen Dom0 (hypervisor) support has finally been merged. 2.6.37 will still not be Dom0-ready; there will be at least one more development cycle required for that; see this summary from Jeremy Fitzhardinge for the full plan.

  • The fanotify subsystem has been re-enabled, and should be available in 2.6.37.

  • The 9p filesystem has gained POSIX access control list support.

  • The Speakup kernel-based screen reader has been merged into the staging tree.

  • New drivers:

    • Systems and processors: aESOP Samsung S5PV210-based Torbreck boards.

    • Audio: Intel MID SST DSP devices.

    • Block: Cypress Astoria USB SD host controllers, Marvell PXA168/PXA910/MMP2 SD host controllers, and ST Microelectronics Flexible Static Memory Controllers.

    • Miscellaneous: Basic, memory-mapped GPIO controllers, Intel Topcliff GPIO controllers, Intel Moorestown/Medfield i2c controllers, IDT CPS Gen.2 SRIO RapidIO switches, Freescale i.MX DMA engines, ARM PrimeCell PL080 or PL081 DMA engines, Cypress West Bridge Astoria controllers, USB ENE card readers, Asahi Kasei AK8975 3-axis magnetometers, OLPC XO display controller devices, Analog Devices AD799x analog/digital converters, Winbond/Nuvoton W83795G/ADG hardware monitoring chips, Flarion OFDM usb and pcmcia modems, Maxim MAX8952 and MAX8998 Power Management ICs, National Semiconductors LP3972 PMIC regulators, and Broadcom BCM63xx hardware watchdogs.

    • Network: Intel Topcliff platform controller hub CAN interfaces, Technologic Systems TS-CAN1 PC104 peripheral boards, SBE wanPMC-2T3E3 interfaces, RealTek RTL8712U (RTL8192SU) Wireless LAN NICs (replaces older rtl8712 driver), Atheros AR6003 wireless interface controllers, Beeceem USB Wimax adapters, and Broadcom bcm43xx wireless chipsets.

    • Video4Linux2: remotes using the RC-5 (streamzap) protocol, Konica chipset-based cameras, Sharp IX2505V silicon tuners, LME2510 DM04/QQBOX USB DVB-S boxes, Samsung s5h1432 demodulators, Several new Conexant cx23417-based boards, Nuvoton w836x7hg consumer infrared transceivers, OmniVision OV6650 sensors, OMAP1 camera interfaces, Siliconfile SR030PC30 VGA cameras, Sony imx074 sensors, and VIA integrated chipset camera controllers.

Changes visible to kernel developers include:

  • There have been, once again, significant changes to the Video4Linux2 driver API. The new "mediabus" layer adds flexibility for dealing with complex devices, but also complicates simpler drivers somewhat. The videotext/teletext API, long unused, has been removed.

  • The file_system_type structure has a new mount() function which is meant to replace get_sb().

Now the stabilization period begins; the final 2.6.37 release will almost certainly happen in January.

Index entries for this article
KernelReleases/2.6.37


to post comments

The second half of the 2.6.37 merge window

Posted Nov 1, 2010 21:22 UTC (Mon) by slothrop (guest, #69834) [Link] (15 responses)

BTW a nice userspace program for the (ext4) FITRIM ioctl() command
can be found here:
http://www.spinics.net/lists/xfs/msg01837.html
It's called fstrim and works nicely on my Vertex SSD.
Compile and run as root (e.g.): fstrim -v /
(There is no visible feedback, you have to wait a few minutes
(depending on the size and state of the SSD) before it finishes.)

The second half of the 2.6.37 merge window

Posted Nov 1, 2010 21:31 UTC (Mon) by slothrop (guest, #69834) [Link]

It now can also be found on sourceforge:
http://sourceforge.net/projects/fstrim/

The second half of the 2.6.37 merge window

Posted Nov 1, 2010 23:16 UTC (Mon) by tardyp (guest, #58715) [Link] (7 responses)

Do you think this could be used to reduce the size of virtualbox'es variable size virtual disks?

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 4:51 UTC (Tue) by slothrop (guest, #69834) [Link] (5 responses)

No. This is not about reducing sizes, its about telling
your SSD which blocks are not used by the filesystem currently,
so that they could be added to the internal list of available
blocks. This hopefully makes the SSD snappier and faster.

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 6:39 UTC (Tue) by bronson (subscriber, #4806) [Link] (4 responses)

Right. And if you tell virtualbox which blocks on the virtual disk are no longer used, the backing file can be made more sparse.

In theory, trim could be just as useful to loopback-mounted filesystems as it is to SSDs, no?

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 7:24 UTC (Tue) by slothrop (guest, #69834) [Link] (3 responses)

That's an interesting idea, but it needs to be implemented
in virtualbox or qemu. Then you could issue the FITRIM
ioctl from the running guest and it would be intercepted
by these programs and acted upon accordingly.

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 14:05 UTC (Tue) by tardyp (guest, #58715) [Link] (1 responses)

Seems one of the dev of vbox was answering on this topic a while ago.
http://forums.virtualbox.org/viewtopic.php?f=9&t=1822...
"""
VD images files are typically mapped onto physical rotating media. These have high burst bandwidth but poor seek times (compared to SSD). The VDI format uses 2Mb pages for performance reasons. Dropping this to 4K to align it to the SDD driver technology would have a disastrous impact on real I/O performance (up to a factor of 10 slowdown say). Sorry, but this is a dumb idea.
"""
This was not wrong 1 year ago.. now, I'm doing desktop virtualization on ssd laptop. I dont want my virtualdisk to grow undefinitively. I dont want spinning disk optimizations.

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 16:51 UTC (Tue) by butlerm (subscriber, #13312) [Link]

Nothing stops clients from trimming free 2MB blocks using 4K block trim operations...

The second half of the 2.6.37 merge window

Posted Nov 3, 2010 16:49 UTC (Wed) by nix (subscriber, #2304) [Link]

You'd also need a holepunch operation in the overlying filesystem, to be able to insert holes in the middle of an existing file.

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 7:37 UTC (Tue) by mokki (subscriber, #33200) [Link]

I think so. When the virtual block device run by virtualbox/kvm detects the trim command it could punch a hole to the disk image, freeing disk space.

The xfs has an ioctl for punching holes to files. unfortunately a fallocate flag for deallocating space from 2007 has not been included in mainline.

The second half of the 2.6.37 merge window

Posted Nov 3, 2010 15:18 UTC (Wed) by nye (subscriber, #51576) [Link] (5 responses)

>BTW a nice userspace program for the (ext4) FITRIM ioctl() command can be found here:

In what circumstances would you want to use this rather than just mounting with the 'discard' option?

I'm curious because I've just bought my first SSD and I'm wondering how best to maintain performance. At first I was thinking of reserving some unpartitioned space, but then it occurred to me that it would probably be better just to use an ext4 partition. That way I can use the space if I really need it at some point, but it can be 'reclaimed' by the SSD for its own purposes when there's nothing stored there. Is that not correct?

The second half of the 2.6.37 merge window

Posted Nov 3, 2010 15:31 UTC (Wed) by slothrop (guest, #69834) [Link] (4 responses)

I have performance problems with my first generation
Vertex SSD if I use the ext4 discard option. So in
this case it is better to run fstrim as a cron job every
night. But if you have a good SSD with a fast ATA trim
implementation, the discard mount option is just fine.

The second half of the 2.6.37 merge window

Posted Nov 3, 2010 16:45 UTC (Wed) by nye (subscriber, #51576) [Link]

Ah, that makes sense - thanks.

The second half of the 2.6.37 merge window

Posted Nov 6, 2010 4:13 UTC (Sat) by Lope (guest, #65656) [Link] (1 responses)

The question of how often do you want to run fstrim on the whole filesystem is a little tricky, because it depends on what type of workload you are running. If you are able to fill up you disk within one day you probably should reclaim the space (discard) at least one time a day, but on regular desktop I very much doubt that.

So, what you probably want to do (if you want to be exact) is to watch the amount of data written to the filesystem (you can do that through /sys/fs/ext4/<device>/lifetime_write_kbytes assuming that on <device> is ext4 fs) and when it is going to reach some threshold (like 80% of device size) you would need to start doing the discard (note that FITRIM on ext4 will return amount of reclaimed space). But all of this may be an awful overkill for simple desktop:). And aside of that there are some very bad devices out there which are showing significant performance regression even at 50% fs saturation. And of course if you have more partitions on the same device ... it gets even more complicated :)

All that said, if you are ok with doing it once a day (and you are not even noticing it) it is good thing to do. But if it disturbs you, you probably would not want to do it so often, or at least do it per partes (which you can do with a little scripting) through a longer period of time.

The second half of the 2.6.37 merge window

Posted Nov 6, 2010 18:08 UTC (Sat) by slothrop (guest, #69834) [Link]

Good points,
I write ~30GB per week to my 30GB SSD (just one partition).
So I changed cron from running fstrim daily to twice a week.

The second half of the 2.6.37 merge window

Posted Nov 6, 2010 4:27 UTC (Sat) by Lope (guest, #65656) [Link]

If you do want to benchmark discard performance of you SSD you can use this tool :

http://sourceforge.net/projects/test-discard/

BUT, there are some very bad devices which might be corrupted by sending lots of small TRIM's so better be careful (Or blame your vendor for doing bad job!).

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 0:06 UTC (Tue) by nteon (subscriber, #53899) [Link] (2 responses)

is there a reason linus's tree at http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-... doesn't have anything after 2.6.36-rc6? Did it move, or get rolled back for some reason?

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 0:11 UTC (Tue) by corbet (editor, #1) [Link]

Linus forgot to push out the last stuff he committed, but it's there now.

The second half of the 2.6.37 merge window

Posted Nov 2, 2010 0:33 UTC (Tue) by BenHutchings (subscriber, #37955) [Link]

This is due to a botched disk upgrade; see mail on LKML.

'Support for the CAIF shared memory protocol has been added. '

Posted Nov 3, 2010 11:48 UTC (Wed) by trancecode (guest, #38493) [Link] (1 responses)

Can anyone here point me to more info about the new CAIF protocol?

I read the docs and googled a bit but it is not clear to me how to use/test it.

Are there user space utilities available?
Is hardware essential? - I see a loopback implementation mentioned.
Do I need to buy a SonyEricsson Android phone for this to be useful?
(I might just do that if I can be sure to get one that will work)

'Support for the CAIF shared memory protocol has been added. '

Posted Nov 11, 2010 13:06 UTC (Thu) by jch (guest, #51929) [Link]

I was under the impression that CAIF is used internally by Android phones. It's for communication between the two CPUs in the phone, not for communicating with an external device.

--jch


Copyright © 2010, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds