LWN.net Logo

API changes in the 2.6 kernel series

The 2.6 kernel development series differs from its predecessors in that much larger and potentially destabilizing changes are being incorporated into each release. Among these changes are modifications to the internal programming interfaces for the kernel, with the result that kernel developers must work harder to stay on top of a continually-shifting API. There has never been a guarantee of internal API stability within the kernel - even in a stable development series - but the rate of change is higher now.

This article will be updated to keep track of the internal changes for each 2.6 kernel release. Its permanent location is:

http://lwn.net/Articles/2.6-kernel-api/

This page will, doubtless, remain incomplete for a while. If you see an omission, please let us know by sending a note to kernel@lwn.net rather than by posting comments here. The chances of a prompt update are higher, the article will not become cluttered with redundant comments, and we'll be more than happy to credit you here.

If you are a Linux Device Drivers, Third Edition reader looking for information on changes since the book was published: LDD3 covers version 2.6.10 of the kernel, so only the changes starting with 2.6.11 are relevant.

Last update: January 5, 2006

2.6.15 (January 2, 2006)

  • The nested class device patch was merged, allowing class_device structures to have other class_devices as parents. This patch is a hack to make the input subsystem work with sysfs. This code will change again in the future; see Greg Kroah-Hartman's article for more information on what is planned.

  • The prototypes for the driver model class "interface" methods add() and remove() have changed; there is now a new parameter pointing to the relevant interface structure.

  • A new platform_driver structure has been added to describe drivers for devices built into the core "platform."

  • The prototypes for the suspend() and resume() methods in struct device_driver have changed. They are also only called once per event, rather than three times as in previous kernels.

  • Two new fields have been added to the device_pm_info which control how drivers should act on hardware-created wakeup events; see this article for details.

  • There is a notification mechanism which lets interested modules know when a USB device is added to (or removed from) the system. This system is used by some core code; drivers do not normally need to hook in to it.

  • The gfp_t type is now used throughout the kernel. If you have a function which takes memory allocation flags, it should probably be using this type.

  • Code using reader/writer semaphores can now use rwsem_is_locked() to test the (read) state of the semaphore without blocking.

  • The new vmalloc_node() function allocates memory on a specific NUMA node.

  • The "reserved" bit for memory pages has, for all practical purposes, been removed.

  • vm_insert_page() has been added to make it easier for drivers to remap RAM into user space VMAs.

  • There is a new kthread_stop_sem() function which can be used to stop a kernel thread which might be currently blocked on a specific semaphore.

  • RapidIO bus support has been merged into the mainline.

  • The netlink connector mechanism makes netlink code easier to write. Independently, a type-safe netlink interface has been added and is used in parts of the networking subsystem.

  • These kernel symbols have been unexported and are no longer available to modules: clear_page_dirty_for_io, console_unblank, cpu_core_id hugetlb_total_pages, idle_cpu, nr_swap_pages, phys_proc_id, reprogram_timer, swapper_space, sysctl_overcommit_memory, sysctl_overcommit_ratio, sysctl_max_map_count, total_swap_pages, user_get_super, uts_sem, vm_acct_memory, and vm_committed_space.

  • Version 1 of the Video4Linux API is now officially scheduled for removal in July, 2006.

  • The owner field has been removed from the pci_driver structure.

  • A number of SCSI subsystem typedefs (Scsi_Device, Scsi_Pointer, and Scsi_Host_Template) have been removed.

  • The DMA32 memory zone has been added to the x86-64 architecture; its purpose is to make it easy to allocate memory below the 4GB barrier (with the new GFP_DMA32 flag).

  • A call to rcu_barrier() will block the calling process until all current RCU callbacks have completed.

2.6.14 (October 27, 2005)

  • A new PHY abstraction layer has been added for network drivers.

  • The sk_buff structure has changed again; the changes will force a recompile but shouldn't otherwise be a problem.

  • Version 19 of the wireless extensions has been merged. Among other things, this version deprecates the get_wireless_stats() method in the net_device structure.

  • The klist API has changed. The order of the parameters has been reversed for klist_add_head() and klist_add_tail(). It is now necessary to provide a pair of reference counting functions when setting up a list with klist_init().

  • The relayfs virtual filesystem, which enables high-rate data transfers between the kernel and user space, has been merged.

  • kzalloc() has been added as a way of obtaining pre-zeroed memory.

  • Two new versions of schedule_timeout() have been added.

  • The new TASK_INTERACTIVE state flag tells the scheduler not to perform the usual accounting on sleeping processes.

  • SKB's which are expected to be cloned can be efficiently allocated with alloc_skb_fclone().

  • A few new helper functions for mapping block I/O requests have been added; see this article for details.

  • Securityfs, a virtual filesystem intended for use with security modules, has been merged.

2.6.13 (August 28, 2005)

  • The HZ constant is now configurable at kernel build time.

  • The timer API now includes try_to_del_timer_sync(), which makes a best effort to delete the timer; it is safe to call in atomic context.

  • The block_device_operations structure now has an unlocked_ioctl() member.

  • The return value from netif_rx() has changed; it now will return one of only two values: NETIF_RX_SUCCESS or NETIF_RX_DROP.

  • pci_dma_burst_advice can be used by PCI drivers to learn the optimal way of bursting DMA transfers.

  • The text searching API has been added.

  • A new memory allocation function, kzalloc(), has been added.

2.6.12 (June 17, 2005)

  • cancel_rearming_delayed_work() was added to the workqueue API.

  • The timeout value passed to usb_bulk_msg() and usb_control_msg() is now expressed in milliseconds instead of jiffies.

  • An interrupt-disabling spinlock is used in the rwsem implementation. It was never correct to call one of the variants of down_read() or down_write() with interrupts disabled, but it is even less correct now.

  • The fields in the net_device structure have been rearranged, which will break binary-only drivers.

  • kref_put() now returns an int value: nonzero if the kref was actually released.

  • kobject_add() and kobject_del() no longer generate hotplug events. If you need these events, you must call kobject_hotplug() explicitly. The wrapper functions kobject_register() and kobject_unregister() do still generate hotplug events.

  • kobj_map() no longer takes a subsystem argument; instead, it needs a pointer to a semaphore which it can use for mutual exclusion.

  • A new function, sysfs_chmod_file(), allows permissions to be changed on existing sysfs attributes.

  • There is a new generic sort() function which should be used in preference to creating yet another implementation.

  • A new attribute (__nocast) is being used with sparse to disable a number of implicit casts and find probable bugs.

  • io_remap_page_range() is now deprecated; use io_remap_pfn_range() instead.

  • A set of functions has been added to work with big-endian I/O memory.

  • synchronize_kernel() is deprecated. Callers should instead use either synchronize_sched() (to verify that all processors have quiesced) or synchronize_rcu() (to verify that all processors have exited RCU critical sections).

  • The flag argument to blk_queue_ordered() has changed to indicate how ordered writes are handled by the device. Possible values are QUEUE_ORDERED_NONE (ordering is not possible), QUEUE_ORDERED_TAG (ordering is forced with request tags), and QUEUE_ORDERED_FLUSH (ordering is done with explicit flush commands). For the last case, the request queue has two new methods, prepare_flush_fn() and end_flush_fn(), which are called before and after a barrier request.

  • A new function, valid_signal(), can (and should) be used to test whether signal numbers from user space are valid.

  • The Developers Certificate of Origin, the document acknowledged by all those "Signed-off-by:" headers, has changed. The new version adds a clause noting that contributions - and the information that goes with them - are public information which can be redistributed.

2.6.11 (March 2, 2005)

  • The kernel now performs access checking for read() and write() calls before invoking the driver- or filesystem-specific file_operations method.

  • The bcopy() function, unused in the mainline kernel, has been removed.

  • The prototype of the suspend() method in struct pci_driver has changed; the state parameter is now of type pm_message_t.

  • The rwlock_is_locked() macro has been removed; instead, use either read_can_lock() or write_can_lock(). There is also a new spin_can_lock() for regular spinlocks.

  • Three new ways of waiting for completions have been added: wait_for_completion_interruptible(), wait_for_completion_timeout(), and wait_for_completion_interruptible_timeout().

  • For USB drivers: the usb_device_descriptor and usb_config_descriptor structures now keep all fields in the wire (little-endian) form. [GKH]

  • pci_set_power_state() and pci_enable_wake() have new prototypes: power states are represented with the pci_power_t type rather than an int. [GKH]

  • The Big Kernel Semaphore patch was merged. As a result, code which is protected by lock_kernel() is now preemptible. This change should not affect most code developed in this century, but there are always exceptions.

  • The file_operations structure now contains an unlocked_ioctl() member. If that member is non-NULL, it will be called in preference to the regular ioctl() method - and the big kernel lock will not be held. New code should use unlocked_ioctl() and the programmer should ensure that the proper locking has been performed.

    There is also a new compat_ioctl() method which is called, if present, when a 32-bit process calls ioctl() on a 64-bit system.

  • Run-time initialization of spinlocks is being converted away from the assignment form (using SPIN_LOCK_UNLOCKED) to explicit spin_lock_init() calls. No noises have yet been made about removing SPIN_LOCK_INIT, but the writing should be considered to be on the wall. If and when the real-time preemption patches are merged, the assignment form may no longer be possible.

  • debugfs has been merged; it is a virtual filesystem intended for use by kernel hackers who want to export debugging information from their code.

  • Binary attributes in sysfs can now offer mmap() support; see this patch for the details.

  • Four-level page tables have been merged. This change affects surprisingly little code, but, if you are manually walking through the page table tree, you will have to take the new level into account.

  • Socket buffers can be obtained from alloc_skb_from_cache(), which uses a slab cache.

  • A new memory allocation flag (__GFP_ZERO) was added; it allows kernel code to request that the allocated memory be zeroed. It is part of the larger prezeroing patch which has not, yet, been merged.

  • Linus has reimplemented pipes with a circular buffer construct which will, eventually, be mutated into a more generic form.

  • Work is being done toward the goal of removing the semaphore from struct subsystem. If your code depends on this semaphore, which it shouldn't, expect to have to change it soon.

2.6.10 (December 24, 2004)

  • Calling pci_enable_device() is required to get interrupt routing to work. [GKH]

  • A new function, pci_dev_present(), can be used to determine whether a specific device is present or not. [GKH]

  • The prototypes to pci_save_state() and pci_restore_state() have changed: the buffer argument is no longer needed (the space has been allocated in struct pci_dev instead). [GKH]

  • The kernel build system was tweaked; the preferred name for kernel makefiles is now Kbuild. The change is meant to highlight the fact that kernel makefiles are rather different than the user-space variety, but very few, if any makefiles have been renamed.

  • add_timer_on(), sys_lseek(), and a number of other kernel functions are no longer exported to modules. Most of the driver core functions have been changed to GPL-only exports.

  • I/O space write barriers are now supported.

  • The prototype of kunmap_atomic() has changed. This change should not affect properly-written code, but should generate warnings when a struct page pointer is (erroneously) passed to that function.

  • atomic_inc_return() was added as a way to increment the value of an atomic_t variable and get the new value.

  • The little-used "BIO walking" helper functions (process_that_request_first()) have been removed.

  • The venerable remap_page_range() function has been changed to remap_pfn_range(); the new function uses a page frame number for the physical address, rather than the actual address. remap_page_range() is still supported - for now.

  • wake_up_all_sync(), unused in the mainline tree, was removed.

  • A simple, stream-oriented circular buffer implementation was added.

  • The kernel event mechanism was merged, making it possible to notify user space of relevant kernel events.

  • vfs_permission() was replaced by generic_permission(), which has an optional callback for ACL checking. [MS]

2.6.9 (October 18, 2004)

  • Kprobes was merged, making another debugging technique available.

  • Spinlocks are implemented completely out of line now. This change should not affect any code.

  • wait_event_timeout() was added.

  • Kobjects now use the kref type to handle reference counting. Most code should be unaffected by this change.

  • A new set of functions for accessing I/O memory was introduced. The new functions are cleaner and type-safe, and should be used in preference to readb() and friends. The new ioport_map() function makes it possible to treat I/O ports as if they were I/O memory.

  • The NETIF_F_LLTX feature for net_devices tells the networking subsystem that the driver code performs its own locking and does not require that the xmit_lock be taking before hard_start_xmit() can be called.

  • dma_declare_coherent_memory() was added to allow the DMA functions to hand out memory located on a specific device.

  • msleep_interruptible() was added.

  • The prototype of kref_put() changed; a pointer to the release() function is now required.

2.6.8 (August 13, 2004)

  • The fcntl() method in the file_operations structure, just added in 2.6.6, was removed. It has been replaced by two new methods: check_flags() and dir_notify().

  • nonseekable_open() was added as a way of indicating that a given file is not seekable.

  • wait_event_interruptible_exclusive() was added.

  • dma_get_required_mask() was added as a way for drivers to determine the optimal DMA mask.

  • Module section information was added under /sys/module, making it easier use symbolic debuggers with modules.

  • The VFS follow_link() method saw some (compatible) changes. Filesystems should use the new symlink lookup method so that the kernel can, eventually, support a greater link depth. [MS]

(We are still in the process of filling in the earlier API changes - stay tuned).

Acknowledgements

Thanks to the following people who have helped keep this page current:

[GKH]Greg Kroah-Hartman
Michael Hayes
[MS]Miklos Szeredi

(Log in to post comments)

API changes in the 2.6 kernel series

Posted Jan 20, 2005 3:35 UTC (Thu) by Eliot (guest, #19701) [Link]

Did we miss the announcement of the publication of "Linux Device Drivers, Third Edition"? Where can I get a copy!

LDD3

Posted Jan 20, 2005 3:36 UTC (Thu) by corbet (editor, #1) [Link]

Sorry, it's not out quite yet. The reference is in there because this page is referenced in LDD3 as the place to go for the most recent information.

The plan is to have it available at LinuxWorld next month.

LDD3

Posted Jan 20, 2005 4:50 UTC (Thu) by bronson (subscriber, #4806) [Link]

How will LDD3 compare with Robert Love's book? Are they direct competitors? Given the caliber of the authors, I am sure that both of these books will be well worth purchasing.

LDD3

Posted Jan 20, 2005 13:36 UTC (Thu) by corbet (editor, #1) [Link]

"How will LDD3 compare with Robert Love's book? Are they direct competitors?"

If they are competitors, the competition is friendly - Robert was one of our technical reviewers. His (excellent) book is much wider in scope than LDD, but not as detailed. You clearly need to buy both :)

LDD3

Posted Jan 20, 2005 17:43 UTC (Thu) by parimi (subscriber, #5773) [Link]

I read Robert Love's book a while ago. My observation is that it explains many of the concepts used in the kernel ( e.g kernel API, spink locks etc.). LDD, on the other hand, is for people who want to get started with writing device drivers. Both the books are indispensable :)

LDD3

Posted Mar 15, 2005 16:42 UTC (Tue) by LogicG8 (guest, #11076) [Link]

> You clearly need to buy both :)

I did. I was near a Borders and stopped in to pick up your book. They had the
second edition but not the third. However RML's new book was and it was on
sale for 40% off. Remembering your favorable review I purchased a copy to
tide me over until recieved my copy of LDD3 from O'reilly.

API changes in the 2.6 kernel series

Posted Jan 20, 2005 9:49 UTC (Thu) by eskild (subscriber, #1556) [Link]

Ah, this is good stuff. Tracking what's happening on the LKML is virtually a full-time job, and one for which I don't have the time. Thanks for the effort.

This is why I'll be happy to renew my subscription when it comes up in April.

API changes in the 2.6 kernel series

Posted Jan 20, 2005 23:45 UTC (Thu) by dhess (subscriber, #7827) [Link]

Agreed, thank god for LWN.

API changes in the 2.6 kernel series

Posted Jan 21, 2005 9:55 UTC (Fri) by KotH (subscriber, #4660) [Link]

<aol>me too</aol>

Especialy for people like me, who are messing with out of tree
drivers, this page is very valuable. Thanks a lot.

USB drives

Posted Jan 21, 2005 17:10 UTC (Fri) by hazelsct (subscriber, #3659) [Link]

One change between 2.6.8 and 2.6.9 (and from what I hear, 2.6.10) is the new devices /dev/ud* for USB drives. This "interface change" makes such devices unusable without a MAKEDEV that can handle the new devices, and a major number change from 2.6.9 to 2.6.10 makes this awkward. (See Debian bug 278237 for details.)

USB drives

Posted Jan 21, 2005 21:11 UTC (Fri) by gregkh (subscriber, #8) [Link]

That is /dev/ub* not ud.

Anyway, that is because you selected to use the UB driver, which will grab your usb storage device instead of the usb-storage driver if you build it. If you do not want this change to happen, do not use the driver.

Or use something like udev which keeps name changes like this from happening all together if you use the correct type of rule for your device.

API changes in the 2.6 kernel series

Posted Jan 27, 2005 7:20 UTC (Thu) by nhoxanh (guest, #17931) [Link]

Corbet, I want to pre-order the LDD3. How could I do it?

Thanks.

API changes in the 2.6 kernel series

Posted Jan 30, 2006 15:27 UTC (Mon) by raulsiles (subscriber, #21347) [Link]

The book referenced along the text is available at:
http://lwn.net/Kernel/LDD3/
or
http://www.oreilly.com/catalog/linuxdrive3/book/index.csp

Thanks a lot for offering it for free! Knowledge is power!

Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds