|
|
Log in / Subscribe / Register

API changes in the 2.6 kernel series

The 2.6 kernel development series differs from its predecessors in that much larger and potentially destabilizing changes are being incorporated into each release. Among these changes are modifications to the internal programming interfaces for the kernel, with the result that kernel developers must work harder to stay on top of a continually-shifting API. There has never been a guarantee of internal API stability within the kernel - even in a stable development series - but the rate of change is higher now.

This article will be updated to keep track of the internal changes for each 2.6 kernel release. Its permanent location is:

http://lwn.net/Articles/2.6-kernel-api/

This page will, doubtless, remain incomplete for a while. If you see an omission, please let us know by sending a note to kernel@lwn.net rather than by posting comments here. The chances of a prompt update are higher, the article will not become cluttered with redundant comments, and we'll be more than happy to credit you here.

If you are a Linux Device Drivers, Third Edition reader looking for information on changes since the book was published: LDD3 covers version 2.6.10 of the kernel, so only the changes starting with 2.6.11 are relevant.

Last update: April 29, 2008

2.6.25 (April 16, 2008)

  • There have been a great many changes to the low-level device model APIs dealing with kobjects and ksets. These changes have, in turn, forced a large number of adjustments throughout the tree. See Documentation/kobject.txt for an overview of the new API.

  • There is a new set of security module functions for dealing with filesystem mount and unmount operations.

  • The chained scatterlist API has been augmented with the sg_table patches.

  • There have been some changes to the block request completion API. See this article for a description of the new way of doing things.

  • A large number of SUNRPC symbols (rpc_* and rpcauth_*) have been changed to GPL-only exports.

  • The "flatmem" and "discontigmem" memory models have been removed on the 64-bit x86 architecture; "sparsemem" is now used for all builds.

  • The fastcall function attribute didn't do anything on the x86 architecture, so it has been removed.

  • x86 has a new set of functions for easily manipulating page attributes. They are:

        set_memory_uc(unsigned long addr, int numpages); /* Uncached */
        set_memory_wb(unsigned long addr, int numpages); /* Cached */
        set_memory_x(unsigned long addr, int numpages);  /* Executable */
        set_memory_nx(unsigned long addr, int numpages); /* Non-executable */
        set_memory_ro(unsigned long addr, int numpages); /* Read-only */
        set_memory_rw(unsigned long addr, int numpages); /* Read-write */
    

    There is also a set of set_pages_* functions which take a struct page pointer rather than a beginning address.

  • Early-boot debugging of x86 systems via the FireWire port is now supported.

  • Bidirectional command support has been added to the SCSI layer.

  • There is a new process state called TASK_KILLABLE. It is a blocked state similar to TASK_UNINTERRUPTIBLE, with the difference that a wakeup will happen upon delivery of a fatal signal. The idea is to allow (almost) uninterruptible sleeps, but to still allow the process to be killed outright - thus ending the problem of unkillable processes stuck in the "D" state. There is a new set of functions for using this state: wait_event_killable(), schedule_timeout_killable(), mutex_lock_killable(), etc.

  • add_disk_randomness() has been unexported as there are no more in-tree users.

  • pci_enable_device_bars() has been replaced by two more-specific functions: pci_enable_device_io() and pci_enable_device_mem().

  • The high-resolution timer API has been augmented with:

        unsigned long hrtimer_forward_now(struct hrtimer *timer,
                                          ktime_t interval);
    

    It will move the given timer's expiration forward past the current time as determined by the associated clock.

  • The device structure now holds a pointer to a device_dma_parameters structure:

        struct device_dma_parameters {
    	unsigned int max_segment_size;
    	unsigned long segment_boundary_mask;
        };
    

    These parameters are used by the DMA mapping layer (and the IOMMU mapping code in particular) to ensure that I/O operations are set up within the device's constraints. The PCI layer supports this feature with two new functions:

        int pci_set_dma_max_seg_size(struct pci_dev *dev, unsigned int size);
        int pci_set_dma_seg_boundary(struct pci_dev *dev, unsigned long mask);
    

    Drivers for devices with unusually strict DMA limitations should probably use these functions to ensure that those restrictions are respected.

  • Many nopage() methods have been replaced by the newer fault() API; the near-term plan is to remove nopage() altogether. See this article for a description of the new way of "page not present" handling.

  • A generic resource counter mechanism was merged as part of the memory controller patch set; see <linux/res_counter.h> for the details.

  • reserve_bootmem() has a new flags parameter. Most callers will set it to BOOTMEM_DEFAULT; the kdump code, though, uses BOOTMEM_EXCLUSIVE to ensure that it is the only one to touch the memory.

  • Most architectures now have support for cmpxchg64() and cmpxchg_local().

  • There is a new set of string functions:

        extern int strict_strtoul(const char *string, unsigned int base, 
                                  unsigned long *result);
        extern int strict_strtol(const char *string, unsigned int base,
        	       		     long *result);
        extern int strict_strtoull(const char *string, unsigned int base,
                                   unsigned long long *result);
        extern int strict_strtoll(const char *string, unsigned int base,
                                  long long *result);
    

    These functions convert the given strings to various forms of long values, but they will return an error status if the given string value, as a whole, does not represent a proper integer value. These functions are now used in the parsing of kernel parameters.

2.6.24 (January 24, 2008)

  • The i386/x86_64 architecture merger has during this kernel series. The result is a single architecture, called "x86," which can be built for 32-bit and 64-bit processors.

  • The Video4Linux layer has some new internal support for composite devices involving more than one driver (many V4L2 devices involve, at a minimum, separate drivers for the controller and the sensor).

  • Also in Video4Linux: the video-buf layer has been replaced with a more generic implementation which works with a wider range of devices (including USB devices and those which do not support scatter/gather DMA).

  • The NAPI interface used in network drivers has been reworked to better support devices with multiple transmit queues.

  • The networking layer has a new function for printing MAC addresses:

        char *print_mac(char *buf, const u8 *addr);
    

    The buf buffer should be declared with DECLARE_MAC_BUF(); the output is suitable for formatting in printk() with "%s".

  • The NETIF_F_LLTX (lockless transmit) flag for network devices has been deprecated and should not be used in new code.

  • The functions ktime_sub_us() and ktime_sub_ns() have been added; they subtract the given number of microseconds or nanoseconds from a ktime_t value.

  • The hard_header() method has been removed from struct net_device; it has been replaced by a per-protocol header_ops structure pointer.

  • The debugfs filesystem has some new functions (debugfs_create_x8(), debugfs_create_x16(), debugfs_create_x32()) which make it easy to export files containing hexadecimal numbers.

  • Various small sysfs-related API changes have been made. The name field has been removed from the kobject structure. The prototypes of the user-event callbacks have been changed. Many of the subsystem-related calls have been removed - subsystems never really did much of anything anyway; get_bus() and put_bus() are also gone.

  • A new value DMA_MASK_NONE can be stored in the device structure dma_mask field to indicate that the device is incapable of performing DMA.

  • The VFS has a couple of new address space operations (write_begin() and write_end()) aimed at fixing some deadlock scenarios; see this article for more information.

  • The scatterlist chaining patches have been merged and many parts of the kernel have been updated to use this feature.

  • The CFLAGS= and CPPFLAGS= options now work with the kernel build system in the expected way: they add flags to be passed to the C compiler and preprocessor, respectively.

  • The prototype for slab constructor callbacks has changed to:

        void (*ctor)(struct kmem_cache *cache, void *object);
    

    The unused flags argument has been removed and the order of the other two arguments has been reversed to match other slab functions.

  • The DECLARE_MUTEX_LOCKED() macro has been removed.

  • The long-deprecated SA_* interrupt flags have been removed in favor of the IRQF_* equivalents.

  • A number of block layer utilities have seen prototype changes. The most evident change, perhaps, is bio_endio() and the associated bio_end_io_t callback:

        void bio_endio(struct bio *bio, int error);
        typedef void (bio_end_io_t) (struct bio *, int);
    

    These functions now always completes the entire BIO, so the size argument has been removed.

  • The paravirt_ops structure has been split into several smaller, more specialized operations vectors. These include pv_init_ops (boot-time operations), pv_time_ops (for time-related operations), pv_cpu_ops (privileged instructions), pv_irq_ops (interrupt handling), pv_mmu_ops (page table management), and a few others.

  • There are some new bit operations which have been added:

        int test_and_set_bit_lock(unsigned long nr, unsigned long *addr);
        void clear_bit_unlock(unsigned long nr, unsigned long *addr);
        void __clear_bit_unlock(unsigned long nr, unsigned long *addr);
    

    These operations are intended to be used in the creation of single-bit locks; they work without the need for any additional memory barriers.

  • There is a new KERN_CONT priority level for printk(). It is, in fact, empty; it is meant to serve as a marker for printk() calls which continue a previous (not terminated with a newline) printed line.

  • The filesystem export operations, used to make filesystems available over protocols like NFS, have been reworked. Two new methods (fh_to_dentry() and fh_to_parent()) replace the old get_dentry() interface. There is a new structure (struct fid) used to describe file handles. This work is aimed at making the export interface easier to use and (eventually) supporting 64-bit inode numbers.

  • The virtio patches - providing an infrastructure for I/O into and out of virtualized guests - have been merged.

2.6.23 (October 9, 2007)

  • The UIO interface for the creation of user-space drivers has been merged. While UIO is aimed at user space, there is a kernel-space component for driver registration and interrupt handling.

  • unregister_chrdev() now returns void.

  • There is a new notifier chain which can be used (by calling register_pm_notifier()) to obtain notification before and after suspend and hibernate operations.

  • The new "lockstat" infrastructure provides statistics on the amount of time threads spend waiting for and holding locks.

  • The new fault() VMA operation replaces nopage() and populate(). See this article for a description of the current fault() API.

  • The generic netlink API now has the ability to register (and unregister) multicast groups on the fly.

  • The destructor argument has been removed from kmem_cache_create(), as destructors are no longer supported. All in-kernel callers have been updated.

  • There is a new clone() flag - CLONE_NEWUSER - which creates a new user namespace for the process; it is intended for use with container systems.

  • There is a new rtnetlink API for managing software network devices.

  • The networking core can now work with devices which have more than one transmit queue. This is a feature which was needed to properly support some wireless devices.

  • The sysfs core has been significantly rewritten to weaken the connection between sysfs entries and internal kobjects. The new code should make life easier for driver writers who will have fewer object lifecycle issues to worry about.

  • The never-used enable_wake() PCI driver method has been removed.

  • Drivers wanting to get the revision ID from the PCI config space should now just use the value found in the new revision member of the pci_dev structure. All in-tree drivers have been changed to use this new approach.

  • The SCSI layer has picked up a couple of scatter/gather accessor functions - scsi_dma_map() and scsi_dma_unmap() - in preparation for chained scatter/gather lists and bidirectional requests. Most drivers in the kernel have been updated to use these functions.

  • The idr code has a couple of new helper functions: idr_for_each() and idr_remove_all().

  • sys_ioctl() is no longer exported to modules.

  • The page table helper functions ptep_establish(), ptep_test_and_clear_dirty() and ptep_clear_flush_dirty() have been removed - they had no in-kernel users.

  • Kernel threads are non-freezable by default; any kernel thread which should be frozen for a suspend-to-disk operation must now call set_freezable() to arrange for that to happen.

  • The SLUB allocator is now the default.

  • The new function is_owner_or_cap(inode) tests for access permission based on the current fsuid and capabilities; it replaces the open-coded test previously found in several filesystems.

  • There is a new utility function:
        char *kstrndup(const char *s, size_t max, gfp_t gfp);
    
    This function duplicates a string along the lines of the user-space strndup().

2.6.22 (July 8, 2007)

  • The mac80211 (formerly "Devicescape") wireless stack has been merged, creating a whole new API for the creation of wireless drivers, especially those requiring software MAC support.

  • The eth_type_trans() function now sets the skb->dev field, consistent with how similar functions for other link types operate. As a result, many Ethernet drivers have been changed to remove the (now) redundant assignment.

  • The header fields in the sk_buff structure have been renamed and are no longer unions. Networking code and drivers can now just use skb->transport_header, skb->network_header, and skb->mac_header. There are new functions for finding specific headers within packets: tcp_hdr(), udp_hdr(), ipip_hdr(), and ipipv6_hdr().

  • Also in the networking area: the packet scheduler has been reworked to use ktime values rather than jiffies.

  • The i2c layer has seen significant new changes meant to make i2c drivers look more like drivers for other buses. There are, for example, new probe() and remove() methods for notifying devices when i2c peripherals come and go. Since i2c is not a self-describing bus, the support code still needs help to know where i2c devices might be; for many classes of device, this information can be had from the system BIOS.

  • The crypto API has a new set of functions for use with asynchronous block ciphers. There is also a new cryptd kernel thread which can run any synchronous cipher in an asynchronous mode.

  • The subsystem structure has been removed from the Linux device model; there never really was any need for it. Most code which was expecting a struct subsystem argument has been changed to use the relevant kset instead.

  • There is a new version of the in-kernel rpcbind (portmapper) client which supports versions 2-4 of the rpcbind protocol. The portmapper API has changed as a result.

  • Numerous changes to the paravirt_ops methods have been made. Additionally, paravirt_ops is no longer a GPL-only export.

  • There is a new memory function:

        void *krealloc(const void *p, size_t new_size, gfp_t flags);
    

    As one would expect, it changes the size of the allocated memory, moving it if need be.

  • The SLUB allocator has been merged as an experimental (for now) alternative to the slab code. The SLUB API generally matches slab, but the handling of zero-length allocations has changed somewhat.

  • A new macro has been added to make the creation of slab caches easier:

        struct kmem_cache KMEM_CACHE(struct-type, flags);
    
    The result is the creation of a cache holding objects of the given struct_type, named after that type, and with the additional slab flags (if any).

  • The SLAB_DEBUG_INITIAL flag has been removed, along with the associated SLAB_CTOR_VERIFY flag passed to constructors. The result is a set of changes which ripples through quite a few source files. The unused SLAB_CTOR_ATOMIC flag is also gone.

  • The SuperH architecture has working kgdb support again.

  • The ia64 architecture has a new tool which will inject machine check errors into a running system. Not recommended for production machines.

  • The deferrable timers patch has been merged. There is also a new macro for initializing workqueue entries (INIT_DELAYED_WORK_DEFERRABLE()) which causes the job to be queued in a deferrable manner.

  • The old SA_* interrupt flags have not been removed as originally scheduled, but their use will now generate warnings at compile time.

  • There is a new list_first_entry() macro which, surprisingly, gets the first entry from a list.

  • The atomic64_t and local_t types are now fully supported on a wider set of architectures.

  • Workqueues have been reworked again. There is a new function:

        void cancel_work_sync(struct work_struct *work);
    

    This function tries to cancel a single workqueue entry, be it on the shared (keventd) or a private workqueue. Meanwhile run_scheduled_work() has been removed.

2.6.21 (April 25, 2007)

  • Sysfs now supports the concept of "shadow directories" - multiple versions of a directory with the same name. This feature is to be used with container applications, allowing each namespace to have resources (network interfaces, for example) with the same name. To that end, two new functions have been added:

         int sysfs_make_shadowed_dir(struct kobject *kobj,
    	          void *(*follow_link)(struct dentry *, 
                                           struct nameidata *));
         struct dentry *sysfs_create_shadow_dir(struct kobject *kobj);
    

    sysfs_make_shadowed_dir() takes the existing directory for a kobject and makes it shadowed - capable of having multiple instantiations. The follow_link() method must be able to pick out the right version for any given situation. A call to sysfs_create_shadow_dir() will create a new instantiation for a directory which has been made shadowed.

    Note that this feature is likely to change somewhat in 2.6.22.

  • Quite a few kobject functions - kobject_init(), kobject_del(), kobject_unregister(), kset_register(), kset_unregister(), subsystem_register(), subsystem_unregister(), and subsys_create_file() - now return harmlessly if passed a NULL pointer.

  • Many kernel subsystems which once used class_device structures have been changed to use struct device instead; this work is toward a long-term goal of getting rid of the class tree and having a single device tree in sysfs.

  • There is a new function:

         int device_schedule_callback(struct device *dev, 
                                      void (*func)(struct device *))
    

    This function will arrange for func() to be called at some future time in process context. It's meant to enable device attributes to unregister themselves, but one can imagine other applications as well.

  • The ALSA system on chip ("ASoC") layer provides extensive support for the implementation of sound drivers on embedded systems; see the documentation files packaged with the kernel for details.

  • Significant changes have been made to the crypto support interface.

  • The device resource management patches, making a lot of driver code easier to write, have been merged.

  • The DMA memory zone (ZONE_DMA) is now optional and may not be present in all kernels.

  • The local_t type has been made consistent across architectures and has gained some documentation.

  • The nopfn() address space operation can now return NOPFN_REFAULT to indicate that the faulting instruction should be re-executed.

  • A new function, vm_insert_pfn(), enables the insertion of a new page into a process's address space by page-frame number.

  • A new driver API for general-purpose I/O signals has been added.

  • The sysctl code has been heavily reworked, leading to a number of internal API changes.

  • The clockevents and dynamic tick patches have been merged. Most code will not require changes, but kernel developers should be aware of code which depends on jiffies.

2.6.20 (February 4, 2007)

  • The workqueue API has seen a major rework which requires changes in almost any code using workqueues. In short: there are now two different types of workqueues, depending on whether the delay feature is to be used or not. The work function no longer gets an arbitrary data pointer; its argument, instead, is a pointer to the work_struct structure describing the job. If you have code which is broken by these changes, this set of instructions by David Howells is likely to be helpful.

  • Some additional workqueue changes have been merged as well. There is a new "freezable" workqueue type, indicating a workqueue which can be safely frozen during the software suspend process. The new function create_freezeable_workqueue() will create one. Another new function, run_scheduled_work(), will cause a previously-scheduled workqueue entry to be run synchronously. Note that run_scheduled_work() cannot be used with delayed workqueues.

  • Much of the sysfs-related code has been changed to use struct device in place of struct class_device. The latter structure will eventually go away as the class and device mechanisms are merged.

  • There is a new function:

        int device_move(struct device *dev, struct device *new_parent);
    

    This function will reparent the given device to new_parent, making the requisite sysfs changes and generating a special KOBJ_MOVE event for user space.

  • A number of kernel header files which included other headers no longer do so. For example, <linux/fs.h> no longer includes <linux/sched.h>. These changes should speed kernel build times by getting rid of large number of unneeded includes, but might break some out-of-tree modules which do not explicitly include all the headers they need.

  • The internal __alloc_skb() function has a new parameter, being the number of the NUMA node on which the structure should be allocated.

  • The slab allocator API has been cleaned up somewhat. The old kmem_cache_t typedef is gone; struct kmem_cache should be used instead. The various slab flags (SLAB_ATOMIC, SLAB_KERNEL, ...) were all just aliases for the equivalent GFP_ flags, so they have been removed.

  • A new boot-time parameter (prof=sleep) causes the kernel to profile the amount of time spent in uninterruptible sleeps.

  • dma_cache_sync() has a new argument: the device structure for the device doing DMA.

  • The paravirt_ops code has gone in, making it easier for the kernel to support multiple hypervisors. Anybody wanting to port a hypervisor to this code should note that it is somewhat volatile and likely to remain that way for some time.

  • The struct path changes have been merged, with changes rippling through the filesystem and device driver subsystems. In short, code accessing the dentry pointer from a struct file pointer, which used to read file->f_dentry, should now read file->f_path.dentry. There are defines making the older style of code work - for now.

  • There is now a generic layer for human input devices; the USB HID code has been switched over to this new layer.

  • A new function, round_jiffies(), rounds a jiffies value up to the next full second (plus a per-CPU offset). Its purpose is to encourage timeouts to occur together, with the result that the CPU wakes up less frequently.

  • The block "activity function," a callback intended for the implementation of disk activity lights in software, has been removed; nobody was actually using it.

2.6.19 (November 29, 2006)

  • The prototype for interrupt handler functions has changed. In short, the regs argument has been removed, since almost nobody used it. Any interrupt handler which needs the pre-interrupt register state can use get_irq_regs() to obtain it.

  • The latency tracking infrastructure patch has been merged.

  • The readv() and writev() methods in the file_operations structure have been removed in favor of aio_readv() and aio_writev() (whose prototypes have been changed). See this article for more information.

  • The no_pfn() address space operation has been added.

  • SRCU - a version of read-copy-update which allows read-side blocking - has been merged. See this article by Paul McKenney for lots of details.

  • The CHECKSUM_HW value has long been used in the networking subsystem to support hardware checksumming. That value has been replaced with CHECKSUM_PARTIAL (intended for outgoing packets where the job must be completed by the hardware) and CHECKSUM_COMPLETE (for incoming packets which have been completely checksummed by the hardware).

  • A number of memory management changes have been merged, including tracking of dirty pages in shared memory mappings, making the DMA32 and HIGHMEM zones optional, and an architecture-independent mechanism for tracking memory ranges (and the holes between them).

  • The pud_page() and pgd_page() macros now return a struct page pointer, rather than a kernel virtual address. Code needing the latter should use pud_page_vaddr() or pgd_page_vaddr() instead.

  • A number of driver core changes including experimental parallel device probing and some improvements to the suspend/resume process.

  • There is now a notifier chain for out-of-memory situations; the idea here is to set up functions which might be able to free some memory when things get very tight.

  • The semantics of the kmap() API have been changed a bit: on architectures with complicated memory coherence issues, kmap() and kunmap() are expected to manage coherency for the mapped pages, thus eliminating the need to explicitly flush pages from cache.

  • PCI Express Advanced Error Reporting is now supported in the PCI layer.

  • A number of changes have been made to the inode structure in an effort to make it smaller.

  • Much improved suspend and resume support for the USB layer.

  • A new set of functions has been added to allow USB drivers to quickly check the direction and transfer mode of an endpoint.

  • A somewhat reduced version of Wireless Extensions version 21. Most of the original functionality has been removed with the idea that the wireless extensions will soon be superseded by something else.

  • Vast numbers of annotations enabling the sparse utility to detect big/little endian errors.

  • The flags field of struct request has been split into two new fields: cmd_type and cmd_flags. The former contains a value describing the type of request (filesystem request, sense, power management, etc.) while the latter has the flags which modify the way the command works (read/write, barriers, etc.).

  • The block layer can be disabled entirely at kernel configuration time; this option can be useful in some embedded situations.

  • The kernel now has a generic boolean type, called bool; it replaces a number of homebrewed boolean types found in various parts of the kernel.

  • There is a new function for allocating a copy of a block of memory:

        void *kmemdup(const void *src, size_t len, gfp_t gfp);
    
    A number of allocate-then-copy code sequences have been updated to use kmemdup() instead.

2.6.18 (September 19, 2006)

  • The Video4Linux 2 API has changed: the huge ioctl() method has been moved into the V4L2 code. Video drivers provide a very long list of methods specific to the individual ioctl() commands. See <media/v4l2-dev.h>.

  • The generic IRQ layer has been merged. The SA_* flags to request_irq() have been renamed; the new prefix is IRQF_. A long series of patches has converted in-tree drivers over to the new names; The old names are scheduled for removal in January, 2007.

  • 64-bit resources are now supported. This change affects a number of users of the resource management API.

  • The kernel lock validator has gone in, along with a number of fixes for potential deadlocks found by the validator.

  • At long last, the devfs subsystem has been removed.

  • An API and support for the Intel I/OAT DMA engine.

  • The skb_linearize() function has been reworked, and no longer has a GFP flags argument. There is also a new skb_linearize_cow() function which ensures that the resulting SKB is writable.

  • Network drivers should no longer manipulate the xmit_lock spinlock in the net_device structure; instead, the following new functions should be used:

         int netif_tx_lock(struct net_device *dev);
         int netif_tx_lock_bh(struct net_device *dev);
         void netif_tx_unlock(struct net_device *dev);
         void netif_tx_unlock_bh(struct net_device *dev);
         int netif_tx_trylock(struct net_device *dev);
    

  • The long-deprecated inter_module API has finally been removed altogether.

  • A new kernel API providing access to the "inotify" functionality has been added.

  • The old scsi_request infrastructure has been removed, since there are no longer any in-tree drivers which use it.

  • The include file <linux/usb_input.h> is now <linux/usb/input.h>.

  • The VFS get_sb() filesystem method has a new prototype:

         int (*get_sb)(struct file_system_type fstype, int flags,
                       const char *dev_name, void *data,
    		   struct vfsmount *mnt);
    

    The mnt parameter is new; it allows the filesystem to receive a pointer to the target mount point structure. The mount point should be associated with the superblock in the get_sb() method with a call to:

         int simple_set_mnt(struct vfsmount *mnt, struct super_block *sb);
    

    The return value of get_sb() has also been changed to an int error status. The various get_sb_*() convenience functions have had the same changes applied. The purpose of all this work is to allow NFS to share superblocks across mount points.

  • The statfs() superblock operation has a new prototype:

         int (*statfs)(struct dentry *dentry, struct kstatfs *stats);
    

    The old struct super_block pointer is now a dentry pointer instead.

  • Some functions have been added to make it easy for kernel code to allocate a buffer with vmalloc() and map it into user space. They are:

         void *vmalloc_user(unsigned long size);
         void *vmalloc_32_user(unsigned long size);
         int remap_vmalloc_range(struct vm_area_struct *vma, void *addr,
                                 unsigned long pgoff);
    

    The first two functions are a form of vmalloc() which obtain memory intended to be mapped into user space; among other things, they zero the entire range to avoid leaking data. vmalloc_32_user() allocates low memory only. A call to remap_vmalloc_range() will complete the job; it will refuse, however, to remap memory which has not been allocated with one of the two functions above.

  • The read-copy-update API is now accessible only to GPL-licensed modules. The deprecated function synchronize_kernel() has also been removed.

  • There is a new strstrip() library function which removes leading and trailing white space from a string.

  • A new WARN_ON_ONCE macro will test a condition and complain if that condition evaluates true - but only once per boot.

  • A number of crypto API changes have been merged, the biggest being a change to most algorithm-specific functions to take a pointer to the crypto_tfm structure, rather than the old "context" pointer. This change was necessary to support parameterized algorithms.

  • There is a new make target "headers_install". Its purpose is to install a set of kernel headers useful for libraries and user-space tools. A limited set of headers is installed, and those headers are sanitized on their way to the destination directory. It is hoped that distributors will use this mechanism to set up kernel headers for inclusion from user space in the future.

2.6.17 (June 17, 2006)

  • Support for the SPARC "Niagara" architecture.

  • EXPORT_SYMBOL_GPL_FUTURE() has been merged.

  • The safe notifier patch has been merged, creating a new API for all notifier users.

  • The SLAB_NO_REAP slab cache option, which ostensibly caused the slab not to be cleaned up when the system is under memory pressure, has been removed. The kmem_cache_t typedef is also being phased out in favor of struct kmem_cache.

  • The "softmac" 802.11 subsystem has been merged. This code may eventually be phased out, however, in favor of the Devicescape code.

  • There is a new real-time clock subsystem, providing generalized RTC support and a well-defined driver interface.

  • A new utility function has been added:

         int execute_in_process_context(void (*fn)(void *data),
                                        void *data, 
    				    struct execute_work *work);
    

    This function will arrange for fn() to be called in process context (where it can sleep). Depending on when execute_in_process_context() is called, fn() could be invoked immediately or delayed by way of a work queue.

  • The SMP alternatives patch has been merged.

  • A rework of the relayfs API - but the sysfs interface has been left out for now.

  • There is a new tracing mechanism for developers debugging block subsystem code.

  • There is a new internal flag (FMODE_EXEC) used to indicate that a file has been opened for execution.

  • The obsolete MODULE_PARM() macro is gone forevermore.

  • A new function, flush_anon_page(), can be used in conjunction with get_user_pages() to safely perform DMA to anonymous pages in user space.

  • Zero-filled memory can now be allocated from slab caches with kmem_cache_zalloc(). There is also a new slab debugging option to produce a /proc/slab_allocators file with detailed allocation information.

  • There are four new ways of creating mempools:

         mempool_t *mempool_create_page_pool(int min_nr, int order);
         mempool_t *mempool_create_kmalloc_pool(int min_nr, size_t size);
         mempool_t *mempool_create_kzalloc_pool(int min_nr, size_t size);
         mempool_t *mempool_create_slab_pool(int min_nr, 
                                             struct kmem_cache *cache);
    

    The first creates a pool which allocates whole pages (the number of which is determined by order), while the second and third create a pool backed by kmalloc() and kzalloc(), respectively. The fourth is a shorthand form of creating slab-backed pools.

  • The prototype for hrtimer_forward() has changed:

         unsigned long hrtimer_forward(struct hrtimer *timer,
                                       ktime_t now, ktime_t interval);
    

    The new now argument is expected to be the current time. This change allows some calls to be optimized. The data field has also been removed from the hrtimer structure.

  • A whole set of generic bit operations (find first set, count set bits, etc.) has been added, helping to unify this code across architectures and subsystems.

  • The inode f_ops pointer - which refers to the file_operations structure for the open file - has been marked const. Quite a bit of code, which used to change that structure, has been changed to compensate. Similar changes have been made in many filesystems. "The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean)."

  • local_t is now a signed type.

  • Attributes in sysfs can be pollable.

  • A class_device can now have attribute groups created at registration time; to take advantage of this capability, store the desired gropus in the new groups field.

  • The splice(), vmsplice(), and tee() system calls have been merged. Supporting those calls requires implementing two new file_operations methods. See this article for the final form of the splice_read() and splice_write() functions.

2.6.16 (March 19, 2006)

  • The mutex code has been merged. The use of semaphores for mutual exclusion is now deprecated, and the current semaphore API may go away altogether.

  • The high-resolution kernel timer code has been merged. The new API allows for greater precision in timer values, though the underlying implementation is still limited by the timer interrupt resolution.

  • A new list function, list_for_each_entry_safe_reverse(), does just what one would expect.

  • A 64-bit atomic type, atomic_long_t, has been added. Supported functions are:
    • long atomic_long_read(atomic_long_t *l);
    • void atomic_long_set(atomic_long_t *l, long i);
    • void atomic_long_inc(atomic_long_t *l);
    • void atomic_long_dec(atomic_long_t *l);
    • void atomic_long_add(long i, atomic_long_t *l);
    • void atomic_long_sub(long i, atomic_long_t *l);

  • The "SLOB" memory allocator has been merged. SLOB is a drop-in replacement for the slab allocator, intended for very low-memory systems.

  • The dentry structure has been changed: the d_child and d_rcu fields are now overlaid in a union. This change shrinks this heavily-used structure and improves its cache behavior.

  • The usb_driver structure has a new field (no_dynamic_id) which lets a driver disable the addition of dynamic device IDs. The owner field has also been removed from this structure.

  • The device probe() and remove() methods have been moved from struct device_driver to struct bus_type. The bus-level methods will override any remaining driver methods.

  • Some significant changes to the SCSI subsystem aimed at eliminating the use of the old scsi_request structure. The SCSI software IRQ is no longer used; postprocessing happens via the generic block software IRQ instead.

  • Much of the core device model code has been reeducated to use the term "uevent" instead of "hotplug." Some changes which are visible outside of the core code include:
    • kobject_hotplug() becomes kobject_uevent()
    • struct kset_hotplug_ops becomes struct kset_uevent_ops, and its hotplug() member is now uevent()
    • add_hotplug_env_var() becomes add_uevent_var()

  • The block I/O barrier code has been rewritten. This patch changes the barrier API and also adds a new parameter to end_that_request_last().

  • The block_device_operations structure has a new method getgeo(); its job is to fill in an hd_geometry structure with information about the drive. With this operation in place, many block drivers will not need an ioctl() function at all.

  • Linas Vepstas's PCI error recovery patch has been merged.

  • Compilers prior to gcc 3.2 can no longer be used to build kernels.

  • When the kernel is configured to be optimized for size, gcc (if it's version 4.x) is given the freedom to decide whether inline functions should really be inlined. The __always_inline attribute now truly forces inlining in all cases. This is an outcome from the discussion on inline functions held at the beginning of the year.

2.6.15 (January 2, 2006)

  • The nested class device patch was merged, allowing class_device structures to have other class_devices as parents. This patch is a hack to make the input subsystem work with sysfs. This code will change again in the future; see Greg Kroah-Hartman's article for more information on what is planned.

  • The prototypes for the driver model class "interface" methods add() and remove() have changed; there is now a new parameter pointing to the relevant interface structure.

  • A new platform_driver structure has been added to describe drivers for devices built into the core "platform."

  • The prototypes for the suspend() and resume() methods in struct device_driver have changed. They are also only called once per event, rather than three times as in previous kernels.

  • Two new fields have been added to the device_pm_info which control how drivers should act on hardware-created wakeup events; see this article for details.

  • There is a notification mechanism which lets interested modules know when a USB device is added to (or removed from) the system. This system is used by some core code; drivers do not normally need to hook in to it.

  • The gfp_t type is now used throughout the kernel. If you have a function which takes memory allocation flags, it should probably be using this type.

  • Code using reader/writer semaphores can now use rwsem_is_locked() to test the (read) state of the semaphore without blocking.

  • The new vmalloc_node() function allocates memory on a specific NUMA node.

  • The "reserved" bit for memory pages has, for all practical purposes, been removed.

  • vm_insert_page() has been added to make it easier for drivers to remap RAM into user space VMAs.

  • There is a new kthread_stop_sem() function which can be used to stop a kernel thread which might be currently blocked on a specific semaphore.

  • RapidIO bus support has been merged into the mainline.

  • The netlink connector mechanism makes netlink code easier to write. Independently, a type-safe netlink interface has been added and is used in parts of the networking subsystem.

  • These kernel symbols have been unexported and are no longer available to modules: clear_page_dirty_for_io, console_unblank, cpu_core_id hugetlb_total_pages, idle_cpu, nr_swap_pages, phys_proc_id, reprogram_timer, swapper_space, sysctl_overcommit_memory, sysctl_overcommit_ratio, sysctl_max_map_count, total_swap_pages, user_get_super, uts_sem, vm_acct_memory, and vm_committed_space.

  • Version 1 of the Video4Linux API is now officially scheduled for removal in July, 2006.

  • The owner field has been removed from the pci_driver structure.

  • A number of SCSI subsystem typedefs (Scsi_Device, Scsi_Pointer, and Scsi_Host_Template) have been removed.

  • The DMA32 memory zone has been added to the x86-64 architecture; its purpose is to make it easy to allocate memory below the 4GB barrier (with the new GFP_DMA32 flag).

  • A call to rcu_barrier() will block the calling process until all current RCU callbacks have completed.

2.6.14 (October 27, 2005)

  • A new PHY abstraction layer has been added for network drivers.

  • The sk_buff structure has changed again; the changes will force a recompile but shouldn't otherwise be a problem.

  • Version 19 of the wireless extensions has been merged. Among other things, this version deprecates the get_wireless_stats() method in the net_device structure.

  • The klist API has changed. The order of the parameters has been reversed for klist_add_head() and klist_add_tail(). It is now necessary to provide a pair of reference counting functions when setting up a list with klist_init().

  • The relayfs virtual filesystem, which enables high-rate data transfers between the kernel and user space, has been merged.

  • kzalloc() has been added as a way of obtaining pre-zeroed memory.

  • Two new versions of schedule_timeout() have been added.

  • The new TASK_INTERACTIVE state flag tells the scheduler not to perform the usual accounting on sleeping processes.

  • SKB's which are expected to be cloned can be efficiently allocated with alloc_skb_fclone().

  • A few new helper functions for mapping block I/O requests have been added; see this article for details.

  • Securityfs, a virtual filesystem intended for use with security modules, has been merged.

2.6.13 (August 28, 2005)

  • The HZ constant is now configurable at kernel build time.

  • The timer API now includes try_to_del_timer_sync(), which makes a best effort to delete the timer; it is safe to call in atomic context.

  • The block_device_operations structure now has an unlocked_ioctl() member.

  • The return value from netif_rx() has changed; it now will return one of only two values: NETIF_RX_SUCCESS or NETIF_RX_DROP.

  • pci_dma_burst_advice can be used by PCI drivers to learn the optimal way of bursting DMA transfers.

  • The text searching API has been added.

  • A new memory allocation function, kzalloc(), has been added.

  • A big set of driver core changes, including the removal of the class_simple interface and new prototypes for the device structure sysfs methods. [FR]

2.6.12 (June 17, 2005)

  • cancel_rearming_delayed_work() was added to the workqueue API.

  • The timeout value passed to usb_bulk_msg() and usb_control_msg() is now expressed in milliseconds instead of jiffies.

  • An interrupt-disabling spinlock is used in the rwsem implementation. It was never correct to call one of the variants of down_read() or down_write() with interrupts disabled, but it is even less correct now.

  • The fields in the net_device structure have been rearranged, which will break binary-only drivers.

  • kref_put() now returns an int value: nonzero if the kref was actually released.

  • kobject_add() and kobject_del() no longer generate hotplug events. If you need these events, you must call kobject_hotplug() explicitly. The wrapper functions kobject_register() and kobject_unregister() do still generate hotplug events.

  • kobj_map() no longer takes a subsystem argument; instead, it needs a pointer to a semaphore which it can use for mutual exclusion.

  • A new function, sysfs_chmod_file(), allows permissions to be changed on existing sysfs attributes.

  • There is a new generic sort() function which should be used in preference to creating yet another implementation.

  • A new attribute (__nocast) is being used with sparse to disable a number of implicit casts and find probable bugs.

  • io_remap_page_range() is now deprecated; use io_remap_pfn_range() instead.

  • A set of functions has been added to work with big-endian I/O memory.

  • synchronize_kernel() is deprecated. Callers should instead use either synchronize_sched() (to verify that all processors have quiesced) or synchronize_rcu() (to verify that all processors have exited RCU critical sections).

  • The flag argument to blk_queue_ordered() has changed to indicate how ordered writes are handled by the device. Possible values are QUEUE_ORDERED_NONE (ordering is not possible), QUEUE_ORDERED_TAG (ordering is forced with request tags), and QUEUE_ORDERED_FLUSH (ordering is done with explicit flush commands). For the last case, the request queue has two new methods, prepare_flush_fn() and end_flush_fn(), which are called before and after a barrier request.

  • A new function, valid_signal(), can (and should) be used to test whether signal numbers from user space are valid.

  • The Developers Certificate of Origin, the document acknowledged by all those "Signed-off-by:" headers, has changed. The new version adds a clause noting that contributions - and the information that goes with them - are public information which can be redistributed.

2.6.11 (March 2, 2005)

  • The kernel now performs access checking for read() and write() calls before invoking the driver- or filesystem-specific file_operations method.

  • The bcopy() function, unused in the mainline kernel, has been removed.

  • The prototype of the suspend() method in struct pci_driver has changed; the state parameter is now of type pm_message_t.

  • The rwlock_is_locked() macro has been removed; instead, use either read_can_lock() or write_can_lock(). There is also a new spin_can_lock() for regular spinlocks.

  • Three new ways of waiting for completions have been added: wait_for_completion_interruptible(), wait_for_completion_timeout(), and wait_for_completion_interruptible_timeout().

  • For USB drivers: the usb_device_descriptor and usb_config_descriptor structures now keep all fields in the wire (little-endian) form. [GKH]

  • pci_set_power_state() and pci_enable_wake() have new prototypes: power states are represented with the pci_power_t type rather than an int. [GKH]

  • The Big Kernel Semaphore patch was merged. As a result, code which is protected by lock_kernel() is now preemptible. This change should not affect most code developed in this century, but there are always exceptions.

  • The file_operations structure now contains an unlocked_ioctl() member. If that member is non-NULL, it will be called in preference to the regular ioctl() method - and the big kernel lock will not be held. New code should use unlocked_ioctl() and the programmer should ensure that the proper locking has been performed.

    There is also a new compat_ioctl() method which is called, if present, when a 32-bit process calls ioctl() on a 64-bit system.

  • Run-time initialization of spinlocks is being converted away from the assignment form (using SPIN_LOCK_UNLOCKED) to explicit spin_lock_init() calls. No noises have yet been made about removing SPIN_LOCK_INIT, but the writing should be considered to be on the wall. If and when the real-time preemption patches are merged, the assignment form may no longer be possible.

  • debugfs has been merged; it is a virtual filesystem intended for use by kernel hackers who want to export debugging information from their code.

  • Binary attributes in sysfs can now offer mmap() support; see this patch for the details.

  • Four-level page tables have been merged. This change affects surprisingly little code, but, if you are manually walking through the page table tree, you will have to take the new level into account.

  • Socket buffers can be obtained from alloc_skb_from_cache(), which uses a slab cache.

  • A new memory allocation flag (__GFP_ZERO) was added; it allows kernel code to request that the allocated memory be zeroed. It is part of the larger prezeroing patch which has not, yet, been merged.

  • Linus has reimplemented pipes with a circular buffer construct which will, eventually, be mutated into a more generic form.

  • Work is being done toward the goal of removing the semaphore from struct subsystem. If your code depends on this semaphore, which it shouldn't, expect to have to change it soon.

2.6.10 (December 24, 2004)

  • Calling pci_enable_device() is required to get interrupt routing to work. [GKH]

  • A new function, pci_dev_present(), can be used to determine whether a specific device is present or not. [GKH]

  • The prototypes to pci_save_state() and pci_restore_state() have changed: the buffer argument is no longer needed (the space has been allocated in struct pci_dev instead). [GKH]

  • The kernel build system was tweaked; the preferred name for kernel makefiles is now Kbuild. The change is meant to highlight the fact that kernel makefiles are rather different than the user-space variety, but very few, if any makefiles have been renamed.

  • add_timer_on(), sys_lseek(), and a number of other kernel functions are no longer exported to modules. Most of the driver core functions have been changed to GPL-only exports.

  • I/O space write barriers are now supported.

  • The prototype of kunmap_atomic() has changed. This change should not affect properly-written code, but should generate warnings when a struct page pointer is (erroneously) passed to that function.

  • atomic_inc_return() was added as a way to increment the value of an atomic_t variable and get the new value.

  • The little-used "BIO walking" helper functions (process_that_request_first()) have been removed.

  • The venerable remap_page_range() function has been changed to remap_pfn_range(); the new function uses a page frame number for the physical address, rather than the actual address. remap_page_range() is still supported - for now.

  • wake_up_all_sync(), unused in the mainline tree, was removed.

  • A simple, stream-oriented circular buffer implementation was added.

  • The kernel event mechanism was merged, making it possible to notify user space of relevant kernel events.

  • vfs_permission() was replaced by generic_permission(), which has an optional callback for ACL checking. [MS]

2.6.9 (October 18, 2004)

  • Kprobes was merged, making another debugging technique available.

  • Spinlocks are implemented completely out of line now. This change should not affect any code.

  • wait_event_timeout() was added.

  • Kobjects now use the kref type to handle reference counting. Most code should be unaffected by this change.

  • A new set of functions for accessing I/O memory was introduced. The new functions are cleaner and type-safe, and should be used in preference to readb() and friends. The new ioport_map() function makes it possible to treat I/O ports as if they were I/O memory.

  • The NETIF_F_LLTX feature for net_devices tells the networking subsystem that the driver code performs its own locking and does not require that the xmit_lock be taking before hard_start_xmit() can be called.

  • dma_declare_coherent_memory() was added to allow the DMA functions to hand out memory located on a specific device.

  • msleep_interruptible() was added.

  • The prototype of kref_put() changed; a pointer to the release() function is now required.

2.6.8 (August 13, 2004)

  • The fcntl() method in the file_operations structure, just added in 2.6.6, was removed. It has been replaced by two new methods: check_flags() and dir_notify().

  • nonseekable_open() was added as a way of indicating that a given file is not seekable.

  • wait_event_interruptible_exclusive() was added.

  • dma_get_required_mask() was added as a way for drivers to determine the optimal DMA mask.

  • Module section information was added under /sys/module, making it easier use symbolic debuggers with modules.

  • The VFS follow_link() method saw some (compatible) changes. Filesystems should use the new symlink lookup method so that the kernel can, eventually, support a greater link depth. [MS]

(We are still in the process of filling in the earlier API changes - stay tuned).

Acknowledgements

Thanks to the following people who have helped keep this page current:

[FR]Farzad Raiyat
[GKH]Greg Kroah-Hartman
Michael Hayes
[MS]Miklos Szeredi


to post comments

More API changes in 2.6.16

Posted Aug 4, 2006 19:32 UTC (Fri) by smurf (subscriber, #17840) [Link] (2 responses)

An email address to report new (or missing) missing entries would be nice...

  • 2.6.16: network device drivers' ioctl() methods cannot call dev_ioctl() any more; they need to return -ENOIOCTLCMD instead, and the upper layer will do it for them.

    Accordingly, dev_ioctl() is no longer exported to modules.

  • 2.6.16: tty buffering layer revamp.

    The interface is now

     
            len = tty_request_buffer_room(tty, amount_hardware_says);
            tty_insert_flip_string(tty, buffer_from_card, len);
        

    There are also a

            int tty_prepare_flip_string(tty, strptr, len);
    

    so that on can copy large blocks directly from I/O space to the flip buffer.

More API changes in 2.6.16

Posted Aug 22, 2008 9:44 UTC (Fri) by mlfowler (guest, #29063) [Link]

I've just been experimenting with ldd3's tty chapter and had got stuck when the compiler complained about tty->filp.count. Your errata pointed me in the right direction, thanks! One small correction, turns out the interface is:
    len = tty_buffer_request_room(tty, amount_hardware_says);
    tty_insert_flip_string(tty, buffer_from_card, len);

More API changes

Posted Jan 6, 2010 21:08 UTC (Wed) by phormion (guest, #62841) [Link]

We have this fairly complex proprietary kernel module which we currently build against 2.6.7 (ancient, I know) and decided to try with a more recent kernel (2.6.27 - that's what we have available at the moment). There were a few issues, and what's surprising is almost none of them could be found on this page:
  • the 'func' member in struct packet_type has changed signature (one extra net_device * argument); since 2.6.14, see here
  • INIT_WORK() takes only 2 arguments since 2.6.20 (see here)
  • some functions (dev_get_by_index(), devinet_ioctl()) take an extra argument pointing to the net namespace
  • other small stuff like SET_MODULE_OWNER() removed (was a nop in 2.6 anyway), netfilter defines like NF_IP_LOCAL_* renamed to NF_INET_LOCAL_*

there need a position for the _syscallN macro removal in the kernel 2.6.19

Posted Jun 2, 2007 8:04 UTC (Sat) by rael (guest, #40124) [Link]

origin from: http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.19-git13.log please switch to use the user space function syscall(2):
     #include <sys/syscall.h>
     #include <unistd.h>

     int
     syscall(int number, ...);

API changes in the 2.6.22 kernel

Posted Nov 12, 2007 13:48 UTC (Mon) by abacus (guest, #49001) [Link]

There is a small error in the 2.6.22 API changes: skb->skb_mac_header should be
skb->mac_header.

LWN seems forget to keep the /2.6-kernel-api/ page update since 2.6.25 (April 16, 2008)

Posted Nov 16, 2008 9:04 UTC (Sun) by rael (guest, #40124) [Link]

LWN seems forget to keep the /2.6-kernel-api/ page update since 2.6.25
(April 16, 2008)
http://lwn.net/Articles/2.6-kernel-api/

Thanks.

And I tried to send email to kernel@lwn.net, but it failed with:

Delivery to the following recipient failed permanently:

kernel@lwn.net

<blockquote>
Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other server returned was: 550 550 5.1.1 <kernel@lwn.net>: Recipient address rejected: User unknown in local recipient table (state 14).
</blockquote>

API changes in the 2.6 kernel series

Posted Dec 5, 2008 20:47 UTC (Fri) by linuxjacques (subscriber, #45768) [Link]

me too.

this was a useful feature but seems to be neglected now.

API changes in the 2.6 kernel series

Posted Jun 1, 2013 3:56 UTC (Sat) by duxing2007 (guest, #91235) [Link]

hello everybody, source code examples in ldd3 are based on 2.6.10, which won't compiled on kernel version using in most currently linux distributions.I've port them to all linux longterm stable branch after 3.0. see https://github.com/duxing2007/ldd3-examples-3.x.


Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds