|
|
Log in / Subscribe / Register

4.3 Merge window, part 3

By Jonathan Corbet
September 14, 2015
Last week's merge window summary ended with a guess that the bulk of the changes for 4.3 had been seen by that point. By the time Linus released 4.3-rc1 and closed the merge window, 10,756 non-merge changesets had been pulled into the mainline repository; that's about 550 since last week. So the patch rate did indeed fall off as expected, but there were still some significant changes that slipped in before the window closed.

Significant user-visible changes include:

  • The new file /proc/kpagecgroup can be used to determine which memory control group each page of physical memory is charged to.

  • The idle-page tracking feature has been merged. This feature enables the discovery of memory pages that are not in active use; this information can be used to optimize the allocation of memory between containers or virtual machines.

  • The membarrier() system call, which has been circulating since (at least) 2010, has been merged. See this commit for the latest description and man page.

  • The control-group writeback improvement patches have been merged.

  • New hardware support includes Microchip LAN88XX PHYs, NXP LPC18xx/43xx watchdog timers, Atmel SAMA5D4 watchdog timers, Toradex Colibri VF50 touchscreens, and Freescale i.MX6UL touchscreen controllers.

get_vaddr_frames()

With regard to changes visible to kernel developers: there has been one significant addition to the memory-management interface. Certain driver subsystems (media, for example) have long had to reach deep into the memory-management subsystem to implement high-performance I/O to user-space buffers. Some of the resulting code raised eyebrows in memory-management circles; it also stood in the way of efforts to make the mmap_sem semaphore less of a bottleneck.

Jan Kara has been working on reducing mmap_sem use for a while; that effort has extended into improving the memory-management primitives used in the driver tree. In 4.3 he has added a new set of helpers for the easy mapping of I/O buffers. It creates a new type, struct frame_vector, to describe a mapped buffer. That structure lives in <linux/mm.h>, but it's probably best if most users treat it as if it were an opaque structure.

A frame vector can be allocated or destroyed with:

    struct frame_vector *frame_vector_create(unsigned int nr_frames);
    void frame_vector_destroy(struct frame_vector *vec);

Here, nr_frames is the maximum number of pages that will be mapped using this vector. The mapping itself is done with:

    int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
		         bool write, bool force, struct frame_vector *vec);

The beginning virtual address of the buffer to be mapped is passed in start, and the size of the buffer in nr_frames. If write is set, the buffer will be mapped for write access; if force is set, write access will be set up even if the buffer is mapped read-only in user space. After a successful call, vec will contain the results of the mapping, and the return value will be the number of pages actually mapped.

If the buffer lives in ordinary memory, get_vaddr_frames() will take a reference to each mapped page to keep it in RAM. That reference must be released at some point to unpin the pages; to do so, call:

    void put_vaddr_frames(struct frame_vector *vec);

Note that frame_vector_destroy() does not make this call; users must take care to do it themselves.

Once upon a time (i.e. last year), this type of interface would have returned an array of struct page pointers to refer to the mapped pages. The increasing use of not quite real memory in hardware has created pressure to use page-frame numbers (PFNs) instead. As it happens, the contents of the frame vector may be either struct page pointers or PFNs, depending on the type of memory mapped. Driver-level code need not be aware of this practice, but it does have to be explicit about what it wants. To gain access to the mapped buffer (for DMA mapping, for example), use one of:

    struct page **frame_vector_pages(struct frame_vector *vec);
    unsigned long *frame_vector_pfns(struct frame_vector *vec);

The call to frame_vector_pages() can fail if it is not possible to represent the buffer using page structures; the error code will be returned as an ERR_PTR() value, so a macro like IS_ERR() should be used to check the returned pointer before using it. Conversion to PFNs is unconditionally successful in the current implementation.

All of the above functions are exported to modules (with no explicit GPL limitation). Driver code has been fixed up in a number of places (example) to use the new interface; the result is a net reduction in lines of code and, hopefully, an improvement in robustness.

In summary

Meanwhile, the 4.3 development cycle has now entered the stabilization phase. For the curious, Stephen Rothwell's post-merge-window summary gives some statistics about the changes that were merged this time around. It seems that 94% of them were in the linux-next tree at the beginning of the merge window, with the DRM and networking trees being the biggest source of commits that weren't there. Some 587 commits in linux-next didn't make it into the mainline; nearly a quarter of those come from the kdbus tree, which was not proposed for merging this time around. Also not merged was OrangeFS, which ran into some trouble when the pull request went out; chances are good OrangeFS will make it in for 4.4.

Index entries for this article
Kernelget_vaddr_frames()
KernelReleases/4.3


to post comments


Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds