In the last discussion on day one of the 2012 ARM minisummit, Marek
Szyprowski gave a status update on
changes in the ARM DMA subsystem over the last year. There has been a lot
of work in that time, with most of it having been merged in 3.5. The most
important change is the conversion to dma_map_ops, which provides
a common DMA framework that can be implemented as needed for each
architecture. It allows for both coherent and non-coherent devices,
supports bounce buffers, and IOMMUs.
The second most important change was the addition of the Contiguous
Memory Allocator (CMA). It is in 3.5, but is still marked as
experimental. It has been tested on some systems, and Szyprowski hopes
that it will be stabilizing over the next kernel cycle or so.
Lastly, a bunch of new attributes for DMA operations have been added.
These are mostly for improving performance and to "avoid some hacks",
Szyprowski said. For upcoming releases, he would like to work on better
support for declaring coherent areas.
For 3.5, there was work to remove some of the limits on DMA, in
particular, the 2MB limit on mappings. The fixed-sized coherent area has
been replaced with memory from vmalloc(). That can't be done in
atomic context, however, so there is a small pre-allocation for use in that
context. For some devices that buffer was too small, so the size has been
made platform dependent. The IOMMU implementation had no support for an
atomic buffer at all, but patches have been posted recently, which he hopes
to get
into 3.6.
The IOMMU code is not particularly ARM-specific, Szyprowski said; it could
be used for other architectures. There is a bit more work to isolate the
common code and make it generic, but he would need to coordinate that work
with the other architectures. Arnd Bergmann suggested just moving the code
to a generic place, but leaving it turned off for other architectures. That
would allow others interested to turn it on and try it out.
Bergmann noted that when CMA was proposed a year and a half ago, it was
envisioned that it would be unconditionally built for all v6 and v7
platforms. But that would make all recent ARM architectures depend on an
experimental feature, so he suggested that it might be time to turn off the
experimental designation.
There are still some issues that need to be resolved before that can
happen, Szyprowski said. There are cases where the allocation can fail
because of different accounting between movable and non-movable
regions. But Mel Gorman strongly recommended building CMA by default since
the problems just result in an allocation failure, and did not cause a
full system failure. He suggested making CMA the default with a fall-back
to the old code if it fails. That way people will start using the feature,
potentially see fall-back warnings, and help fix the problems. If it stays
as an experimental feature, he fears that no one will actually use and test
CMA.
Bergmann thought that any platform using a boot time reservation of memory
(i.e. a "carve out") should be forced into using CMA. One of the problems
with that idea is that some of the carve-outs are not upstream because they
are for out-of-tree graphics hardware. In addition, the vendors are moving
on and are no longer interested in adding features or updating their
drivers to use a new feature like CMA.
Noting that there are multiple ways to do carve-outs, Gorman also suggested
creating a core carve-out API for code consolidation. It could provide
memory that is isolated or DMA-able, for example, so that all of the
carve-outs in the kernel could use it. CMA could underlie that API, and it
could implement the fall-back until CMA shakes out.
Fragmentation within CMA regions was mentioned as a concern. While Gorman
didn't think it all that likely to happen in practice, some noted that
there were already problems when using memory regions for OpenGL. User space
actions can cause significant fragmentation in that case. Szyprowski
suggested using
separate CMA regions as a way to reduce the problem.
CMA still needs work to support highmem; there is no reason that it needs
to be restricted to lowmem. Szyprowski hopes to get some time to work on
that in the future. Wiring up CMA to x86 DMA is another thing that he
plans to work on.
(
Log in to post comments)