|
|
Subscribe / Log in / New account

Is it time to remove ZONE_DMA?

By Jonathan Corbet
May 4, 2018

LSFMM
The DMA zone (ZONE_DMA) is a memory-management holdover from the distant past. Once upon a time, many devices (those on the ISA bus in particular) could only use 24 bits for DMA addresses, and were thus limited to the bottom 16MB of memory. Such devices are hard to find on contemporary computers. Luis Rodriguez scheduled the last memory-management-track session of the 2018 Linux Storage, Filesystem, and Memory-Management Summit to discuss whether the time has come to remove ZONE_DMA altogether.

Rodriguez, however, was late to his own session, so the developers started discussing the topic without him. It's not clear that any modern devices still need the DMA zone, and removing it would free one precious page [Luis Rodriguez] flag. Any requests with the GFP_DMA flag could be redirected to the zone for the contiguous memory allocator (CMA) which, in turn, could be given the bottom 16MB of memory to manage. Matthew Wilcox asked whether the same thing could be done with ZONE_DMA32, used for devices that can only DMA to 32-bit addresses, but it is not possible to allocate all of the lowest 4GB of memory to that zone, since it would exclude kernel allocations.

It was noted in passing that the POWER architecture uses GFP_DMA extensively. It doesn't actually need it, though; the early POWER developers had misunderstood the flag and thought that it was needed for any memory that would be used for DMA.

At this point, Rodriguez arrived and presented his case. He noted that the existence of ZONE_DMA causes an extra branch to be taken in every memory allocation call. Perhaps removing the zone could improve performance by taking out the need for those branches. It's not clear that performance would improve all that much, but the developers would be happy to be rid of this ancient zone regardless.

The problem is that quite a few drivers are still using ZONE_DMA, even if a number of them don't really need it. The SCSI subsystem was mentioned as having a number of allocations using it. Wilcox suggested that perhaps the drivers still using ZONE_DMA could be moved to the staging tree; they could then either be fixed and moved back or just removed entirely. A look at the list of affected drivers (which can be found in this summary of the session posted by Rodriguez) suggests that just deleting them is probably not an option, though.

More work will be needed to determine the real effects of changing this zone, and of possibly redirecting it into the CMA zone instead. But its removal would simplify the memory-management subsystem, so there is motivation for the developers to do the necessary research.

Index entries for this article
KernelMemory management
ConferenceStorage, Filesystem, and Memory-Management Summit/2018


to post comments

Is it time to remove ZONE_DMA?

Posted May 4, 2018 17:20 UTC (Fri) by arnd (subscriber, #8866) [Link] (1 responses)

We also have a couple of ARM platforms, including IIRC the modern shmobile parts that use ZONE_DMA for various other platform specific limitations that are distinct from the 24 bit ISA bus constraints.

CMA can probably work around most of those, but one would have to look not only at drivers that explicitly use GFP_DMA but also those that set a dma_mask smaller than 0xffffffff, either in the driver or inherited from the parent bus in DT.

Is it time to remove ZONE_DMA?

Posted May 7, 2018 16:04 UTC (Mon) by timur (guest, #30718) [Link]

On ARM, enabling DMA_CMA forces all DMA allocations to be made from the CMA, which means that if your CMA isn't big enough, you'll run out of memory. We've had to set the CMA size to 0 by default for our kernels, otherwise DMA allocations start to fail.

Is it time to remove ZONE_DMA?

Posted May 6, 2018 23:18 UTC (Sun) by benh (subscriber, #43720) [Link]

The PowerPC case isn't that simple ;-) We did have early PowerPC systems such as PReP machines, that did have ISA busses along with similar DMA limitations.

On other systems, for a while became equiv. to ZONE_DMA32 at a time where the latter didn't exist yet.

That said, I don't find many use in our code left these days, what did you spot that's still "wrong" in your opinion ?

Is it time to remove ZONE_DMA?

Posted May 7, 2018 11:44 UTC (Mon) by cborni (subscriber, #12949) [Link] (4 responses)

Unfortunately ZONE_DMA does not always mean the lowest 16MB.
On s390 we use GFP_DMA to allocate buffers below 2GB. This is still necessary for several hardware interfaces so we certainly need a way to allocate buffers in that region.

Is it time to remove ZONE_DMA?

Posted May 7, 2018 16:13 UTC (Mon) by willy (subscriber, #9762) [Link] (1 responses)

Can you use GFP_DMA32 for that purpose, or do you need it to refer to the full 4GB?

Is it time to remove ZONE_DMA?

Posted May 7, 2018 16:47 UTC (Mon) by cborni (subscriber, #12949) [Link]

We need it to be below 2GB (the old 31bit mode has survived in some places).
Right now ZONE_DMA is defined to be exactly that on s390x. We could of course redefine everything to be GFP_DMA32 (which does not exist yet) but this seems a pointless rename.

Is it time to remove ZONE_DMA?

Posted May 7, 2018 16:17 UTC (Mon) by willy (subscriber, #9762) [Link] (1 responses)

By the way, the intended way for drivers to allocate memory below 2GB is the DMA allocation API with a mask of 1<<31 -1. Is that doable? GFP_DMA might be used to implement that interface, but it's easier to deal with that than dozens of drivers.

Is it time to remove ZONE_DMA?

Posted May 7, 2018 17:04 UTC (Mon) by cborni (subscriber, #12949) [Link]

> By the way, the intended way for drivers to allocate memory below 2GB is the DMA allocation API with a mask of 1<<31 -1. Is that doable?

No. This is not about driver code, this is about core code that calls some classic CISC instructions that require control blocks that might have some satellite blocks via pointers. And sometimes some specific satellite blocks need to be below 2GB.

Another thing about dma mask: This does not keep this area free unless needed. GFP_DMA does that.
Imagine a system with 3 GB of memory.
2GB ZONE_DMA
1GB ZONE_NORMAL (or movable)

If the page cache now needs 2GB, it will consome the 1GB in ZONE_NORMAL and after that 1GB in ZONE_DMA. So we can still handle all DMA requests. If we have no dedicated zone then there is is higher chance that pages <2GB are used making allocation in there less likely.

So I think your understanding of ZONE_DMA usage (the x86 way) is not exactly how s390x uses it - I fear.


Copyright © 2018, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds