I'm curious on a potential hole with using these devices - the assumption that the destination copy area isn't represented in a processor cache. There are easily a couple of scenarios where this could bite you -
1) Copying incomming packet buffers to a user or other area. In general, multiple packets will come into the same memory area, so the user will have had a cache hit at one time, the DMA operation occurs, and since the cache isn't invalidated, the user gets the wrong data.
2) Peek and copy - an area is looked at to determine a value (such as an ARP cache or packet filter rule). Since the data can age, the timestamp is compared. When old, a DMA operation is used to transfer in new data, but the user hasn't invalidated the cache so therefore only gets old data.
In general, I think any copy operation has to manage the possible cache entries that cover a copy destination, and the general answer of flushing the caches determines a significant portion of the overhead of such a DMA operation - i.e. DMA is efficient when the cost of copying X-bytes <= cost of flushing all caches + cost of CPU coyping of X-bytes.
Obviously, you could only allow this functionality for non-cacheable memory regions, but then the utility of this function is quite limited.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds