LWN: Comments on "Zero-copy TCP receive" https://lwn.net/Articles/752188/ This is a special feed containing comments posted to the individual LWN article titled "Zero-copy TCP receive". en-us Wed, 08 Oct 2025 09:07:12 +0000 Wed, 08 Oct 2025 09:07:12 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Zero-copy TCP receive https://lwn.net/Articles/752825/ https://lwn.net/Articles/752825/ amarao <div class="FormattedComment"> I may be naive, but I feel that hiding headers for any zero copy application is useless and adds compexity. Kernel can instead put a whole packet 'as is' into userspace memory and provide two 'data start/data end' pointers for each packet to help application to ignore headers.<br> </div> Thu, 26 Apr 2018 09:19:36 +0000 Zero-copy TCP receive https://lwn.net/Articles/752487/ https://lwn.net/Articles/752487/ sbates <div class="FormattedComment"> Well I’d assume these proposed patches are for a variety of NICs from multiple vendors. The SolarFlare stuff is (I assume) specific to their NICs. <br> </div> Mon, 23 Apr 2018 02:18:53 +0000 Zero-copy TCP receive https://lwn.net/Articles/752470/ https://lwn.net/Articles/752470/ jmichels <div class="FormattedComment"> Wouldn't this also have to out perform kernel bypass functionality offered by vendors such as Solar Flare?<br> </div> Sun, 22 Apr 2018 12:51:21 +0000 Zero-copy TCP receive https://lwn.net/Articles/752444/ https://lwn.net/Articles/752444/ cladisch <div class="FormattedComment"> <font class="QuotedText">&gt; Another possibility is just mapping the entire packet into the user-mode ring buffer</font><br> <p> The FireWire driver does this for isochronous packets.<br> <p> <font class="QuotedText">&gt; and letting userspace skip over the embedded protocol headers</font><br> <p> The FireWire host controller interface is standardized and must support scatter+gather, so the driver can instruct it to write the header words into another buffer so that only the actual data bytes end up in the mmap buffer. This requires that the header size is fixed and 32-bit aligned, so doing the same for a TCP/IP interface would require more flexible hardware support.<br> <p> There is also a mode that dumps everything into the buffer, where the application has to parse out the packet metadata and headers.<br> </div> Sat, 21 Apr 2018 06:35:05 +0000 Zero-copy TCP receive https://lwn.net/Articles/752430/ https://lwn.net/Articles/752430/ epa <div class="FormattedComment"> Could showing the header data to userspace conceivably introduce a security hole? It might be better to zero out those bytes (once the kernel has finished with them of course) before handing the page over to userspace.<br> </div> Fri, 20 Apr 2018 20:04:43 +0000 Zero-copy TCP receive https://lwn.net/Articles/752416/ https://lwn.net/Articles/752416/ ejr <div class="FormattedComment"> Yeah, I'd expect something more along the lines of vmsplice(..., SPLICE_F_GIFT).<br> </div> Fri, 20 Apr 2018 16:55:47 +0000 Zero-copy TCP receive https://lwn.net/Articles/752415/ https://lwn.net/Articles/752415/ alkbyby <div class="FormattedComment"> It is a little odd to see such unusual band controversial change that also introduces new API, to be merged so quickly.<br> </div> Fri, 20 Apr 2018 16:49:48 +0000 Zero-copy TCP receive https://lwn.net/Articles/752393/ https://lwn.net/Articles/752393/ quotemstr <div class="FormattedComment"> <font class="QuotedText">&gt; TCP is a little different though as you are seeing the abstraction. With I think tcp segment offload or the ingress equivalent I think it is very likely you will get the kind of packets needed in this case. </font><br> <p> Another possibility is just mapping the entire packet into the user-mode ring buffer and letting userspace skip over the embedded protocol headers --- sort of as a hybrid between a conventional network stack and a user-space network stack. <br> </div> Fri, 20 Apr 2018 15:08:27 +0000 Zero-copy TCP receive https://lwn.net/Articles/752392/ https://lwn.net/Articles/752392/ quotemstr <div class="FormattedComment"> Right. An ioctl could at least return a proper error code (say, EAGAIN), and there's precedent for an ioctl consuming data. There's no precedent for mmap being destructive!<br> </div> Fri, 20 Apr 2018 15:06:26 +0000 Zero-copy TCP receive https://lwn.net/Articles/752382/ https://lwn.net/Articles/752382/ post-factum <div class="FormattedComment"> Would speculative page faults patchset be able to address the contention for the mmap_sem lock?<br> </div> Fri, 20 Apr 2018 12:56:14 +0000 Zero-copy TCP receive https://lwn.net/Articles/752365/ https://lwn.net/Articles/752365/ k8to <div class="FormattedComment"> Ring buffers are pretty efficient for some patterns of data passing, but I don't see how you could do a zero-copy ring buffer.<br> </div> Fri, 20 Apr 2018 07:04:23 +0000 Zero-copy TCP receive https://lwn.net/Articles/752362/ https://lwn.net/Articles/752362/ ebiederm <div class="FormattedComment"> It might be worth comparing this to how PF_PACKET sockets works with mmap.<br> <p> Those I believe implement a shared ring buffer between kernel and user space. Something like that might be possible.<br> <p> TCP is a little different though as you are seeing the abstraction. With I think tcp segment offload or the ingress equivalent I think it is very likely you will get the kind of packets needed in this case. <br> <p> Doing anything more complicated (aka a ring buffer) I suspect would be quite a bit harder to implement and more fragile than what has been implemented here. As this sounds like it is just taking packets right out of the existing packet queue.<br> </div> Fri, 20 Apr 2018 03:46:04 +0000 Zero-copy TCP receive https://lwn.net/Articles/752361/ https://lwn.net/Articles/752361/ josh <div class="FormattedComment"> The underlying memory map would still need to change each time, to avoid copying the underlying pages. That said, using mmap repeatedly for this does seem quite strange.<br> </div> Fri, 20 Apr 2018 02:47:20 +0000 Zero-copy TCP receive https://lwn.net/Articles/752357/ https://lwn.net/Articles/752357/ luto <div class="FormattedComment"> Wouldn’t it be better (faster and saner) to mmap a magic region for zero copy reception and then use ioctl to materialize the data into it?<br> </div> Fri, 20 Apr 2018 02:26:15 +0000