|
|
Subscribe / Log in / New account

Batch processing of network packets

Batch processing of network packets

Posted Aug 22, 2018 14:23 UTC (Wed) by cagrazia (guest, #124754)
In reply to: Batch processing of network packets by ncm
Parent article: Batch processing of network packets

Last time I worked with Netmap (2014), it was an exciting piece of software to move packets from the NIC to the driver, and then directly in user space (bypassing IP/transport), always batching as much as possible. The problem to solve was to reduce the cost of system calls and stack traversing for an application to get or to send the packets from the NIC. While Netmap was very useful for L2 tracing software (libpcap, Wireshark, etc.) or packet generators, it was insufficient for the other parts of the stack: IP layer or TCP could not benefit from Netmap batching.

I suppose in these years the problem to pass or to retrieve packets from/to the NIC directly to applications has been solved with a mix between the Rizzo's (author of Netmap) and the Linux network developers' design. This work instead focuses on the internal of the network stack, from the driver to the upper layers (without reaching the application).


to post comments

Batch processing of network packets

Posted Aug 22, 2018 16:18 UTC (Wed) by dps (guest, #5725) [Link] (1 responses)

At least some applications were saving 100ns is worth money sometimes bypass the Linux kernel network stack. I believe that solarflare has a LD_PRELOAD library that moves some of tcp to user space, thereby avoiding context switching.

I suspect there is a lot of hardware off load too but in the final analysis somebody else got that job :-(

It might be worth knowing that solarflare et al target customers that want to be close to stock exchanges because the speed of light is finite.

Batch processing of network packets

Posted Aug 26, 2018 0:44 UTC (Sun) by BenHutchings (subscriber, #37955) [Link]

At least some applications were saving 100ns is worth money sometimes bypass the Linux kernel network stack. I believe that solarflare has a LD_PRELOAD library that moves some of tcp to user space, thereby avoiding context switching.

It certainly used to be that the major performance win of user-level networking was not so much the avoidance of kernel/user context switches, but improving temporal locality of access to packets.

With kernel networking, the kernel has to demux incoming packets into socket queues and account for the memory allocated to each socket, as the packets come in. So the CPU will access packet headers during demux and then again some time later when the application receives the packets. With user-level networking, the hardware does demux into (typically) per-process queues and the CPU will access packet headers only when the application receives the packet. (The packet buffers are naturally accounted to the process.)

Still, the cost of context switches has been increased substantially by mitigations for speculation leaks. So that may be a bigger part of the performance advantage now.

I suspect there is a lot of hardware off load too but in the final analysis somebody else got that job :-(

I worked on Solarflare drivers up to the SFC9100 generation, and there was no TCP offload or anything really unusual there. The essential features are checksum offload, flow steering/filtering, and lots of queues.

Batch processing of network packets

Posted Aug 22, 2018 17:13 UTC (Wed) by shemminger (subscriber, #5739) [Link]

Lots of this work was motivated by making kernel performance reach that of DPDK. http://vger.kernel.org/netconf2018_files/StephenHemminger...

Batch processing of network packets

Posted Aug 22, 2018 22:51 UTC (Wed) by ncm (guest, #165) [Link] (2 responses)

There are important (i.e. $billlions) applications of packet ingestion that do not need any stack processing at all -- just get the bits where user space can see them, with as little handling as possible, if not less. In many cases typical packets are ~100 bytes long, leaving 76ns to process each packet at 10Gbps. Fortunately, bursts over 10Gbps are still rare.

Some of these uses actually need to act on the contents of the packet in real time, and those users build custom ASICs that deliver incompletely-arrived packets for speculative processing. But most uses just need to timestamp, filter, and direct packets to various places for parallel downstream processing. Those are the uses that can benefit from kernel and driver improvements, and they get everything they need from netmap, DPDK, ef_vi and the like. DPDK is a mess, and ef_vi is proprietary. Netmap offers a plausible route to break out of lock-in, and actually use commodity hardware in many of what are now thought of as high-end, niche applications that need specialized equipment.

Batch processing of network packets

Posted Aug 23, 2018 15:32 UTC (Thu) by willemb (subscriber, #73364) [Link] (1 responses)

Linux 4.18 landed AF_XDP for this purpose

https://www.kernel.org/doc/html/latest/networking/af_xdp....

Batch processing of network packets

Posted Aug 24, 2018 1:52 UTC (Fri) by ncm (guest, #165) [Link]

Thank you, this is good news. I wonder how I had missed it.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds