Batch processing of network packets
Batch processing of network packets
Posted Aug 22, 2018 14:23 UTC (Wed) by cagrazia (guest, #124754)In reply to: Batch processing of network packets by ncm
Parent article: Batch processing of network packets
I suppose in these years the problem to pass or to retrieve packets from/to the NIC directly to applications has been solved with a mix between the Rizzo's (author of Netmap) and the Linux network developers' design. This work instead focuses on the internal of the network stack, from the driver to the upper layers (without reaching the application).
Posted Aug 22, 2018 16:18 UTC (Wed)
by dps (guest, #5725)
[Link] (1 responses)
I suspect there is a lot of hardware off load too but in the final analysis somebody else got that job :-(
It might be worth knowing that solarflare et al target customers that want to be close to stock exchanges because the speed of light is finite.
Posted Aug 26, 2018 0:44 UTC (Sun)
by BenHutchings (subscriber, #37955)
[Link]
It certainly used to be that the major performance win of user-level networking was not so much the avoidance of kernel/user context switches, but improving temporal locality of access to packets. With kernel networking, the kernel has to demux incoming packets into socket queues and account for the memory allocated to each socket, as the packets come in. So the CPU will access packet headers during demux and then again some time later when the application receives the packets. With user-level networking, the hardware does demux into (typically) per-process queues and the CPU will access packet headers only when the application receives the packet. (The packet buffers are naturally accounted to the process.) Still, the cost of context switches has been increased substantially by mitigations for speculation leaks. So that may be a bigger part of the performance advantage now. I worked on Solarflare drivers up to the SFC9100 generation, and there was no TCP offload or anything really unusual there. The essential features are checksum offload, flow steering/filtering, and lots of queues.
Posted Aug 22, 2018 17:13 UTC (Wed)
by shemminger (subscriber, #5739)
[Link]
Posted Aug 22, 2018 22:51 UTC (Wed)
by ncm (guest, #165)
[Link] (2 responses)
Some of these uses actually need to act on the contents of the packet in real time, and those users build custom ASICs that deliver incompletely-arrived packets for speculative processing. But most uses just need to timestamp, filter, and direct packets to various places for parallel downstream processing. Those are the uses that can benefit from kernel and driver improvements, and they get everything they need from netmap, DPDK, ef_vi and the like. DPDK is a mess, and ef_vi is proprietary. Netmap offers a plausible route to break out of lock-in, and actually use commodity hardware in many of what are now thought of as high-end, niche applications that need specialized equipment.
Posted Aug 23, 2018 15:32 UTC (Thu)
by willemb (subscriber, #73364)
[Link] (1 responses)
https://www.kernel.org/doc/html/latest/networking/af_xdp....
Posted Aug 24, 2018 1:52 UTC (Fri)
by ncm (guest, #165)
[Link]
Batch processing of network packets
Batch processing of network packets
At least some applications were saving 100ns is worth money sometimes bypass the Linux kernel network stack. I believe that solarflare has a LD_PRELOAD library that moves some of tcp to user space, thereby avoiding context switching.
I suspect there is a lot of hardware off load too but in the final analysis somebody else got that job :-(
Batch processing of network packets
Batch processing of network packets
Batch processing of network packets
Batch processing of network packets