Credit where it's due
Credit where it's due
Posted Aug 21, 2018 15:42 UTC (Tue) by ecree (guest, #95790)Parent article: Batch processing of network packets
He suggested it in http://lists.openwall.net/netdev/2016/01/15/51 and the following thread seems also to contain some early prefiguring of XDP.
Posted Aug 22, 2018 2:43 UTC (Wed)
by dgc (subscriber, #6611)
[Link] (3 responses)
-Dave.
Posted Aug 22, 2018 22:02 UTC (Wed)
by ejr (subscriber, #51652)
[Link]
Posted Aug 27, 2018 19:20 UTC (Mon)
by roblucid (guest, #48964)
[Link]
Posted Aug 27, 2018 19:21 UTC (Mon)
by roblucid (guest, #48964)
[Link]
Posted Aug 22, 2018 3:57 UTC (Wed)
by mtaht (subscriber, #11087)
[Link] (3 responses)
However I've seen a lot of code that does stupid things to software batch gro stuff, with things that actually overwhelm the cache. I'm curious, with this new code,
Skylake is one thing. But a typical cache size on small boxes is 32k/32k for these.
Posted Aug 22, 2018 3:59 UTC (Wed)
by mtaht (subscriber, #11087)
[Link] (2 responses)
Posted Aug 23, 2018 16:15 UTC (Thu)
by ecree (guest, #95790)
[Link] (1 responses)
Posted Aug 24, 2018 7:26 UTC (Fri)
by mtaht (subscriber, #11087)
[Link]
In my ideal world, packets inside the kernel and from future devices, would have something the like the following format:
|tx_timestamp|rx_timestamp|packet|some|different|hashes|skb control block for all kinds of other stuff|
The rx_timestamp is free if you have it in hw, the rx/tx_timestamp would make all the codel-y work faster on tx, and also enable the timer queues VJ is talking about ( https://netdevconf.org/0x12/session.html?evolving-from-af... ) see also sch_etx and the igb network hw, and... selfishly - on top of "timestamp always", 3 hashes would make sch_cake fast enough for general use. timestamps are at the front because they are "free" though they could live at the back, hashes at the back because you need time to calc them and on a cut through switch you can't wait for them. The existing cb has some other fields I'd do always too and make persistent through the stack.
I'm so totally not going to make more of this "modest proposal", as much as I think the whole skb layout could use a major revision, as it would take a forklift, time, and taste to redo in linux, dpdk, rgmii, etc. It would make it really difficult to backport code from one format to the other. It would take years to implement in linux (years more in bsd, osx, windows), for a benefit that would mostly be for 100gige+ interfaces originally. A metric ton of people would have their favorite fixed format field they'd want somewhere, thus politics and gnashing of standards bodies teeth would happen... specialized hw offload engines break...
Still, if I could have 1 sysctl to immuttably rx timestamp packets on all interfaces always it would be great. Packets being managed by codel could measure the entire system from ingress to egress and *drop them*, shedding load automatically as a system as a whole (rather than at the qdisc) got more stress than it should take, figuring out the cost of each substep through the layers would be something you could do on a per packet basis on your workload, rather than by blasting specialized test tools through it that *don't model real traffic* and claiming that pps really meant something...
and there's be ponies! and speckled dancing unicorns! harder RTT deadlines! and systems that didn't collapse under load! and poppies, poppies, poppies everywhere to roll around in to help you sleep even better!
It's after midnight. I usually don't post anything after midnight.
...
I am happy that ebpf is making it easier to offload more stuff into smarter hardware. and I look forward to trying this new skb list idea out next quarter on some very old, slow, tiny hardware.
Credit where it's due
Credit where it's due
Credit where it's due
Credit where it's due
Credit where it's due
if folk have experimented with decreasing the (oft rather large) NAPI value in the first place, and to what extent it was tested on devices with small i/dcaches.
Credit where it's due
Credit where it's due
Credit where it's due