Missing the AF_BUS
Missing the AF_BUS
Posted Jul 5, 2012 5:47 UTC (Thu) by daniel (guest, #3181)In reply to: Missing the AF_BUS by alonz
Parent article: Missing the AF_BUS
Posted Jul 5, 2012 6:00 UTC (Thu)
by alonz (subscriber, #815)
[Link] (5 responses)
Posted Jul 5, 2012 6:10 UTC (Thu)
by daniel (guest, #3181)
[Link] (4 responses)
Posted Jul 5, 2012 17:10 UTC (Thu)
by jond (subscriber, #37669)
[Link] (3 responses)
Posted Jul 7, 2012 1:40 UTC (Sat)
by daniel (guest, #3181)
[Link] (2 responses)
Posted Jul 13, 2012 5:43 UTC (Fri)
by Tov (subscriber, #61080)
[Link]
Posted Jul 15, 2012 7:55 UTC (Sun)
by philomath (guest, #84172)
[Link]
Posted Jul 5, 2012 18:26 UTC (Thu)
by josh (subscriber, #17465)
[Link] (9 responses)
Posted Jul 6, 2012 15:03 UTC (Fri)
by pspinler (subscriber, #2922)
[Link] (1 responses)
Certainly all that complexity can't be great for performance.
It's the argument I make for fibre channel v. iscsi. It's true that iscsi hardware (being just standard networking stuff) is a lot cheaper and does the job 90-95% of the time. But in the edge case, especially w.r.t latency, fibre still wins, largely because it's simple in comparison.
-- Pat
Posted Jul 9, 2012 2:35 UTC (Mon)
by raven667 (subscriber, #5198)
[Link]
That's something worth testing, scientifically.
> It's the argument I make for fibre channel v. iscsi. It's true that iscsi hardware (being just standard networking stuff) is a lot cheaper and does the job 90-95% of the time. But in the edge case, especially w.r.t latency, fibre still wins, largely because it's simple in comparison.
One thing about this example that I would like to point out. FC implements much of the features of Ethernet and TCP/IP ... differently, so in that sense the complexity is at least comparable though probably not equal. As far as the implementation complexity I think that FC can get off easier because as a practical matter it is used in closed networks often with all components from the same vendor. Ethernet and TCP/IP have to deal with a lot more varied equipment and varied networks and have to be battle tested against _anything_ happening, all that extra implementation complexity has a real reason for being there.
Posted Jul 9, 2012 6:02 UTC (Mon)
by daniel (guest, #3181)
[Link] (6 responses)
Here's a lovely bit:
http://lxr.linux.no/#linux+v3.4.4/net/ipv4/tcp_output.c#L796
This is part of a call chain that goes about 20 levels deep. There is much worse in there. See, that stuff looks plausible and if you listen to the folklore it sounds fast. But it actually isn't, which I know beyond a shadow of a doubt.
Posted Jul 9, 2012 6:53 UTC (Mon)
by daniel (guest, #3181)
[Link] (3 responses)
http://lxr.linux.no/#linux+v3.4.4/net/ipv4/ip_output.c#L799
This code just kills efficiency by a thousand cuts. There is no single culprit, it is just that all that twisting and turning, calling lots of little helpers and layering everything through an skb editing API that successfully confuses the optimizer adds up to an embarrassing amount of overhead. First rule to remember? Function calls are not free. Not at the speeds networks operate these days.
Posted Jul 9, 2012 8:18 UTC (Mon)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Jul 9, 2012 23:06 UTC (Mon)
by daglwn (guest, #65432)
[Link]
Posted Jul 9, 2012 18:40 UTC (Mon)
by butlerm (subscriber, #13312)
[Link]
Much of the complexity of that function has to do with kernel support for fragmented skbs, which is required for packets that are larger than the page size. That is the sort of thing that would go away if the kernel adopted a kernel page size larger than the hardware page size in cases where the latter is ridiculously small.
I am not sure what the real benefits are of managing everything in terms of 4K pages is on a system with modern memory sizes. Perhaps the idea of managing everything in terms of 64K pages (i.e. in groups of 16 hardware pages) could be revisited. That would dramatically simplify much of the networking code, because support for fragmented skbs could be dropped. No doubt it would have other benefits as well.
Posted Jul 9, 2012 9:11 UTC (Mon)
by gioele (subscriber, #61675)
[Link]
> This is part of a call chain that goes about 20 levels deep. There is much worse in there. See, that stuff looks plausible and if you listen to the folklore it sounds fast. But it actually isn't, which I know beyond a shadow of a doubt.
Don't you have some notes, implementation ideas or performance tests that you want to share with the rest of the kernel community? I'm pretty sure that they would love to hear how to cut in half the CPU overhead of UDP messages without regressions in functionalities.
This kind of impact would surely reduce the battery consumption of mobile applications, so, maybe the main developers will not interested, but devs of mobile-oriented forks like Android will surely be.
Posted Jul 9, 2012 20:26 UTC (Mon)
by butlerm (subscriber, #13312)
[Link]
In Dave's defense will note that the Linux TCP stack does appear to be extremely efficient compared to other OS'es… It's just not always the perfect hammer for the screws you may be using.
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS
Missing the AF_BUS