|
|
Subscribe / Log in / New account

Should this be implemented in endpoints at all?

Should this be implemented in endpoints at all?

Posted Jan 3, 2025 17:34 UTC (Fri) by john_ousterhout (guest, #175303)
In reply to: Should this be implemented in endpoints at all? by buck
Parent article: The Homa network protocol

Infiniband has just about all of the performance problems of TCP when it comes to congestion control etc.. The only advantage of Infiniband is that people like Mellanox built really nice NICs for it that bypass the kernel.


to post comments

Should this be implemented in endpoints at all?

Posted Jan 3, 2025 22:29 UTC (Fri) by bvanassche (subscriber, #90104) [Link] (1 responses)

Infiniband has just about all of the performance problems of TCP when it comes to congestion control etc.. The only advantage of Infiniband is that people like Mellanox built really nice NICs for it that bypass the kernel.

Is there any scientific paper that backs the above statement about congestion? Multiple papers have been published about how to handle congestion in datacenter RDMA networks. Two examples:

  • Zhu, Yibo, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. "Congestion control for large-scale RDMA deployments." ACM SIGCOMM Computer Communication Review 45, no. 4 (2015): 523-536.
  • Shpiner, Alexander, Eitan Zahavi, Omar Dahley, Aviv Barnea, Rotem Damsker, Gennady Yekelis, Michael Zus, Eitan Kuta, and Dean Baram. "RoCE rocks without PFC: Detailed evaluation." In Proceedings of the Workshop on Kernel-Bypass Networks, pp. 25-30. 2017.

Should this be implemented in endpoints at all?

Posted Jan 6, 2025 17:10 UTC (Mon) by paulj (subscriber, #341) [Link]

Congestion control is far from a solved problem, especially not across wider, less-controlled / uncoordinated networks. Some say ECN is the magic bullet, especially in its recent updated form of L4S / TCP Prague. Others (including a regular commenter here on LWN on networking matters, and Linux congestion/buffering contributor) I think disagree with that.

Congestion control is a bit easier on low-hop, tightly controlled networks - i.e. DCs - but even there it is not solved. Fairness across different kinds of CC in particular is a bitch, as is fairness across flows with very different RTTs and/or BDPs. E.g., congestion controller might work great competing with low-latency, fast connections (i.e. intra-DC), but have issues with fairness competing with flows with different properties, like much higher RTT (e.g., cross-region DC to DC). It's clearly not at all an easy problem.

Should this be implemented in software at all?

Posted Jan 4, 2025 6:35 UTC (Sat) by buck (subscriber, #55985) [Link]

Sorry the comment you were responding to was so provocative.

Your reply was, by contrast, most gracious (I say as someone who has no emotional attachment to Infiniband design [grin]).

But since I can't withdraw my comment (I think), I at least fixed the Subject of this reply to reflect what my provocative question really was. (I certainly didn't mean to exclude NICs as an implementation target, which are probably considered, by anybody's definition, part of an "endpoint", or maybe even an "endpoint" in their own right, if they are smart NICs/"DPUs".)

That said, if you are being gracious enough to give your code away, it's not my business to question what use the rest of the world may find to make of it. Clearly it has found plenty of use for Raft, TCL, etc.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds