|
|
Subscribe / Log in / New account

Article on TOE

From:  "Wael Noureddine" <wael-AT-chelsio.com>
To:  "Jonathan Corbet" <corbet-AT-lwn.net>
Subject:  Re: Article on TOE
Date:  Wed, 31 Aug 2005 10:10:18 -0700

Hi Jonathan,

We found your article on "Linux and TCP Offload Engines" very
interesting. The article discussed the submitted Chelsio TOE patch and
compiled a list of the objections raised by the stack maintainers. We
hope to be given the opportunity to provide some information regarding
the patch, and to clarify some of the points made.
 
As you have noted, the patch itself is really minimal. All in all, a
dozen or so lines of actual code will be needed for 2.6.14 to provide
generic, vendor-independent support for TOE. In any case, we have
resources committed to handling any future maintenance work. Therefore,
this should prove of very little impact on the maintenance of the stack.
 
The maintainers' apprehension regarding TOE in the Linux stack is well
known and shows up in the list of objections. Before we answer these
objections listed in last week's article, it is important to stress the
following points:
 
1) In addition to full offload, a TOE provides all the functions of a
regular NIC, including checksum offload and LSO for non-offloaded
traffic. A TOE can be operated as a NIC without any changes.
 
2) Today, you can buy a 10 Gbps TOE at virtually no price premium
compared to a 10 Gbps NIC. You're basically getting the additional
features for free.
 
3) Adding TOE support in the stack does not bypass the software stack.
It only gives the possibility to enable additional functionality if need
be. TOE is a performance enhancement which should be available to users
who need it.
 
Now, to the objections:
 
* The maintenance issue has been mentioned above, and looking at the
patch itself should address any concerns in that area. Questions,
comments or suggestion regarding it are more than welcome and
appreciated. If there is anything that can be done to further improve
this aspect let us know.
 
* Netfilter support is really not shorted out, and connection acceptance
can still be subjected to regular checking. Also, keep in mind that a
TOE is there to speed up some connections which require it, the rest of
the traffic is still fully processed in the software stack.
 
* Traffic rate control at 10 Gbps speeds is really not practical in
software today. Without arguing if and when that would be possible,
today the Chelsio TOE provide rate control in hardware, so no
functionality is lost in that regard. Clearly, this will depend on
different vendors' implementations, but this is all about choice.
 
* The security and patching issue is dependent on the vendor approaches
and their handling of flaws. However, given that a TOE can be disabled
at any time, one can fully rely on the software stack, while awaiting a
fix. There is no impact compared to regular NICs, besides the
performance loss.
 
* TOE performance has been questioned in the past, and perhaps rightly
so. However, it appears that this has changed recently. The Chelsio TOE
holds the Internet 2 Land Speed Record (7.5Gbps over 33,000Km), where
it maxed out the PCI-X bus and the distance required, with 1,500 byte
frames. This is just one indication, other independent tests by the Los
Alamos
Lab and OSU showed for example that TOE provides about twice the
throughput at half the CPU utilization of a regular NIC for data transfers,
and 60% to 1000% improvement in Web server capacity (see
http://www.chelsio.com/technology/HotInterconnect_2005.pdf). These
improvements were obtained without fully utilizing the TOE capability,
such as zero copy.
 
* It is clear that no one would want to design a 100Mbps TOE today, but
it is also a question whether anyone still has an original 100Mbps
adapter from 1993 in their current system. Technology advances will
obsolete everything we're building now, and in that regard the TOE is no
different from a regular NIC. Assuming you still have the 100Mbps TOE
you bought 10 years ago, you could just disable the offload and use it
as a NIC.
 
* It is important to stress that the TOE patent issue is being taken out
of context when it comes to full offload. The patents in question are
for the partial offload approach which has been taken by Microsoft. Full
offload is not, and cannot be patented as legal studies have determined.
 
* Stateless offload is an option which may work out for some
applications and users. However, the performance gap is still
considerable. Adding CPUs or waiting for CPUs to get faster are
suggestions which ignore the cost part of the equation. It is best to
leave such considerations to the users, who have to optimize their cost
performance measure.
 
* TOE opponents rely on the observation that CPU speeds tend to catch up
with network speeds, obviating the need for TOE. However, the very fact
that TOE is brought up recurrently and ever more pressingly indicates
that this gap is periodic, and it is getting more serious every time.
Today, the performance gap is being filled with exotic inter-connects,
such as InfiniBand, while TCP/IP over Ethernet lags in performance.
Dismissing this market as niche and insignificant would be ignoring the
market realities. As shown in recent studies, such as
http://www.chelsio.com/technology/Cluster_2005_Techical_R...,
a TOE makes TCP/IP over Ethernet again a competitive
technology.
 
It is important to mention that there are many unacknowledged benefits
to performing TCP processing in hardware, including microsecond
granularity retransmission and rate control, and receive data
re-assembly offload. These capability turn out to be very useful when
operating the latest low latency 10 Gbps Ethernet switches-on-a-chip,
which tend to have limited buffering resources and may consequently drop
packets. In addition, a TOE can handle essential TCP features, such as
timestamps, which are usually turned OFF due to their high processing
requirements at 10 Gbps. In addition, a TOE will most likely be required
to enable other technologies such as iSCSI, which is expected to gain
widespread use as a storage networking protocol.
 
TOE's performance has been independently demonstrated by end users, and
the technology can be integrated into Linux with relatively little
effort compared to other options being considered. There are no real
technical reasons for denying TCP offload its place as a useful option,
which users who require high performance should have today. It is our
hope that other reasons can be addressed to the satisfaction of
everyone, and the benefit of the users of TCP/IP over Ethernet


to post comments

TSO vs. TOE?

Posted Sep 1, 2005 8:15 UTC (Thu) by mingo (guest, #31122) [Link] (2 responses)

curiously absent from both this letter and the paper is a discussion of TSO vs TOE.

TSO (TCP Segmentation Offload) support is included in the 2.6 kernel and works fine. I have looked at the HotInterconnect_2005.pdf paper referenced in the letter, but it is not clear to me whether you have compared TSO to TOE (on the same card).

The fundamental question is not whether TOE outperforms unassisted, CPU-driven TCP - it evidently does.

The question is, does TOE outperform TSO (which is a much simpler and more robust way of offloading TCP work to the NIC), and if yes, by how much?

TSO vs. TOE?

Posted Sep 2, 2005 0:56 UTC (Fri) by giraffedata (guest, #1954) [Link] (1 responses)

The second to the last paragraph of the original article does this comparison. It's more of an acknowledgement than an exhaustive comparison, of course -- that would be another article. But at least the topic is not absent.

TSO vs. TOE?

Posted Sep 2, 2005 6:46 UTC (Fri) by mingo (guest, #31122) [Link]

but why is TSO vs. TOE an afterthought in the paper (and not tested at all it seems), while the Linux networking stack maintainers stress that it is the main issue?

as i mentioned in another comment, TSO (TCP Segmentation Offload - the letter mentions "LSO" that is probably TSO) has extensive support in the 2.6 Linux kernel, and has been supported for a long time. All the network hardware that is capable of doing TSO has native Linux driver support for it: tg3, e1000, ixgb, s2io, bnx2, qeth, tg3, 8139cp - you name it.

Article on TOE

Posted Sep 11, 2005 21:10 UTC (Sun) by markhahn (guest, #32393) [Link]

offload always means more difficult upgrades, less control, more opacity. it's conceivable that it's worthwhile, but needs significant justification. further, such a justification must necessarily consider issues such as the rapid increase of host cpu/mem/io power (proving that a VIA EPIA needs Chelsio's board is not interesting!)

ultimately, it boils down to a cost-benefit tradeoff. Chelsio's problem is that their costs are still quite high - for instance, I suspect Wael is not thinking of the $US 800 10G card from Myri when he says that Chelsio's cards are comparable in price (the only Chelsio price I could find was Eu 2500!).

finally, Chelsio is betting on a certain religion of system design - the one that gives rise to very expensive, gold plated servers decked out with SANs, hot-swap ram, resident service teams, etc. I think it's become quite clear that this approach is being strongly challenged by cluster-based ones that use more cost-effective off-the-shelf components, and which simply do not reward expensive offload engines.


Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds