|| ||Alex Aizman <firstname.lastname@example.org>|
|| ||VJ Channel API - driver level (README)|
|| ||Tue, 02 May 2006 15:53:21 -0700|
The following "soaks" on our side for a couple months now... Noticed  and
thought it'd make sense to add it to the mix.
* README (inlined below)
* netdevice.h patch with the preliminary/draft driver-level API (next message).
Van Jacobson's Net Channels presentation at LCA2006 is available at  and
further discussed, for instance, at .
Within this context we defined the following goals for our implementation:
1) There are separate transport-neutral hardware supported channels for
transmit and receive traffic.
2) We are interested in a proof-of-concept, as well as an initial pass at a
The API introduces the concept of a unidirectional "flow" that can be "bound"
to a specific hardware supported channel. Within this framework TCP flow
is simply a special case where the transmit traffic flow is all the traffic
from a local TCP endpoint to a remote TCP endpoint and the receive
traffic flow is the exact opposite.
The API introduces - and is structured around - the following objects:
* hardware channel - hw_channelh
* kernel channel - kernel_channelh
* receive flow - netdev_rx_flow
More exactly, hw_channelh and kernel_channelh are opaque
(void*) handles to the corresponding stateful and
implementation-dependent objects. The rest of this text
talks about channels and denotes handles - for shortness sake
we'll now assume that the difference is clear enough.
Channels & Channel Handles
Both the hardware and kernel channels (hw_channelh and kernel_channelh,
respectively) are strictly unidirectional. When a channel is opened it
is specified as a transmit or a receive channel. In order to create a
bi-directional send/receive channel there must be an additional API
which provides a higher level abstraction by using a pair
of uni-directional channels.
Hardware channel handle and kernel channel handle are opaque handles
designated to reference the corresponding stateful objects when used in an
appropriate context, i.e., driver and kernel, respectively.
Hardware channel handle (hw_channelh) is an opaque handles used to reference
the corresponding device driver-specific channel object. It is up to the
device driver developers to define the actual channel structures which work
best to their specific hardware. The API knows nothing about these except as
Similarly, kernel channel handle (kernel_channelh) is opaque, as far as
network drivers are concerned. There is a 1-to-1 correspondence between a
kernel channel and a hardware channel. This assists in separating domain
knowledge between the device driver and the kernel proper. It is assumed
that kernel developers will be able to make use of kernel_channelh by
casting it to the appropriate structure when, for instance, processing
frames received on a corresponding hardware channel (hw_channelh).
Receive flow (netdev_rx_flow) contains a criteria that allows to steer
a certain type of incoming packets (L2 frames, IP or UDP datagrams,
TCP segments, etc.) to a receive channel. For instance, a single channel
can be used to only receive traffic for a given MAC, or all traffic to TCP
port 80. Furthermore, several flows can be added (or more exactly, can be
"bound" via the corresponding bind_rx_hwchannel() API call) to the same
channel. This means, one can create a channel that accepts traffic for
destination MAC A and MAC B, or a channel used to transfer TCP packets with
destination port 80 and 8080, or a channel to for a number of TCP connections
defined by their respective 4-tuples.
For more discussion see Section "Neterion XFrame-II Specific Notes" below.
One can also assign a custom receive function to each separate receive channel.
If a callback function is specified, this function will be used to pass up
traffic instead of the netif_* API. This allows for a direct data path for
applications should they wish to use it. This callback function is entirely
optional and must be set per channel. If the function pointer is NULL the
standard netif_* API is used.
With respect to receive flow binding sequence, the last channel
that is bound to a specific flow is the one that "wins", i.e., gets the
traffic. In general, sharing of channels in a consistent fashion and
tracking of receive flows is currently considered outside of the scope
of this API.
System Scalability ("The Big Picture")
The API has its place in the entire picture that, as per Van Jacobson,
includes "channelized" application, "channelized" socket, and "channelized"
driver. It is meant to provide a mechanism, which, if used correctly,
will ultimately allow to achieve system-level per-CPU scalability.
It is outside the scope of this API to provide an interface to automatically
place a transmit channel and the user-space application using it onto a given
CPU. But if the transmitting application is in fact bound to a CPU, and if the
kernel socket is "channel-aware", and if this particular channel is always
used with the same CPU - if all of the above is true, then the ultimate goal
of scalability can be reached.
Similarly, on the inbound, the API can be used to:
(1) channelize received traffic based on a number of available
receive flow classification mechanisms
(2) process the per-channel MSI/MSI-X on a given CPU
specified at channel open time (see open_rx_hwchannel()).
It's outside of scope of this particular API what happens with the packets
received on a given channel after netif_rx*() callback hands the over to the
Neterion XFrame-II Specific Note
The API is (an attempt of) a generalized kernel <=> driver interface.
This section talks about Xframe-specific restrictions. We are assuming that
other multi-channel capable network adapters might have adapter-specific
restrictions; the idea however is not to propagate those restrictions on the
level of API, if possible.
The XFrame-II adapter supports multiple receive traffic flow types.
However, it is not possible to mix the receive flow types
(netdev_hwchannel_rx_flow_e) with the current Xframe hardware. In other
words, one cannot use a single hardware channel to receive, for instance,
all TCP traffic for destination port 80 and all L2 traffic for destination MAC A.
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html