LWN.net Logo

Reworking NAPI

NAPI ("new API," though it is not so new anymore) is an interrupt mitigation mechanism used with network devices. When network traffic is heavy, the kernel can safely predict that incoming packets will be available anytime it gets around to looking, so there is no need to have the adapter interrupting it (possibly thousands of times per second) to tell it about those packets. So a NAPI-compliant driver will turn off the packet receive interrupt and provide a poll() method to the kernel. When the kernel is ready to deal with more packets, poll() will be called with a maximum number of packets it is allowed to feed into the kernel; it should process up to that many packets and quit.

With NAPI in place, the kernel can process significantly higher packet loads. The reduction in interrupt load helps, but there are a couple of other advantages as well. The way NAPI works makes it less likely that packets will be reordered in the kernel. And if traffic reaches the point where the kernel is forced to drop packets, those packets can be dumped before they are ever fed into the network stack. For more information on NAPI, see this old LWN article or this page at OSDL, which is newer and more complete.

That page may require some updating soon, however, as Stephen Hemminger has proposed a newer NAPI (NNAPI?) which changes the driver API somewhat. In the current mainline, there are two NAPI-related fields in the net_device structure: poll(), being the function called to collect packets from the adapter, and weight, which is essentially the driver writer's best guess as to how important the interface is relative to any others which might be on the system. Stephen's patch moves these parameters into a separate structure (struct napi_struct), aggregating them with a few other NAPI-related structures.

The napi_struct structure is then put back into struct net_device, but drivers need not use that one. The whole purpose of this patch would appear to be to separate the NAPI-related information from specific network devices. There are some adapters which provide multiple ports, all of which have a single receive interrupt. The separated NAPI information allows all of those ports to share a single NAPI state and a single poll() function; this organization better fits the reality of the hardware.

This patch won't hit mainline before 2.6.21, so authors have some time to react. The changes are relatively simple to make. The first is to find a napi_struct structure for the device; in the absence of a reason to do otherwise, the best solution would be to use the new napi field in the net_device structure. So, if the current code initializes itself with something like:

    dev->weight = MY_WEIGHT;
    dev->poll = my_poll;

The new version would look like this:

    dev->napi.weight = MY_WEIGHT;
    dev->napi.poll = my_poll;

The prototype of the poll() function has changed a bit, however; it now looks like:

    int (*poll)(struct napi_struct *napi, int budget);

The pointer to the net_device structure has been replaced with a pointer to the napi_struct structure. In most cases, the net_device pointer can be had with a call like:

    struct net_device *dev = container_of(napi, struct net_device, napi);

The meaning of the budget parameter has changed slightly as well; it is now the only indicator of how many packets the poll() function may feed into the kernel. There is no longer any need to check the quota field separately. Finally, the return value should be the number of packets which were actually processed.

The other NAPI-related functions in the network system have been modified in fairly predictable ways. NAPI polling is started with either of:

    void napi_schedule(struct napi_struct *napi);
    /* or */
    int napi_schedule_prep(struct napi_struct *napi);
    void __napi_schedule(struct napi_struct *napi);

Polling is turned off with:

    void napi_complete(struct napi_struct *napi);

The current patch is in an early state, so the interfaces could change over the next few months. Nobody has spoken out against it, though, so chances are good that it will be merged in some form.


(Log in to post comments)

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds