Improving lost and spurious IRQ handling

By Jonathan Corbet
June 15, 2010

Interrupts are a device's way of telling the kernel that something interesting has happened. One of the key benefits of using interrupts is that they free the kernel from the need to poll a device to learn what its state is. Like any other part of a computer, though, interrupts can go wrong, leading to situations where the system is overwhelmed by a flood of spurious interrupts - or, instead, left waiting for an interrupt which will never arrive. The kernel has some defensive mechanisms in its generic interrupt layer for dealing with situations like these; Tejun Heo has now posted a patch series intended to improve those mechanisms. As it happens, the necessary response when interrupts go bad is returning to polling.

One problem which is familiar to driver authors is missing interrupts. A driver will typically set up an I/O operation, get it started, then wait until an interrupt indicating completion arrives. If that interrupt never shows up, the driver can end up waiting for a very long time. Missing interrupts can have a number of causes, including flaky devices or an interrupt routing problem somewhere in the system. Either way, if the driver author has not anticipated this situation and taken the appropriate measures - setting a timeout, for example - things will not end well.

Waiting for interrupt timeouts will slow a device's performance considerably, though. That problem can be mitigated by polling the device state frequently, but rapid polling has its own costs. In an attempt to obtain the best results consistently, Tejun's patch adds a new driver API:

    #include <linux/interrupt.h>

    struct irq_expect *init_irq_expect(unsigned int irq, void *dev_id);
    void expect_irq(struct irq_expect *exp);
    void unexpect_irq(struct irq_expect *exp, bool timedout);

A call to init_irq_expect() will allocate an opaque token to be used with the other two functions; it should be passed the interrupt number of interest and the same dev_id value as was used to allocate the interrupt initially. When the driver initiates an action which should result in a device interrupt, it should make a call to expect_irq(). When the operation is completed, unexpect_irq() should be called, with timedout indicating whether the operation timed out (the interrupt did not arrive). Note that it's not necessary for the driver to free the struct irq_expect structure; that will happen automatically when the interrupt is released.

A call to expect_irq() will initiate polling on the given interrupt line, where "polling" means making an occasional call to the device's interrupt handler. Initially, that polling is quite slow. If it turns out that the device is dropping interrupts (as indicated by the timedout parameter to unexpect_irq()), the polling frequency will be increased - up to once every millisecond. Working devices should interrupt before the slow poll period passes, so the result should be no real polling at all on reliable devices. If there is a problem with interrupt delivery, though, the kernel will automatically take responsibility for poking the interrupt handler when interrupts are expected.

This interface works well if the driver knows when to expect interrupts, but not all devices work that way. For hardware which can interrupt at any time, there is an "IRQ watching" API instead:

    void watch_irq(unsigned int irq, void *dev_id);

This function will begin polling of the specified interrupt line; it will also initiate tracking of interrupt delivery status. If it determines that interrupts are being lost (as determined by an IRQ_HANDLED return status from a polled call to the handler), it will continue to poll at a higher frequency. Otherwise, eventually, interrupt delivery will be deemed to be reliable and polling will be turned off.

Tejun's patch also changes the way that the kernel responds to spurious interrupts - those which no driver is interested in. Current kernels count the number of interrupts on each line for which no handler returned IRQ_HANDLED; if 99,000 out of 100,000 interrupts are spurious, the kernel loses patience, disables the interrupt line forevermore, and starts polling the line instead. There is a real cost to this action, which is why the kernel allows spurious interrupts to get to such a high proportion of the total. Once the response is triggered, there is no going back, even if the spurious interrupts were the result of a brief hardware glitch.

With the adaptive polling mechanisms put into place to support the above features, the kernel is also able to take a more flexible approach to handling of spurious interrupts. 9,900 bad interrupts out of 10,000 are now enough to cause the spurious interrupt handling mechanism to kick in; as before, it disables the interrupt and begins polling. After a period, though, the new code will reenable the interrupt line, just to see what happens. If the source of spurious interrupts has stopped, the interrupt can be used as before. If, instead, spurious interrupts are still being delivered, the line will be blocked again for a longer period of time.

There has not been a lot of discussion of this patch set so far; one comment worried that polling could cause users not to realize that there are problems in their systems. But Tejun says that this kind of response is required to get reasonably solid behavior out of flaky hardware, and nobody seems to want to challenge that claim. So it seems fairly likely that a future version of this patch will find its way into the mainline at some point.

Index entries for this article
Kernel	Interrupts

Improving lost and spurious IRQ handling

Posted Jun 17, 2010 7:41 UTC (Thu) by mjthayer (guest, #39183) [Link] (3 responses)

> one comment worried that polling could cause users not to realize that there are problems in their systems.

Strangely enough my thought was the reverse - that the system would be well suited for gathering diagnostic information that could be used to alert users to potential problems.

Improving lost and spurious IRQ handling

Posted Jun 17, 2010 20:49 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (2 responses)

Alerts must be actionable. If you tell me "Foo: Bar happened. Quux!" and I can't do anything about it ("throw your Foo away" is rarely an acceptable option) then I just feel like Linux is pointlessly screaming at me.

It's worth recording and making available the information to anyone who enquires, but that's about it. I'd say it's like the occasional deprecated API feature, those get mentioned in dmesg but they aren't (in any system I've seen) pushed to desktop notifiers etc., because users who can't run dmesg are most likely powerless to fix them, and generally an upstream will already know about it and be in the process of developing a fix.

Improving lost and spurious IRQ handling

Posted Jun 18, 2010 6:46 UTC (Fri) by jzbiciak (guest, #5246) [Link]

Proactive alerts would be bad, unless a massive failure is imminent, then go ahead and alert me. So, spurious interrupts? Don't tell me proactively. Hard-drive about to croak? Give me a pie in the face if you have to!

That said, it would be nice to have a "Why is it slow?" button that can go round up all the suspicious things it's seen lately, such as:

Spurious interrupts
Dropped interrupts
Kernel oopses that didn't panic the system
HD command timeouts. (Much more common for me back in the IDE days.)
Wacky numbers on my network interfaces (ie. gobs of dropped/collided/whatever packets)
Unusual temperature readings
...etc, etc, etc.

Basically, round up anything vaguely suspicious and say "Uh, here," and maybe stop there. That is, aim it at a semi-expert or motivated tinkerer diagnosing a slow computer. Trying to give advice to less clued users based on some sort of expert system database is asking for trouble and confusion. Better to leave it somewhat opaque and leave it to the educated and motivated to interpret it.

A recent example from my Windows laptop: Video acceleration "dies" if I have VPN up and running while also running dual head. (At least, that's the only common factor I've identified.) I first noticed it because everything "got slow" to varying degrees. If I had a "Why's it slow?" button, it should put that event at the top of the list, even if it can't tell me what to do about it. On a previous laptop, it "got slow" due to HD timeouts. The list goes on. These spurious and dropped interrupts are natural candidates for such a list.

Improving lost and spurious IRQ handling 100% fix

Posted May 31, 2016 19:06 UTC (Tue) by stevedonato (guest, #109054) [Link]

Current Linux missing interrupts can cause the OS to hang because it does not have a perfected "Missing Interrupt Handler."
In my opinion a simple 100% fix is; kernel should include starting a timer prior to issuing any I/O request. If the timer interrupt pops and the I/O request has not YET completed the missing interrupt handler "MIH" can post an I/O error/timeout etc. back to the original requester of the I/O.
If the I/O completes normally the timer is canceled before returning to the task scheduler.
Starting a hardware Timer should no take any CPU time during it's wait time.
While addition code has to take into account the type of device and what is max timeout it should wait for etc. this is a simple table driven list of items.
IBM uses this approach in all operating systems

Improving lost and spurious IRQ handling

Posted Jun 18, 2010 21:20 UTC (Fri) by giraffedata (guest, #1954) [Link]

When the operation is completed, unexpect_irq() should be called, with timedout indicating whether the operation timed out (the interrupt did not arrive)

Does this mean the device driver should call unexpect_irq()? How does it know the interrupt did not arrive, given that someone calls the driver's interrupt handler (because of expect_irq()) even if it didn't?