Since you cannot run the app on the same CPU as the one receiving the interrupts you're going to get a cache bounce *anyway*, right?
But it's worse than that, it's not *an* app, there are several apps which all need to see the same data and since they are running on different CPUs you're going to get a cache bounce for each one anyway.
What you're basically saying is: round-robin IRQ handling is bad because you're sometimes going to get 6 cache-bounces per packet instead of 5. BFD. Without round-robin IRQs if the amount of traffic doubles we have to tell the client we can't do it.
The irony is, if you buy a more expensive network card you get MSI-X which gets the network card to do the IRQ distribution. You then get the same number of cache bounces as if we programmed the IO-APIC to do round-robin, but the load is more evenly distributed. So we've worked around a software limitation by modifying the hardware!
I'm mostly irked by a built-in feature of the IO-APIC being summarily disabled on machines with 8+ CPUs with the comment "but you don't really want that" while I believe I should at least be given the choice.