That (plus intelligently setting up the buffers so that packets *don't* sit there too long) would be an improvement. Is it good enough? I dunno. Say we set a latency target of 10 ms. That means that sendfile()'s going to incur 100 wakeups/second, which is probably more than we'd like, but maybe acceptable (and maybe we'd need to wake up that often to deal with ACKs anyway). It's also not clear that that's an aggressive enough latency target. For a web server, that's already 10% of Amazon's "100 ms latency = 1% lost sales" guideline. For servers chatting with each other inside a datacenter, I just measured 1/4 of a ms as the average ping between two machines in our cluster, so call it an 80x increase in one-way latency. That seems like a lot, maybe?
And there are a lot of advantages to picking the *right* packets to drop -- if the packet you drop happens to be DNS, or interactive SSH, or part of a small web page, then you'll cause an immediate user-visible hiccup, and won't get any benefits in terms of reduced contention (like you would if you had dropped a packet from a long-running bulk TCP flow that then backs off). Then again, maybe that's okay, and re-ordering packets that have already been handed off to the driver does sound pretty tricky! (And might require hardware support.)
But it's useful to try and find the "right" solution first, because that way even if you give up on achieving it, at least in plan B you know what you're trying to approximate.