User: Password:
Subscribe / Log in / New account

The CHOKe packet scheduler

The CHOKe packet scheduler

Posted Feb 28, 2011 0:39 UTC (Mon) by gmaxwell (guest, #30048)
In reply to: The CHOKe packet scheduler by jnareb
Parent article: The CHOKe packet scheduler

Well, SFB isn't this— SFB uses bloom filters to discriminate flows while this uses sampled queue occupancy.

It would be interesting to see the two compared in simulation. I'd expect SFB to be more fair until its hash tables start saturating and then CHOKe to be more fair after that point.

It might be interesting to combine them, but I'm not sure what a proper update rule would look like because the CHOKe matches are stochastic. I suppose you'd have an update rule for the blue constants that clocks two different values through a filter depending on the CHOKe match status. In effect, you'd have SFBish behavior until the buckets saturate and then each saturated bucket would behave CHOKe-ish.

Unfortunately, as described, CHOKe is not hardware friendly at all. The random queue access can be pipelined (or, better, replaced with a single memory slot remembering a random flow at the time it was enqueued) but I don't see an obvious way to drop both in the case of a match without taking more memory bandwidth than RED or even SFB like schemes.

(Log in to post comments)

The CHOKe packet scheduler

Posted Feb 28, 2011 3:05 UTC (Mon) by jthill (subscriber, #56558) [Link]

not hardware friendly
Router hardware is so taut that lighting a don't-transmit-me flag in the packet header is excessive?

The CHOKe packet scheduler

Posted Feb 28, 2011 3:19 UTC (Mon) by gmaxwell (guest, #30048) [Link]

An additional read-modify-write cycle out to whatever memory is storing the en-queued descriptors, potentially for every packet ingress? And what happens if you hit a don't transmit packet on your next sample? Stall the pipeline and try again?

At best it would halve the worst case throughput of a memory bandwidth bottlenecked system (e.g. 10g+ packet forwarding engines)

Compared to RED (just needs a RNG and a fullness counter) and SFB (whos state can be made 'arbitrarily' small without compromising the queue capacity, and so can be kept on chip) this sounds pretty unfriendly.

Of course, not an issue on some small software driven router handling only a few tens of megabits. but most of those devices could also do real flow-aware queuing which was far more fine grained than CHOKe.

The CHOKe packet scheduler

Posted Feb 28, 2011 3:24 UTC (Mon) by dlang (subscriber, #313) [Link]

if you are pumping data out a 10Gb pipe, you are unlikely to need this sort of thing. this is needed when the link you are sending out of is the bottleneck of the communications path.

allow this to be configured on a per-interface basis and you can disable it on your links where the hardware can barely keep up, but still have it in place where you are going to be buffering the data that you send.

The CHOKe packet scheduler

Posted Feb 28, 2011 3:41 UTC (Mon) by gmaxwell (guest, #30048) [Link]

I wonder what bit of misinformation makes people think that simply because a link is fast that it won't be congested. This isn't the case— and fast links tend to be major traffic aggregation points where fairness is a bigger issue, and where insufficient buffering can result in very costly under-utilization.

As I mentioned, in the sort of place where you're using a purely software forwarding path and where the design of CHOKe wouldn't be a performance impediment you could also do per-flow queuing which would be considerably more fair than CHOKe (and potentially much better for loss sensitive flows), perhaps falling back to CHOKe/SFB if the flow lookups become too expensive.

AFAIK, Linux doesn't even have a true per-flow qdisc though there have been patches and SFQ approximates it with acceptable performance. Can you suggest a case where CHOKe would be needed but the SFQ qdisc in Linux would be inappropriate?

The CHOKe packet scheduler

Posted Feb 28, 2011 6:19 UTC (Mon) by jthill (subscriber, #56558) [Link]

At best it would halve the worst case throughput of a memory bandwidth bottlenecked system (e.g. 10g+ packet forwarding engines
That implies that internal bandwidth to packet metadata can be more of a bottleneck than transmission bandwidth to the actual packet data, and that one read and one write to a random packet envelope costs roughly as much as allocating and queuing it did in the first place.

So far as the envelope is concerned, that would mean each envelope is written as (some equivalent of) a single cache line. Hauling the envelope back and rewriting a bit would then actually cost the same bandwidth. That seems sensible enough.

I don't think full interlock on the update is necessary -- that's what LL/SC is for, yes? Just ignore the result from the conditional store, if it works, great, if not, so what? Just as for a duplicate tag, the inbound packet is still dropped, as it should be, and the goal is to approximate, to "differentially penalize", which this still achieves.

It's still really hard for me to imagine metadata bandwidth being more of a bottleneck than actual data bandwidth, though.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds