|From:||Paul Turner <firstname.lastname@example.org>|
|Subject:||[CFS Bandwidth Control v4 0/7] Introduction|
|Date:||Tue, 15 Feb 2011 19:18:31 -0800|
|Cc:||Bharata B Rao <email@example.com>, Dhaval Giani <firstname.lastname@example.org>, Balbir Singh <email@example.com>, Vaidyanathan Srinivasan <firstname.lastname@example.org>, Gautham R Shenoy <email@example.com>, Srivatsa Vaddagiri <firstname.lastname@example.org>, Kamalesh Babulal <email@example.com>, Ingo Molnar <firstname.lastname@example.org>, Peter Zijlstra <email@example.com>, Pavel Emelyanov <firstname.lastname@example.org>, Herbert Poetzl <email@example.com>, Avi Kivity <firstname.lastname@example.org>, Chris Friesen <email@example.com>|
Hi all, Please find attached v4 of CFS bandwidth control; while this rebase against some of the latest SCHED_NORMAL code is new, the features and methodology are fairly mature at this point and have proved both effective and stable for several workloads. As always, all comments/feedback welcome. Changes since v3: - Rebased to current tip, update to work with new group scheduling accounting - (Bug fix) Fixed Race with unthrottling (due to changing global limit) fixed - (Bug fix) Fixed buddy interactions -- in particular, prevent buddy nominations from re-picking throttled entities The skeleton of our approach is as follows: - We maintain a global pool (per-tg) pool of unassigned quota. Within it we track the bandwidth period, quota per period, and runtime remaining in the current period. As bandwidth is used within a period it is decremented from runtime. Runtime is currently synchronized using a spinlock, in the current implementation there's no reason this couldn't be done using atomic ops instead however the spinlock allows for a little more flexibility in experimentation with other schemes. - When a cfs_rq participating in a bandwidth constrained task_group executes it acquires time in sysctl_sched_cfs_bandwidth_slice (default currently 10ms) size chunks from the global pool, this synchronizes under rq->lock and is part of the update_curr path. - Throttled entities are dequeued, we protect against their re-introduction to the scheduling hierarchy via checking for a, per cfs_rq, throttled bit. Interface: ---------- Three new cgroupfs files are exported by the cpu subsystem: cpu.cfs_period_us : period over which bandwidth is to be regulated cpu.cfs_quota_us : bandwidth available for consumption per period cpu.stat : statistics (such as number of throttled periods and total throttled time) One important interface change that this introduces (versus the rate limits proposal) is that the defined bandwidth becomes an absolute quantifier. Previous postings: ----------------- v3: https://lkml.org/lkml/2010/10/12/44 v2: http://lkml.org/lkml/2010/4/28/88 Original posting: http://lkml.org/lkml/2010/2/12/393 Prior approaches: http://lkml.org/lkml/2010/1/5/44 ("CFS Hard limits v5") Thanks, - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to firstname.lastname@example.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds