User: Password:
Subscribe / Log in / New account

Nohz cpusets v3 (adaptive tickless kernel)

From:  Frederic Weisbecker <>
To:  LKML <>,
Subject:  [RFC][PATCH 00/41] Nohz cpusets v3 (adaptive tickless kernel)
Date:  Tue, 1 May 2012 01:54:34 +0200
Message-ID:  <>
Cc:  Frederic Weisbecker <>, Alessio Igor Bogani <>, Andrew Morton <>, Avi Kivity <>, Chris Metcalf <>, Christoph Lameter <>, Daniel Lezcano <>, Geoff Levand <>, Gilad Ben Yossef <>, Hakan Akkan <>, Ingo Molnar <>, Kevin Hilman <>, Max Krasnyansky <>, "Paul E. McKenney" <>, Peter Zijlstra <>, Stephen Hemminger <>, Steven Rostedt <>, Sven-Thorsten Dietrich <>, Thomas Gleixner <>
Archive-link:  Article


A summary of what this is about can be found here:

Changes since v2:

	* Correctly handle update of the cpuset mask when the nohz
	flag is set (courtesy of Hakan Akkan)

	* Handle rq clock. This introduces a new update_nohz_rq_clock()
	helper that sites which make use of rq->clock can call if they want to
	ensure the rq clock doesn't have a stale value due to the targeted CPU
	beeing tickless. If it's tickless, it's not maintaining the rq clock
	by calling scheduler_tick()->update_rq_clock() periodically.
	I think I've added this manual call to every callsites that needed it.
	I may have missed some though, or we may forget to handle tickless
	CPUs in future code. So I think we need to add some automated debug checks
	to catch that.

	* Fix a warning reported by Gilad Ben Yossef: we flush the time on
	pre-schedule and then the tick is restarted from an IPI before we do
	it manually on post-schedule. From there we try to flush the time
	again but ts->jiffies_saved_whence is set to SAVED_NONE because
	we already flushed the time. This was triggering a spurious warning.

Still a lot to do. I'm now maintaining the TODO list there:

The git branch can be fetched from:

Frederic Weisbecker (40):
  nohz: Separate idle sleeping time accounting from nohz logic
  nohz: Make nohz API agnostic against idle ticks cputime accounting
  nohz: Rename ts->idle_tick to ts->last_tick
  nohz: Move nohz load balancer selection into idle logic
  nohz: Move ts->idle_calls incrementation into strict idle logic
  nohz: Move next idle expiry time record into idle logic area
  cpuset: Set up interface for nohz flag
  nohz: Try not to give the timekeeping duty to an adaptive tickless
  x86: New cpuset nohz irq vector
  nohz: Adaptive tick stop and restart on nohz cpuset
  nohz/cpuset: Don't turn off the tick if rcu needs it
  nohz/cpuset: Wake up adaptive nohz CPU when a timer gets enqueued
  nohz/cpuset: Don't stop the tick if posix cpu timers are running
  nohz/cpuset: Restart tick when nohz flag is cleared on cpuset
  nohz/cpuset: Restart the tick if printk needs it
  rcu: Restart the tick on non-responding adaptive nohz CPUs
  rcu: Restart tick if we enqueue a callback in a nohz/cpuset CPU
  nohz: Generalize tickless cpu time accounting
  nohz/cpuset: Account user and system times in adaptive nohz mode
  nohz/cpuset: New API to flush cputimes on nohz cpusets
  nohz/cpuset: Flush cputime on threads in nohz cpusets when waiting
  nohz/cpuset: Flush cputimes on procfs stat file read
  nohz/cpuset: Flush cputimes for getrusage() and times() syscalls
  x86: Syscall hooks for nohz cpusets
  x86: Exception hooks for nohz cpusets
  x86: Add adaptive tickless hooks on do_notify_resume()
  nohz: Don't restart the tick before scheduling to idle
  sched: Comment on rq->clock correctness in ttwu_do_wakeup() in nohz
  sched: Update rq clock on nohz CPU before migrating tasks
  sched: Update rq clock on nohz CPU before setting fair group shares
  sched: Update rq clock on tickless CPUs before calling
  sched: Update rq clock earlier in unthrottle_cfs_rq
  sched: Update clock of nohz busiest rq before balancing
  sched: Update rq clock before idle balancing
  sched: Update nohz rq clock before searching busiest group on load
  rcu: New rcu_user_enter() and rcu_user_exit() APIs
  rcu: New rcu_user_enter_irq() and rcu_user_exit_irq() APIs
  rcu: Switch to extended quiescent state in userspace from nohz cpuset
  nohz: Exit RCU idle mode when we schedule before resuming userspace
  nohz/cpuset: Disable under some configs

Hakan Akkan (1):
  nohz/cpuset: enable addition&removal of cpus while in adaptive nohz

 arch/Kconfig                       |    3 +
 arch/x86/Kconfig                   |    1 +
 arch/x86/include/asm/entry_arch.h  |    3 +
 arch/x86/include/asm/hw_irq.h      |    7 +
 arch/x86/include/asm/irq_vectors.h |    2 +
 arch/x86/include/asm/smp.h         |   11 +
 arch/x86/include/asm/thread_info.h |   10 +-
 arch/x86/kernel/entry_64.S         |   12 +-
 arch/x86/kernel/irqinit.c          |    4 +
 arch/x86/kernel/ptrace.c           |   10 +
 arch/x86/kernel/signal.c           |    3 +
 arch/x86/kernel/smp.c              |   26 ++
 arch/x86/kernel/traps.c            |   20 +-
 arch/x86/mm/fault.c                |   13 +-
 fs/proc/array.c                    |    2 +
 include/linux/cpuset.h             |   29 ++
 include/linux/kernel_stat.h        |    2 +
 include/linux/posix-timers.h       |    1 +
 include/linux/rcupdate.h           |    8 +
 include/linux/sched.h              |   10 +-
 include/linux/tick.h               |   75 ++++--
 init/Kconfig                       |    8 +
 kernel/cpuset.c                    |  141 +++++++++-
 kernel/exit.c                      |    8 +
 kernel/posix-cpu-timers.c          |   12 +
 kernel/printk.c                    |   15 +-
 kernel/rcutree.c                   |  150 ++++++++--
 kernel/sched/core.c                |  112 ++++++++-
 kernel/sched/fair.c                |   39 +++-
 kernel/sched/sched.h               |   29 ++
 kernel/softirq.c                   |    6 +-
 kernel/sys.c                       |    6 +
 kernel/time/tick-sched.c           |  542 +++++++++++++++++++++++++++++-------
 kernel/time/timer_list.c           |    7 +-
 kernel/timer.c                     |    2 +-
 35 files changed, 1148 insertions(+), 181 deletions(-)


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds