Re: [RFC patch 1/2] sched: dynamically adapt granularity with nr_running

[Posted September 14, 2010 by corbet]

From:		Mike Galbraith <efault-AT-gmx.de>
To:		Mathieu Desnoyers <mathieu.desnoyers-AT-efficios.com>
Subject:		Re: [RFC patch 1/2] sched: dynamically adapt granularity with nr_running
Date:		Mon, 13 Sep 2010 06:13:03 +0200
Message-ID:		<1284351183.7321.36.camel@marge.simson.net>
Cc:		Ingo Molnar <mingo-AT-elte.hu>, LKML <linux-kernel-AT-vger.kernel.org>, Peter Zijlstra <peterz-AT-infradead.org>, Linus Torvalds <torvalds-AT-linux-foundation.org>, Andrew Morton <akpm-AT-linux-foundation.org>, Steven Rostedt <rostedt-AT-goodmis.org>, Thomas Gleixner <tglx-AT-linutronix.de>, Tony Lindgren <tony-AT-atomide.com>
Archive‑link:		Article

On Sun, 2010-09-12 at 14:16 -0400, Mathieu Desnoyers wrote:
> * Mike Galbraith (efault@gmx.de) wrote:
> > On Sun, 2010-09-12 at 08:14 +0200, Ingo Molnar wrote:
> > > * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > > 
> > > > (on a uniprocessor 2.0 GHz Pentium M)
> > > > 
> > > > * Without the patch:
> > > > 
> > > >  - wakeup-latency with SIGEV_THREAD in parallel with youtube video and
> > > >    make -j10
> > > > 
> > > > maximum latency: 50107.8 µs
> > > > average latency: 6609.2 µs
> > > > missed timer events: 0
> > > 
> > > I tried your patches on a similar UP system, using wakeup-latency.c. I 
> > > also measured the vanilla upstream kernel (cced86a) with the default 
> > > granularity settings, and also vanilla with a sched_min_granularity/3 
> > > tune (patch attached below for that).
> > > 
> > > I got the following results (make -j10 kbuild load, average of 3 runs):
> > > 
> > >  vanilla: 
> > > 
> > >   maximum latency: 38278.9 µs
> > >   average latency:  7730.1 µs
> > > 
> > >  mathieu-dyn:
> > > 
> > >   maximum latency: 28698.8 µs
> > >   average latency:  7757.1 µs
> > > 
> > >  peterz-min_gran/3:
> > > 
> > >   maximum latency: 22702.1 µs
> > >   average latency:  6684.8 µs
> > 
> > One thing that springs to mind with make is that it does vfork, so kinda
> > sorta continues running in drag, so shouldn't get credit for sleeping,
> > as that introduces bogus spread.  Post vfork parent notification time
> > adjustment may suffice, think I'll try that.
> 
> Hrm, I might be misunderstanding what you are saying here, but when a new
> process/thread is forked and woken up, we fall in the "initial" case of
> place_entity, so we increase the vruntime of a whole slice rather than getting
> credit for sleeping.
> 
> Or am I missing your point ?

Yes and no.  I'm pondering the parent, but by the same token, the vfork
child shouldn't be penalized either.

Does your latency go down drastically if you turn START_DEBIT off?
Seems like it should.  Perhaps START_DEBIT should not start a task
further right than rightmost.  I've done that before.

maximum latency: 19221.5 µs
average latency: 5159.0 µs
missed timer events: 0

maximum latency: 43901.0 µs
average latency: 8430.1 µs
missed timer events: 0

Turning it off here cut latency roughly in half (i've piddled vfork
though, but not completely).  Limiting child placement to no further
right than rightmost should help quite a bit.

	-Mike