User: Password:
|
|
Subscribe / Log in / New account

Are we really chasing the right issue?

Are we really chasing the right issue?

Posted Feb 24, 2012 8:16 UTC (Fri) by rvfh (subscriber, #31018)
Parent article: Short sleeps suffering from slack

> on some systems, the timer slack value is set quite high - on the order of seconds

So the problem is not just sleep(0) then! sleep(1) might sleep several seconds too... Isn't this the first issue to fix? Who decided my sleep(1) could wait several seconds and not just the 1 I coded?

To me the problem is when the sleep requested period is less than the slack value, and that's what I would fix.


(Log in to post comments)

Are we really chasing the right issue?

Posted Feb 24, 2012 8:45 UTC (Fri) by dlang (subscriber, #313) [Link]

the owner of the system set the slack value, why should an application programmer of some random application get to override this?

Are we really chasing the right issue?

Posted Feb 24, 2012 9:29 UTC (Fri) by rvfh (subscriber, #31018) [Link]

Problem is:
* app dev says it should sleep 1 second
* sys owner says if you sleep, then you may sleep for 5 seconds

What do we do? Either
* sleep for 1 second, as requested, or
* sleep for up to 5 seconds and break the application

I think this calls for a new user-space API, such as:
unsigned int sleep_slack(unsigned int seconds, unsigned int slack);

But sleep's behaviour should not be changed.

Are we really chasing the right issue?

Posted Feb 24, 2012 9:55 UTC (Fri) by tglx (subscriber, #31301) [Link]

> But sleep's behaviour should not be changed.

The kernel does not change sleep() behaviour. It's the sysadmins choice to set slack to something large. The kernel provides the mechanism, but not the policy.

Are we really chasing the right issue?

Posted Feb 24, 2012 10:20 UTC (Fri) by anselm (subscriber, #2796) [Link]

Who decided my sleep(1) could wait several seconds and not just the 1 I coded?

The person who wrote the spec for sleep(), which says, among other things:

The suspension time may be longer than requested due to the scheduling of other activity by the system.

So if you believe that »sleep(1)« will sleep for exactly one second, you are mistaken about how sleep() works.

oversleeping

Posted Feb 24, 2012 22:57 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

I think it's more basic than the documented function of sleep(). In a non-realtime timeshared OS, the OS can take several seconds from you any time it wants, whether you did a sleep() or not. If you get to run at all, you should be grateful.

Are we really chasing the right issue?

Posted Feb 26, 2012 10:44 UTC (Sun) by IkeTo (subscriber, #2122) [Link]

> So if you believe that »sleep(1)« will sleep for exactly one second, you are mistaken about how sleep() works.

Nobody has any doubt about "sleep(1)" sleeping 1.01 second, or sleeping 2 whole days if the user suspended the computer. But that's a different proposition than expecting that "sleep(1)" would regularly sleep 10 seconds in a reasonably loaded system. As a developer, if I know that if instead it sleeps 10 seconds, my program will not behave as it should, what other options do I have?

Are we really chasing the right issue?

Posted Feb 27, 2012 10:50 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

That would depend on whether your program breaking when the delay is 10 seconds instead of 1 second is justifiable. If it is, you'll just have to document that the user needs to turn down the timer slack setting on their system. If it isn't, fix your buggy program.

Are we really chasing the right issue?

Posted Feb 27, 2012 15:51 UTC (Mon) by fuhchee (guest, #40059) [Link]

"If it is, you'll just have to document that the user needs to turn down the timer slack setting on their system."

So a single systemwide knob has to be fixed by the user's sysadmin? That doesn't seem appropriate, just to retain previous capability.

Are we really chasing the right issue?

Posted Feb 27, 2012 19:05 UTC (Mon) by dlang (subscriber, #313) [Link]

it's not a "single systemwide knob", it's a per-cgroup knob

Are we really chasing the right issue?

Posted Mar 1, 2012 5:24 UTC (Thu) by kevinm (guest, #69913) [Link]

If your program will work correctly when sleeping one second, but not when sleeping 10 seconds, then you either have a buggy program (which is probably already failing on heavily loaded systems) or a program that should be using a real-time scheduling class.

A program that calls sleep(n) must already expect to sleep for at least n seconds. The timer-slack is just making these bugs more visible.

Are we really chasing the right issue?

Posted Mar 3, 2012 2:05 UTC (Sat) by IkeTo (subscriber, #2122) [Link]

Say I want to create a stop-watch application, the user specify a number of seconds to wait until an alert is shown, and meanwhile the stop watch will keep displaying the amount of time remaining, in seconds intervals. No user care if our display is updated 0.1 second too late, so the original sleep works perfectly. There is no need of "real-time scheduling class" requirement.

With the timer slack, all at a sudden users will see the timer being updated once fifteen seconds, and the final alert also late similarly. No user will miss such a "bug".

Now what option do I have?

1. I can ask the user to setuid root the program so that the program can use real-time scheduling, hoping that they have root privileges, and making every security sensitive user to raise their eyebrow.

2. I can ask the user to change the cgroup wide timer slack value, hoping that they have root privileges, and making the whole system wasting energy for all the time before the user/admin remember to reset the timer slack value, because they are now sleeping more than they do optimally.

3. I can stop sleeping at all, and instead use a busy loop with a very high nice level. Seems very drastic, waste a processor, waste power, make system load 1, but in a sense it is the best solution because it only affect the system for as long as the stop watch runs, and do not need root privileges.

How's that sound?

Are we really chasing the right issue?

Posted Mar 7, 2012 17:22 UTC (Wed) by mpr22 (subscriber, #60784) [Link]

4. Write your program with a client/daemon architecture. The daemon can be activated as root by the system's daemon-managing services, then drop its privileges once it has given itself a real-time scheduling class. The client connects to the daemon via a socket, then sits in a blocking read() waiting for the once-a-second heartbeat packets from the daemon. If the daemon doesn't currently have any clients, it can just sit in a blocking accept() call until one shows up.

Admittedly this stops people on machines they don't administer from installing and using your application. However, if the user isn't trusted to have administrative access to the system, they probably shouldn't be self-installing applications that require policy violations to work as expected anyway.

Are we really chasing the right issue?

Posted Mar 9, 2012 8:41 UTC (Fri) by Thomas (subscriber, #39963) [Link]

Using a timer?

Are we really chasing the right issue?

Posted Feb 24, 2012 10:26 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

Who decided my sleep(1) could wait several seconds and not just the 1 I coded?

Linus, by virtue of deciding in 1991 that his new kernel would be an ordinary preemptively multitasking kernel, rather than something more exotic. sleep() has always had the property on Unix-like OSes that your process might sleep longer than you expect.

Are we really chasing the right issue?

Posted Mar 1, 2012 13:29 UTC (Thu) by slashdot (guest, #22014) [Link]

The timer slack must be at most around 1-10ms if the system is supposed to correctly run arbitrary software.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds