Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
(Nearly) full tickless operation in 3.10
So the problem is not just sleep(0) then! sleep(1) might sleep several seconds too... Isn't this the first issue to fix? Who decided my sleep(1) could wait several seconds and not just the 1 I coded?
To me the problem is when the sleep requested period is less than the slack value, and that's what I would fix.
Are we really chasing the right issue?
Posted Feb 24, 2012 8:45 UTC (Fri) by dlang (✭ supporter ✭, #313)
Posted Feb 24, 2012 9:29 UTC (Fri) by rvfh (subscriber, #31018)
What do we do? Either
* sleep for 1 second, as requested, or
* sleep for up to 5 seconds and break the application
I think this calls for a new user-space API, such as:
unsigned int sleep_slack(unsigned int seconds, unsigned int slack);
But sleep's behaviour should not be changed.
Posted Feb 24, 2012 9:55 UTC (Fri) by tglx (subscriber, #31301)
The kernel does not change sleep() behaviour. It's the sysadmins choice to set slack to something large. The kernel provides the mechanism, but not the policy.
Posted Feb 24, 2012 10:20 UTC (Fri) by anselm (subscriber, #2796)
Who decided my sleep(1) could wait several seconds and not just the 1 I coded?
The person who wrote the spec for sleep(), which says, among other things:
The suspension time may be longer than requested due to the scheduling of other activity by the system.
So if you believe that »sleep(1)« will sleep for exactly one second, you are mistaken about how sleep() works.
Posted Feb 24, 2012 22:57 UTC (Fri) by giraffedata (subscriber, #1954)
Posted Feb 26, 2012 10:44 UTC (Sun) by IkeTo (subscriber, #2122)
Nobody has any doubt about "sleep(1)" sleeping 1.01 second, or sleeping 2 whole days if the user suspended the computer. But that's a different proposition than expecting that "sleep(1)" would regularly sleep 10 seconds in a reasonably loaded system. As a developer, if I know that if instead it sleeps 10 seconds, my program will not behave as it should, what other options do I have?
Posted Feb 27, 2012 10:50 UTC (Mon) by mpr22 (subscriber, #60784)
Posted Feb 27, 2012 15:51 UTC (Mon) by fuhchee (subscriber, #40059)
So a single systemwide knob has to be fixed by the user's sysadmin? That doesn't seem appropriate, just to retain previous capability.
Posted Feb 27, 2012 19:05 UTC (Mon) by dlang (✭ supporter ✭, #313)
Posted Mar 1, 2012 5:24 UTC (Thu) by kevinm (guest, #69913)
A program that calls sleep(n) must already expect to sleep for at least n seconds. The timer-slack is just making these bugs more visible.
Posted Mar 3, 2012 2:05 UTC (Sat) by IkeTo (subscriber, #2122)
With the timer slack, all at a sudden users will see the timer being updated once fifteen seconds, and the final alert also late similarly. No user will miss such a "bug".
Now what option do I have?
1. I can ask the user to setuid root the program so that the program can use real-time scheduling, hoping that they have root privileges, and making every security sensitive user to raise their eyebrow.
2. I can ask the user to change the cgroup wide timer slack value, hoping that they have root privileges, and making the whole system wasting energy for all the time before the user/admin remember to reset the timer slack value, because they are now sleeping more than they do optimally.
3. I can stop sleeping at all, and instead use a busy loop with a very high nice level. Seems very drastic, waste a processor, waste power, make system load 1, but in a sense it is the best solution because it only affect the system for as long as the stop watch runs, and do not need root privileges.
How's that sound?
Posted Mar 7, 2012 17:22 UTC (Wed) by mpr22 (subscriber, #60784)
4. Write your program with a client/daemon architecture. The daemon can be activated as root by the system's daemon-managing services, then drop its privileges once it has given itself a real-time scheduling class. The client connects to the daemon via a socket, then sits in a blocking read() waiting for the once-a-second heartbeat packets from the daemon. If the daemon doesn't currently have any clients, it can just sit in a blocking accept() call until one shows up.
Admittedly this stops people on machines they don't administer from installing and using your application. However, if the user isn't trusted to have administrative access to the system, they probably shouldn't be self-installing applications that require policy violations to work as expected anyway.
Posted Mar 9, 2012 8:41 UTC (Fri) by Thomas (subscriber, #39963)
Posted Feb 24, 2012 10:26 UTC (Fri) by mpr22 (subscriber, #60784)
Linus, by virtue of deciding in 1991 that his new kernel would be an ordinary preemptively multitasking kernel, rather than something more exotic. sleep() has always had the property on Unix-like OSes that your process might sleep longer than you expect.
Posted Mar 1, 2012 13:29 UTC (Thu) by slashdot (guest, #22014)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds