A new kernel timer API
In current Linux kernels, internal time (for most purposes) is measured in "jiffies," which are really just a counter which is incremented when each timer interrupt happens. The new time code supersedes jiffies with an absolute, monotonically increasing count of nanoseconds. References to jiffies thus become a call to:
nsec_t do_monotonic_clock(void);
Using nanoseconds allows kernel code to work with high-resolution time in real-world units. That, in turn, lets kernel developers forget about the (error-prone) conversions between jiffies and real-world time which are currently necessary.
Nishanth's add-on patch changes the timer subsystem to use nanoseconds as well. The current add_timer() and mod_timer() interfaces remain supported, but are deprecated. The new interface for setting (or modifying) a timer is:
int set_timer_nsecs(struct timer_list *timer, nsec_t expires);
void set_timer_on_nsecs(struct timer_list *timer, nsec_t expires,
int cpu);
This function will cause the given timer to be set to go off at expires, which is an absolute nanoseconds count. Usually, expires will be calculated by adding the desired delay (in nanoseconds) to whatever do_monotonic_clock() returns.
It's worth noting that this patch changes the meaning of the expires field in the timer_list structure. This field is now represented in an internal "timer intervals" unit, rather than in jiffies. If the old add_timer() and mod_timer() interfaces are used, the expires field will be silently converted to the internal format. Code which performs calculations on expires (by increasing the delay and calling mod_timer(), for example) could be in for a surprise.
This patch also deprecates schedule_timeout(), in favor of these functions:
nsec_t schedule_timeout_nsecs(nsec_t timeout);
unsigned long schedule_timeout_usecs(unsigned long usecs);
unsigned int schedule_timeout_msecs(unsigned int msecs);
All three of these functions will set a timer for the given delay (which is
a relative value, not absolute), then call schedule().
| Index entries for this article | |
|---|---|
| Kernel | Timers |
Posted May 19, 2005 5:47 UTC (Thu)
by brouhaha (subscriber, #1698)
[Link] (10 responses)
Posted May 19, 2005 9:44 UTC (Thu)
by kleptog (subscriber, #1183)
[Link] (8 responses)
I don't really know how much smaller we can make CPUs and stuff but I beleive we're reaching the point where higher clock-speeds are useless and we have to start doing more per clock cycle, hyperthreading, multicore, grid computing, etc. If this is the case, a higher resolution than nanoseconds does not seem particularly useful.
Hoave a nice day,
Posted May 19, 2005 17:50 UTC (Thu)
by brouhaha (subscriber, #1698)
[Link] (7 responses)
My point wasn't so much that we need 1 ps timing precision, as that we may well need better than 1 ns. There's three orders of magnitude difference there. If we don't want to use 1 ps, we could certainly use 10 ps, 100 ps, or even 37.2 ps as the unit. But 1 ps seems somewhat more convenient.
Posted May 19, 2005 18:42 UTC (Thu)
by hamjudo (guest, #363)
[Link] (3 responses)
To the parent poster, picosecond timers will be usefull, even on chips that are many picoseconds across. Note how well the Network Time Protocol can synchronize clocks to better than millisecond precision, even though the systems themselves are many light milliseconds apart. (assuming appropriate network interconnect.)
Posted May 20, 2005 23:51 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (2 responses)
You're right that brouhaha confused the units (he said 4 seconds worth of picoseconds fit in 32 bits; it's really 4 seconds worth of nanoseconds). But you're introducing a "fits" that nobody's talking about here -- apparently you're intending for 32 bits to count one second.
Note that the interface provides multiple granularities/ranges from which to choose. You can specify your interval in milliseconds, microseconds, or nanoseconds. That in no way means you can actually get that kind of precision out of the timer. There's really no reason not to throw in picoseconds, if only to save having to answer the question.
Posted May 23, 2005 16:25 UTC (Mon)
by spitzak (guest, #4593)
[Link] (1 responses)
Is there some good reason why a power of ten should be used? Is it because of rounding errors from times specified in decimal numbers of seconds?
Posted May 24, 2005 2:40 UTC (Tue)
by giraffedata (guest, #1954)
[Link]
Remember that we're talking about an external interface here -- the question is in what units would a user of the timer facility want to specify a duration? Virtually nobody measures time in binary units; we all think of time in milliseconds, nanoseconds, etc.
The Unix time_t type (which I think is what you're referring to as the Unix clock) doesn't actually figure in anywhere here -- this is a value that specifies a duration, not a point in time; and if it ever gets added to a point in time, that time is in the kernel internal format, which is a count of clock ticks.
Posted May 19, 2005 22:24 UTC (Thu)
by kleptog (subscriber, #1183)
[Link] (2 responses)
My personal feeling is that measuring less than a nanosecond is not useful given than the moment you're accessing something off-chip (like say memory) you're going to be delayed by tens of thousands of picoseconds and memory latency is not reducing anywhere near as fast as clock speeds.
But hey, I'm willing to be proved wrong.
Posted May 22, 2005 15:17 UTC (Sun)
by haraldt (guest, #961)
[Link] (1 responses)
Processors are asyncronous even today, aren't they? Instructions may have to run through a lot of clock ticks this way, but it's all a matter of resolution. Won't promise this is is going to happen, but hey, it's an idea?
Posted May 22, 2005 15:39 UTC (Sun)
by haraldt (guest, #961)
[Link]
Err.. that A and B are in multiples of a third millimeter apart. You'd probably need asynchronous buses, asyncronous memory devices etc. too.
Posted May 26, 2005 20:53 UTC (Thu)
by j1m+5n0w (guest, #20285)
[Link]
I worked on a project for awhile implementing a high-precision timer mechanism in linux. We used the APIC timer, which gave us an accuracy of about 4 microseconds at best (worst case was much, much worse due to non-premptible kernel sections). Linux is at the point now where nonpreemptible sections longer than a couple milliseconds might happen occasionally, but they're relatively rare whereas latencies of a couple hundred microseconds happen all the time. That would imply that a wakeup timer with a granularity much less than a few hundred microseconds won't be all that useful, since it can make any guarantees, so there's currently not much of a need for timer APIs with a granularity finer than microseconds or nanoseconds.
One big problem is inconsistent interfaces. IIRC Nanosleep uses timespecs (32 bits for seconds, 32 bits for nanoseconds), select uses timvals (32 bits for seconds, 32 bits for microseconds), poll uses a 32 bit millisecond value, itimers use timeval, gettimeofday uses timeval, and aio (i think) uses timespecs. Timespec seems to make the most sense, since it can be used for very long or very short timeouts, and doesn't waste many bits (you might as well use the maximum precision you can get for free). Timeval is almost as good, but microseconds are kind of sloppy for gettimeofday, which might be able to tell what time it is with greater accuracy (though the system call takes about a microsecond to complete, so maybe the point is moot). Poll really shouldn't have used a single 32 bit value - it's too coarse for high-precision timeouts, and can't be used for very long timeouts either.
Someone else in this thread suggested using a 64 bit value of 2^-32 second units, which appeals to me but probably not everyone else. If the system call interface could standardize on timespec for everything time-related, that would be fine with me. Unfortunately, the system call interface is more or less etched in stone, so I don't forsee anyone changing it anytime soon.
Another change I would like to see but I don't know if anyone else does, is to have versions of nanosleep, select, poll, etc.. that use absolute time for their timeouts, rather than relative time. This ensures that the time lost during system call entry is accounted for properly. It also means that the kernel has to handle the case where the timer is expired before it's even added to the queue.
Posted May 19, 2005 13:04 UTC (Thu)
by davecb (subscriber, #1574)
[Link] (1 responses)
I'm a performance engineer and tend to depend
--dave
Posted May 19, 2005 17:14 UTC (Thu)
by jsbarnes (guest, #4096)
[Link]
Posted May 20, 2005 21:42 UTC (Fri)
by xav (guest, #18536)
[Link]
Using nanoseconds for the unit seems like it might be slightly short-sighted. It's probably fine for now, but will it be too coarse twenty years from now? Wouldn't it be better to use picoseconds as the unit? That still allows for over four seconds to be represented in an unsigned 32-bit integer, or hundreds of years in a 64-bit integer.
A new kernel timer API
I'm not sure. In one picosecond, light has travelled about one third of a millimetre (a milli-foot?). Electrons not even that far. In a nanosecond, light moves about 30cm (a foot for you non-SI users).A new kernel timer API
A new kernel timer API
I
beleive we're reaching the point where higher clock-speeds are useless
and we have to start doing more per clock cycle,
People have been saying that for at least fifteen years, and the clock rates keep going up. As Feynman said, "there's plenty of room at the bottom". Eventually we will hit the physical limits, but we aren't there yet.
But you got your units confused. The smallest power of ten unit that fits is the nanosecond. The smallest power of 2 unit is 2^-32 seconds, which is about 250 picoseconds.A new kernel timer API
A new kernel timer API
But you got your units confused. The smallest power of ten unit that fits is the nanosecond. The smallest power of 2 unit is 2^-32 seconds, which is about 250 picoseconds.
That actually sounds like a useful and natural unit to use, and certainly easier for programmers to remember. It would allow the upper 32 bits to be equal to the Unix clock and allow conversion to a floating-point number of seconds without rounding errors.Why not units of 2^-32 seconds?
Natural for whom? The computer?
Why not units of 2^-32 seconds?
LOL! If we're making devices which are smaller than a third of a millimetre across, I imagine we'd get a much better return clustering a few thousand onto a single chip than we'd get by trying to make them all run at a terahertz clock rate.A new kernel timer API
A new kernel timer API
Todays standard is to move information from place A to place B within a manageable number of clock ticks. If a clock tick takes a picosecond, then the only requirement is that places A and B are never more than one third of a millimeter apart.A new kernel timer API
The distance between processor and main memory, for example, could be a load of clock ticks, often with several signals travelling on the same wire. But as long as equipment can handle asyncronous signalling (in addition to the speed of course) it's far from impossible.A new kernel timer API
Wouldn't it be better to use picoseconds as the unit?
Hmmn, does that mean we can have a portableA new kernel timer API
high-resolution timer interface in userland?
on (or port) implementations of the
POSIX hrtime().
The kernel already supports some POSIX clock routines, like clock_gettime, A new kernel timer API
as well as a few different types of clocks. If your platform supports it
(i.e. if you have a good clock source available and someone's written a
kernel driver for it), high resolution timers are available via that
interface.
There's already a possible confusion between relative and absolute nsecs. Maybe he should create types for both and some sparse-magic to check that only relative nsecs can be added to absolute nsecs.date & duration
