The trouble with the TSC
The time stamp counter (TSC) provided by x86 processors is a high-resolution counter that can be read with a single instruction (RDTSC), which makes it a tempting target for applications that need fine-grained timestamps. Unfortunately, it is also rather unreliable, so the kernel jumps through hoops to decide whether to use it and to try to detect when it goes awry. An effort to export the kernel's knowledge about the reliability of the TSC has met strong resistance for a number of reasons, but the biggest is that the kernel developers don't think that applications should be accessing the counter directly.
Dan Magenheimer and Venkatesh Pallipadi proposed adding a /sys/devices/tsc directory with several entries corresponding to the kernel's internal TSC information, including the tsc_unstable flag, which governs whether the kernel uses the counter as a stable time source. Andi Kleen questioned the idea:
That is exactly what the patch is meant to do, Magenheimer said, because applications have no reliable way to determine whether the standard system calls will be "fast" or "slow":
Note also that even vsyscall with TSC as the clocksource will still be significantly slower than rdtsc, especially in the common case where a timestamp is directly stored and the delta between two timestamps is later evaluated; in the vsyscall case, each timestamp is a function call and a convert to nsec but in the TSC case, each timestamp is a single instruction.
Depending on the hardware, gettimeofday() and
clock_gettime() may be implemented as vsyscalls—virtual
system calls—rather than standard
system calls, which eliminates the user space to kernel transition.
Vsyscalls are code that is stored in a special memory region in user space
(the vdso region)
that may access kernel-maintained data, like clock ticks.
Using vsyscalls, the calls are (relatively) fast, but on some hardware (or
virtual machines) that
requires kernel-space operations to get to a reliable counter, a vsyscall
cannot be
used, so the calls are slower. For applications that "need to obtain timestamp data
tens or hundreds of thousands of times per second
", the difference
is significant.
But Magenheimer believes that
if the kernel finds the TSC stable enough for its own timekeeping purposes, then that guarantees that it is usable by applications. Arjan
van de Ven and Thomas Gleixner are quick to correct that misunderstanding.
Van de Ven notes that the stability of the
TSC can change under certain circumstances and there would be no way to
notify the applications. His advice: "friends don't let friends use
rdtsc in application code
".
Gleixner goes into some detail about how
the TSC can get out of whack, including system management mode interrupts (SMIs)
fiddling with the TSC to hide their presence, that multiple cores can
have different values because of boot offsets and/or hotplugging, and that
multiple sockets can introduce differences due to separate clocks or drift
in the clock signals due to temperature. There is, in short, nothing
reliable about the TSC: "the stupid hardware is
not reliable whether it has some 'I claim to be reliable tag' on it or
not
". Gleixner did offer a possible alternative, though:
What we can talk about is a vget_tsc_raw() interface along with a vconvert_tsc_delta() interface, where vget_tsc_raw() returns you an nasty error code for everything which is not usable.
Currently, there are unnamed "enterprise applications" that attempt to figure out whether they can use the TSC, and do so if they think it will work because of the uncertainty in the performance of gettimeofday() and friends. Magenheimer suggests that perhaps that information could be made available:
Magenheimer also wonders if the kernel developers are suffering from "hot stove" syndrome, in that they have been burned in the past and are reluctant to even consider changes. But Gleixner and van de Ven both point out that there is no hardware that can make the guarantees that Magenheimer wants. And Gleixner has the burn marks to prove it:
While the discussion had various interesting analogies including hanging
ropes/knives and condoms versus abstention, it did not (yet) find a car
analogy. It did, however, seem to find some common ground that information
about whether the clock calls are implemented as vsyscalls or system calls
should be exported. That is unlikely to satisfy those that have been "using vsyscalls for a while and still have a
performance headache
", who Magenheimer quotes, but there is nothing stopping
applications from reading the TSC directly. Those applications just have
to be prepared to handle any strange TSC behavior they encounter.
Ingo Molnar tries to clarify the reasons
that the kernel can't export the reliability information: "The point is for the kernel to not be complicit in
practices that are technically not reliable.
[...]
So the kernel wont 'signal' that something is safe to
use if it is not safe to use.
"
But he also sees some reason to hope:
I really mean it - and it might be possible - but we have not found it yet.
Peter Zijlstra has another solution to the problem. He would like to see the kernel move to eventually disable RDTSC from user space. By emulating the instruction and logging all uses of it (and the related RDTSCP), user-space programs that use it could be identified and changed:
Of course closed source stuff will have to deal with it themselves, but who cares about that anyway ;-)
Exporting the information about whether gettimeofday() is "slow" or not seems like a reasonable starting point. No patches to do that have emerged yet, but it is a fairly straightforward thing to do. Eventually, something like Gleixner's vget_tsc_raw() may also come about, though it won't satisfy those who are unhappy with the current vsyscall performance. Those applications will just have to read the TSC themselves and deal with whatever the hardware throws at them.
Index entries for this article | |
---|---|
Kernel | Timers |
Posted May 20, 2010 13:49 UTC (Thu)
by abatters (✭ supporter ✭, #6932)
[Link]
Posted May 20, 2010 17:36 UTC (Thu)
by sustrik (guest, #62161)
[Link] (3 responses)
Posted May 21, 2010 17:04 UTC (Fri)
by blitzkrieg3 (guest, #57873)
[Link] (2 responses)
Posted May 21, 2010 17:13 UTC (Fri)
by sustrik (guest, #62161)
[Link]
Posted May 21, 2010 21:11 UTC (Fri)
by foom (subscriber, #14868)
[Link]
(hardware/kernel version with malfunctioning timekeeping, also running NTP; NTP will step the time once in a while because it can't keep up with the clock drift. You might say that's uncommon, but not so uncommon that you won't run into it!)
Posted May 20, 2010 19:35 UTC (Thu)
by dmadsen (guest, #14859)
[Link] (11 responses)
What happened to the philosophy of letting the {user,programmer} decide if he wants to shoot his foot? If someone is accessing the TSC, then he also should know the pitfalls.
Isn't there a better use of kernel developer time than creating needless restrictions?
Posted May 21, 2010 1:32 UTC (Fri)
by dlang (guest, #313)
[Link]
given how badly it works, they expect this to be the common case rather than the exception, so they don't want to enable something that doesn't work and will cause them problems.
Posted May 21, 2010 8:30 UTC (Fri)
by khim (subscriber, #9252)
[Link] (7 responses)
The ability to shoot his foot is all good and well if you only have knowledgeable, responsive people at the keyboard. Or massive per-review system. When IT was young all IT people were like this (hey, if you need a week just to run the program once you'll be careful... and ask a lot of other people to verify that you've not written crap). But over time as more and more people got access to the computer this approach started to fail: more and more clueless people appeared. First as users, then as developers too. And as access to the CPU become less and less expensive (hey, the CPU you have in your pocket is more powerful then what you had for million dollars fifty years ago) even clueful people started doing mistakes (there are not enough time to carefully reread every written line hundred times anymore). So today this approach is limited to the kernel - and may be even this group is too big. This is natural evolution. Compare with cars. Early models were primitive but allowed you to tinker freely - and it was easy to damage them by abuse. Today if you'll sell car which blows up if you press the gas pedal too much... well, you'll fired - best case. Worst case - you'll go to jail. Unix (and linux) always had the ability to restrict the user (file permissions and quotas). Today it can restrict the program too (seccomp, SELinux, AppArmor, etc). The next step if, if course, developers. And just like with cars: if you need highly specialized system which will not use the same roads as regular cars (off-road car or rocket car) - you can ignore the warnings and remove the "superfluous" checks - but this is not an options for 99% developers out there.
Posted May 22, 2010 6:38 UTC (Sat)
by dmadsen (guest, #14859)
[Link] (6 responses)
I should not have to pay because "more and more clueless people appeared". Shall we all then live to the lowest common denominator? My God, soon we'd all be using Windows Starter Edition! :-)
One way I got to be a "knowledgable, responsive [responsible?] user" was to hurt myself doing "stupid" things. Ctrl-Alt-Backspace *should* hurt -- once.
And you know, code review isn't so bad: when I coded at the keypunch, it required more debugging than when I used the coding pad. Writing code that I (and hopefully others) reviewed not only made better code then, but taught me (and others) to write better in the future. (I would recommend reading "The Psychology of Computer Programming" by Gerald M Weinberg for more information). And if I talked to someone who'd done rm -rf incorrectly, maybe I could avoid their pain.
If it is true that natural evolution is to produce buggy code quicker, than perhaps we should resist. Maybe sometimes celerity isn't a virtue, and the modern motion of the inevitability of bugs in code isn't true.
The assumption that kernel code is somehow special and should be specially treated is wrong -- *someone* is going to depend on code you write no matter how big or small the project, and an error in that code is gonna cause *someone* some trouble. If you don't believe that, than you should not be writing code for others to use.
File permissions, quotas, etc are there to mainly stop a system user from hurting other system users. That's one of the normal policies of an OS. On the other hand, training wheels are fine for novices, but they are meant to come off.
Again, I'm not talking about removing for everyone the equivalent of safety gear in a program. What I'm talking about is deliberately engineering something so that any safety gear is [almost] impossible to remove. In normal use, I should be able to use something safely, but, in my freedom, I must be able to remove or modify it *even if you don't think I should*.
Keeping up with the car analogy, would you buy a car deliberately built so it cannot go faster than 65 MPH and attempts to circumvent that would be illegal?
I do understand -- and appreciate -- the point that they aren't only trying to protect others, but that the kernel devs don't want to hear clueless whining.
My point -- and really my main point -- is that when you operate to remove another's freedom, it had better be at more than just a whim, and more than your convenience. In this case, that fact that you have a couple people saying "don't do it!" to me means that it shouldn't be done.
It's not as if there aren't a lot of other kernel decisions that people can't whine about, you know. This is just yet another reason for those on LKML to say "RTFM. Go Away.". :-)
Posted May 22, 2010 10:26 UTC (Sat)
by khim (subscriber, #9252)
[Link] (3 responses)
You are not paying "more". If anything you pay less - old systems can be bought for cheap (unless they are very old and reach the antique category). You want new stuff for cheap? Sorry, it's not for you - it's cheap because of the economy of scale and this economy is only possible because it's not for you. You think that being a "knowledgable, responsible user" is worthwhile goal and worth spending time. Most users don't think so - and since systems are created for them the rules are adjusted for them. Bug-free software is absolutely impossible. Deal with it. You can create tiny kernel which is mostly bug-free (and no, linux is way too big for that), but the rest of system will be bug-infested no matter what the programmers will be doing. Brilliant observation! That's why I have zero sympathy for freetype developers. They brought the problems on themselves by exposing private interfaces - so not they should deal with the fallout. They should have done what all sensible people are doing: attached the visibility ("hidden") to all internal functions. Helmet, on the other hand, is meant to be used by everyone. Yes, some things should only be enabled in debug libraries (like _GLIBCXX_DEBUG) and some should be enabled all the time. This is the right way to design things. Safety gear must be built robustly - or else it'll not work. It can always be removed by direct modification of program source (or even binary if the need is really strong). The ability to turn it off easily is actively harmful. Brilliant example. Of course not! 65 MPH is not enough - there are roads where you can travel faster! Real car's speed limit is set to 155mph or 112mph (depending on country). On the other hand, if you meat to say that artificial limits in cars is something unimaginable you've utterly failed: not only it's imaginable - it's widespread in real world! See the freetype example above. API usage freedom is not right, you must earn it - by showing use cases where it's needed and where nothing else really fits. See the recent dicussion related to such right: it looks like Google will get the wakelocks in the end, but it was not an easy sell. And this is good thing. Often it's much easier to just muck with system internals rather then use proper interfaces - but this leads to the Windows-like mess, where you can't change anything without breaking something.
Posted May 23, 2010 5:40 UTC (Sun)
by dmadsen (guest, #14859)
[Link]
This has brought home to me in a personal way the difficulties that a successful project manager must have when working with a diverse population, especially regarding cohesiveness and consistency of vision. Gentlemen, my hat's off to you!
Posted May 23, 2010 13:49 UTC (Sun)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted May 23, 2010 17:54 UTC (Sun)
by khim (subscriber, #9252)
[Link]
Posted May 30, 2010 4:32 UTC (Sun)
by fredi@lwn (subscriber, #65912)
[Link] (1 responses)
Posted Jun 1, 2010 5:30 UTC (Tue)
by dmadsen (guest, #14859)
[Link]
But I'm also talking about something a bit more subtle, which says "[don't restrict me] ** only because you think it's for my own good **". In this case, the reasoning is "let's disable it because it's not reliable and someone using it might have negative results". As a philosophy, this says "protect people from harming themselves".
My point is that to go down this path *in a general way* is a slippery slope; where do we stop in making a "safe" system, one where a {user,developer} can't hurt himself? Shall we, for example, as a policy decision, eliminate the ability to remove files from users and make it for root only?
I am aware, of course, that different functionality is directed towards users of different experience levels, and how that might affect any let's-protect-the-user-from-himself decisions.
But much learning comes from making mistakes -- and the "baby-proofing" one does in a home with toddlers is only enough to so that the baby doesn't get permanently damaged before he learns.
So I re-emphasize that we must be careful not to *blindly* apply the "let's remove that function for his own good" philosophy. Each case must be thought about carefully. And even then, should the decision to be made to protect the user from himself, it should be easy for that user to say "I've learned I can hurt myself, and I don't want to be coddled/restricted anymore because I've also learned how to use that sharp knife you were shielding me from". That's where freedom comes in.
And I re-state that the particular function in this article is surely not likely to be used by someone who doesn't already have the level of knowledge to understand the potential pitfalls. That is, he should not need to be protected, as he knows the stove is hot already.
Posted May 21, 2010 10:20 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
There's a well-supported interface that (in theory) could be almost as fast as RDTSC for the case where the TSC is reliable - it's called gettimeofday(). It has the advantage that it can be reliable even when the TSC is not, and can exploit future timers that are even better than the TSC. And, with vsyscall support, it's all userspace, so you don't pay the price of a context switch, just of converting the TSC to seconds.
Given this, why not disable userspace support for RDTSC when it's not needed as part of a fast implementation of gettimeofday()? Applications on a custom platform can use a slightly modified kernel that always permits RDTSC (and the serializing variant RDTSCP), while applications that can't control the platform should rely on gettimeofday(). The missing bit is kernel support for telling you whether gettimeofday() is fast or slow - if it's slow, you can log this, and disable things that depend on gettimeofday() being fast.
Posted May 21, 2010 10:39 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link]
There's a balance to be struck and I think that the difference between what people think the TSC will do (cheap, fast, reliable way to measure time) and what they get (rarely all, and often none of the above) makes the status quo acceptable. I wouldn't support actively blocking use of TSC, but it certainly makes no sense for the kernel folk to endorse it as this patch proposed.
TSC useful for embedded
The trouble with the TSC
The trouble with the TSC
The trouble with the TSC
The trouble with the TSC
The trouble with the TSC
The trouble with the TSC
The "philosophy" was dead for a long, long time aready... deal with it
The "philosophy" was dead for a long, long time aready... deal with it
Look around you before answering... please...
I should not have to pay because "more and more clueless people appeared".
One way I got to be a "knowledgable, responsive [responsible?] user" was to hurt myself doing "stupid" things.
Maybe sometimes celerity isn't a virtue, and the modern motion of the inevitability of bugs in code isn't true.
The assumption that kernel code is somehow special and should be specially treated is wrong -- *someone* is going to depend on code you write no matter how big or small the project, and an error in that code is gonna cause *someone* some trouble.
File permissions, quotas, etc are there to mainly stop a system user from hurting other system users. That's one of the normal policies of an OS. On the other hand, training wheels are fine for novices, but they are meant to come off.
What I'm talking about is deliberately engineering something so that any safety gear is [almost] impossible to remove.
Keeping up with the car analogy, would you buy a car deliberately built so it cannot go faster than 65 MPH and attempts to circumvent that would be illegal?
My point -- and really my main point -- is that when you operate to remove another's freedom, it had better be at more than just a whim, and more than your convenience.
Look around you before answering... please...
Look around you before answering... please...
You are right, of course. Freetype 2.1 was released in 2002 and gcc only got visibility attribute in 2003. And other methods are not as nice. But we are in 2010 - and yet Freetype 2.3.12 (released just a few months ago) does not use visibility(hidden). Not even as option!
I can understand SNAFU with freetype 2.1, but why persist in folly ?
The "philosophy" was dead for a long, long time aready... deal with it
The "philosophy" was dead for a long, long time aready... deal with it
The trouble with the TSC
The trouble with the TSC