How to speed up system calls
There is an obvious solution to this problem: use the sysenter instruction instead. sysenter is quite a bit faster on modern Pentium processors. There are just a couple of problems: not all x86 processors support sysenter, and sysenter steps on registers in ways that can be hard to work around.
The lack of across-the-board support for sysenter is a problem. The kernel maintains a set of flags telling it what capabilities a given processor has; other processor-specific options are set at configuration time. System calls, however, are not invoked from the kernel - that is the C library's job. The last thing glibc needs is to be trying to figure out, at run time, the right way to invoke system calls.
Linus's solution to this problem is a patch which brings back a variant of an old idea. As of 2.5.53, the kernel will map a global, read-only page at the top of every process's address space. That page contains the optimal code for executing a system call on the current processor. Whenever glibc needs to call into the system, it simply sets up the registers and, rather than doing the old int $0x80, it jumps into the new page. The C library still needs to do a runtime test (since older kernels will lack this "vsyscall" page), but it need not concern itself with the detailed capabilities of different processors.
Keeping the registers straight turned out to be a trickier problem. The
way sysenter steps on registers makes it hard to invoke system
calls with more than five parameters. Various schemes were looked at,
including creating a new "extra argument block" or simply requiring that
six-argument system calls be invoked the old way. Linus finally came up
with a tricky solution that makes it all work, however; those of you who
like digging through x86 assembly may want to peek at his "absolutely wonderfully disgusting solution" to
the problem. "I'm a disgusting pig, and proud of it to boot.
"
The result of all this: the gettimeofday() system call runs in just over half the time on a P4 processor. The speedup on Pentium 3's is less - a factor of 1.2 - but is still worthwhile.
Now that the vsyscall page is in place, will it be used for other things,
such as implementing gettimeofday() entirely in user space? The
answer, for now, appears to be "no". Getting a user-space
gettimeofday() right is, seemingly, harder than it looks; there
are synchronization issues, especially on some SMP systems where the clocks
may not be synchronized by the hardware. So a user-space
gettimeofday() appears to not be in the works, for now at least.
