|
|
Subscribe / Log in / New account

Virtual time

The developers interested in containers and virtualization have discussed interfaces to virtualize access to a number of system resources. None, however, have talked about virtualizing access to the system time. Until now, that is. With Jeff Dike's time virtualization patches any process tree can have its own idea of what time it is.

Jeff's patch adds a new "time namespace" structure to the task structure. By default, all processes share the normal host system's idea of time. But a new option (CLONE_TIME) to the unshare() system call allows a process to disconnect from the system time. After such a call, that process - and any children it creates - will be able to keep its own time value. Setting a virtualized time value is, unlike changing the normal system time, an unprivileged operation.

Internally, a virtualized time is stored as a simple offset; whenever a process requests the current time, the offset is added to the the current system time and the sum is returned. This approach has the advantages of being simple and fast; a process running with virtualized time also does not give up time adjustments made, for example, by NTP. On the other hand, this implementation does not support the ability to confuse processes by messing deeply with their idea of time - running time at a different rate, for example, or even backward. Chances are that this omission will not upset more than a small percentage of potential users of virtualized time, however.

Jeff's purpose is to speed up the gettimeofday() system call in User-mode Linux instances. If the kernel allows process subtrees to have their own time values, then User-mode Linux can simply use the host's gettimeofday() call, rather than intercepting that call and implementing it itself. Since gettimeofday() is one of the most frequently-used system calls, this optimization can make a significant difference.

One other change is required, however, for User-mode Linux to get the benefit from this change. UML performs much of its process control using ptrace(); in particular, it intercepts and interprets system calls with the PTRACE_SYSCALL operation. What is really needed for a fast gettimeofday() is the ability to not intercept that particular call. So Jeff's patch also extends ptrace() by adding a PTRACE_SYSCALL_MASK operation. This new operation can set a bitmask indicating which system calls should be intercepted, and which should be executed without stopping.

The result, with a suitably patched UML, is a gettimeofday() call which runs at about 99% of the native process speed. That may well be good enough to make this patch a piece of the growing set of interfaces supporting virtualization and containers.

Index entries for this article
KernelTimekeeping
KernelVirtualization


to post comments

gettimeofday() --- Move the system call into userspace

Posted Apr 21, 2006 16:37 UTC (Fri) by AnswerGuy (guest, #1256) [Link] (1 responses)

One novel approach to speeding up gettimeofday that I heard about a few
years ago (from Andrew Tridgell who implemented it on a different OS
and architecture) was to get ride of the gettimeofday() system call
entirely.

One model for doing this would be to use a read-only globally shared page which contained the current time (and things like the uname() struct pathconf() and sysconf() and whatever else will fit in one or two pages).

These pages are mapped into every process' address space (similar to how VDSO's are mapping a kernel hosted userspace threading library implementation --- but read-only rather than executable). Then the gettimeofday() and uname() and a few other system calls can be implemented in user space by he libraries without including any context switch overhead.

(Realistically you'd leave the system calls in for compatability but offer
a faster more lightweight method as described; perhaps adding printk() options to help identify those apps which were using the slower, heavyweight system call method).

One thing that's easy to misunderstand about UNIX is that the distinction between system call and library function can (from some perspective) be a bit arbitrary. Classically a system call is any function which interfaces to kernel space while library functions can be wrapped around system calls but are generally done entirely within a process' own address space (in user space). However, with some of the memory mapping tricks (and memory mapped I/O hardware features) it's possible for many operations that would conception involve system calls to be implemented as library functions with suitable memory mappings.

This is not to say that such memory mappings are "better" than system calls in the general case. However, for some things like gettimeofday() and uname() there is a pretty clear win on (almost?) any modern virtual memory architecture.

JimD

gettimeofday() --- Move the system call into userspace

Posted Apr 22, 2006 8:33 UTC (Sat) by Blaisorblade (guest, #25465) [Link]

That's indeed implement since some time (guess in the 2.5 era) in arch/x86_64/kernel/vsyscall.c, together with the time syscall and two empty slots.

Virtual time

Posted Apr 22, 2006 5:22 UTC (Sat) by skybrian (guest, #365) [Link] (1 responses)

There's actually a good reason for "messing deeply with their idea of time": testing applications that use timeouts and schedulers. For example, suppose you want to see what happens after a month's worth of nightly batch processes have happened. It's useful to be able to speed up time so it doesn't actually take a month to run the test.

There are many ways to do this, but running software in fast-forward would be a useful tool in the application developer's toolkit.

Virtual time

Posted Apr 25, 2006 2:14 UTC (Tue) by pm101 (guest, #3011) [Link]

From the other side of the equation, I would like to use this for avoiding timeouts. Quite a few applications time out after some time. I've even had free software applications tell me "You're running a version of XXXX, please upgrade." I've also had restricted materials time-out (in one case, a class distributed an educational application that stopped working when the semester ended). Bypassing these seems like a good and noble endeavor.


Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds