LWN.net Logo

The last-minute unshare() discussion

One of many new system calls added in the 2.6.16 kernel is unshare(). Its purpose is to perform the opposite of the various sharing flags provided with clone(): it is used to disconnect some of a process's resources from those of its ancestor and sibling processes. With unshare(), a process can ask to have its own filesystems, namespaces, or file descriptor table. The unsharing of other resources, including semaphore undo information, virtual memory, signal handlers, and more is stubbed in for future releases.

A couple of last-second issues with unshare() surfaced just as 2.6.16 was being prepared for final release; only some of those issues were resolved in the resulting kernel.

One of those had to do with the implementation of unshare(CLONE_VM), which causes the calling process to stop sharing memory with others. It seemed that this functionality was present and complete, until Oleg Nesterov noticed that the code does not take into account the possibility that a core dump of the address space may be in process. The solution, for now, is to simply disable unsharing of memory. It seems that there is nobody who needs this feature immediately, and it was too late to be trying to fix up a core memory management function.

Eric Biederman raised a couple of other issues relating to the unshare() API which he would have liked to see fixed before that API becomes part of a released kernel. One was the use of the same set of flags used by clone() to specify sharing. Eric says:

sys_unshare can't implement half of the clone flags under any circumstances and those that it does implement have subtlely different semantics than the clone flags. Using a different set of flags sets the expectation that things will be different.

That discussion did not get very far, however; Linus prefers to use the same flags, and nobody else seems to be terribly upset about it.

Eric's other point was that unshare() does not test for unrecognized flags; they are silently ignored. So user space can ask for the unsharing of resources which are not known to - or supported by - the unshare() call and no error status will be returned. This behavior could be a problem in the future, when the set of legal flags for unshare() is expected to grow. A program written to use one of the new flags may not do the right thing if it is subsequently run on a 2.6.16 kernel; the functionality it asks for will not be present, but the kernel will not inform it of the fact.

The patch submitted by Eric addressed both issues: the names of the flags and testing for unrecognized flags. It was not merged for 2.6.16, however. The unrecognized flag test, on its own, might have gotten in (and such a patch has been merged for 2.6.17), but the combined patch didn't make it. Andrew Morton remarked: "Your single patch did two different things - there's a lesson here." The creation of tightly-focused patches truly is important, especially just prior to a final kernel release.


(Log in to post comments)

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds