Microsoft research: A fork() in the road

Posted Apr 10, 2019 19:15 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
Parent article: Microsoft Research: A fork() in the road

One interesting method is to create a process in a quiescent state and then just poke it from the parent process until it's ready. Then just start it.

This neatly avoids all the complications of forking and memory overcommit.

That's what Fuchsia does, btw.

Microsoft research: A fork() in the road

Posted Apr 10, 2019 20:01 UTC (Wed) by roc (subscriber, #30627) [Link]

The paper mentions this under "Cross-process operations".

Microsoft research: A fork() in the road

Posted Apr 11, 2019 15:26 UTC (Thu) by sbaugh (guest, #103291) [Link]

I'm hesitant to comment here because it's not done, but I've been working on an implementation for Linux of cross-process operations, so that inchoate processes can be created and manipulated from other processes.

The implementation (as some other comments speculate about) is as a userspace stub which receives syscalls to execute over some transport, and sends their results back. I use a pair of file descriptors, but other transports could be implemented too.

The issue with ptrace is not just that it's hard to use, not just that it's slow, but also that there can only be one ptracer at a time. A program that used ptrace in normal operation to manipulate its children would be much less compatible with strace, gdb, and other tools. That's not workable for a general purpose API.

Furthermore, ptrace puts limits on what kind of transport can be used between the stub and the main process. It would be nice to use shared memory to send syscall instructions to the stub, to improve performance when much setup must be done. As it stands, with a pipe used for transport, this API is actually network transparent; this could allow for some interesting novel APIs for starting and manipulating processes on different hosts.

The hardest part has been the need to create new abstractions that use this new way of executing syscalls. I couldn't think of an acceptable and performant way to reuse existing functions which implicitly make syscalls in the current process, in this new world where syscalls are done in the explicit context of some arbitrary process handle. So a fair bit of reinvention has been required to support explicitly specifying the process to operate on.

Another difficulty is the book-keeping of resources (file descriptors, paths, pointers) across multiple processes. Treating file descriptors as ints is difficult to keep straight when working with multiple file descriptor tables across multiple processes, where the same int might refer to different file descriptors in different processes. So I've had to develop multiple layers of abstractions for user programs which manipulate other processes: one layer which works with raw int file descriptors, and other layers on top of it which work with file descriptors as a combination of an int and the fd table it is valid within. Similar abstractions are needed for other resources as well.

It's so far very expressive and powerful. It's been surprisingly easy to adapt my development to this new way of spawning and manipulating processes. I definitely think that cross-process operations (more generally, explicitly specifying the thing to act on in all syscalls, instead of implicitly working on the current process or whatever) are the right design for operating systems; it's much more expressive than both the posix_spawn style and the fork style.