User-space interrupts

Posted Oct 14, 2021 17:13 UTC (Thu) by anton (subscriber, #25547)
In reply to: User-space interrupts by neilbrown
Parent article: User-space interrupts

Unix has the very good idea that everything is a file. One of the benefits is that you can write some general routine on top of the system calls, and it is useful for all kinds of things. This admittedly does not work all the time, but we should strive for it.

In the present case, requiring the general routine to know whether it is used on a file that can result in EINTR, and how to behave in that case (which is very likely application-dependent) breaks modularity.

The application and its signal handler know how to deal with the situation, and the longjmp() approach is a good one in that sense. Of course, leaving an asynchronous signal with longjmp() has its dangers, but that's still the way we chose in Gforth (where asynchronous signals are rare).

User-space interrupts

Posted Oct 14, 2021 19:51 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

IMHO the unnecessary complication here is that "regular files" are treated specially. When your "regular file" could be backed by a network filesystem, FUSE, NBD, etc. it really ought to be considered more like a socket, subject to potential short reads and returning EINTR on signals whether or not any data has been read.

Users generally expect to be able to use sockets and pipes in place of regular files, e.g. using process substitution in Bash or named FIFOs or Unix-domain sockets in the filesystem, or arbitrary paths under /proc/$PID/fd/. Unless there is a good reason to require capabilities specific to regular files, for example lseek() or mmap()—or the application creates the file itself with O_EXCL—then applications ought to expect that read() and write() may process less data than requested even if the normal case involves regular files.

As for the longjmp() approach, that only works because the kernel backs out of the blocking call before invoking the signal handler. (A longjmp() call from a signal handler can't perform a non-local return out of arbitrary *kernel* stack frames.) At that point it's mostly a matter of policy whether the kernel restarts the system call after the handler returns or just returns EINTR to the caller—either always restarting or always returning EINTR would not simplify the kernel signficantly—and in general matters of policy are best left to application or library code rather than the kernel. Wrapping every non-interruptable read() in a loop to restart it until you get all the data you wanted is not substantially more code, or more *complex* code, than wrapping every read() which you might want to interrupt in a call to setjmp() and communicating that fact to the signal handler so it can decide whether to call longjmp().

POSIX also has these caveats regarding longjmp() from a signal handler:

> It is recommended that applications do not call longjmp() or siglongjmp() from signal handlers. To avoid undefined behavior when calling these functions from a signal handler, the application needs to ensure one of the following two things: … After the call to longjmp() or siglongjmp() the process only calls async-signal-safe functions and does not return from the initial call to main(). … Any signal whose handler calls longjmp() or siglongjmp() is blocked during *every* call to a non-async-signal-safe function, and no such calls are made after returning from the initial call to main().

It would be difficult to guarantee either of these restrictions are met in a complex application with many library dependencies. For example, if you return from a signal handler with longjmp() and then call printf() without masking every signal whose handler could call longjmp() then you've already broken both of those rules and invoked undefined behavior.