Not really userspace

Posted Jul 17, 2024 21:08 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
In reply to: Not really userspace by ddevault
Parent article: Redox to implement POSIX signals in user space

Ah, OK. This makes sense. You still need tools to poke the target memory across the process boundaries, but it's really no different than ptrace().

In your case, the supervisor (signald?) will be responsible for all the signals in the system. It'd be OK for POSIX software, and native software will just use the "suspend thread" call directly instead. Like Go runtime that uses signals to pre-empt threads that are running inside tight inner loops for their user-space scheduling.

Not really userspace

Posted Jul 18, 2024 9:22 UTC (Thu) by 4lDO2 (guest, #172237) [Link] (4 responses)

> You still need tools to poke the target memory across the process boundaries, but it's really no different than ptrace().

Assuming the current implementation proposal does not significantly change when the process manager takes over the signal sending role from the kernel, this is to some extent true, but it doesn't require any dynamic mapping of other processes' memory, or using kernel interfaces for register modification. The target threads themselves handle register save/restore, and the temporary old register values (like the instruction pointer before being overwritten) are stored in a shared memory page, so apart from the suspend/interrupt logic, the kernel only needs to be able to set the target instruction pointer. It's too early to say, but maybe this will be reduced to a userspace-controlled IPI primitive?

(The kernel does already support ptrace though.)

Not really userspace

Posted Jul 18, 2024 19:51 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

> It's too early to say, but maybe this will be reduced to a userspace-controlled IPI primitive?

Looks like it. Basically, the whole design can be:

1. A separate signald process that provides the API for the signal masking and queueing.
2. Signal functions in libc simply do RPC calls to the signald.

The kernel then needs to have this additional functionality:
1. A syscall to pause a given thread, and return the thread context (register file and whatever additional information needed). The pause functionality can work even if the thread is in the kernel space, or it can be deferred to the syscall return time.
2. A syscall to resume a given thread with the provided thread context.
3. Asynchronous exceptions (like SIGBUS/SIGSEGV) in the kernel automatically pause the offending thread, and submit the thread context to the signald via some kind of IPC.

Signald can then do all the processing and masking logic. It also neatly removes from the kernel all the corner cases with double signals, signal targeting, and so on.

It also opens the way for a better API in the future.

Not really userspace

Posted Jul 19, 2024 15:34 UTC (Fri) by 4lDO2 (guest, #172237) [Link] (2 responses)

> 1. A separate signald process that provides the API for the signal masking and queueing.
> 2. Signal functions in libc simply do RPC calls to the signald.

This is not how the current implementation works, and would probably be too inefficient for signals to be meaningful for non-legacy software. Currently, sigprocmask/pthread_sigmask, sigaction, sigpending, and the sigentry asm which calls actual signal handlers, are implemented without any syscalls/IPC calls, but instead only modify shared memory locations. Sending process signals (kill, sigqueue) requires calling the kernel (later, the process manager) for synchronization reasons. And although sending thread signals (raise, pthread_kill) currently also calls the kernel, it's possible the latter will also be possible to do in userspace too, only calling the kernel if the target thread was blocked at the time the signal was sent, like futex, which is what I meant by "userspace-controlled IPI primitive".

> The kernel then needs to have this additional functionality:
> 1. A syscall to pause a given thread, and return the thread context (register file and whatever additional information needed). The pause functionality can work even if the thread is in the kernel space, or it can be deferred to the syscall return time.
> 2. A syscall to resume a given thread with the provided thread context.
> 3. Asynchronous exceptions (like SIGBUS/SIGSEGV) in the kernel automatically pause the offending thread, and submit the thread context to the signald via some kind of IPC.

That is exactly what ptrace allows, but this signals implementation is not based on tracing the target thread and externally saving/restoring the context, it's based on *internally* saving/restoring the context on the same thread. Very similar to how an interrupt handler would work. The kernel only needs to be able to save the old instr pointer, jump userspace to the sigentry asm, mask signals, and the target context will *itself* obtain a stack and push registers, etc. The same applies for exceptions, which will be *synchronously* handled (using a similar mechanism as signals), also analogous to CPU exceptions like page faults. Though it might make sense to allow configuring exceptions asynchronously as an alternative, so a (new) tracer is always notified when e.g. a crash occurs, if a program is not explicitly prepared for such events.

Not really userspace

Posted Jul 19, 2024 17:40 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> This is not how the current implementation works, and would probably be too inefficient for signals to be meaningful for non-legacy software

Honestly, signals shouldn't be used for non-legacy software. It's a bad primitive, they're not composable, there's a limited number of them, and so on.

If instead you have a primitive specifically designed as a way to manipulate running threads, then it might be more useful. A great example is Go runtime that is using signals to interrupt inner loops. Once the thread is interrupted, they run a conservative pointer scanning on the most recent stack frame and registers, to protect new objects against being garbage-collected.

Additionally, the handler doesn't _have_ to be in a different process. It can be in a background thread within the same process, so the amount of context switches can be the same compared to regular signal handling.

> That is exactly what ptrace allows, but this signals implementation is not based on tracing the target thread and externally saving/restoring the context, it's based on *internally* saving/restoring the context on the same thread.

Yeah, that has been a constant issue with signals. It depends on the thread's environment being sane, so sigaltstack() was an inevitability. And if you have sigaltstack(), then why not just extend it to handling via an IPC?

Not really userspace

Posted Jul 19, 2024 21:14 UTC (Fri) by 4lDO2 (guest, #172237) [Link]

> Honestly, signals shouldn't be used for non-legacy software. It's a bad primitive, they're not composable, there's a limited number of them, and so on.

I agree signals are a bad primitive for high-level code, and it's a shame POSIX has reserved almost all of the standard signals, many of which signals are too low-level for (SIGWINCH for example). Signalfd or sigwait are much better in those cases, or using a high level queue-based abstraction like the `signal-hook` crate. It would probably be better if the 'misc' signals were instead queue-only, or not signals at all, and if exceptions and signals would be separated. And possibly making SIGKILL and SIGSTOP non-signals.

> If instead you have a primitive specifically designed as a way to manipulate running threads, then it might be more useful.

This is sort of what I'm trying to reduce the kernel part of the implementation into. Just a way to IPI a running thread and set its instruction pointer, and then let that thread decide what it should do. Possibly even literally using IPIs, such as Intel's SENDUIPI feature, and possibly using "switch back from timer interrupt" hooks (with the additional benefit of automagically supporting restartable sequences). This would be without any context switches at all, although a mode switch for the receiver, if both the sender and receiver are simultaneously running.

This is of course useful for runtimes like Go, the JVM, and possibly even async Rust runtimes (maybe a Redox driver can be signaled directly if a hardware interrupt occurs, coordinated with the runtime), which aren't (necessarily) based on switching stacks.

> Additionally, the handler doesn't _have_ to be in a different process. It can be in a background thread within the same process, so the amount of context switches can be the same compared to regular signal handling.

> Yeah, that has been a constant issue with signals. It depends on the thread's environment being sane, so sigaltstack() was an inevitability. And if you have sigaltstack(), then why not just extend it to handling via an IPC?

Switching stacks is, apart from TLS (assuming x86 psabi TLS is required), virtually the same thing as switching between green threads, and on some OSes regular threads and green threads are even the same (pre-Windows 11 with UMS, AFAIK). That could perhaps eventually include Redox. I don't understand why one would want IPC (assuming you mean process and not processor) except when tracing, as that'd suffer from the usual context switch overhead, which is probably too high for a language/async runtime.