Serious vulnerability fixed with OpenSSH 9.8 [LWN.net]

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 13:38 UTC (Mon) by mss (subscriber, #138799) [Link] (38 responses)

I wonder whether it would be possible to catch calls from a signal handler to most common / risky async-signal-unsafe glibc functions (for example: malloc(), free(), pthread_mutex_lock(), etc.) and call abort() in this case.

This would probably require some sort of thread-local-storage flag, set before any user signal handler starts executing and reset on the return to the original interrupted code (via a trampoline?).
Or maybe even a "signals being handled" counter for easier nested signal handling.

I'm sure this would uncover a lot of broken programs at first - much like Glib's GSlice allocator removal did - but eventually they would get fixed and the whole ecosystem would benefit in the long run.

There even could be some way to disable this behavior for old non-conforming programs that can't be fixed for any reason.

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 15:05 UTC (Mon) by wsy (subscriber, #121706) [Link]

Static analysis should be able to catch this too.

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 15:15 UTC (Mon) by rweikusat2 (subscriber, #117920) [Link] (16 responses)

That's not going to work because "got called from a signal handler" is not good (or bad) enough as criterion to determine that there's a problem. That's only the case if the signal hander interrupted a function which isn't async signal safe. Eg, a program can run with all handled signals blocked all the time and just unblock them as part of an synchronous I/O multiplexing call like pselect, ppoll or epoll_pwait. In this case, signal handlers defined in a suitable way¹ can call whatever functions they want to call as no unsafe function will ever have been interrupted.

¹ Via sigaction using a signal mask which ensures that signal handlers themselves cannot be interrupted by handled signals.

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 20:17 UTC (Mon) by geofft (subscriber, #59789) [Link] (15 responses)

Sure, but there's very little reason for a signal handler to be calling async-signal-unsafe functions at all, so it's fine for the rule to reject a few technically harmless programs.

You don't need to be doing nontrivial work inside your signal handler. If you're doing a standard I/O-multiplexing main loop with pselect or its ilk, all your signal handler needs to do is to set a flag or increment a counter that is checked by your main loop. Then you can do signal processing in the normal flow of your program. For instance, if you're a terminal app that responds to either input or the window being resized, you can have both of these flow into the same redraw code path instead of having a second special path for SIGWINCH. (This is equivalent in control flow to the old self-pipe trick, except without actually requiring a self-pipe.)

The consequence is that you can't handle a signal in the middle of handling input and you have to wait until the main loop. For instance, you can't handle a window resize while halfway through processing keyboard input. But that's exactly the same restriction as in the approach you propose with blocking all the signals and unblocking them with pselect: the signal can't be delivered while you're in userspace code, and is only delivered the next time you unblock it when you call pselect.

And neither approach deals with handlers for signals like SIGSEGV, which cannot be usefully deferred and must be handled immediately. You cannot use async-signal-unsafe functions to handle such signals at all.

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 20:57 UTC (Mon) by rweikusat2 (subscriber, #117920) [Link] (14 responses)

It's not a fine rule because it rejects perfectly correct programs making sensible use of existing kernel facilities: The kernel can dispatch signals to signal handlers just fine, there's no reason to reinvent that in userspace using awkward constructions like global flag variables to add special-case processing to loops dealing with I/O events.

Signals used for synchronous error notifications, eg, SIGSEGV, SIGFPE and SIGPIPE, can arguably not usefully¹ be deferred, but that's a different conversation.

¹ There's nothing which stops a program from blocking SIGSEGV. But that's not going to accomplish anything sensible (I can think of at the moment).

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 21:10 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

There are programs that use SIGSEGV for things like garbage collector read barriers during compaction. But that's a fairly niche usage.

Kinds of Signals

Posted Jul 2, 2024 15:23 UTC (Tue) by rweikusat2 (subscriber, #117920) [Link] (3 responses)

The original Bourne shell (in-)famously used SIGSEGV for memory allocation. Reportedly, it was coded based on the assumption of a heap of infinite size and caught and handled SIGSEGV to extend the heap as necessary for the faulting access to succeed. But that's really a completely different conversation.

There are 2 kinds of signals, signals generated because of asynchronously occurring events which are unrelated to the control flow of a program and signals which are synchronously generated because of something the program did. Many signals programs may need to handle belong into the first category, eg, SIGINT, SIGIO or SIGALRM. A strategy for handling such a signal is to defer it until a time when it's known that the process can safely be interrupted, ie, when it's blocked in the kernel waiting for something to happen.

This cannot be done for the second kind of signals. Eg, the following program (written with ed ;-)

#include <stdlib.h>
#include <stdio.h>

int main(int argc, char **argv)
{
int a, b;
a = atoi(argv[1]);
b = atoi(argv[2]);
printf("a / b + 1 = %d\n", a / b + 1);
return 0;
}

will be terminated by a SIGFPE when the second argument was 0 or by a SIGSEGV when less than 2 arguments were provided. There's no way it could continue executing correctly "until later" after either signal was generated.

Kinds of Signals

Posted Jul 2, 2024 17:11 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

> There are 2 kinds of signals, signals generated because of asynchronously occurring events which are unrelated to the control flow of a program and signals which are synchronously generated because of something the program did.

Yeah. Windows deals with these signals using SEH (Structured Exception Handling) which is only slightly better than signals.

In theory, a signal can be split into two primitives:
1. Pause the offending process (SIGSTOP?).
2. Notify the handler running in a separate thread (or process) to handle the condition.

Having an explicit "paused" state is actually a good primitive in itself, and might be useful for other purposes (e.g. for a better process launch API). It's not _entirely_ better, you still have to deal with functions that might be terminated at a basically arbitrary point. But at least it reduces the asynchronous suspension problem to more well-known multithreading.

Kinds of Signals

Posted Jul 2, 2024 17:25 UTC (Tue) by geofft (subscriber, #59789) [Link] (1 responses)

Right, but also that second kind of signal handler cannot run in the scheme you proposed upthread where the signal is blocked until the program has finished executing any async-signal-unsafe functions in its normal flow. So we should not be thinking of them as the same sort of thing at all, even though they happen to both be implemented through UNIX signals.

Signals like SIGWINCH, SIGINT, SIGTSTP, SIGCHLD, SIGUSR1, etc. should be processed by a signal handler that does as little as possible and defers the work back to the main part of the program. If you wish, you can also defer the execution of the signal handler itself.

Signals like SIGSEGV, SIGFPE, etc. are genuinely hard to handle. You cannot use async-signal-unsafe / non-reentrant code to handle them, in the colloquial sense of "handle" as in "respond to": the signal handler is called at an arbitrary point that you cannot guarantee is not in the middle of some problematic function (or even between two CPU instructions implementing a single logical operation, like writing to a long-enough integer), and you have no ability to defer handling it. You need to be careful and there's no real way around it. In practice, for most programs the only useful thing to do is to print a backtrace or similar error message and exit, and you should just find some library that is designed for this purpose and can print a backtrace without any async-signal-unsafe calls. (For some very carefully designed programs e.g. in a managed runtime, you might be able to unwind the managed stack via unmanaged code, or inject a stack frame to perform unwinding and return to that, but now you're talking about something very different from OpenSSH as it exists today.)

(And of course I am putting aside the possibility of raising SIGSEGV etc. manually, which is deeply silly but means this particular signal is more like the first category - you can meaningfully block such a signal and receive it later.)

There are also two signals which in my opinion should not be caught at all - SIGPIPE and SIGFSZ. Both of these signals have the property that the default disposition terminates the process, and they otherwise cause specific errnos to be returned, EPIPE and EFBIG. The default disposition is useful for ease of programming (e.g., SIGPIPE causing the process to silently terminate means that `yes | head` works the way you want with the obvious implementations of both commands), but if you're going to handle it yourself, you can just ignore the signals and handle the errno from write() - there's no advantage in handling it in a separate part of the code. While you can defer handling the signals, the calling code will have already gotten EPIPE/EFBIG, so it doesn't seem particularly useful to handle them asynchronously.

Kinds of fault handlers

Posted Jul 2, 2024 20:41 UTC (Tue) by riking (subscriber, #95706) [Link]

The first action of every useful SIGSEGV handler (that might successfully return to retry the instruction) I've seen is to perform a range check on the faulting memory address, the faulting program counter, or both. Either we're touching a special memory region where I know how to map in the memory, or we're running a special code region that wants me to signal failure and jump the PC to the failure branch.

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 21:25 UTC (Mon) by geofft (subscriber, #59789) [Link] (8 responses)

Yes, I guess my position is that it's fine to reject some perfectly correct programs that could be sensibly written some other way, if the goal is protecting from exploitable errors. It's possible to construct a perfectly correct program that fails any possible check/rule you might want to write about signal safety (this basically boils down to the halting problem), but that's no reason to entirely give up on trying to make things better.

SIGWINCH is a form of I/O, fundamentally. If you were rendering to an X window, a resize would be delivered as I/O. If you are internally implementing telnet or something, you would get informed of a resize on a remote terminal as I/O. The fact that it isn't I/O is, in my opinion, the awkward bit, and I expect most programs would find it less awkward to handle it in the control flow they use for all other terminal input instead of with a special case.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 15:49 UTC (Tue) by rweikusat2 (subscriber, #117920) [Link] (7 responses)

This sensible other way - Just make sure signal handlers don't call functions which aren't async-signal safe! - has bitten the OpenSSH project in the backside twice in the last 25 years (2006 and now). IMHO, this means it isn't very sensible. If the OpenSSH people cannot reliably get it right, who else will?

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 16:49 UTC (Tue) by geofft (subscriber, #59789) [Link] (6 responses)

"Just make sure signal handlers don't call functions which aren't async-signal safe!" is not my position. I agree it is not sensible and I agree that is the root of the problem here.

My position is that signal handlers should not do any nontrivial work at all, and ideally not call any functions at all - they should just store data into a preallocated buffer of some sort (which could be a simple flag or counter, if you don't need any metadata sent with the signal) and return immediately. If you absolutely must, you can call write(2) and no other functions, either to do a self-pipe or to stderr. If you need to abort the program, you can also call _exit(2) (not exit(3)!). That's it.

If you're doing any amount of work in the signal handler, there is too much risk of calling an async-signal-unsafe function (e.g., syslog) because you start thinking of it as a normal function - or, as OpenSSH did, because you call other normal functions in other files that are async-signal-safe today but might not be tomorrow.

Your position, as I understand it (and please correct me if I'm wrong) is that it's actually fine for signal handlers to call functions that aren't async-signal-safe, provided that you make sure that the signals are blocked while any such functions might be running and only unblocked when any such functions are complete. I argue this is no less dangerous; just like how, in a complicated program, you might take a complicated function that only calls async-signal-safe functions and introduce an async-signal-unsafe call without realizing what you're doing, you might also unblock a signal (e.g. by calling pselect() or something) from inside async-signal-unsafe / non-reentrant code without realizing what you're doing. In the example here of using alarm() to interrupt a connection after a timeout, it is quite plausible that someone might find that the alarm isn't being delivered as promptly as needed and unblock SIGALRM in an unsound way to solve the bug.

I am in agreement with you that you should impose a rule on the programmer that a complicated response to a signal should be executed at a time when it is not interrupting other work in the program. My claim is there's a straightforward way for the programmer to abide by this rule: reuse the mechanisms that the program already has for making sure that responses to other external events - i.e. I/O - are executed without interrupting other work in the program, whatever that mechanism is. If it's an event loop managing non-blocking code, hook into the event loop. If it's straight-line code doing I/O, check for EINTR and see if a flag got flipped.

By not introducing a new type of control flow (essentially a COME FROM!), the question of how to get that control flow right disappears. UNIX signal handlers don't give you a way to fully avoid the change in control flow, but if your signal handler just writes to a variable (atomically/consistently) and returns, then you are as close as possible to avoiding the change in control flow.

Another totally reasonable option, instead of a minimal signal handler, is to make a dedicated signal-handling thread that sits in sigwaitinfo() and then uses thread-safe mechanisms to convey receipt of the signal back to the main thread. This actually does avoid any change in control flow - assuming your program is already multi-threaded. If a program is currently single-threaded, then you can't actually fully handle the signal in the new signal-receiving thread because the data structures you will use to do so probably aren't thread-safe, so there's no particular advantage over having a minimal signal handler, so you'll have to resist the temptation to do anything other than, again, writing to some thread-safe data structure (e.g., pthread_cond_signal) and going back to sigwaitinfo().

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 17:35 UTC (Tue) by rweikusat2 (subscriber, #117920) [Link] (5 responses)

I fail to see any difference between "just don't call functions which aren't async signal sage" and your elaboration about just that, because that's what it boils down to. What precisely is or isn't "nontrivial work" is in the eye of the beholder. OpenSSH has had two vulnerabilities because of this approach in the last 25 years. And OpenSSH is a high-profile security-related program developed by experts in this area.

An alternate option is to defer handling of asynchronous signals until a time when the process can safely be interrupted. For a program structured around a synchronous I/O multiplexing loop, this will be when the process has again called into the kernel to be notified when more I/O is possible. System call variants exists specifically for this use case, namely the p-variants of select, poll etc. This obviously also requires that all handled signals are blocked while signal handlers themselves are executing as they will have been unblocked by the I/O multiplexing call. An advantage of this is that it doesn't need userspace code for dispatching signals.

Yet another option is to block all signals an application wants to handle and receive signals as input on a signalfd file descriptor. This needs some form of userspace signal dispatching but will work in environments where sigaction isn't easily usable (eg, perl).

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 21:40 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (4 responses)

> What precisely is or isn't "nontrivial work" is in the eye of the beholder.

The comment you are replying to just explained exactly what "nontrivial work" is. It's anything other than setting a flag, writing one byte into a self-pipe (or the like), or calling _exit(2). These are not examples. They are an exhaustive list. Literally any other code under the sun is "nontrivial."

> System call variants exists specifically for this use case, namely the p-variants of select, poll etc. This obviously also requires that all handled signals are blocked while signal handlers themselves are executing as they will have been unblocked by the I/O multiplexing call. An advantage of this is that it doesn't need userspace code for dispatching signals.

This works great, right up until one of your libraries tries to use alarm(2) (or sleep(3), usleep(3), setitimer(2), etc.) to do something interesting, and finds that signals are blocked and it does not work. And yes, that probably does go against the "no nontrivial work" rule in most practical cases, but in userspace, the sad reality is that you do not own all of the code in your process, and sometimes have to live with libraries doing ill-advised things.

(Never mind the even crazier possibility that one of your libraries will decide it is a good idea to manipulate the signal mask!)

Automatic async-signal-unsafe checking?

Posted Jul 3, 2024 14:18 UTC (Wed) by rweikusat2 (subscriber, #117920) [Link] (3 responses)

Parent article: Serious vulnerability fixed with OpenSSH 9.8

>> What precisely is or isn't "nontrivial work" is in the eye of the beholder.
> The comment you are replying to just explained exactly what "nontrivial work" is. It's anything other than setting a > flag, writing one byte into a self-pipe (or the like), or calling _exit(2). These are not examples. They are an
> exhaustive list. Literally any other code under the sun is "nontrivial."

The comment reflects the opinion of the author what's considered "non-trivial". The set of async signal safe function is significantly larger than just write and _exit which very strongly suggests that other people do not share this opinion. For another example, OpenBSD (and also, some Linux distributions) have an asycnc-signal-safe syslog function specifically to enable it to be used safely from asynchronous signal handlers.

Automatic async-signal-unsafe checking?

Posted Jul 3, 2024 22:48 UTC (Wed) by NYKevin (subscriber, #129325) [Link]

Operating systems will happily let you carry out all manner of bad ideas. The fact that something is supported doesn't magically make it into a good idea.

Automatic async-signal-unsafe checking?

Posted Jul 4, 2024 1:52 UTC (Thu) by geofft (subscriber, #59789) [Link] (1 responses)

You are continuing to misunderstand or mischaracterize my proposal. I am not making the proposal of restricting your signal handlers to the entire set of async-signal-safe functions, so I don't know why you are commenting on how large that set is - it is not relevant to my proposal.

My proposal is to permit a narrow and well-defined set of operations, much smaller than the set of async-signal-safe functions. I am not expressing the opinion that all other functions are async-signal-unsafe. There are many other async-signal-safe functions; you should, nonetheless, not call them.

Contrary to what you say, OpenSSH has not tried my proposal. They have tried the proposal you are attacking - of restricting signal handlers to async-signal-safe calls, and paying attention to how that set changes on various OSes. That is the proposal that (as you point out) has failed in practice. We should not adopt it. I am not calling for people to adopt it.

For context, here is the patch in TFA that OpenSSH suggests to fix the vulnerability in existing releases:

diff --git a/log.c b/log.c
index 9fc1a2e2e..191ff4a5a 100644
--- a/log.c
+++ b/log.c
@@ -451,12 +451,14 @@ void
 sshsigdie(const char *file, const char *func, int line, int showfunc,
     LogLevel level, const char *suffix, const char *fmt, ...)
 {
+#ifdef SYSLOG_R_SAFE_IN_SIGHAND
 	va_list args;
 
 	va_start(args, fmt);
 	sshlogv(file, func, line, showfunc, SYSLOG_LEVEL_FATAL,
 	    suffix, fmt, args);
 	va_end(args);
+#endif
 	_exit(1);
 }

It's very clear to see that they're taking the approach of calling or not calling functions depending on whether they are async-signal-safe, and this is exactly what I am claiming one should not do. Under my proposal, there should not be any function calls to other user-written functions in a signal handler - the very existence of this sshsigdie() function is a design flaw, because it is not immediately obvious that it is called from a signal handler and subject to async-signal-safety restrictions. (In fact, it's called indirectly from a macro sigdie(), further obscuring the link between the signal handler and this code.) The idea of doing conditional compilation based on the current OS's async signal safety guarantees is an indication that you are already well down an incorrect path.

My proposal is to change the code in OpenSSH such that the SIGALRM handler is not responsible for ending the child process, it is just responsible for notifying whatever code was being called that the child process should shut down. If a read() from the remote end returns EINTR and a flag is set, that's a sign to call the regular code to exit the process. The handler should not be doing it.

My proposal is that, if you want to use alarm()/SIGALRM for a timeout, your alarm handling should be in the course of how you process all other events - much as it would be if you were using timerfd for a timeout on an OS that has it. If you want to syslog, pass a message to your main program to syslog. The fact that things like timerfd exist is, I would argue, evidence that this is in fact the consensus view. So is the fact that we have had numerous new types of kernel features in the last few decades exposed via pollable file descriptors and none via new signals (not counting the real-time signals, which are a rather different sort of thing from signals like SIGALRM and SIGWINCH).

In fact, it looks like the patch that OpenSSH applied in 9.8 to fix the bug was to do exactly this sort of change - change the signal handler to exit without logging anything, and move responsibility for logging into the parent sshd process.

And this patch points out what's actually going on in the vulnerability: it's a timeout that needs to be able to interrupt a blocking PAM call. The approach of setting a flag doesn't work here, because you're in code that you don't control in a PAM module which cannot check your flag; you have to call _exit(). But by the same token, your proposal of blocking signals outside of pselect() etc. wouldn't work here either: if you only unblocked SIGALRM while you're inside a pselect() call of your own, you would never interrupt the stuck PAM module.

Automatic async-signal-unsafe checking?

Posted Jul 4, 2024 5:33 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> The approach of setting a flag doesn't work here, because you're in code that you don't control in a PAM module which cannot check your flag; you have to call _exit().

These days, I would move the timeout to the parent process entirely, and instead I will send heartbeat messages to it.

Automatic async-signal-unsafe checking?

Posted Jul 1, 2024 21:28 UTC (Mon) by geofft (subscriber, #59789) [Link] (17 responses)

Thinking about this a bit more - if we're entertaining the idea of making changes to existing code (and we should!), is there any reason to allow an arbitrary-code signal handler in the first place?

The basic problem with async-signal-safety is that the C library might internally have structures that are not protected against reentrancy when interrupted at some arbitrary opcode. But this is a difficult property to ensure in your own structures and algorithms, too. For example, from the exploit writeup, they're attacking two functions: they're attacking libc malloc()/free() which on recent glibc versions does not take locks on the global heap when the program is single-threaded (because, apart from signals, there should be no way to have multiple simultaneous calls), and they're also attacking pam_start()/pam_end() in the PAM library, which is called with a pam_handle_t that is not expected to be shared across threads. It is possible to write a PAM implementation where pam_start() and pam_end() are async-signal-safe, but there is no particular reason to do so.

So blocking libc functions that are known to be async-signal-unsafe is not sufficient in theory, and would not have been sufficient to protect against this vulnerability in practice.

There's a fairly standard pattern for handling signals in complex programs: store any information you need from the signal handler - most commonly, just set a flag that it was caught - and return immediately, and make sure your main program processes that information promptly, as soon as it's done with its current unit of work. If you're writing a program with a ppoll/pselect/etc. main loop, then you essentially treat receiving a signal as receiving any other kind of input from a file descriptor. You finish handling the current piece of input quickly and without blocking the whole program and you pick up the next piece of input. This automatically and robustly deals with the problem of non-reentrant data structures: there is no reentrancy at all, and you're expected to leave each data structure in a usable condition when you return to the main loop. If you're writing in a programming language with async support, then there's effectively a main loop inside the runtime that does this, and the next time you yield/await, the runtime can switch to your signal handler. And you already have to ensure that your shared data structures are left usable when you yield/await, so there is no new requirement.

What if we only allowed signal handlers to store data (into an atomic/reentrant data structure) - set a flag, increment a counter, write information about the signal to a pre-allocated buffer, write(2) to a self-pipe, etc. - or, if absolutely needed, call a couple of simple functions like _exit(2) or write(2) to stderr? Are there real-world programs that need to do something more complicated (e.g. printf/sprintf) in a signal handler, or would be harder to read/maintain if they adopted this model?

You could implement this with either a programmatic static (or dynamic) analysis tool for what happens in signal handlers, with an EDSL to implement these restricted handlers, or simply with a human review rule that basically any complexity in a signal handler is suspicious.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 0:12 UTC (Tue) by nix (subscriber, #2304) [Link] (16 responses)

There's a fairly standard pattern for handling signals in complex programs: store any information you need from the signal handler - most commonly, just set a flag that it was caught - and return immediately, and make sure your main program processes that information promptly, as soon as it's done with its current unit of work.

That works in programs, but it makes signal handling in a library even more of a nightmare than it was before (previously, all a library needed was a function the main program had to call to tell it what signals it could use, but now it needs to hook into the caller's event loop while knowing nothing about how that loop is structured. This is liable to be quite hard.)

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 1:18 UTC (Tue) by dskoll (subscriber, #1630) [Link] (3 responses)

Couldn't a library spawn a thread and do the bulk of the signal handling there (using its own self-pipe, for example)? A lot more of POSIX is thread-safe than async-signal-safe.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 4:24 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

You can do that, although you'll have to make sure that every non-signaled thread has all the signals blocked. Including the threads that might be started by libraries.

Interestingly, Windows NT developers didn't want to bother with full POSIX compatibility for their command-line-oriented libc. So signals were delivered in a separate thread, started for that purpose.

And honestly, this makes so much more sense for most of the signals.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 12:22 UTC (Tue) by nix (subscriber, #2304) [Link]

Yes, though as Cyberax notes, blocking everything is important too -- and God forbid you have to communicate with other threads or share anything but the most trivial state without a locking nightmare: it's often better just to serialize everything and use a separate process. At least this *exposes* the fact that concurrency is hard rather than hiding it and making it simply impossible to deal with like asynchronous signals do, but that doesn't really make it less hard, or less full of lethal traps where every single bit of the code has to consider that the data structures it's manipulating might be used by something else running at the same time. It's still really difficult, unscoped coupling.

Another problem I've hit is that thread-directed signals remain a bit of a late-addition inconsistent-API nightmare. You can at least rely on tgkill() these days, but if you want to fire a timer and direct *that* at a thread (something I had to do specifically to get around the fact that other things, like ptrace(), are *only* thread-directed)? Suddenly you're messing about with sigev_notify = SIGEV_SIGNAL | SIGEV_THREAD_ID and _sigev_un._tid and other "no you should not touch this, there's minimal documentation, it's supposed to be for threading libs only and underscores are everywhere" horrors.

(All pain comes from truth, so here's what triggered this:

<https://github.com/oracle/dtrace-utils/commit/c883bd437cf...>
<https://github.com/oracle/dtrace-utils/commit/234b39beb0e...>

where I had to use threads because of wanting to do something, anything else at the same time as ptrace()/waitpid(), then I had to use condvars and implement a whole little RPC layer because of threads, then I had to actually rely on hitting things with timer-triggered signals and EINTR returns because of wanting the thread to do anything else at all rather than just waitpid()ding forever or sitting in a CPU-spinning polling loop or getting stuck in race conditions because of course you can't turn waitpid() into a poll()able entity without non-upstreamed patches: and that's just a small fraction of the complexity here, almost none intrinsic to the problem space and all working around the APIs. The Linux API is really an abominable tangled un-designed mess in this area.)

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 12:49 UTC (Tue) by geofft (subscriber, #59789) [Link]

Yes, but with a couple of downsides. A program cannot call unshare(CLONE_NEWUSER or CLONE_NEWPID) if it is multithreaded. In the versions of glibc affected by the bug that started this discussion, having multiple threads in the process causes it to start doing locking on malloc, so it slows down the program. Programs that call fork() and do significant work in the child (as opposed to calling async-signal-safe functions and finally _exit() or execve()) can act weird for very similar reasons to the bug that started this discussion: because only the forking thread is duplicated, if the library's helper thread was in the middle of malloc(), then in the forked child, the heap is locked and nothing will ever unlock it, so calls to malloc() will deadlock, whereas in a single-threaded program this problem could not happen.

In general it's rude for a library to create a new thread without clearly documenting that fact and providing an async/sans-IO alternative. If you absolutely must do this, fork a child process and then do the usual things to daemonize (double-fork, setsid()) and use pipes or shared memory or something to communicate with synchronous function calls in the parent.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 13:18 UTC (Tue) by geofft (subscriber, #59789) [Link] (4 responses)

I'm not sure it's that hard in practice. You ask the caller to take responsibility for hooking into their own event loop. For instance, if you're a terminal drawing library, you say "You need to register a SIGWINCH handler, and you need to call foobar_redraw() when it's called. You could signal(SIGWINCH, foobar_redraw) but that exposes you to all the async-signal-unsafety problems. Instead, write a signal handler that sets a flag, and call foobar_redraw() from your main loop." Almost certainly your library is also saying "When there is input from the terminal, call foobar_parse()."

If the implementation of the handler requires doing I/O, e.g. because it performs the redraw and that might block on a slow network connection, then you need to structure your library some other way so each call into the library is quick and nonblocking. So maybe it becomes "You need to register a SIGWINCH handler that calls foobar_need_redraw()," and your library already has a foobar_do_stuff() that needs to be called from the main loop under certain conditions. And probably your library can guarantee that foobar_need_redraw() is async-signal-safe, since it just needs to set a flag.

It is annoying that you need the library to convey info about what file descriptors it's interested in in a generic way (such that the caller's main loop can use select, poll, epoll, kqueue, WaitForMultipleObjectsEx, etc.), but that's not really a problem specific to signal handling. If the only thing the library is doing is responding to signals and not to other I/O, then this problem doesn't apply and the way to implement it is pretty straightforward I think.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 14:38 UTC (Tue) by paulj (subscriber, #341) [Link]

^ This.

Signal handlers should do no more than set flags (in a sig-sensitive way - handler can run multiple times), and let the main programme deal with things.

Libraries should provide functions to notify them of signals they may care about, outside of the signal context - just a standard API documentation thing. It's not hard.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 15:56 UTC (Tue) by Vorpal (guest, #136011) [Link] (2 responses)

Some other languages have solved the sharing and managing of global resources that is signals better. In particular for Rust there exists a de-facto standard crate (rust speak for library). Note that it is *not* part of the actual standard library (std) bundled with the compiler, just a defacto standard crate.

The library is at https://crates.io/crates/signal-hook

For a start it solves the sharing and chaining of signal handlers. But it also provide sensible default behaviours for setting flags (but with an unsafe alternative for arbitrary code if you really need it). As well as providing signal safe channels etc.

Of course for C and C++ the ecosystem is way too splinterd and also full of existing mess for something like that to gain traction. And thread safety (and even more so async signal safety) in C is really hard. Rust helps a bunch here, though async signal safety can still be an issue (especially if using functions from the underlying C library), which is why the "arbitrary code" option is unsafe.

I wouldn't be surprised if higher level languages like Go, Python etc also do sensible things to abstract away the low level mess, but I don't know those languages well enough to speak about them on this topic.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 20:45 UTC (Tue) by dskoll (subscriber, #1630) [Link] (1 responses)

Not sure about Python, but I believe Perl just remembers that a signal has occurred and then calls the appropriate $SIG{'HANDLER'} at a safe time within the evaluation loop. When you control the entire environment, it's easy to impose a policy like this.

Signals are one of the misfeatures of UNIX.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 21:49 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

This is what Python does as well.

Annoyingly, it has to acquire the GIL before it can execute the signal handler, and even more annoyingly, it tries to handle SIGINT by default, so (in extreme cases) Python can hang for several seconds between pressing ^C and the process actually terminating, even if the programmer made no attempt to handle any signals. Fortunately, it does not try to handle SIGQUIT, so you can just press ^\ instead of ^C.

Automatic async-signal-unsafe checking?

Posted Jul 4, 2024 16:07 UTC (Thu) by sionescu (subscriber, #59410) [Link] (6 responses)

Libraries shouldn't ever handle signals.

Automatic async-signal-unsafe checking?

Posted Jul 4, 2024 16:32 UTC (Thu) by pizza (subscriber, #46) [Link] (5 responses)

> Libraries shouldn't ever handle signals.

SDL wouldn't function without use of signals.

(for extra fun, it triggers them as well as consuming them...)

Automatic async-signal-unsafe checking?

Posted Jul 4, 2024 17:59 UTC (Thu) by sionescu (subscriber, #59410) [Link]

SDL could be one exception, but also "Dear lord!".

Automatic async-signal-unsafe checking?

Posted Jul 9, 2024 13:01 UTC (Tue) by paulj (subscriber, #341) [Link] (3 responses)

Why?

In particular, why couldn't it use the "generally the sanest approach for libraries" thing of defining handlers to be called (from non-sigcontext generally) by the user application if a signal was caught?

Automatic async-signal-unsafe checking?

Posted Jul 9, 2024 16:35 UTC (Tue) by sionescu (subscriber, #59410) [Link] (1 responses)

That's a completely different thing from a library defining a handler, thus potentially overwriting a handler defined elsewhere. Since signals are a global property of a process, you can't have libraries define their own signal handlers because that's not composable. It's perfectly reasonable to expose a function that the app should call in case of signal.

Automatic async-signal-unsafe checking?

Posted Jul 9, 2024 17:07 UTC (Tue) by paulj (subscriber, #341) [Link]

By handler I meant "A function to handle the signal, that the user calls from normal context" - not a handler to install directly as the signal handler. I.e., the user handles the signal itself, then later calls the library-specific handlers, from the normal user-code context.

That's really the only sane API for libraries that need to know about signals. I'm trying to understand why that wouldn't work for a library, as pizza suggests.

Automatic async-signal-unsafe checking?

Posted Jul 10, 2024 11:14 UTC (Wed) by pizza (subscriber, #46) [Link]

SDL is a platform abstraction layer. If your application is directly mucking with signals (and/or threads), then it's playing outside of (and potentially interfering with) SDL's sandbox. Given that the reason for using SDL is easy portability, this is counter-productive.

SDL hooks signals so it can clean up after itself properly. That can be disabled at init time.

It also generates signals and expects the application to supply a callback to handle them in a realtime manner -- This is notably used in SDL's audio subsystem, and is also one of the approaches supported by ALSA.

It also uses signals in its internal threading implementation.

All of this was true in the SDL1 days; I don't know if SDL2 or beyond does things differently.

But as I mentioned above, ALSA can use signals as one of three mechanisms for informing applications to refill audio buffers. (the other two being blocking I/O and non-blocking select/poll-based I/O). Which one you use depends on how your application is structured.

Automatic async-signal-unsafe checking?

Posted Jul 2, 2024 18:08 UTC (Tue) by Sesse (subscriber, #53779) [Link]

Valgrind used to have a separate tool for this. I'm not sure if it's still available/maintained, though?

Automatic async-signal-unsafe checking?

Posted Jul 12, 2024 18:35 UTC (Fri) by mrugiero (guest, #153040) [Link]

I feel something similar to how OpenBSD only allows syscalls to be called by the libc by using the range of the memory mapping (IIRC that was the trick), maybe signal handlers could be in a specific section of the binary and restrict what it can call based on that. Calling code outside the authorized sections would cause the OS to kill the process.

openbsd

Posted Jul 1, 2024 18:03 UTC (Mon) by higuita (guest, #32245) [Link] (1 responses)

OpenBSD is not affected due to a older protection layer... anyone knows what is this ?

openbsd

Posted Jul 1, 2024 18:20 UTC (Mon) by lutchann (subscriber, #8872) [Link]

From the link in the article:

> OpenBSD is notably not vulnerable, because its SIGALRM handler calls syslog_r(), an async-signal-safer version of syslog() that was invented by OpenBSD in 2001.

Authenticated port knocking

Posted Jul 1, 2024 20:42 UTC (Mon) by mb (subscriber, #50428) [Link] (26 responses)

Since the xz-Disaster I started to add a second layer of authentication to my private services (including OpenSSH):

https://github.com/mbuesch/letmein

That improves my sleep quality ever since and especially today.

Authenticated port knocking

Posted Jul 1, 2024 20:59 UTC (Mon) by flussence (guest, #85566) [Link] (25 responses)

Long ago I decided to put everything I feasibly can behind wireguard. It's a bit wasteful to have 3 or 4 layers of encryption on some traffic, but never having to worry about things like this makes it worth it.

Authenticated port knocking

Posted Jul 2, 2024 9:37 UTC (Tue) by mb (subscriber, #50428) [Link] (8 responses)

I currently have my VPN behind the letmein knocker.

My personal goal is for the first barrier to be as simple as possible. No complicated crypto protocols. No complicated architecture.
Letmein is not 100% there, yet, IMO. It's still more complicated than I'd like it to be. I plan to strip it down further.

It's a first step for now and it's much better than what I had before.

Authenticated port knocking

Posted Jul 12, 2024 18:48 UTC (Fri) by mrugiero (guest, #153040) [Link] (7 responses)

I'd be interested in that fork when it's there.

Authenticated port knocking

Posted Jul 12, 2024 19:13 UTC (Fri) by mb (subscriber, #50428) [Link] (6 responses)

Uhm, what fork?

Authenticated port knocking

Posted Jul 12, 2024 23:30 UTC (Fri) by mrugiero (guest, #153040) [Link] (5 responses)

I assumed you'd work on a fork for the stripping down. Are you the original author? Or was your idea to do it for yourself and keep it local?

Authenticated port knocking

Posted Jul 13, 2024 7:15 UTC (Sat) by mb (subscriber, #50428) [Link] (4 responses)

Ah. Got it. :)
I am the original author and by now I have already stripped the things that I wanted to strip.
I don't plan to remove anything else. See the open issues oh Github for things that I still plan to do.

Authenticated port knocking

Posted Jul 18, 2024 23:48 UTC (Thu) by mrugiero (guest, #153040) [Link] (3 responses)

Nice! Very interesting project. I'm interested in trying it in an s6 system. I think it can make to work with the systemd code (that is, without touching the actual Rust code) to handle notification and fd passing with a little wrapping in execline. Would you be interested in some example files as docs when I do that?

Authenticated port knocking

Posted Jul 19, 2024 5:59 UTC (Fri) by mb (subscriber, #50428) [Link] (2 responses)

Sure. I'll always be happy to include examples in the documentation. I have no idea what an s6 system is. though :)
Also code changes are ok, as long as they are not too intrusive for the rest of the use cases.

letmein can run without systemd. If it doesn't see the systemd environment variables, it will create the listening socket on its own. Also see the --no-systemd option. It doesn't create a pidfile or fork into the background like traditional daemons do, though.

Looking forward to your contribution. Feel free to create and issue and/or PR.

Authenticated port knocking

Posted Jul 20, 2024 3:51 UTC (Sat) by mrugiero (guest, #153040) [Link] (1 responses)

> Sure. I'll always be happy to include examples in the documentation. I have no idea what an s6 system is. though :)

It's an alternative init system, supervision-tree based, but with readiness notification and some parts of socket activation. I have a hobby server where I used it for fun and that's the one I'd like to use letmein with.

> Also code changes are ok, as long as they are not too intrusive for the rest of the use cases.

Cool. I don't think they will be needed (and s6 is rather niche, so I wouldn't want to burden you with maintaining code for it), but it's nice to see you're open minded about it if it turns out I'm wrong.

> letmein can run without systemd. If it doesn't see the systemd environment variables, it will create the listening socket on its own. Also see the --no-systemd option. It doesn't create a pidfile or fork into the background like traditional daemons do, though.

No forking needed :)
I'll see if the features it adds are compatible with s6, if not then I'll probably use it without. Is it only to receive the listening socket? I was thinking of notifying readiness as well.

> Looking forward to your contribution. Feel free to create and issue and/or PR.

Thanks! It'll probably be some time till I have free time for that. I will try this weekend but can make no promises.

Authenticated port knocking

Posted Jul 20, 2024 6:51 UTC (Sat) by mb (subscriber, #50428) [Link]

Yes, it receives the systemd socket and then notifies readiness to systemd. But only if it has been started by systemd.

Authenticated port knocking

Posted Jul 2, 2024 14:26 UTC (Tue) by Trelane (subscriber, #56877) [Link] (15 responses)

How does Wireguard help here? It is just a service that listens to the network, no?

WireGuard as defense-in-depth

Posted Jul 2, 2024 15:23 UTC (Tue) by gnoutchd (guest, #121472) [Link] (14 responses)

WireGuard is a VPN implementation with strong public-key authentication. The idea is to configure your network/sshd/iptables/etc. so that you can't reach sshd directly, only via WireGuard. That way, to exploit an sshd bug you'd also have to find a way around WireGuard (find and compromise an authorized peer, find a bug in WireGuard itself, etc.). Defense-in-depth, if you will.

WireGuard as defense-in-depth

Posted Jul 3, 2024 5:32 UTC (Wed) by Trelane (subscriber, #56877) [Link] (13 responses)

But now you just changed where the hole has to be--wireguard instead of sshd. I don't see what you gain.

WireGuard as defense-in-depth

Posted Jul 3, 2024 7:17 UTC (Wed) by Wol (subscriber, #4433) [Link] (12 responses)

> But now you just changed where the hole has to be--wireguard instead of sshd. I don't see what you gain.

Because you misunderstand the problem. He hasn't *changed* where the hole is, he's added a whole new outer wall round the castle. So now you have to find *two* holes, not one.

Before you can exploit sshd (the old situation), you now have to exploit wireguard first - the new outer bailey.

Cheers,
Wol

WireGuard as defense-in-depth

Posted Jul 3, 2024 7:34 UTC (Wed) by mjg59 (subscriber, #23239) [Link] (11 responses)

Wireguard is running on the remote server, so you only need to find a hole in Wireguard to gain access to the remote server. If your argument is that the pre-auth attack surface on Wireguard is smaller than the pre-auth attack surface on sshd then that's a reasonable argument to make given sufficient evidence, but otherwise there isn't an additional wall here. If Wireguard has an exploitable vulnerability, you simply exploit that instead of caring about sshd at all.

WireGuard as defense-in-depth

Posted Jul 3, 2024 9:10 UTC (Wed) by Wol (subscriber, #4433) [Link] (4 responses)

Ah.

As I understood it, the target was the *local* machine, which could only be reached by ssh from the remote machine. So if the remote machine was protected by wireguard, then you would need to compromise each in turn.

Cheers,
Wol

WireGuard as defense-in-depth

Posted Jul 3, 2024 15:18 UTC (Wed) by mjg59 (subscriber, #23239) [Link] (3 responses)

If I've already compromised wireguard, why do I need to compromise sshd? I've presumably already got RCE.

WireGuard as defense-in-depth

Posted Jul 3, 2024 19:05 UTC (Wed) by Wol (subscriber, #4433) [Link]

Maybe I'm dense, and we'll need flussence to explain, but if wireguard is running on the firewall (which presumably has no legitimate reason to initiate connections to internal machines), don't you need some other exploit - for example sshd - to compromise an internal machine?

(Yes, once you're in the firewall, compromising other machines is easier ...)

I'm assuming wireguard and sshd are NOT on the same machine ...

Cheers,
Wol

WireGuard as defense-in-depth

Posted Jul 4, 2024 21:18 UTC (Thu) by mussell (subscriber, #170320) [Link] (1 responses)

Wireguard can run as a userspace daemon which uses TUN/TAP and does not need to run as root. By comparison, OpenSSH needs to run as root with all capabilities, no seccomp filter, and YesNewPrivs due a combination of how awful Unix authentication is and trying to do privilege separation on its own. In theory it should be possible to run a SSH daemon without root by having systemd create the session and having it pass back ttys similar to how machinectl shell/run0 works, but I highly doubt the OpenSSH devs would implement something that uses systemd.

unprivileged sshd

Posted Jul 5, 2024 4:46 UTC (Fri) by donald.buczek (subscriber, #112892) [Link]

> In theory it should be possible to run a SSH daemon without root by having systemd create the session and having it pass back ttys similar to how machinectl shell/run0 works, but I highly doubt the OpenSSH devs would implement something that uses systemd.

Yes, this already works and sshd doesn't need to be aware. For example, we abuse sshd to offer interactive sessions via our cluster scheduler. sshd is running unprivileged, the network socket is already connected externally (so we run it in inetd mode) and keys are created per session and communicated out of band [^1].

[^1] https://github.molgen.mpg.de/mariux64/mxtools/blob/master...

WireGuard as defense-in-depth

Posted Jul 3, 2024 12:19 UTC (Wed) by dskoll (subscriber, #1630) [Link] (2 responses)

It's true that exploiting Wireguard would be just as good as exploiting sshd, but I suspect there are a lot more SSH scanners prowling the Internet looking for vulnerable servers than there are Wireguard scanners. So in a way, there's a bit of security through obscurity.

WireGuard as defense-in-depth

Posted Jul 5, 2024 1:30 UTC (Fri) by aaronmdjones (subscriber, #119973) [Link] (1 responses)

There cannot be such a WireGuard scanner as you put it. WireGuard does not respond at all, to any unsolicited traffic, unless that traffic resembles a handshake containing its public key. If you don't know a WireGuard device's public key, you can't get it to send anything to you, ever. This is because the public keys are Curve25519 key agreement keys, not Ed25519 signature keys (WireGuard doesn't use any asymmetric signature crypto), so if you don't know its public key, you can't even agree a session negotiation key with it.

WireGuard as defense-in-depth

Posted Jul 8, 2024 4:16 UTC (Mon) by raven667 (subscriber, #5198) [Link]

I don't know anything technical about Wireguard, but that's a neat property. You might still be able to scan for it though, based on the _lack_ of an ICMP port unreachable response, unless you go out of your way to block ICMP, but doing so can have other unintended consequences so more modern firewalling guides don't recommend it the way they used to.

WireGuard as defense-in-depth

Posted Jul 3, 2024 15:10 UTC (Wed) by Trelane (subscriber, #56877) [Link]

Exactly. Thank you.

WireGuard and/or OpenSSH

Posted Jul 3, 2024 16:19 UTC (Wed) by gnoutchd (guest, #121472) [Link]

Ah, fair point. If we run WireGuard on the same host as the sshd, and if we assume WireGuard and OpenSSH are equally likely to have remote-root bugs, then I'd agree that it wouldn't buy you much.

However, WireGuard and sshd need not run on the same host. What if you put WireGuard on a gateway device?

Also, I actually would expect WireGuard to be less prone to security bugs than OpenSSH, for two reasons:

WireGuard's protocol and implementation are deliberately very simple, with exactly one cryptosystem and authentication scheme, whereas sshd must carry ~15-20 years of negotiable ciphersuites and authentication schemes.
sshd has to interact with stuff like syslog and pam and nss and systemd, and this has caused trouble several times (e.g. xz-utils, this CVE, CVE-2006-5051). WireGuard's integrations appear to be much simpler.

I've not looked closely at the code of either, though, so I could be wrong.

WireGuard as defense-in-depth

Posted Jul 3, 2024 16:41 UTC (Wed) by mb (subscriber, #50428) [Link]

>If your argument is that the pre-auth attack surface on Wireguard is smaller than the pre-auth attack
>surface on sshd then that's a reasonable argument to make given sufficient evidence, but otherwise
>there isn't an additional wall here.

Yes. That's exactly why I use an authenticating knocker instead of a VPN. It's only about reducing the external pre-auth attack surface. The knocker is not as small as I'd like it to be, but there's still room for improvement.

alpine

Posted Jul 2, 2024 6:05 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (1 responses)

Quite annoying to see articles and comments claiming that alpine is not vulnerable when the email is saying it probably is, but they never tested it.

alpine

Posted Jul 2, 2024 6:18 UTC (Tue) by gioele (subscriber, #61675) [Link]

> Quite annoying to see articles and comments claiming that alpine is not vulnerable when the email is saying it probably is, but they never tested it.

It's not only random commenters, musl's maintainer stated:

> OpenSSH sshd on musl-based systems is not vulnerable to RCE via CVE-2024-6387 (regreSSHion).
>
> This is because we do not use localtime in log timestamps and do not use dynamic allocation (because it could fail under memory pressure) for printf formatting.
>
> While the sshd bug is UB (AS-unsafe syslog call from signal context), very deliberate decisions we made for other good reasons reduced the potential impact to deadlock taking a lock.

https://fosstodon.org/@musl/112711796005712271

Spectacularly well written

Posted Jul 2, 2024 11:15 UTC (Tue) by ringerc (subscriber, #3071) [Link] (1 responses)

I am truly impressed by the quality of the writing and explanation in that advisory.

Truly spectacular technical writing.

Spectacularly well written

Posted Jul 2, 2024 16:55 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

I agree, and their advisories generally are of high quality. This one reminded me some articles published in Phrack Magazine decades ago. These were always a pleasure to discover and read.

timing mitigations?

Posted Jul 2, 2024 16:56 UTC (Tue) by cozzyd (guest, #110972) [Link]

I wonder if it would be helpful to randomize the exact auth timeout a little to avoid potential signal race conditions? Like if the timeout is 600 seconds, randomly choose between 599.5 and 600.5 seconds each time. I would guess that would make it much harder to trigger a race condition like this?