LWN: Comments on "Deferring seccomp decisions to user space" https://lwn.net/Articles/756233/ This is a special feed containing comments posted to the individual LWN article titled "Deferring seccomp decisions to user space". en-us Thu, 16 Oct 2025 09:55:28 +0000 Thu, 16 Oct 2025 09:55:28 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Deferring seccomp decisions to user space https://lwn.net/Articles/764842/ https://lwn.net/Articles/764842/ zlynx <div class="FormattedComment"> I implemented a Go netlink reader for connection tracking. It wasn't hard for the most part (I do wish someone had explicitly written a few comments about data alignment instead of making it implicitly hidden in macros if I remember correctly).<br> <p> I do wish that the netlink formats were better documented.<br> <p> Calling C code from Go causes all sorts of complex interactions with the green threads and garbage collection so it is not a good idea to casually link into CGo.<br> </div> Fri, 14 Sep 2018 14:46:46 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/764818/ https://lwn.net/Articles/764818/ mathstuf <div class="FormattedComment"> Well, that'd work if Go projects weren't so intent on not using *any* non-Go code in their stacks…&lt;/snark&gt;<br> <p> To not make this just a snark, I'll add a data point. I've seen git-lfs not want to fork out to Git for things like `remote get-url` and rather re-implement `insteadOf` and `pushInsteadOf` yet again. And so git-lfs is still broken with alias remote URLs that differ in push and pull. Attribute reading is also broken in the case of user "[attr]" attributes. Yes, both have issues filed (and I don't know Go (yet?) well enough to fix it myself).<br> <p> I believe the *only* thing they fork for is to find out the version of Git used elsewhere. There might be one or two more instances as well, but they're of a similar level of actual functionality sharing.<br> </div> Fri, 14 Sep 2018 13:32:11 +0000 File paths? https://lwn.net/Articles/756685/ https://lwn.net/Articles/756685/ Cyberax <div class="FormattedComment"> Right now BPF syscall filter programs can't access the string arguments at all, so there's no problem.<br> </div> Tue, 05 Jun 2018 21:34:52 +0000 File paths? https://lwn.net/Articles/756673/ https://lwn.net/Articles/756673/ wahern <div class="FormattedComment"> Is that how it works _now_? Is any of that work already in place?<br> <p> </div> Tue, 05 Jun 2018 19:26:00 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756583/ https://lwn.net/Articles/756583/ SEJeff <div class="FormattedComment"> We almost had this (a better ptrace, no userspace api ontop of it) with utrace, but Andrew Morton (ultimatey) shot it down and Linus didn't like it. This caused Roland Mcgrath to stop working on utrace / uprobes almost entirely.<br> <p> Some light reading:<br> <p> <a href="https://lwn.net/Articles/371210/">https://lwn.net/Articles/371210/</a><br> <a href="https://yarchive.net/comp/linux/utrace.html">https://yarchive.net/comp/linux/utrace.html</a><br> </div> Tue, 05 Jun 2018 00:07:01 +0000 File paths? https://lwn.net/Articles/756570/ https://lwn.net/Articles/756570/ Cyberax <div class="FormattedComment"> No, there's no race condition. The kernel code would have to copy strings into the message sent to the userspace helper.<br> <p> The helper code then can do all the required open/access/stat stuff and return the results as a file descriptor (open) or a static block of data (stat/access).<br> <p> Obviously, copying the parameters will add some overhead, but it should be way less than doing additional ptrace/read_mem calls from the userspace helper.<br> </div> Mon, 04 Jun 2018 21:41:47 +0000 File paths? https://lwn.net/Articles/756569/ https://lwn.net/Articles/756569/ wahern <div class="FormattedComment"> Isn't that susceptible to a race condition? systrace (<a href="https://en.wikipedia.org/wiki/Systrace">https://en.wikipedia.org/wiki/Systrace</a>) never saw widespread adoption exactly because of the race condition, both on Linux and on OpenBSD (with an in-kernel implementation). The TOCTTOU race is that a signal handler or thread changes the path between the check and the actual open.<br> <p> The solution is to copy the path or otherwise make it immutable. That's costly and it's why the the seccomp BPF filter originally didn't support processing the file path string. Has that changed?<br> <p> </div> Mon, 04 Jun 2018 21:37:40 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756394/ https://lwn.net/Articles/756394/ josh <div class="FormattedComment"> I find myself curious if this could be used to emulate non-existent system calls, or even invent an entirely new syscall interface with arbitrary syscall numbers. The userspace program receives the syscall number and arguments; it could do anything with those.<br> </div> Sun, 03 Jun 2018 22:57:47 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756393/ https://lwn.net/Articles/756393/ oscode <div class="FormattedComment"> Thanks for sharing! Your LSM projects look interesting, it's just a shame they can't be dynamically loaded.<br> </div> Sun, 03 Jun 2018 22:47:19 +0000 File paths? https://lwn.net/Articles/756370/ https://lwn.net/Articles/756370/ jhoblitt <div class="FormattedComment"> I wonder if anyone has gathered statics on syscall distribution for various types of workload?<br> <p> I suspect that `statfs()` and `access()` are also frequent syscalls with string params. File distribution programs, such as HTTP servers, can produce a fairly extreme number of `access()` calls.<br> </div> Sun, 03 Jun 2018 13:21:57 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756368/ https://lwn.net/Articles/756368/ jhoblitt <div class="FormattedComment"> The "gain" of using netlink is a standard client lib, such as libnl, could be used instead of every service having a custom interface with semantics that evolve differently than other kernel interfaces over time. Imagine what the state of interoperability would be if most "ReSTful" web APIs used a custom serialization format instead of JSON?<br> </div> Sun, 03 Jun 2018 13:11:28 +0000 File paths? https://lwn.net/Articles/756348/ https://lwn.net/Articles/756348/ Cyberax <div class="FormattedComment"> Still, a special case for automatic transmission of string arguments might make sense. open/stat calls are probably 90% of security-related calls and special-casing their arguments might give a sizable performance benefit.<br> </div> Sun, 03 Jun 2018 05:16:02 +0000 File paths? https://lwn.net/Articles/756350/ https://lwn.net/Articles/756350/ roc <div class="FormattedComment"> rr uses /proc/&lt;pid&gt;/mem to read tracee memory instead of PTRACE_PEEKUSER, even though it's already ptracing, because the former is so much faster. I assume gdb does too.<br> </div> Sun, 03 Jun 2018 03:11:45 +0000 File paths? https://lwn.net/Articles/756347/ https://lwn.net/Articles/756347/ dezgeg <div class="FormattedComment"> Presumably with process_vm_readv(). Still a user/kernel context switch per string but still much, much better than ptrace()...<br> </div> Sun, 03 Jun 2018 01:49:17 +0000 File paths? https://lwn.net/Articles/756346/ https://lwn.net/Articles/756346/ Cyberax <div class="FormattedComment"> How does it access the file paths? I guess the filter can just ptrace the requesting process, but that's already piling up overhead on top of the overhead.<br> <p> Perhaps a special case for strings could be added?<br> </div> Sun, 03 Jun 2018 00:47:12 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756341/ https://lwn.net/Articles/756341/ TheJH <div class="FormattedComment"> But doing that reasonably safely (without race conditions) is a big PITA, especially if the sandboxed process is multithreaded. If you look at the path argument of an open() call and use that to determine whether the call should be allowed, it's probably safest to do the actual open() in the supervisor process and then install the resulting FD in the sandboxed process.<br> </div> Sat, 02 Jun 2018 20:07:49 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756339/ https://lwn.net/Articles/756339/ smurf <div class="FormattedComment"> Wouldn't handling of these calls be a whole lot easier if there was a way to tell the monitored program to proceed with the syscall in question? I'd assume that calls like open() or exec() on behalf of the tracee are a major PITA to do correctly – in other words: a security hole in waiting.<br> </div> Sat, 02 Jun 2018 19:14:43 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756334/ https://lwn.net/Articles/756334/ rvolgers <div class="FormattedComment"> This seems really nice for the seccomp usecase, but it does kind of put the spotlight on how awkward ptrace is in comparison.<br> <p> I really wish we'd one day get a clean file descriptor based debugging API instead of the ptrace pseudo-reparenting and signal abuse nonsense.<br> <p> <p> </div> Sat, 02 Jun 2018 17:27:25 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756331/ https://lwn.net/Articles/756331/ skx <p>I have to say I'm interested in seeing how this turns out - at least partially because I wrote a linux-security-module which defers checks for <tt>exec</tt> calls to user-space. The code is reasonably clean, and the overhead of having to exec a user-space binary is essentially unnoticed.</p> <p>The code is here:</p> <ul> <li><a href="https://github.com/skx/linux-security-modules/tree/master/security/can-exec">https://github.com/skx/linux-security-modules/tree/master/security/can-exec</a></li> </ul> <p>BPF has so many uses, and I'm loving the way it is becoming better documented, and more useful. I'm sure it is only a matter of time before it is invoked by linux-security modules.</p> Sat, 02 Jun 2018 16:42:41 +0000 Deferring decisions to userspace? https://lwn.net/Articles/756328/ https://lwn.net/Articles/756328/ corbet That is a good point, something I didn't mention properly in the article. It behaves a lot like <tt>SECCOMP_RET_ERRNO</tt>. I have added a little text to try to fill that in, thanks. Sat, 02 Jun 2018 15:22:10 +0000 Deferring seccomp decisions to user space https://lwn.net/Articles/756323/ https://lwn.net/Articles/756323/ brauner <div class="FormattedComment"> This is a much needed patchset and I'm really happy that since the first design discussions<br> at Plumbers last year it has seen rapid development thanks to Tycho. No one has really done<br> a lot of bikeshedding on it which is great!<br> It seems that people didn't really notice how much use cases this will enable once this is merged.<br> If I were one of gvisor guys I'd take a very close look at this patchset and whether it'd be possible<br> to kick out ptrace.<br> It's excellent that we've managed to decouple this from the ebpf seccomp patchset. The last step<br> is to hopefully not tie this to netlink as this looks like a lot of protocol for not much gain in this<br> case. But we'll see.<br> </div> Sat, 02 Jun 2018 13:04:25 +0000 Deferring decisions to userspace? https://lwn.net/Articles/756321/ https://lwn.net/Articles/756321/ TheJH <div class="FormattedComment"> The article is titled "Deferring seccomp decisions to user space". As far as I can tell, the referenced patchset doesn't actually defer the whole decision; it allows userspace to synchronously handle the syscall and provide a return value, but userspace can't decide to just let the syscall through, it can only emulate it.<br> </div> Sat, 02 Jun 2018 12:54:25 +0000