LWN: Comments on "New AT_ flags for restricting pathname lookup" https://lwn.net/Articles/767547/ This is a special feed containing comments posted to the individual LWN article titled "New AT_ flags for restricting pathname lookup". en-us Wed, 15 Oct 2025 12:38:13 +0000 Wed, 15 Oct 2025 12:38:13 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767806/ https://lwn.net/Articles/767806/ eru Thanks to all for explaining the need and use of the <i>something</i>at() calls. Mon, 08 Oct 2018 04:18:22 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767789/ https://lwn.net/Articles/767789/ rweikusat2 <div class="FormattedComment"> rc = fstatat(dirfd, d_ent-&gt;d_name, &amp;, 0);<br> <p> This should have been<br> <p> rc = fstatat(dirfd, d_ent-&gt;d_name, &amp;st, 0);<br> <p> and was but got deleted when "htmlifying" the source ... :-(<br> </div> Sun, 07 Oct 2018 19:27:35 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767787/ https://lwn.net/Articles/767787/ rweikusat2 Manipulating pathnames means "doing string operations", something that's fairly cumbersome in C. For an example, consider the following toy-program: <pre> #define _GNU_SOURCE #include &lt;dirent.h&gt; #include &lt;errno.h&gt; #include &lt;fcntl.h&gt; #include &lt;stdio.h&gt; #include &lt;sys/stat.h&gt; static char *cwd[] = { ".", NULL }; int main(int argc, char **argv) { DIR *dir; struct dirent *d_ent; struct stat st; int dirfd, rc; ++argv; if (!*argv) argv = cwd; do { dirfd = open(*argv, O_RDONLY, 0); if (dirfd == -1) { perror("open"); continue; } dir = fdopendir(dirfd); if (!dir) { perror("fdopendir"); continue; } printf("-----\nfiles in %s\n-----\n", *argv); while ((d_ent = readdir(dir))) { rc = fstatat(dirfd, d_ent-&gt;d_name, &amp;, 0); if (rc == -1) { if (errno != ENOENT) perror("fstatat"); continue; } if (S_ISREG(st.st_mode)) printf("%s\t\t%zu bytes\n", d_ent-&gt;d_name, (size_t)st.st_size); } closedir(dir); } while (*++argv); return 0; } </pre> This takes a list of directory pathnames as arguments and prints the names and sizes of all files in any of the directories. It uses <em>fstatat</em> because the names returned by <em>readdir</em> are filenames relative to the directory being read. Thanks to the <em>*at</em>-call, they can be accessed without doing dynamic string manipulation and buffer management and also without changing the cwd of the process forward and backward for each directory. <p> Also, <em>chdir</em> is basically unusable in multi-threaded processes as it changes the working directory of the process, ie, it affects all threads, not just the one executing it <strong>and</strong>, as seen by another thread, the cwd change is an unpredictable, asynchronously occuring event. Eg, a thread desiring to create two files in the same directory might end up creating them in different directories. <p> Lastly, the directory a process was started in might have been picked intentionally, eg, as location where core dumps should go to, and the process shouldn't change it except if there's a very good reason for that (and this should be documented). Sun, 07 Oct 2018 17:22:24 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767773/ https://lwn.net/Articles/767773/ cyphar <div class="FormattedComment"> Something like resolveat(2)? The problem is that this would necessarily be conceptually identical to openat(O_PATH). Maybe O_PATH should've been a different syscall but we are mostly stuck with it now, and I think it would be strange to have two methods of opening an O_PATH descriptor. Though, there are some aspects of O_PATH that I think need to be fixed (and would require more convoluted O_ flags -- maybe a new syscall is warranted to fix some of the semantics of O_PATH. I'm not sure.)<br> <p> And remember that the widespread utility of any resolveat(2) syscall would likely require having AT_EMPTY_PATH support for every *at(2) syscall (which is unfortunately far from the case currently).<br> </div> Sun, 07 Oct 2018 04:36:36 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767769/ https://lwn.net/Articles/767769/ judas_iscariote <div class="FormattedComment"> It is quite unfortunate that kernel developers insist on extending openat() with more and more contrived semantics, I wish they just added new syscalls with well defined behaviour.<br> </div> Sun, 07 Oct 2018 00:34:15 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767745/ https://lwn.net/Articles/767745/ wahern <div class="FormattedComment"> Shouldn't it be possible to quiesce the runtime (pause GC, park all other goroutines, and join all kernel threads)? All the machinery in the scheduler must already be there, more or less. Maybe some component is currently running in a dedicated thread in an infinite loop, but conceptually it could be refactored to be able to enter and exit its core loop.<br> <p> It might not be particularly efficient and come with a ton of gotchas, but it would at least make some currently impossible things possible, such as using geteuid and forking helper processes. Those things tend to happen early on, anyhow, so performance and other limitations wouldn't matter much.<br> <p> </div> Sat, 06 Oct 2018 01:37:28 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767741/ https://lwn.net/Articles/767741/ nix <div class="FormattedComment"> Generally I do the same thing when debugging eBPF that I do when debugging other programs: printf()! In the case of eBPF you throw in a helper that does a printk() and chuck in calls to the helper liberally. (This is not so useful if you can't modify the eBPF, mind you.)<br> <p> </div> Fri, 05 Oct 2018 22:24:27 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767719/ https://lwn.net/Articles/767719/ Cyberax <div class="FormattedComment"> It's way worse than assembly. With assembly you can typically use debuggers to trace the execution and inspect the environment. Nothing comparable exists for eBPF.<br> </div> Fri, 05 Oct 2018 17:14:36 +0000 Places to block filesystem traversal https://lwn.net/Articles/767690/ https://lwn.net/Articles/767690/ smurf <div class="FormattedComment"> Also, userspace sanitation depends on the fact that no second thread exists that modifies the sanitized path before it's passed to the kernel. In-kernel defenses against that sort of thing at least work.<br> </div> Fri, 05 Oct 2018 14:24:40 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767669/ https://lwn.net/Articles/767669/ nix <div class="FormattedComment"> eBPF is a nice thing to have if machine-generated (it's a rather nice and orthogonal assembler, and the ability to add helpers is just a killer feature that I wish real assemblers had!), but it's about as pleasant to debug programs written in it as any other assembler: i.e. fairly easy if you're familiar with the code generator, a nightmare otherwise, doubly so if this is the less regular land of handwritten code, disassembled and devoid of comments.<br> </div> Fri, 05 Oct 2018 12:10:48 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767668/ https://lwn.net/Articles/767668/ nix <div class="FormattedComment"> Others have commented on the problems with chdir(). The problem with using long absolute pathnames is twofold: firstly, you race with people modifying symlinks and/or renaming out from underneath you (*at() can at least reduce this by nailing the walk to specific directory inodes). Secondly, the length of pathnames is capped at pathconf(..., _SC_PATH_MAX): but you can make directory trees of arbitrary depth, with absolute paths much deeper than this and indeed deeper than the hardware page size. Nobody does this manually, but it can and does happen with machine-generated hierarchies, and the deep parts of such hierarchies are *only* traversable via chdir() or the *at() syscalls: while you can compose an absolute path that should reach those parts, the kernel will reject it with -ENAMETOOLONG.<br> <p> So generic code has no choice but to use chdir() or *at() to traverse hierarchies or fail on such deep hierarchies, and generic multithreaded code or library code which might be run in multithreaded contexts has no choice but to use *at().<br> </div> Fri, 05 Oct 2018 12:08:38 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767664/ https://lwn.net/Articles/767664/ pbonzini <div class="FormattedComment"> For one, chdir affects the entire process rather than the current thread only.<br> </div> Fri, 05 Oct 2018 10:47:02 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767660/ https://lwn.net/Articles/767660/ Cyberax <div class="FormattedComment"> I hope so. I've just spent a day debugging a eBPF filter written by somebody else and it's NOT a nice experience at all.<br> <p> Debugging infrastructure is sorely lacking for it.<br> </div> Fri, 05 Oct 2018 07:34:41 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767659/ https://lwn.net/Articles/767659/ flewellyn <div class="FormattedComment"> I believe neilbrown was joking. I have no evidence for this, but I am desperately choosing to believe it anyway.<br> </div> Fri, 05 Oct 2018 07:31:40 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767656/ https://lwn.net/Articles/767656/ kostix <div class="FormattedComment"> That wouldn't have helped anyway: the problem with not being able to do the classic fork+exec in Go programs is that the code executing in each of them heavily relies on the live Go runtime (which is linked with/into any compiled Go executable and actually manages the whole lifecycle of the program), and that runtime exploits multiple OS threads — both to run the program's goroutines and do its own chores.<br> <p> Since fork() clones the state of just a single thread — the one which happened to execute that syscall, — as soon as the control resumes in the child process, there is literally no Go runtime anymore around the goroutine "awoken" in the cloned thread, and as soon as it happens to call anything which would normally reach for the runtime, it is hosed. And normally such a call would happen pretty soon.<br> <p> So basically the only sensible thing one might safely do after forking a process running a Go program is to do a controlled set of preparations and exec().<br> And actually that's what the syscall.ForkExec does — with some added complexity stemming from Go having an execution model other than C ;-)<br> <p> You can look at ForkExec in <a href="https://golang.org/src/syscall/exec_unix.go">https://golang.org/src/syscall/exec_unix.go</a> and then at forkAndExecInChild in <a href="https://golang.org/src/syscall/exec_linux.go">https://golang.org/src/syscall/exec_linux.go</a> — the code is very easy to follow for any programmer with a C background, and it is extensively commented.<br> </div> Fri, 05 Oct 2018 07:13:16 +0000 Places to block filesystem traversal https://lwn.net/Articles/767658/ https://lwn.net/Articles/767658/ epa <div class="FormattedComment"> It’s not just containers. Path-traversal bugs are a common exploit in archivers like tar or unzip, where unpacking a malicious archive file overwrites things elsewhere in the filesystem. I imagine web servers might also use this flag as an additional defence to make sure they only serve content from the right directory. If the flag existed on all operating systems, a lot of userspace path sanitizing code could be removed. <br> </div> Fri, 05 Oct 2018 07:11:12 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767655/ https://lwn.net/Articles/767655/ Cyberax <div class="FormattedComment"> You can't, not in a race-free way anyway.<br> </div> Fri, 05 Oct 2018 04:33:52 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767654/ https://lwn.net/Articles/767654/ eru openat() is one of those Linux system calls whose rationale I don't quite understand. It allows opening files relative to a particular directory, but can't you do the same thing by manipulating the path name, or by using chdir() first? Fri, 05 Oct 2018 04:13:36 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767648/ https://lwn.net/Articles/767648/ luto <div class="FormattedComment"> It would be “simple” in the sense that getting the eBPF right would be at least as difficult as getting the kernel code with the AT flags right would be. But with eBPF, no one would ever review it carefully or fix the bugs.<br> <p> eBPF is flexible, but it’s not magic.<br> </div> Thu, 04 Oct 2018 23:55:34 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767646/ https://lwn.net/Articles/767646/ Cyberax <div class="FormattedComment"> No......<br> <p> Please, no more eBPF. It never ever works outside of kernel developers' machines.<br> </div> Thu, 04 Oct 2018 23:03:49 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767645/ https://lwn.net/Articles/767645/ neilbrown <div class="FormattedComment"> Surely this could be vastly simplified by allowing an eBPF program to be attached to a file descriptor so that when a path_lookup starts from that file descriptor, the eBPF program is used to vet or modify the lookup of each component.<br> <p> </div> Thu, 04 Oct 2018 22:52:54 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767639/ https://lwn.net/Articles/767639/ Cyberax <div class="FormattedComment"> You can pin a goroutine to a thread using LockOSThread, but it basically locks this thread out of running other goroutines.<br> <p> (Personally, I'd like for them to add goroutine IDs)<br> </div> Thu, 04 Oct 2018 21:45:20 +0000 New AT_ flags for restricting pathname lookup https://lwn.net/Articles/767637/ https://lwn.net/Articles/767637/ wahern <div class="FormattedComment"> I don't understand why the Go team is so resistant to adding the ability to explicitly pin a goroutine to a machine thread. Goroutines are an amazing, almost ideal construct. But there's a very obvious and unresolvable impedance mismatch between how a goroutine implement threading (linear flow of logical execution) and how traditional operating systems do. A similar mismatch exists with FFI ABIs (i.e. stack details) and with the blocking semantics of some syscalls. In those cases a goroutine *is* pinned to a machine thread; indeed, the very architecture of the Go runtime (the [G]oroutine, OS [M]achine thread, and [P]rocessor scheduling abstractions) is built around this mismatch. It's inexplicable to me why they refuse to expose the scheduling levers that must necessarily exist.<br> <p> </div> Thu, 04 Oct 2018 21:36:42 +0000