|
|
Subscribe / Log in / New account

The trouble with symbolic links

The trouble with symbolic links

Posted Jul 7, 2022 16:23 UTC (Thu) by nix (subscriber, #2304)
In reply to: The trouble with symbolic links by iabervon
Parent article: The trouble with symbolic links

That requires all userspace programs not only to follow symlinks but to do so recursively (since any component of a path may be a symlink), and to do so in a race-free fashion. They also all need to spot loops.

How many of them do you imagine will get *that* right, given how many bugs and races there have been in, say, rm -rf, which you'd think is heavily tested? Oh look, a new set of security holes, probably worse than the last lot.

-- N., very heavy user of symlinks, thank god for Nix (the distro, no relation) making sure that not too many break when entire /usr trees are symlink farms


to post comments

The trouble with symbolic links

Posted Jul 7, 2022 17:38 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (1 responses)

At least in theory libc could be doing all that. However it would also be much slower and wasteful, because all the caching that the kernel does would be split across multiple userspace programs, and the cache would start cold for every new process.

The trouble with symbolic links

Posted Sep 3, 2022 11:56 UTC (Sat) by nix (subscriber, #2304) [Link]

In theory libc could implement its own cross-process cache using shared memory, like nscd. In practice, the mere thought is painful. This is definitely not userspace's job :)

The trouble with symbolic links

Posted Jul 7, 2022 19:33 UTC (Thu) by iabervon (subscriber, #722) [Link]

I imagine that either all userspace programs will get recursively following symlinks right, or they'll all get it wrong, since I think it would be done in libc and they'd all use the same implementation. For that matter, I think the implementation should probably be a system call that takes a path that may use symlinks and returns a path that doesn't use symlinks, in which case everyone would end up using the correct implementation that the kernel uses.

The only change I can see to outcomes of path-based operations from having separate resolve and operate syscalls would be that, if you rename a symlink over a regular file, a poorly-timed open() could get ELOOP instead of either getting the original file or the target of the symlink. Replacing one symlink with another would be atomic, and any other operation isn't atomic today (that is, you can't replace a directory with a symlink atomically, and you can't replace both a symlink and another file atomically; I guess you might get ELOOP when you could only have gotten ENOENT).

As far as changes to userspace program code, the only one would be that, if you've explicitly called realpath() yourself and validated the result in some way, you'd call realopen() on the string you validated rather than calling open() and getting another round of symlink resolution that might be different.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds