|
|
Subscribe / Log in / New account

Protocol ossification

Protocol ossification

Posted Jan 24, 2025 20:45 UTC (Fri) by NYKevin (subscriber, #129325)
In reply to: Protocol ossification by hmh
Parent article: The trouble with the new uretprobes

IMHO this is not unreasonable, but it would be helpful to make an explicit category for syscalls of this nature, with documentation and possibly new seccomp flags for dealing with it.

Actually, I think it would probably be helpful to have multiple categories for syscalls of different types based on what they can do and whether they can affect things outside of the process. A seccomp BPF filter could, hypothetically, receive a bitmask indicating the properties of a given syscall, such as:

* Whether it can read/modify state, and separate flags for the kind of state it reads or modifies. dup2 would be flagged differently from write, because dup2 modifies the process's file descriptor table, while write can modify the filesystem or do IPC (pipes etc.). (No, procfs is not "the filesystem" for the purposes of this discussion.)
* Whether it is considered part of the kernel's userspace API. Yes for most syscalls, no for uretprobe. Denying syscalls where this flag is not set would be considered poor practice, and might result in compatibility issues on the next kernel release, but you can still do it if you really want to.
* Whether it requires at least one capability to call with the specified arguments, for any reason other than filesystem permissions (because that would require looking them up, which slows things down a ton, and is also inherently racey).
* Whether it would require a filesystem permission check, if executed by a non-privileged process (i.e. the flag is still set if you are root). Does not indicate what the result of that permission check would have been, only that it would have been done.

The BPF filter could then use those flags to make an informed choice about unrecognized syscalls, and you could even pass some kind of mask to seccomp/prctl to indicate which syscalls you want to filter in the first place.

I don't know how backwards compatible this is, or whether there is the will to implement something this complicated. But it would be nice to have.


to post comments


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds