Out-of-memory victim selection with BPF
Out-of-memory victim selection with BPF
Posted Aug 19, 2023 17:05 UTC (Sat) by vadim (subscriber, #35271)In reply to: Out-of-memory victim selection with BPF by mb
Parent article: Out-of-memory victim selection with BPF
clone() is Linux specific and thus non-portable, and much more complicated to use than fork(). I'm sure there's stuff that uses it, but I think most things just won't bother without a good reason.
Posted Aug 19, 2023 17:42 UTC (Sat)
by mb (subscriber, #50428)
[Link] (5 responses)
Well, it blocks until the child calls execve(). Which is the only thing the child is supposed to do. That takes a microsecond or so (Plus two context switches).
Posted Aug 19, 2023 20:32 UTC (Sat)
by kleptog (subscriber, #1183)
[Link] (4 responses)
Well, processes almost always do other thing like close FDs, setup pipes, change permissions, configure signals, configure network/pid/ipc namepsaces, etc. The fact you need to actually do things between the fork() and execve() is why stuff like posix_spawn() never goes anywhere. There's an awful lot of state that gets inherited and you need to be able to manipulate all of it before starting the new process.
Ideally you'd like a way to create a new (empty) process and be able to manipulate its execution state using the standard syscalls without actually forking and then at the last moment kick it off with the new ELF image directly. Probably some smart cookie has designed such an interface, but I don't see it taking off any time soon.
Maybe an execve() with an io_uring-like list of syscalls to execute in the new process? Or via BPF?
Posted Aug 19, 2023 21:03 UTC (Sat)
by izbyshev (guest, #107996)
[Link]
Technically, all these things are the kernel state, not the libc state, so they can be configured after vfork() via direct syscalls without ever touching libc. But this is rarely a good option for a typical application because there are some footguns with direct syscall usage, as well as with vfork() itself.
> Maybe an execve() with an io_uring-like list of syscalls to execute in the new process? Or via BPF?
This has been discussed: https://lwn.net/Articles/908268. No news after that, I'm afraid.
Posted Aug 20, 2023 3:01 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
For 99% of cases none of that is needed.
> Maybe an execve() with an io_uring-like list of syscalls to execute in the new process? Or via BPF?
Ideally? Create a process in a suspended state, returning its file descriptor, then poke at it with process management functions that accept FDs, and finally let it continue.
Posted Aug 21, 2023 16:33 UTC (Mon)
by ibukanov (subscriber, #3942)
[Link] (1 responses)
Posted Aug 21, 2023 16:48 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link]
Out-of-memory victim selection with BPF
Out-of-memory victim selection with BPF
Out-of-memory victim selection with BPF
Out-of-memory victim selection with BPF
Out-of-memory victim selection with BPF
Out-of-memory victim selection with BPF