Not need for new syscall

Posted May 30, 2019 21:30 UTC (Thu) by scientes (guest, #83068)
Parent article: A ring buffer for epoll

You just add a flag, and with that flag there is a second syscall argument. Look at futex() and the crazy variable number of arguments. glibc then magically calls it epoll_create2, or whatever. But no need for a new syscall, just a new flag.

Not need for new syscall

Posted May 30, 2019 21:36 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

I don't quite get it why people are so opposed to new syscalls.

Not need for new syscall

Posted May 30, 2019 21:50 UTC (Thu) by scientes (guest, #83068) [Link]

I just checked, epoll_create1() checks for unknown flags, so there totally is no need for a new syscall.

Not need for new syscall

Posted May 30, 2019 22:53 UTC (Thu) by cyphar (subscriber, #110703) [Link] (3 responses)

We are running out of syscall space. 5.3 will probably have 434 common syscalls on all architectures, and there are apparently cache-related performance impacts once you pass 512 (on x86 at least). This doesn't mean we should always avoid new syscalls, but rather we should be careful when we add them. If the only user-facing purpose of a new syscall is to add a struct argument then we should look at doing it that way.

Not need for new syscall

Posted May 31, 2019 11:28 UTC (Fri) by epa (subscriber, #39769) [Link] (1 responses)

It seems there are two different issues here. One is the ABI used to call into the kernel on different architectures. That may support a fixed number of 'system call numbers' or have performance reasons to keep it down. The other is the API provided to the C library and by the C library to applications so they can call the familiar named functions like open(2) or kill(2). You could have an operating system running on i386 that used only a syscall number when calling into the kernel, but still provided the usual POSIX system call names. Is there a reason Linux can't add new "system calls" indefinitely in this way?

Not need for new syscall

Posted May 31, 2019 12:47 UTC (Fri) by smurf (subscriber, #17840) [Link]

Multiplex syscalls are generally frowned upon these days. Indirection eats another register for the "real" syscall number, tracing and syscall filtering get more complicated, … Besides, yes the syscall table would be full after adding the 512th entry, but extending it to 1024 is not exactly rocket science.

Adding a generator for these tables, in order to use a central point of syscall registry instead of the current arch hodgepodge, is certainly possible. Just do it …

Not need for new syscall

Posted Jun 1, 2019 15:24 UTC (Sat) by luto (guest, #39314) [Link]

What’s the issue on x86? As far as I know, the only real issue is running into the silly x32 aliases, but we can easily fix that.

Not need for new syscall

Posted May 31, 2019 6:56 UTC (Fri) by koenkooi (subscriber, #71861) [Link] (2 responses)

My issue with new syscalls is that they usually get added and enabled for a single platform, x86_64, and only added to more platforms months or years after that. This happened with the original epoll and accept4. The issue manifested itself as a 180 second delay during boot due to accept4:

* sys_accept4() was added in 2.6.28
* sys_accept4() was added for ARM in 2.6.36
* (e)glibc built against 2.6.32 headers on and ARM board running 2.6.32

With help from the systemd folks I tracked it down to accept4 missing, so I applied http://lists.infradead.org/pipermail/linux-arm-kernel/201... to the 2.6.32 kernel. Still a 3 minute delay. That's when I realized I needed to build eglibc against the patched 2.6.32 headers as well as patching the kernel. Running a kernel with the new syscall hooked up is not enough!

So everytime a new syscall gets proposed that is desired by the base layers in the OS I keep an eye on the ARM syscall list to avoid surprises. Marcin keeps this table up to date: https://fedora.juszkiewicz.com.pl/syscalls.html

System calls and architectures

Posted May 31, 2019 13:50 UTC (Fri) by corbet (editor, #1) [Link] (1 responses)

That's not really a problem with new system calls; it's about how they are implemented in the kernel. The good news is that this situation has gotten a lot better and continues to improve. A lot of the system-call boilerplate is being unified across architectures, and it's increasingly expected that new system calls will be enabled for most or all architectures from the outset.

System calls and architectures

Posted May 31, 2019 14:27 UTC (Fri) by cyphar (subscriber, #110703) [Link]

And new (>403) syscalls now use the same number on all architectures, so in principle there should be no need to rebuild libraries to get a __NR_foobar definition on a given architecutre -- libraries should be able to simply do a -ENOSYS check at runtime with an non-arch-specific __NR_foobar value.