Two new ways to read a file quickly

Posted Mar 6, 2020 20:46 UTC (Fri) by axboe (subscriber, #904)
In reply to: Two new ways to read a file quickly by mezcalero
Parent article: Two new ways to read a file quickly

I don't generally disagree with you. 1) The concept isn't finalized at all, and 2) As mentioned at the end, having the kernel manage the allocations were indeed one of the suggestions.

Two new ways to read a file quickly

Posted Mar 6, 2020 21:25 UTC (Fri) by willy (subscriber, #9762) [Link] (10 responses)

I'd like to discuss this on linux-fsdevel, but in lieu of that ...

I think the right way to do this is to have userspace open /dev/null as often as it needs to in order to create the fds it will need. Then use those fd #s in the io_uring calls.

Two new ways to read a file quickly

Posted Mar 6, 2020 23:58 UTC (Fri) by jlayton (subscriber, #31672) [Link] (6 responses)

I've been back and forth on it, but I think leaving fd allocation to the kernel is ultimately the right thing to do. Very few applications actually care _what_ fd they end up getting, and trying to do anything else is going to make it hard to eliminate competing users. This functionality will almost certainly end up in certain libraries after all so you would need a standard allocator of some sort.

Mainly, io_uring needs to be able to specify that a subsequent read (or write, or whatever) use the fd from an open done earlier in the same chain. I think just being able to express "use the fd from last open" would be super useful for about 90% of use cases, and you could always layer on another way to deal with multiple fds later.

Two new ways to read a file quickly

Posted Mar 7, 2020 6:33 UTC (Sat) by ncm (guest, #165) [Link] (5 responses)

In multi-threaded programs (which do exist), the concept of a "last-opened file descriptor" is entirely meaningless. How can anyone think this would be a good idea?

Two new ways to read a file quickly

Posted Mar 7, 2020 10:39 UTC (Sat) by intgr (subscriber, #39733) [Link] (4 responses)

But in the context of a single io_uring command queue, "last-opened file descriptor" seems perfectly well defined.

Two new ways to read a file quickly

Posted Mar 7, 2020 17:33 UTC (Sat) by justincormack (subscriber, #70439) [Link] (3 responses)

Not if you open lots of files asynchronously. You need to identify which open you meant.

Two new ways to read a file quickly

Posted Mar 7, 2020 18:01 UTC (Sat) by nivedita76 (subscriber, #121790) [Link] (2 responses)

A linked sequence of io_uring operations provides that. If you need to operate on lots of files, you have lots of linked sequences, each of which opens one file and operates on it.

Two new ways to read a file quickly

Posted Feb 20, 2024 13:40 UTC (Tue) by sammythesnake (guest, #17693) [Link] (1 responses)

Sorry for the thread necromancy, but...

Isn't operating on thousands of files in one io_uring linked sequence exactly the kind of thing some applications would like to do, reducing thousands of syscalls to a couple...?

Would some kind of per-io_uring-linked-sequence "pseudo-FD" make sense? In/alongside open operations, you could provide a number (1, 2, 3...) for each file opened in the sequence that the kernel transparently maps to "real" FDs internally. Later operations could then identify which of the files opened within the sequence should be acted on (e.g. "read the file *this sequence* calls "1". Maybe with negative FD numbers...?)

The pFD *could* be sequentially allocated so subsequent calls would simply say "the third one opened" but keeping those straight while editing the sequence would be error-prone, so that's probably not a win over finding a way to nominate a pFD.

Obviously, they're are details to sort out like managing the pFD->FD mappings, and getting the API right, but none of that sounds nastier than the other things suggested in this thread (to me, at least - I'm merely a curious bystander!)

This is presumably a very naive question, but can't an io_uring open() operation save the FD returned it a look-up table to be referenced by later operations - that would seem the "obvious" way to me, but I assume this isn't possible, or this whole thread would be moot...

Two new ways to read a file quickly

Posted Feb 20, 2024 16:32 UTC (Tue) by kleptog (subscriber, #1183) [Link]

I think you have described the descriptorless FDs: https://lwn.net/Articles/863071/

Two new ways to read a file quickly

Posted Mar 7, 2020 13:34 UTC (Sat) by josh (subscriber, #17465) [Link] (2 responses)

I don't think that opening /dev/null is the right approach here, for multiple reasons:

1) The block of reserved fds shouldn't actually consist of open file descriptors that take up resources, especially if we may want to have a block of reserved fds per thread.
2) If the only thing keeping an fd reserved is that it has an open file on it, then once that fd is opened and subsequently closed, it stops being reserved. The fd should stay reserved after being closed.
3) O_SPECIFIC_FD specifically doesn't allow opening "over" an existing open file descriptor the way dup2 does; it'll return -EBUSY. I felt that would be less error-prone, and would help catch races.

Two new ways to read a file quickly

Posted Mar 8, 2020 15:08 UTC (Sun) by pbonzini (subscriber, #60935) [Link] (1 responses)

Regarding 3 I think that's a mistake, O_SPECIFICFD to open and dup2 to close seems like a very easy way to manage pre-reserved file descriptors, even without the prctl.

Two new ways to read a file quickly

Posted Mar 8, 2020 15:14 UTC (Sun) by josh (subscriber, #17465) [Link]

Perhaps we could have a specialized "reserved" fd type, and use that instead of opening /dev/null, and only allow "overwriting" that.It would help to have a guaranteed continuous chunk of fds to allocate out of, though.