Two new ways to read a file quickly

Posted Mar 6, 2020 23:58 UTC (Fri) by jlayton (subscriber, #31672)
In reply to: Two new ways to read a file quickly by willy
Parent article: Two new ways to read a file quickly

I've been back and forth on it, but I think leaving fd allocation to the kernel is ultimately the right thing to do. Very few applications actually care _what_ fd they end up getting, and trying to do anything else is going to make it hard to eliminate competing users. This functionality will almost certainly end up in certain libraries after all so you would need a standard allocator of some sort.

Mainly, io_uring needs to be able to specify that a subsequent read (or write, or whatever) use the fd from an open done earlier in the same chain. I think just being able to express "use the fd from last open" would be super useful for about 90% of use cases, and you could always layer on another way to deal with multiple fds later.

Two new ways to read a file quickly

Posted Mar 7, 2020 6:33 UTC (Sat) by ncm (guest, #165) [Link] (5 responses)

In multi-threaded programs (which do exist), the concept of a "last-opened file descriptor" is entirely meaningless. How can anyone think this would be a good idea?

Two new ways to read a file quickly

Posted Mar 7, 2020 10:39 UTC (Sat) by intgr (subscriber, #39733) [Link] (4 responses)

But in the context of a single io_uring command queue, "last-opened file descriptor" seems perfectly well defined.

Two new ways to read a file quickly

Posted Mar 7, 2020 17:33 UTC (Sat) by justincormack (subscriber, #70439) [Link] (3 responses)

Not if you open lots of files asynchronously. You need to identify which open you meant.

Two new ways to read a file quickly

Posted Mar 7, 2020 18:01 UTC (Sat) by nivedita76 (subscriber, #121790) [Link] (2 responses)

A linked sequence of io_uring operations provides that. If you need to operate on lots of files, you have lots of linked sequences, each of which opens one file and operates on it.

Two new ways to read a file quickly

Posted Feb 20, 2024 13:40 UTC (Tue) by sammythesnake (guest, #17693) [Link] (1 responses)

Sorry for the thread necromancy, but...

Isn't operating on thousands of files in one io_uring linked sequence exactly the kind of thing some applications would like to do, reducing thousands of syscalls to a couple...?

Would some kind of per-io_uring-linked-sequence "pseudo-FD" make sense? In/alongside open operations, you could provide a number (1, 2, 3...) for each file opened in the sequence that the kernel transparently maps to "real" FDs internally. Later operations could then identify which of the files opened within the sequence should be acted on (e.g. "read the file *this sequence* calls "1". Maybe with negative FD numbers...?)

The pFD *could* be sequentially allocated so subsequent calls would simply say "the third one opened" but keeping those straight while editing the sequence would be error-prone, so that's probably not a win over finding a way to nominate a pFD.

Obviously, they're are details to sort out like managing the pFD->FD mappings, and getting the API right, but none of that sounds nastier than the other things suggested in this thread (to me, at least - I'm merely a curious bystander!)

This is presumably a very naive question, but can't an io_uring open() operation save the FD returned it a look-up table to be referenced by later operations - that would seem the "obvious" way to me, but I assume this isn't possible, or this whole thread would be moot...

Two new ways to read a file quickly

Posted Feb 20, 2024 16:32 UTC (Tue) by kleptog (subscriber, #1183) [Link]

I think you have described the descriptorless FDs: https://lwn.net/Articles/863071/