|
|
Subscribe / Log in / New account

Two new ways to read a file quickly

Two new ways to read a file quickly

Posted Mar 7, 2020 23:02 UTC (Sat) by josh (subscriber, #17465)
In reply to: Two new ways to read a file quickly by adobriyan
Parent article: Two new ways to read a file quickly

It doesn't seem reasonable to put all the blame on userspace when the kernel gives it misleading information.

I wonder if we could enhance statx to have a STATX_SIZE_HINT flag? With that flag, statx could return a new attribute indicating that the file has an unspecified size and should be read in a single read call, along with a hint for a buffer size that's *probably* big enough. That would substantially reduce the number of read calls.

(Also, for future reference, the first statx call is Rust probing to see if the kernel supports statx, and it only happens for the first statx in the program. Likewise, the fcntl checks if the kernel respects O_CLOEXEC, and that only happens on the first open.)


to post comments

Two new ways to read a file quickly

Posted Mar 9, 2020 14:10 UTC (Mon) by walters (subscriber, #7396) [Link] (2 responses)

Maybe a simpler change would be for the kernel to cache the *last* size of a file in /proc and report it? Though it might trigger bugs in userspace apps that wouldn't be prepared for the file growing between stat() and read().

Two new ways to read a file quickly

Posted Mar 9, 2020 15:29 UTC (Mon) by mathstuf (subscriber, #69389) [Link]

I think any app or library expecting stat to have trustworthy information at read time without some safety net is already a disaster waiting to happen (especially if they're in the /proc hierarchy; the file could fail to *open* after stat returns because the process exited in the meantime).

Two new ways to read a file quickly

Posted Mar 11, 2020 11:51 UTC (Wed) by adobriyan (subscriber, #30858) [Link]

Best way to read /proc is to read PAGE_SIZE minimum at once, and interpret short read as EOF for small files like /proc/uptime or /proc/*/statm which are bounded in size. Bigger reads should be (PAGE_SIZE << n) for unbounded files (/proc/*/maps):

m->buf = seq_buf_alloc(m->size <<= 1);

Most of sysfs is 4KB tops but arbitrary sized for binary attributes.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds