Two new ways to read a file quickly

Posted Mar 8, 2020 7:55 UTC (Sun) by mm7323 (subscriber, #87386)
In reply to: Two new ways to read a file quickly by ploxiln
Parent article: Two new ways to read a file quickly

> Programs like ps, top, grep ...

But is anyone actually complaining or concerned about poor performance of these programs?

Two new ways to read a file quickly

Posted Mar 8, 2020 11:29 UTC (Sun) by mpr22 (subscriber, #60784) [Link] (5 responses)

top's job is to inspect system performance, so it should itself be performant to minimize its impact on system performance.

And I'm sure plenty of people care about grep running slower than it needs to.

Two new ways to read a file quickly

Posted Mar 8, 2020 17:00 UTC (Sun) by mm7323 (subscriber, #87386) [Link] (4 responses)

And I'm sure plenty of people care about grep running slower than it needs to.

grep on large files should be IO bound, and grep on a small file is surely overshadowed by process startup time rather than an extra system call to get file contents into a buffer.

I've also never noticed top negatively impacting system performance; even the busybox version on little embedded systems has never caused me a problem or disappointed.

Generically allowing system-call batching is a good idea, but personally I'm less convinced by esoteric system-calls for specific and limited use cases.

Two new ways to read a file quickly

Posted Mar 8, 2020 18:06 UTC (Sun) by andresfreund (subscriber, #69562) [Link] (1 responses)

I have seen top use a noticeable fraction of CPU time. You're not gonna hit that on a 2 core system with 30 idling processes. But a busy 64 core system with a few thousand processes is a different story.

Two new ways to read a file quickly

Posted Mar 8, 2020 23:42 UTC (Sun) by himi (subscriber, #340) [Link]

Very much so, and even in less intrusive cases it introduced a spike in instantaneous load that can affect interactive and time sensitive processes (games, audio, that kind of thing).

And the general principle of having measurement of the system cause as little impact on the properties being measured definitely holds.

Two new ways to read a file quickly

Posted Mar 9, 2020 11:05 UTC (Mon) by Sesse (subscriber, #53779) [Link]

grep on a small file, sure, but what about grep on many small files (e.g. grep -r foo /usr/include)?

Two new ways to read a file quickly

Posted Mar 9, 2020 22:00 UTC (Mon) by roc (subscriber, #30627) [Link]

I grep through large source trees all the time --- several gigabytes of data spread over thousands or millions of files, all cached in RAM so not I/O bound.

Two new ways to read a file quickly

Posted Mar 8, 2020 20:02 UTC (Sun) by excors (subscriber, #95769) [Link]

Many people do care about grep being slow, and care enough to rewrite it from scratch. Even ack is advertised as being faster than grep (mostly by filtering the list of files to search, e.g. skipping .git directories by default), and ag is advertised as "an order of magnitude faster than ack" (because of multithreading, mmap() instead of read(), faster regex engine, etc), and ripgrep is advertised as being much faster than ag (better multithreading, both mmap() and read() in different situations, better regex engine, etc).

There's no point optimising the kernel for grep itself, because grep could be improved by maybe two orders of magnitude with purely application changes; but it might be worth considering whether kernel changes could improve the performance of ripgrep which has already taken most of the low-hanging fruit.

Two new ways to read a file quickly

Posted Mar 8, 2020 23:51 UTC (Sun) by himi (subscriber, #340) [Link]

The point of readfile() in the context of ps and the like is that it's opening, reading the contents of, and closing lots of small files - that's a lot of small operations performed on the filesystem, which in cases like a clustered filesystem could be translated into things like a lot of distributed lock operations and metadata operations, which can have major impacts on filesystem performance. Using a single syscall won't get rid of all that, but depending on the implementation it could represent a significant improvement.

Two new ways to read a file quickly

Posted Mar 17, 2020 16:38 UTC (Tue) by mebrown (subscriber, #7960) [Link]

I am. I have an embedded system that I need to monitor for performance issues and I'd rather not my performance monitoring tools cause performance issues!

In practice I observe that current implementations of top use a noticeable percentage of CPU, which can throw off my observations.