Development quote of the week

[Posted December 4, 2024 by jzb]

The biggest problem in benchmarking filesystem I/O is that Linux tries really hard to avoid doing it, aggressively using any spare memory as a filesystem cache. This is why serving static Web traffic out of the filesystem often remains a good idea in 2024; your server will take care of caching the most heavily fetched data in RAM without you having to do cache management, which everyone knows is hard.

I have read of various cache-busting strategies and have never really been convinced that they'll outsmart this aspect of Linux, which was written by people who are way smarter and know way more than I think I do. So Bonnie has always used a brute-force approach: Work on a test file which is much bigger than main memory, so Linux has to do at least some real I/O. Ideally you'd like it to be several times the memory size.

But this has a nasty downside. The computer I'm typing on has 32GB of memory, so I ran Bonnie with a 64G filesize (128G would have been better) and it took 35 minutes to finish. I really don't see any way around this annoyance but I guess it's not a fatal problem.

Oh, and those numbers: Some of them look remarkably big to me. But I'm an old guy with memories of how we had to move the bits back forth individually back in the day, with electrically-grounded tweezers.

— Tim Bray

isn't fio this?

Posted Dec 5, 2024 2:18 UTC (Thu) by anarcat (subscriber, #66354) [Link] (3 responses)

Isn't fio, written by linux kernel folks, precisely designed to workaround those caching issues?

isn't fio this?

Posted Dec 5, 2024 5:48 UTC (Thu) by adobriyan (subscriber, #30858) [Link] (2 responses)

With fsync and fdatasync options used, kind of, yes.

isn't fio this?

Posted Dec 5, 2024 9:55 UTC (Thu) by k3ninho (subscriber, #50375) [Link] (1 responses)

I take the example fio invocation from the GitLab docs, which uses `--ioengine=libaio --direct=1` for async queuing and O_DIRECT respectively. Do I have this wrong, is it all about the fsync?

K3n.

isn't fio this?

Posted Dec 5, 2024 13:32 UTC (Thu) by adobriyan (subscriber, #30858) [Link]

The original quote is all about not-O_DIRECT mode. fsync/fdatasync/msync force some I/O before kernel decides it is a good idea.

I'm probably missing something obvious

Posted Dec 5, 2024 11:35 UTC (Thu) by Rigrig (subscriber, #105346) [Link] (2 responses)

Can't you fill up the memory with random data before running the benchmark?

I'm probably missing something obvious

Posted Dec 5, 2024 15:20 UTC (Thu) by philipstorry (subscriber, #45926) [Link] (1 responses)

At a guess - and this is just a guess - it's because that won't be reliable.

The kernel is constantly looking at what's happening and trying to optimise it. If you just allocate a large chunk of memory and then do nothing with it, then at some point during the benchmark running the kernel may realise that memory isn't actually being used, and page it out.

This ruins your benchmark because you suddenly have disk paging during the middle of it. Worse, the exact point at which this happens is well beyond the developer's control. It could happen at very different times during different runs on the same machine. Let alone across multiple machines.

The developer of the benchmark ends up second guessing what the kernel will do. Which is dictated by a combination of the kernel developers and the machine's kernel configuration. It is unknowable at the time of writing the benchmark.

So the most reliable strategy is simply, as Tim Bray says, to work with a file that you know is larger than the cache can possibly be. That way you know that you will be hitting the storage at some point, because it is impossible not to.

I'm probably missing something obvious

Posted Dec 5, 2024 23:12 UTC (Thu) by gerdesj (subscriber, #5446) [Link]

I wonder what a virty balloon driver actually does. Insert a balloon driver, inflate balloon, run IO tests, deflate balloon, remove driver.

Perhaps the VirtIO one could be co-opted for this task for physical systems. I have a nagging feeling that you will have to emulate a physical host and make the real one think it is a VM to get this to work!