> I doubt that 5 years from now the norm for *desktops* will be 12 disks
> on a RAID and 8 threads writing on it.
That device used RAID to get the capacity over 16TB, and 512MB of BBWC to keep the IOPS and bandwidth high to/from spinning disk storage. It sustained up to 10,000 IOPS and 600MB/s on some of those workloads. That's the only reason RAID was used.
To put that IO capability in context, my desktop already has 8 processor threads and a pair of SSDs that sustain 70,000 random write IOPS and 500MB/s. That cost me about $1500 a couple of years ago. It has more capability from the IO perspective than the hardware that I ran the tests on (except for capacity). And I do 8-way kernel builds on it, so I'm already writing with 8 threads to the filesystem on my desktop machine.
IOWs, the test system and tests I described are relevant not only to people with current low end servers but also relevant to those with current medium-to-high-end desktops and laptops. As I've said previously, that was the whole point of the exercise - to show how capable cheap, low end storage is these days and how only XFS can make full use of it....
> Now, 3 seconds compared to 2 seconds doesn't sound so bad, but I'd expect
> that to mean also that the operations that now take 2 minutes on my
> ext4 to take 3 minutes on XFS. A 50% difference certainly means it is
> appreciably slower.
You simply can't make that sort of generic extrapolation from one workload to all workloads. Because the main tests I ran reflected one -specific- aspect of metadata workloads (creates in isolation, read in isolation, unlink in isolation) you can use those numbers to get an idea of how a workload that is heavy in those behaviours might perform and scale. But you cannot say that all workloads will end up with a specific performance differential just because one workload had that differential...