BIO vs DIO
BIO vs DIO
Posted Jun 7, 2024 8:13 UTC (Fri) by Homer512 (subscriber, #85295)In reply to: BIO vs DIO by Paf
Parent article: Measuring and improving buffered I/O
If we could have this happen automatically, it would help so much. For example we could go back to using standard file formats without having to reimplement them. Right now I'm using a custom TIFF file writer for one application simply because there is no way of getting libtiff to do direct IO. Same for things like HDF5.
Heck, I can't even use cp or rsync for backups at the moment since for example rsync does not go faster than 800 MB/s on that server.
Posted Jun 7, 2024 9:57 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
Exactly the use case - shifting a huge volume of data that is going to be written to disk and that's it. What you do NOT want is linux sticking it in the disk cache "just in case". And I think the speedup was (low) orders of magnitude.
I don't think it was even buffered i/o - if you can get linux to disable the disk cache on that machine, see what results you get with standard tiff, cp etc.
Cheers,
Posted Jun 7, 2024 17:29 UTC (Fri)
by Paf (subscriber, #91811)
[Link]
Some of the marketing folks at a Lustre vendor put together something showing improvements in PyTorch checkpointing, for example.
It's something that would presumably be useful in other file systems too, but Lustre is out of tree, so it'll stay within Lustre for now. (It's GPLv2, just out of tree.)
If someone working in upstream wants to copy the idea, I won't object. (Would like credit, of course!)
BIO vs DIO
Wol
BIO vs DIO