Network block devices and OOM safety
Posted Apr 1, 2005 1:47 UTC (Fri) by giraffedata
In reply to: Network block devices and OOM safety
Parent article: Network block devices and OOM safety
I can confirm that filesystem drivers have this problem. I work on network filesystems, some of them based on ISCSI devices, and memory deadlocks and I have become good friends.
But they're a lot less common in filesystems, which is why people didn't demand a fix to the fundamental problem (network layer being simultaneously above and below the main memory pool) years ago.
Most network filesystem access is with NFS.
NFS in its normal configuration always writes synchronously, so the amount of dirty pages is very small.
StorNext does relatively little direct network I/O; the majority of its I/O is through block devices. (And since they aren't usually TCP/IP-based devices, the current problem is inapplicable).
Lustre hasn't seen a large variety of applications; it probably gets lucky.
Most filesystem drivers naturally meter their activity by using the buffer cache, with its somewhat ham-handed limitation on the total amount of memory it's willing to occupy. So long before memory usage gets critical, file-using processes slow down, waiting for new buffers, thereby giving the system a chance to clean out the old ones.
I work on a filesystem driver that uses its own cache manager instead of the buffer cache -- a cache manager that isn't afraid to use every byte of memory for file cache if that's the optimum use for it. Hence my deep experience with these deadlocks.
to post comments)