The ISCSI memory deadlock problem
Posted Jul 22, 2005 18:57 UTC (Fri) by giraffedata
Parent article: Kernel Summit 2005: Convergence of network and storage paths
There really is no amount of memory you can simply set aside in an emergency pool to avoid these memory deadlocks. You have to reserve memory for a particular thread of execution; the more threads, the more memory. And you have to reserve the memory and other resources in a fixed order. I.e. make sure you never need Level N or N+1 resource in order to proceed to where you can release Level N resource you are already holding.
This is where ISCSI has a special problem. Linux has been designed so that the network is a higher layer than memory management. A network function can request MM services, and an MM function can't request network services. But in ISCSI, MM does in fact need network services (to give you memory, MM has to clean dirty memory, which means it needs block services, which need network services).
The only fix is to put the resource grabbing back in order -- make sure a process reserves all of the memory it needs to finish what it's doing at one time, and somehow makes it available to the functions further down its stack that need it. This is a nontrivial extension of various kernel services.
As for the problem that a process' memory requirement changes without any kernel participation when the process dirties memory: The kernel has to make the reservation at the time it adds a dirtyable page to the process' address space.
Throttling, like an arbitrary emergency reserve, mitigates but does not solve the deadlock problem. Throttling is where you choke off new work at its source. It can help performance and fairness. But to avoid deadlock, you have to push the work back from the destination: don't accept any piece of work until you've reserved all the resources needed to guarantee you'll complete it. It has the same slowing effect, but is fundamentally different from throttling.
to post comments)