A block layer introduction part 1: the bio layer

Posted Oct 25, 2017 19:12 UTC (Wed) by edos (guest, #116377)
Parent article: A block layer introduction part 1: the bio layer

I didn't get completely about deadlock in the article.
How is that possible when we have stacked block devices to produce a deadlock based on interdependency? It is not clear for me still

A block layer introduction part 1: the bio layer

Posted Oct 25, 2017 21:34 UTC (Wed) by neilbrown (subscriber, #359) [Link] (7 responses)

A simple, though extremely unlikely, scenario that could cause a deadlock is:
- Suppose I have a RAID1 array where each of the member devices is a RAID0 array with a 4K chunk size.
- An 8K write BIO arrives for the RAID1 array. raid1 code allocates two bios from a private pool and sends an 8K bio to each of the RAID0 devices. These two bios gets queued by generic_make_request.
- Then generic_make_request starts processing the first RAID0 bio. raid0 code needs to split it into 2 4K bios and so allocates a bio from a private pool and submits the new bio and the old bio (now reduced in size) to the underlying devices. These two bios get queued by generic_make_request.
- Then generic_make_request starts processing the second RAID0 bio (newer code will have sorted this to the end of the list, to help avoid the deadlock). Again raid0 code needs to split the bio.

Now, suppose there is no free memory, suppose the private mempool has 16 preallocated entries, and suppose 16 threads all perform exactly this 8K write submission (to different addresses in the RAID1) at the same time.
We will end up with 16 threads all trying to allocate a second bio from the same private pool, while the 16 preallocated entries are each trapped, one per thread, in the generic_make_request queue. The allocations will wait for a previously allocated bio to complete, and those previous bios won't be processed by generic_make_request() until after the allocation completes.

There are other scenarios that are more complex, but are likely enough to actually happen in practice.

A block layer introduction part 1: the bio layer

Posted Oct 25, 2017 22:03 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

I've been looking at the bio layer and I'm wondering if BIO can have a "limp along" mode where it stops all threads and does synchronous submission from one thread? It then can either use a "last reserve" mempool or unsplit pending BIOs.

A block layer introduction part 1: the bio layer

Posted Oct 26, 2017 4:03 UTC (Thu) by neilbrown (subscriber, #359) [Link] (4 responses)

What would be the purpose, or value, of this "limp along" mode. I don't understand...

A block layer introduction part 1: the bio layer

Posted Oct 26, 2017 4:06 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

Crawl out of the out-of-memory situation to resolve the deadlock.

A block layer introduction part 1: the bio layer

Posted Oct 26, 2017 5:57 UTC (Thu) by neilbrown (subscriber, #359) [Link] (2 responses)

> Crawl out of the out-of-memory situation to resolve the deadlock.

Surely it is better to design the code to be dead-lock free. It isn't that hard once the problem is understood. (and if the problem isn't understood, then a workaround like that might not be a complete solution).

A block layer introduction part 1: the bio layer

Posted Oct 26, 2017 6:04 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

I'm confused. Can the abovementioned scenario deadlock with the current bio layer?

A block layer introduction part 1: the bio layer

Posted Oct 27, 2017 2:01 UTC (Fri) by neilbrown (subscriber, #359) [Link]

> Can the abovementioned scenario deadlock with the current bio layer?

No, hence the parenthetical comment (newer code will have sorted this to the end of the list, to help avoid the deadlock).
Providing drivers which split bios only process one of them and submit the other directly to generic_make_request(), there should be no deadlock (of this sort).

A block layer introduction part 1: the bio layer

Posted Oct 26, 2017 8:49 UTC (Thu) by edos (guest, #116377) [Link]

Nice example, thank you!