The obvious solution is to allow the mm code to move pages that are queued for I/O, temporarily blocking the actual hardware submission, instead of waiting until the hardware completes the I/O.
In addition the number of pages actually submitted to hardware simultaneously number of such pages needs to be limited, or they must be copied to a compact bounce buffer before submitting to the hardware, unless they are already contiguous.
Also, with a suitable IOMMU, it should be possible to move pages even while they are actively read by the hardware.
The fix proposed instead seems to make no sense, as it doesn't fix the issue and introduces unpredictable performance degradation.