LWN: Comments on "Device-to-device memory-transfer offload with P2PDMA" https://lwn.net/Articles/767281/ This is a special feed containing comments posted to the individual LWN article titled "Device-to-device memory-transfer offload with P2PDMA". en-us Wed, 01 Oct 2025 15:11:45 +0000 Wed, 01 Oct 2025 15:11:45 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/1001294/ https://lwn.net/Articles/1001294/ sammythesnake <div class="FormattedComment"> <span class="QuotedText">&gt; two functions on the same device cannot do P2P DMA with this patch series if they are plugged directly into the root port</span><br> <p> That seems like a fairly likely use case - passing off some data from one stage of processing to another, so hopefully this restriction is lifted soon. I imagine that's a direction in the developers' sights, though - I'm happy to assume that my negligible level of domain knowledge is outdone by theirs ;-)<br> <p> A couple of possible factors that might make it less of an urgent need occur to me, though, how likely are these, I wonder...?<br> <p> 1. How common would it be for these related functions to be plugged into the root, rather than sharing a (device internal?) bridge? <br> <p> 2. I imagine such devices might simply share the memory between the stages and not need DMA at all for this kind of stage-to-stage handover...?<br> </div> Sat, 07 Dec 2024 11:51:56 +0000 size requirement for pci_p2pdma_add_resource()? https://lwn.net/Articles/992365/ https://lwn.net/Articles/992365/ KCLWN <div class="FormattedComment"> I am calling pci_p2pdma_add_resource() with a portion of a 32MB BAR memory. I've been successful using size values of 16MB, 28MB, 30MB and 32MB. For size values of 29MB and 31MB, I get the following failure. Does the size value to pci_p2pdma_add_resource() need to be multiple of 2MB (large page size)?<br> <p> [ 472.762396] ------------[ cut here ]------------<br> [ 472.762400] Misaligned __add_pages start: 0x600da000 end: 0x600dbeff<br> [ 472.762409] WARNING: CPU: 30 PID: 199 at mm/memory_hotplug.c:395 __add_pages+0x121/0x140<br> [ 472.762420] Modules linked in: dre_drv(OE+) qrtr cfg80211 intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm binfmt_misc irqbypass dax_hmem cxl_acpi rapl cxl_core nls_iso8859_1 ipmi_ssif ast i2c_algo_bit acpi_ipmi i2c_piix4 ccp k10temp ipmi_si ipmi_devintf ipmi_msghandler joydev input_leds mac_hid dm_multipath msr efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 hid_generic rndis_host usbhid cdc_ether usbnet hid mii crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 i40e nvme nvme_core ahci nvme_auth xhci_pci libahci xhci_pci_renesas aesni_intel crypto_simd cryptd [last unloaded: dre_drv(OE)]<br> [ 472.762567] CPU: 30 PID: 199 Comm: kworker/30:0 Tainted: G W OE 6.8.0-45-generic #45-Ubuntu<br> [ 472.762573] Hardware name: Supermicro AS -2025HS-TNR/H13DSH, BIOS 1.6a 03/28/2024<br> [ 472.762576] Workqueue: events work_for_cpu_fn<br> [ 472.762584] RIP: 0010:__add_pages+0x121/0x140<br> [ 472.762591] Code: bc c6 05 aa 6b 5c 01 01 e8 2c e4 f7 fe eb d3 49 8d 4c 24 ff 4c 89 fa 48 c7 c6 70 57 84 bc 48 c7 c7 50 42 e8 bc e8 ef ed ec fe &lt;0f&gt; 0b eb b4 0f b6 f3 48 c7 c7 50 02 84 bd e8 0c e8 6f ff eb b6 66<br> [ 472.762595] RSP: 0018:ff6b7dd90cedfbc0 EFLAGS: 00010246<br> [ 472.762600] RAX: 0000000000000000 RBX: 00000000600da000 RCX: 0000000000000000<br> [ 472.762604] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000<br> [ 472.762607] RBP: ff6b7dd90cedfbf0 R08: 0000000000000000 R09: 0000000000000000<br> [ 472.762609] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000600dbf00<br> [ 472.762612] R13: ff6b7dd90cedfca0 R14: 0000000000000000 R15: 00000000600da000<br> [ 472.762615] FS: 0000000000000000(0000) GS:ff3edc4137a00000(0000) knlGS:0000000000000000<br> [ 472.762619] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033<br> [ 472.762623] CR2: 00007ffcfc44b9c0 CR3: 0000005453a26001 CR4: 0000000000f71ef0<br> [ 472.762626] PKRU: 55555554<br> [ 472.762629] Call Trace:<br> [ 472.762632] &lt;TASK&gt;<br> [ 472.762638] ? show_regs+0x6d/0x80<br> [ 472.762645] ? __warn+0x89/0x160<br> [ 472.762653] ? __add_pages+0x121/0x140<br> [ 472.762659] ? report_bug+0x17e/0x1b0<br> [ 472.762668] ? handle_bug+0x51/0xa0<br> [ 472.762673] ? exc_invalid_op+0x18/0x80<br> [ 472.762678] ? asm_exc_invalid_op+0x1b/0x20<br> [ 472.762688] ? __add_pages+0x121/0x140<br> [ 472.762696] add_pages+0x17/0x70<br> [ 472.762702] arch_add_memory+0x45/0x60<br> [ 472.762708] pagemap_range+0x232/0x420<br> [ 472.762717] memremap_pages+0x10e/0x2a0<br> [ 472.762722] ? srso_alias_return_thunk+0x5/0xfbef5<br> [ 472.762730] devm_memremap_pages+0x22/0x70<br> [ 472.762736] pci_p2pdma_add_resource+0x1c7/0x560<br> [ 472.762744] ? srso_alias_return_thunk+0x5/0xfbef5<br> [ 472.762750] ? DRE_dmDevMemAlloc+0x44a/0x580 [dre_drv]<br> [ 472.762811] DRE_drvProbe+0xc07/0xf30 [dre_drv]<br> [ 472.762852] local_pci_probe+0x44/0xb0<br> [ 472.762859] work_for_cpu_fn+0x17/0x30<br> [ 472.762864] process_one_work+0x16c/0x350<br> [ 472.762872] worker_thread+0x306/0x440<br> [ 472.762881] ? __pfx_worker_thread+0x10/0x10<br> [ 472.762887] kthread+0xef/0x120<br> [ 472.762893] ? __pfx_kthread+0x10/0x10<br> [ 472.762899] ret_from_fork+0x44/0x70<br> [ 472.762904] ? __pfx_kthread+0x10/0x10<br> [ 472.762910] ret_from_fork_asm+0x1b/0x30<br> [ 472.762921] &lt;/TASK&gt;<br> [ 472.762924] ---[ end trace 0000000000000000 ]---<br> [ 472.774397] ------------[ cut here ]------------<br> </div> Tue, 01 Oct 2024 03:50:11 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767758/ https://lwn.net/Articles/767758/ mrybczyn <div class="FormattedComment"> Hello Stephen,<br> You're right, there is the dependency on ZONE_DEVICE that I didn't mention as I think it's not going to matter for most potential users. The addition of support for other architectures and future integration with other subsystems (enabling usage with GPUs...) may be a subject for a follow-up.<br> <p> Cheers<br> Marta<br> </div> Sat, 06 Oct 2018 15:23:05 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767653/ https://lwn.net/Articles/767653/ sbates <div class="FormattedComment"> Hey Marta<br> <p> One thing the article did not comment on is the ARCH specific nature of P2PDMA. While the framework is ARCH agnostic we do rely on devm_memremap_pages() which relies on ZONE_DEVICE which *is* ARCH specific (and in turn relies on MEMORY_HOTPLUG). Right now this includes x86_64 but not (for example) aarch64. Interestingly for some, we are looking at adding ARCH_HAS_ZONE_DEVICE for riscv because we see that ARCH as an interesting candidate for P2PDMA. <br> <p> Of course patches that add support for ZONE_DEVICE to other ARCH would be very cool.<br> <p> Cheers<br> <p> Stephen<br> </div> Fri, 05 Oct 2018 01:48:18 +0000 P2PDMA vs dmabuf? https://lwn.net/Articles/767652/ https://lwn.net/Articles/767652/ sbates <div class="FormattedComment"> Hey<br> <p> As I understand it dmabuf is all about exposing these buffers to userspace. P2PDMA is not quite ready to go that far but as we start looking at userspace interfaces we will definitely look at dmabuf.<br> <p> Oh and if you want to look at extending P2PDMA to tie into dmabuf we'd be more than happy to review that work!<br> <p> Cheers<br> <p> Stephen<br> </div> Fri, 05 Oct 2018 01:42:32 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767651/ https://lwn.net/Articles/767651/ sbates <div class="FormattedComment"> Willy<br> <p> Logan just submitted v9 today. Perhaps comment on that with your size_t concerns. All input gratefully received ;-).<br> <p> Stephen<br> </div> Fri, 05 Oct 2018 01:40:20 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767636/ https://lwn.net/Articles/767636/ jgg <div class="FormattedComment"> 'behind the same bridge' is the right language, if not a little confusing. It doesn't mean 'behind the last bridge' but simply any bridge. Ie the upstream bridge of a switch is sufficient to satisfy the condition, even though there are later bridges before reaching the device.<br> <p> Behind the same root port (for PCI-E) is not quite the same thing, ie two functions on the same device cannot do P2P DMA with this patch series if they are plugged directly into the root port.<br> <p> All that aside, this series does have the requirement that the devices be behind a switch. You can't use it on a GPU and a NVMe drive plugged directly into root ports on your CPU, for instance. This greatly limits the utility, and hopefully will go away eventually when people can figure out how to white list root complexes and BIOSs that support this functionality.<br> </div> Thu, 04 Oct 2018 21:13:39 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767625/ https://lwn.net/Articles/767625/ willy <div class="FormattedComment"> Just because a parameter is called 'size' does not mean it should have type 'size_t'. In this case, it's a length of a (subset of a) BAR, and it can easily be 64-bit on a 32-bit kernel. Should probably be phys_addr_t (even though it's a length, not an address).<br> </div> Thu, 04 Oct 2018 15:57:33 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767545/ https://lwn.net/Articles/767545/ mrybczyn <div class="FormattedComment"> Yes, you're right. It would be more accurate to say "behind a host bridge". You will find more about it in the last part of the article when it talks about the use cases.<br> </div> Wed, 03 Oct 2018 16:25:08 +0000 PCI devices https://lwn.net/Articles/767544/ https://lwn.net/Articles/767544/ mrybczyn <div class="FormattedComment"> In the PCI subsystem it is often understood as all variants, currently mainly PCI Express. The NVMe drives are only PCI Express, for example.<br> </div> Wed, 03 Oct 2018 16:22:49 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767509/ https://lwn.net/Articles/767509/ dullfire <div class="FormattedComment"> There seems to be a mistake in the article.<br> While I have not read the patch set, it would not make sense to require "all devices involved are behind the same PCI bridge".<br> I suspect the term "PCI host bridge" was intended (because that would have the effect that the paragraph describes). Furthermore, since in PCIe all devices have their own PCI bridge (devices, not functions. Also, as a quick overview, a PCIe switch is made up of set of PCIe bridges, one for upstream... and one for each downstream port), it would effectively be impossible to have two PCIe device ever use this functionality. Which would render it moot.<br> </div> Wed, 03 Oct 2018 11:38:59 +0000 PCI devices https://lwn.net/Articles/767504/ https://lwn.net/Articles/767504/ epa <div class="FormattedComment"> When the article talks about PCI devices, does it really mean the old style 32- or 64-bit wide, 33MHz or 66MHz bus? Or should it be taken to include PCI Express (PCIe) as using the same kind of register setup, even though it's electrically rather different?<br> </div> Wed, 03 Oct 2018 10:17:34 +0000 P2PDMA vs dmabuf? https://lwn.net/Articles/767502/ https://lwn.net/Articles/767502/ shalem <div class="FormattedComment"> I wonder how this relates to dmabuf, esp. given the comment about using P2P with GPU-s where dmabuf is already used ?<br> <p> I guess dmabuf is tied to dmaing from/to main memory? So does P2PDMA allow (through e.g. some simple helpers) to use a dmabuf as source/dest of the P2P transfer?<br> </div> Wed, 03 Oct 2018 08:16:09 +0000 Device-to-device memory-transfer offload with P2PDMA https://lwn.net/Articles/767487/ https://lwn.net/Articles/767487/ sbates <div class="FormattedComment"> <p> Thanks Marta for the excellent article summing up where we are with P2PDMA. I also gave a summary talk of P2PDMA at SNIA's Storage Developer Conference in September. The slides for that talk should be available at this link <a href="https://tinyurl.com/y8sazb79">https://tinyurl.com/y8sazb79</a> and you might want to update the article to point to these slides as well as the older ones you mention.<br> <p> Stephen<br> </div> Tue, 02 Oct 2018 21:51:17 +0000