|
|
Log in / Subscribe / Register

iommu/riscv: Add hardware dirty tracking for second-stage domains

From:  fangyu.yu-AT-linux.alibaba.com
To:  joro-AT-8bytes.org, will-AT-kernel.org, robin.murphy-AT-arm.com, pjw-AT-kernel.org, palmer-AT-dabbelt.com, aou-AT-eecs.berkeley.edu, alex-AT-ghiti.fr, tjeznach-AT-rivosinc.com, jgg-AT-ziepe.ca, kevin.tian-AT-intel.com, baolu.lu-AT-linux.intel.com, vasant.hegde-AT-amd.com, anup-AT-brainfault.org, atish.patra-AT-linux.dev, skhawaja-AT-google.com, jgg-AT-nvidia.com
Subject:  [RFC PATCH 00/11] iommu/riscv: Add hardware dirty tracking for second-stage domains
Date:  Tue, 28 Apr 2026 21:13:48 +0800
Message-ID:  <20260428131359.34872-1-fangyu.yu@linux.alibaba.com>
Cc:  guoren-AT-kernel.org, kvm-AT-vger.kernel.org, iommu-AT-lists.linux.dev, kvm-riscv-AT-lists.infradead.org, linux-riscv-AT-lists.infradead.org, linux-kernel-AT-vger.kernel.org, Fangyu Yu <fangyu.yu-AT-linux.alibaba.com>
Archive-link:  Article

From: Fangyu Yu <fangyu.yu@linux.alibaba.com>

The RISC-V IOMMU architecture defines an AMO_HWAD capability (Hardware
Access/Dirty update) that allows the IOMMU to atomically set the A/D bits
in second-stage PTEs on DMA access.  When DC.tc.GADE is asserted, the IOMMU
autonomously sets D on the first write to a page mapped by an iohgatp
domain.  This series wires that capability up to the iommufd dirty-tracking
interface (IOMMU_HWPT_SET_DIRTY_TRACKING / IOMMU_HWPT_GET_DIRTY_BITMAP) and
reports IOMMU_CAP_DIRTY_TRACKING.

Design notes
------------

* The feature is scoped to second-stage (iohgatp) domains only; these are
  the domains created for KVM / VFIO device pass-through when userspace
  allocates an HWPT with IOMMU_HWPT_ALLOC_NEST_PARENT or
  IOMMU_HWPT_ALLOC_DIRTY_TRACKING.  First-stage (iosatp) domains are not
  touched by this series.

* The page-table side plugs into the existing generic_pt dirty hook
  framework (amdv1 / vtdss style).  RISC-V adds the three required PTE
  ops – is_write_dirty / make_write_clean / make_write_dirty.

Testing
-------

* Test on QEMU RISC-V, a virtio-net and an e1000e device was passed through
  to an L2 guest via vfio-pci + iommufd.

* generic_pt KUnit: the existing test_dirty case now runs and passes for
  the RISC-V 64-bit format.

Follow-up work
--------------
* Build a dedicated end-to-end test case that drives the full flow
  (HWPT_ALLOC with DIRTY_TRACKING -> attach -> IOAS_MAP -> generate real
  DMA -> SET_DIRTY_TRACKING -> GET_DIRTY_BITMAP -> verify bitmap against
  expected IOVA footprint) so that the behaviour can be regression-tested
  beyond the KUnit PTE-level coverage.

* If possible, rebase and retest on top of the updated "iommu irqbypass"
  patchset.


Fangyu Yu (6):
  iommupt: Add RISC-V Second-stage (iohgatp) page table support
  iommu/riscv: Add domain_alloc_paging_flags for second-stage domain
  iommupt: Don't preset D when RISC-V IOMMU dirty tracking on
  iommu/riscv: Add dirty tracking support for second-stage domains
  iommu/riscv: Add IOTINVAL.GVMA after updating DDT/PDT entries
  iommupt: Add RISC-V dirty tracking PTE ops

Tomasz Jeznach (2):
  iommu/riscv: report iommu capabilities
  RISC-V: KVM: Enable KVM_VFIO interfaces on RISC-V arch

Zong Li (3):
  iommu/riscv: use data structure instead of individual values
  iommu/riscv: support GSCID and GVMA invalidation command
  iommu/riscv: support nested iommu for getting iommu hardware
    information

 arch/riscv/kvm/Kconfig               |   2 +
 drivers/iommu/generic_pt/fmt/riscv.h | 120 ++++++++++++-
 drivers/iommu/riscv/iommu-bits.h     |   7 +
 drivers/iommu/riscv/iommu.c          | 247 +++++++++++++++++++++++----
 include/linux/generic_pt/common.h    |  13 ++
 include/linux/generic_pt/iommu.h     |  17 +-
 include/uapi/linux/iommufd.h         |  18 ++
 7 files changed, 383 insertions(+), 41 deletions(-)

-- 
2.50.1




Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds