Add support for Sub-NUMA cluster (SNC) systems
From: | Tony Luck <tony.luck-AT-intel.com> | |
To: | Fenghua Yu <fenghua.yu-AT-intel.com>, Reinette Chatre <reinette.chatre-AT-intel.com>, Maciej Wieczor-Retman <maciej.wieczor-retman-AT-intel.com>, Peter Newman <peternewman-AT-google.com>, James Morse <james.morse-AT-arm.com>, Babu Moger <babu.moger-AT-amd.com>, Drew Fustini <dfustini-AT-baylibre.com>, Dave Martin <Dave.Martin-AT-arm.com> | |
Subject: | [PATCH v17 0/9] Add support for Sub-NUMA cluster (SNC) systems | |
Date: | Fri, 03 May 2024 13:33:16 -0700 | |
Message-ID: | <20240503203325.21512-1-tony.luck@intel.com> | |
Cc: | x86-AT-kernel.org, linux-kernel-AT-vger.kernel.org, patches-AT-lists.linux.dev, Tony Luck <tony.luck-AT-intel.com> | |
Archive-link: | Article |
Note: Jump straight to patch 7 for the new stuff. Just minor tweaks in other patches. This series based on top of TIP x86/cache branch: 931be446c6cb ("x86/resctrl: Add tracepoint for llc_occupancy tracking") The Sub-NUMA cluster feature on some Intel processors partitions the CPUs that share an L3 cache into two or more sets. This plays havoc with the Resource Director Technology (RDT) monitoring features. Prior to this patch Intel has advised that SNC and RDT are incompatible. Some of these CPUs support an MSR that can partition the RMID counters in the same way. This allows monitoring features to be used. Legacy monitoring files provide the sum of counters from each SNC node for backwards compatibility. Additional files per SNC node provide details per node. Cache and memory bandwidth allocation features continue to operate at the scope of the L3 cache. Signed-off-by: Tony Luck <tony.luck@intel.com> --- Changes since v16: https://lore.kernel.org/all/20240312214247.91772-1-tony.l... Patch 1: Reinette pointed out that rdt_find_domain() no longer returns ERR_PTR() but one of the callers was still checking return with IS_ERR(). Patch 2: Tip tree added a tracing patch. That needed s/d->id/d->hdr.id/ Patch 3: Reinette: Keep the "RCU" in the kerneldoc description of the domain list fields after the split into separate ctrl/mon lists. Patch 4: No change Patch 5: No change Patch 6: Drop the change that divided output of the resctrl "size" file by the number of SNC domains per L3 cache. Now that this series preserves the contents of the legacy llc_occupancy files this isn't useful. Patch 7: NEW in this series. Add per-SNC domain monitor files while making the original files sum across SNC nodes. Patch 8: (formerly 7) No change Patch 9: (formerly 8) Add documentation for new per-SNC directories and files Tony Luck (9): x86/resctrl: Prepare for new domain scope x86/resctrl: Prepare to split rdt_domain structure x86/resctrl: Prepare for different scope for control/monitor operations x86/resctrl: Split the rdt_domain and rdt_hw_domain structures x86/resctrl: Add node-scope to the options for feature scope x86/resctrl: Introduce snc_nodes_per_l3_cache x86/resctrl: Add new monitor files for Sub-NUMA cluster (SNC) monitoring x86/resctrl: Sub NUMA Cluster detection and enable x86/resctrl: Update documentation with Sub-NUMA cluster changes Documentation/arch/x86/resctrl.rst | 17 + include/linux/resctrl.h | 89 +++-- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/resctrl/internal.h | 72 ++-- arch/x86/kernel/cpu/resctrl/core.c | 430 ++++++++++++++++++---- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 57 +-- arch/x86/kernel/cpu/resctrl/monitor.c | 98 +++-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 26 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 263 ++++++++----- 9 files changed, 759 insertions(+), 294 deletions(-) base-commit: 931be446c6cbc15691dd499957e961f4e1d56afb -- 2.44.0