| From: |
| Dapeng Mi <dapeng1.mi-AT-linux.intel.com> |
| To: |
| Peter Zijlstra <peterz-AT-infradead.org>, Ingo Molnar <mingo-AT-redhat.com>, Arnaldo Carvalho de Melo <acme-AT-kernel.org>, Namhyung Kim <namhyung-AT-kernel.org>, Ian Rogers <irogers-AT-google.com>, Adrian Hunter <adrian.hunter-AT-intel.com>, Alexander Shishkin <alexander.shishkin-AT-linux.intel.com>, Andi Kleen <ak-AT-linux.intel.com>, Eranian Stephane <eranian-AT-google.com> |
| Subject: |
| [Patch v8 00/12] arch-PEBS enabling for Intel platforms |
| Date: |
| Wed, 15 Oct 2025 14:44:10 +0800 |
| Message-ID: |
| <20251015064422.47437-1-dapeng1.mi@linux.intel.com> |
| Cc: |
| linux-kernel-AT-vger.kernel.org, linux-perf-users-AT-vger.kernel.org, Dapeng Mi <dapeng1.mi-AT-intel.com>, Dapeng Mi <dapeng1.mi-AT-linux.intel.com> |
| Archive-link: |
| Article |
Changes:
v7 -> v8:
* Fix the warning reported by Kernel test robot (Patch 02/12)
* Rebase code to 6.18-rc1.
v6 -> v7:
* Rebase code to last tip perf/core tree.
* Opportunistically remove the redundant is_x86_event() prototype.
(Patch 01/12)
* Fix PEBS handler NULL event access and record loss issue.
(Patch 02/12)
* Reset MSR_IA32_PEBS_INDEX at the head of_drain_arch_pebs() instead
of end. It avoids the processed PEBS records are processed again in
some corner cases like event throttling. (Patch 08/12)
v5 -> v6:
* Rebase code to last tip perf/core tree + "x86 perf bug fixes and
optimization" patchset
v4 -> v5:
* Rebase code to 6.16-rc3
* Allocate/free arch-PEBS buffer in callbacks *prepare_cpu/*dead_cpu
(patch 07/10, Peter)
* Code and comments refine (patch 09/10, Peter)
This patchset introduces architectural PEBS support for Intel platforms
like Clearwater Forest (CWF) and Panther Lake (PTL). The detailed
information about arch-PEBS can be found in chapter 11
"architectural PEBS" of "Intel Architecture Instruction Set Extensions
and Future Features".
This patch set doesn't include the SSP and SIMD regs (OPMASK/YMM/ZMM)
sampling support for arch-PEBS to avoid the dependency for the basic
SIMD regs sampling support patch series[1]. Once the basic SIMD regs
sampling is supported, the arch-PEBS based SSP and SIMD regs
(OPMASK/YMM/ZMM) sampling would be supported in a later patch set.
Tests:
Run below tests on Clearwater Forest and Pantherlake, no issue is
found.
1. Basic perf counting case.
perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1
2. Basic PMI based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1
3. Basic PEBS based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}:p' sleep 1
4. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
perf record -e branches:p -Iax,bx,ip,xmm0 -b -c 10000 sleep 1
5. User space PEBS sampling case with basic, GPRs and LBR groups
perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 sleep 1
6. PEBS sampling case with auxiliary (memory info) group
perf mem record sleep 1
7. PEBS sampling case with counter group
perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1
8. Perf stat and record test
perf test 100; perf test 131
History:
v7: https://lore.kernel.org/all/20250828013435.1528459-1-dape...
v6: https://lore.kernel.org/all/20250821035805.159494-1-dapen...
v5: https://lore.kernel.org/all/20250623223546.112465-1-dapen...
v4: https://lore.kernel.org/all/20250620103909.1586595-1-dape...
v3: https://lore.kernel.org/all/20250415114428.341182-1-dapen...
v2: https://lore.kernel.org/all/20250218152818.158614-1-dapen...
v1: https://lore.kernel.org/all/20250123140721.2496639-1-dape...
Ref:
[1]: https://lore.kernel.org/all/20250925061213.178796-1-dapen...
Dapeng Mi (12):
perf/x86: Remove redundant is_x86_event() prototype
perf/x86/intel: Fix NULL event access and potential PEBS record loss
perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
perf/x86/intel: Correct large PEBS flag check
perf/x86/intel: Initialize architectural PEBS
perf/x86/intel/ds: Factor out PEBS record processing code to functions
perf/x86/intel/ds: Factor out PEBS group processing code to functions
perf/x86/intel: Process arch-PEBS records or record fragments
perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
perf/x86/intel: Update dyn_constranit base on PEBS event precise level
perf/x86/intel: Setup PEBS data configuration and enable legacy groups
perf/x86/intel: Add counter group support for arch-PEBS
arch/x86/events/core.c | 21 +-
arch/x86/events/intel/core.c | 268 ++++++++++++-
arch/x86/events/intel/ds.c | 632 ++++++++++++++++++++++++------
arch/x86/events/perf_event.h | 41 +-
arch/x86/include/asm/intel_ds.h | 10 +-
arch/x86/include/asm/msr-index.h | 20 +
arch/x86/include/asm/perf_event.h | 116 +++++-
7 files changed, 963 insertions(+), 145 deletions(-)
base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
--
2.34.1