|
|
Subscribe / Log in / New account

mm/folio_zero_user: add multi-page clearing

From:  Ingo Molnar <mingo-AT-kernel.org>
To:  Ankur Arora <ankur.a.arora-AT-oracle.com>
Subject:  Re: [PATCH v3 0/4] mm/folio_zero_user: add multi-page clearing
Date:  Mon, 14 Apr 2025 08:36:18 +0200
Message-ID:  <Z_ys4jJ8MQ4-kW8P@gmail.com>
Cc:  linux-kernel-AT-vger.kernel.org, linux-mm-AT-kvack.org, x86-AT-kernel.org, torvalds-AT-linux-foundation.org, akpm-AT-linux-foundation.org, bp-AT-alien8.de, dave.hansen-AT-linux.intel.com, hpa-AT-zytor.com, mingo-AT-redhat.com, luto-AT-kernel.org, peterz-AT-infradead.org, paulmck-AT-kernel.org, rostedt-AT-goodmis.org, tglx-AT-linutronix.de, willy-AT-infradead.org, jon.grimm-AT-amd.com, bharata-AT-amd.com, raghavendra.kt-AT-amd.com, boris.ostrovsky-AT-oracle.com, konrad.wilk-AT-oracle.com
Archive-link:  Article


* Ankur Arora <ankur.a.arora@oracle.com> wrote:

> We also see performance improvement for cases where this optimization is
> unavailable (pg-sz=2MB on AMD, and pg-sz=2MB|1GB on Intel) because
> REP; STOS is typically microcoded which can now be amortized over
> larger regions and the hint allows the hardware prefetcher to do a
> better job.
> 
> Milan (EPYC 7J13, boost=0, preempt=full|lazy):
> 
>                  mm/folio_zero_user    x86/folio_zero_user     change
>                   (GB/s  +- stddev)      (GB/s  +- stddev)
> 
>   pg-sz=1GB       16.51  +- 0.54%        42.80  +-  3.48%    + 159.2%
>   pg-sz=2MB       11.89  +- 0.78%        16.12  +-  0.12%    +  35.5%
> 
> Icelakex (Platinum 8358, no_turbo=1, preempt=full|lazy):
> 
>                  mm/folio_zero_user    x86/folio_zero_user     change
>                   (GB/s +- stddev)      (GB/s +- stddev)
> 
>   pg-sz=1GB       8.01  +- 0.24%        11.26 +- 0.48%       + 40.57%
>   pg-sz=2MB       7.95  +- 0.30%        10.90 +- 0.26%       + 37.10%

How was this measured? Could you integrate this measurement as a new 
tools/perf/bench/ subcommand so that people can try it on different 
systems, etc.? There's already a 'perf bench mem' subcommand space 
where this feature could be added to.

Thanks,

	Ingo



Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds