| From: |
| Zhenyu Ye <yezhenyu2-AT-huawei.com> |
| To: |
| <catalin.marinas-AT-arm.com>, <will-AT-kernel.org>, <suzuki.poulose-AT-arm.com>, <maz-AT-kernel.org>, <steven.price-AT-arm.com>, <guohanjun-AT-huawei.com>, <olof-AT-lixom.net> |
| Subject: |
| [PATCH v1 0/2] arm64: tlb: add support for TLBI RANGE instructions |
| Date: |
| Thu, 9 Jul 2020 17:10:52 +0800 |
| Message-ID: |
| <20200709091054.1698-1-yezhenyu2@huawei.com> |
| Cc: |
| <yezhenyu2-AT-huawei.com>, <linux-arm-kernel-AT-lists.infradead.org>, <linux-kernel-AT-vger.kernel.org>, <linux-arch-AT-vger.kernel.org>, <linux-mm-AT-kvack.org>, <arm-AT-kernel.org>, <xiexiangyou-AT-huawei.com>, <prime.zeng-AT-hisilicon.com>, <zhangshaokun-AT-hisilicon.com>, <kuhn.chenqun-AT-huawei.com> |
| Archive-link: |
| Article |
NOTICE: this series are based on the arm64 for-next/tlbi branch:
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/tlbi
--
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses. This series add support for this feature.
I tested this feature on a FPGA machine whose cpus support the tlbi range.
As the page num increases, the performance is improved significantly. When
page num = 256, the performance is improved by about 10 times.
Below is the test data when the stride = PTE:
[page num] [classic] [tlbi range]
1 16051 13524
2 11366 11146
3 11582 12171
4 11694 11101
5 12138 12267
6 12290 11105
7 12400 12002
8 12837 11097
9 14791 12140
10 15461 11087
16 18233 11094
32 26983 11079
64 43840 11092
128 77754 11098
256 145514 11089
512 280932 11111
See more details in:
https://lore.kernel.org/linux-arm-kernel/504c7588-97e5-e0...
--
RFC patches:
- Link: https://lore.kernel.org/linux-arm-kernel/20200708124031.1...
Zhenyu Ye (2):
arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature
arm64: tlb: Use the TLBI RANGE feature in arm64
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/sysreg.h | 3 +
arch/arm64/include/asm/tlbflush.h | 156 ++++++++++++++++++++++++------
arch/arm64/kernel/cpufeature.c | 10 ++
4 files changed, 141 insertions(+), 31 deletions(-)
--
2.19.1