summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-04-17 12:13:35 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2023-04-18 17:05:28 -0700
commite046fe5a36a970bc14fbfbcb2074a48776f6b671 (patch)
tree08aeaa04400e12c301601669f31e4f397623f4b1
parent427fda2c8a4977d9dbd9bc108bbe6e21ec84648d (diff)
downloadlinux-e046fe5a36a970bc14fbfbcb2074a48776f6b671.tar.gz
linux-e046fe5a36a970bc14fbfbcb2074a48776f6b671.tar.bz2
linux-e046fe5a36a970bc14fbfbcb2074a48776f6b671.zip
x86: set FSRS automatically on AMD CPUs that have FSRM
So Intel introduced the FSRS ("Fast Short REP STOS") CPU capability bit, because they seem to have done the (much simpler) REP STOS optimizations separately and later than the REP MOVS one. In contrast, when AMD introduced support for FSRM ("Fast Short REP MOVS"), in the Zen 3 core, it appears to have improved the REP STOS case at the same time, and since the FSRS bit was added by Intel later, it doesn't show up on those AMD Zen 3 cores. And now that we made use of FSRS for the "rep stos" conditional, that made those AMD machines unnecessarily slower. The Intel situation where "rep movs" is fast, but "rep stos" isn't, is just odd. The 'stos' case is a lot simpler with no aliasing, no mutual alignment issues, no complicated cases. So this just sets FSRS automatically when FSRM is available on AMD machines, to get back all the nice REP STOS goodness in Zen 3. Reported-and-tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--arch/x86/kernel/cpu/amd.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 95cdd08c4cbb..1547781e505b 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -929,6 +929,10 @@ static void init_amd(struct cpuinfo_x86 *c)
if (c->x86 >= 0x10)
set_cpu_cap(c, X86_FEATURE_REP_GOOD);
+ /* AMD FSRM also implies FSRS */
+ if (cpu_has(c, X86_FEATURE_FSRM))
+ set_cpu_cap(c, X86_FEATURE_FSRS);
+
/* get apicid instead of initial apic id from cpuid */
c->apicid = hard_smp_processor_id();