summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/x86.h
diff options
context:
space:
mode:
authorLike Xu <likexu@tencent.com>2023-12-06 11:20:54 +0800
committerSean Christopherson <seanjc@google.com>2024-02-26 15:57:22 -0800
commit812d432373f629eb8d6cb696ea6804fca1534efa (patch)
treecc5bcadaf774573e78a166c1d4bcfa4a0e49615f /arch/x86/kvm/x86.h
parent4a447b135e45b49101417d54079df25b520299d9 (diff)
downloadlinux-stable-812d432373f629eb8d6cb696ea6804fca1534efa.tar.gz
linux-stable-812d432373f629eb8d6cb696ea6804fca1534efa.tar.bz2
linux-stable-812d432373f629eb8d6cb696ea6804fca1534efa.zip
KVM: x86/pmu: Explicitly check NMI from guest to reducee false positives
Explicitly check that the source of external interrupt is indeed an NMI in kvm_arch_pmi_in_guest(), which reduces perf-kvm false positive samples (host samples labelled as guest samples) generated by perf/core NMI mode if an NMI arrives after VM-Exit, but before kvm_after_interrupt(): # test: perf-record + cpu-cycles:HP (which collects host-only precise samples) # Symbol Overhead sys usr guest sys guest usr # ....................................... ........ ........ ........ ......... ......... # # Before: [g] entry_SYSCALL_64 24.63% 0.00% 0.00% 24.63% 0.00% [g] syscall_return_via_sysret 23.23% 0.00% 0.00% 23.23% 0.00% [g] files_lookup_fd_raw 6.35% 0.00% 0.00% 6.35% 0.00% # After: [k] perf_adjust_freq_unthr_context 57.23% 57.23% 0.00% 0.00% 0.00% [k] __vmx_vcpu_run 4.09% 4.09% 0.00% 0.00% 0.00% [k] vmx_update_host_rsp 3.17% 3.17% 0.00% 0.00% 0.00% In the above case, perf records the samples labelled '[g]', the RIPs behind the weird samples are actually being queried by perf_instruction_pointer() after determining whether it's in GUEST state or not, and here's the issue: If VM-Exit is caused by a non-NMI interrupt (such as hrtimer_interrupt) and at least one PMU counter is enabled on host, the kvm_arch_pmi_in_guest() will remain true (KVM_HANDLING_IRQ is set) until kvm_before_interrupt(). During this window, if a PMI occurs on host (since the KVM instructions on host are being executed), the control flow, with the help of the host NMI context, will be transferred to perf/core to generate performance samples, thus perf_instruction_pointer() and perf_guest_get_ip() is called. Since kvm_arch_pmi_in_guest() only checks if there is an interrupt, it may cause perf/core to mistakenly assume that the source RIP of the host NMI belongs to the guest world and use perf_guest_get_ip() to get the RIP of a vCPU that has already exited by a non-NMI interrupt. Error samples are recorded and presented to the end-user via perf-report. Such false positive samples could be eliminated by explicitly determining if the exit reason is KVM_HANDLING_NMI. Note that when VM-exit is indeed triggered by PMI and before HANDLING_NMI is cleared, it's also still possible that another PMI is generated on host. Also for perf/core timer mode, the false positives are still possible since those non-NMI sources of interrupts are not always being used by perf/core. For events that are host-only, perf/core can and should eliminate false positives by checking event->attr.exclude_guest, i.e. events that are configured to exclude KVM guests should never fire in the guest. Events that are configured to count host and guest are trickier, perhaps impossible to handle with 100% accuracy? And regardless of what accuracy is provided by perf/core, improving KVM's accuracy is cheap and easy, with no real downsides. Fixes: dd60d217062f ("KVM: x86: Fix perf timer mode IP reporting") Signed-off-by: Like Xu <likexu@tencent.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20231206032054.55070-1-likexu@tencent.com [sean: massage changelog, squash !!in_nmi() fixup from Like] Signed-off-by: Sean Christopherson <seanjc@google.com>
Diffstat (limited to 'arch/x86/kvm/x86.h')
-rw-r--r--arch/x86/kvm/x86.h6
1 files changed, 0 insertions, 6 deletions
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 2f7e19166658..4dc38092d599 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -431,12 +431,6 @@ static inline bool kvm_notify_vmexit_enabled(struct kvm *kvm)
return kvm->arch.notify_vmexit_flags & KVM_X86_NOTIFY_VMEXIT_ENABLED;
}
-enum kvm_intr_type {
- /* Values are arbitrary, but must be non-zero. */
- KVM_HANDLING_IRQ = 1,
- KVM_HANDLING_NMI,
-};
-
static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu,
enum kvm_intr_type intr)
{