summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kvm
Commit message (Collapse)AuthorAgeFilesLines
* KVM: arm64: Make ICC_*SGI*_EL1 undef in the absence of a vGICv3Marc Zyngier8 days2-0/+13
| | | | | | | | | | | | | | | | | | On a system with a GICv3, if a guest hasn't been configured with GICv3 and that the host is not capable of GICv2 emulation, a write to any of the ICC_*SGI*_EL1 registers is trapped to EL2. We therefore try to emulate the SGI access, only to hit a NULL pointer as no private interrupt is allocated (no GIC, remember?). The obvious fix is to give the guest what it deserves, in the shape of a UNDEF exception. Reported-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240820100349.3544850-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: Ensure canonical IPA is hugepage-aligned when handling faultOliver Upton8 days1-1/+8
| | | | | | | | | | | | | | | | | | | | | | | Zenghui reports that VMs backed by hugetlb pages are no longer booting after commit fd276e71d1e7 ("KVM: arm64: nv: Handle shadow stage 2 page faults"). Support for shadow stage-2 MMUs introduced the concept of a fault IPA and canonical IPA to stage-2 fault handling. These are identical in the non-nested case, as the hardware stage-2 context is always that of the canonical IPA space. Both addresses need to be hugepage-aligned when preparing to install a hugepage mapping to ensure that KVM uses the correct GFN->PFN translation and installs that at the correct IPA for the current stage-2. And now I'm feeling thirsty after all this talk of IPAs... Fixes: fd276e71d1e7 ("KVM: arm64: nv: Handle shadow stage 2 page faults") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240822071710.2291690-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: vgic: Don't hold config_lock while unregistering redistributorsMarc Zyngier11 days2-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We recently moved the teardown of the vgic part of a vcpu inside a critical section guarded by the config_lock. This teardown phase involves calling into kvm_io_bus_unregister_dev(), which takes the kvm->srcu lock. However, this violates the established order where kvm->srcu is taken on a memory fault (such as an MMIO access), possibly followed by taking the config_lock if the GIC emulation requires mutual exclusion from the other vcpus. It therefore results in a bad lockdep splat, as reported by Zenghui. Fix this by moving the call to kvm_io_bus_unregister_dev() outside of the config_lock critical section. At this stage, there shouln't be any need to hold the config_lock. As an additional bonus, document the ordering between kvm->slots_lock, kvm->srcu and kvm->arch.config_lock so that I cannot pretend I didn't know about those anymore. Fixes: 9eb18136af9f ("KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Link: https://lore.kernel.org/r/20240819125045.3474845-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: vgic-debug: Don't put unmarked LPIsZenghui Yu11 days1-1/+1
| | | | | | | | | | | | | | | If there were LPIs being mapped behind our back (i.e., between .start() and .stop()), we would put them at iter_unmark_lpis() without checking if they were actually *marked*, which is obviously not good. Switch to use the xa_for_each_marked() iterator to fix it. Cc: stable@vger.kernel.org Fixes: 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240817101541.1664-1-yuzenghui@huawei.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: vgic: Hold config_lock while tearing down a CPU interfaceMarc Zyngier2024-08-081-2/+1
| | | | | | | | | | | | | | | Tearing down a vcpu CPU interface involves freeing the private interrupt array. If we don't hold the lock, we may race against another thread trying to configure it. Yeah, fuzzers do wonderful things... Taking the lock early solves this particular problem. Fixes: 03b3d00a70b5 ("KVM: arm64: vgic: Allocate private interrupts on demand") Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240808091546.3262111-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: Tidying up PAuth code in KVMFuad Tabba2024-08-073-14/+6
| | | | | | | | | | Tidy up some of the PAuth trapping code to clear up some comments and avoid clang/checkpatch warnings. Also, don't bother setting PAuth HCR_EL2 bits in pKVM, since it's handled by the hypervisor. Signed-off-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20240722163311.1493879-1-tabba@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: vgic-debug: Exit the iterator properly w/o LPIZenghui Yu2024-08-071-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In case the guest doesn't have any LPI, we previously relied on the iterator setting 'intid = nr_spis + VGIC_NR_PRIVATE_IRQS' && 'lpi_idx = 1' to exit the iterator. But it was broken with commit 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator") -- the intid remains at 'nr_spis + VGIC_NR_PRIVATE_IRQS - 1', and we end up endlessly printing the last SPI's state. Consider that it's meaningless to search the LPI xarray and populate lpi_idx when there is no LPI, let's just skip the process for that case. The result is that * If there's no LPI, we focus on the intid and exit the iterator when it runs out of the valid SPI range. * Otherwise we keep the current logic and let the xarray drive the iterator. Fixes: 85d3ccc8b75b ("KVM: arm64: vgic-debug: Use an xarray mark for debug iterator") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240807052024.2084-1-yuzenghui@huawei.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchainMarc Zyngier2024-08-071-0/+1
| | | | | | | | | | | | | | | | | | With the NV support of TLBI-range operations, KVM makes use of instructions that are only supported by binutils versions >= 2.30. This breaks the build for very old toolchains. Make KVM support conditional on having ARMv8.4 support in the assembler, side-stepping the issue. Fixes: 5d476ca57d7d ("KVM: arm64: nv: Add handling of range-based TLBI operations") Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Suggested-by: Arnd Bergmann <arnd@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240807115144.3237260-1-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: vgic: fix unexpected unlock sparse warningsSebastian Ott2024-08-022-2/+2
| | | | | | | | | | | | | Get rid of unexpected unlock sparse warnings in vgic code by adding an annotation to vgic_queue_irq_unlock(). arch/arm64/kvm/vgic/vgic.c:334:17: warning: context imbalance in 'vgic_queue_irq_unlock' - unexpected unlock arch/arm64/kvm/vgic/vgic.c:419:5: warning: context imbalance in 'kvm_vgic_inject_irq' - different lock contexts for basic block Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240723101204.7356-4-sebott@redhat.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: fix kdoc warnings in W=1 buildsSebastian Ott2024-08-024-11/+17
| | | | | | | | | | Fix kdoc warnings by adding missing function parameter descriptions or by conversion to a normal comment. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240723101204.7356-3-sebott@redhat.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: fix override-init warnings in W=1 buildsSebastian Ott2024-08-023-0/+7
| | | | | | | | | | | | | | Add -Wno-override-init to the build flags for sys_regs.c, handle_exit.c, and switch.c to fix warnings like the following: arch/arm64/kvm/hyp/vhe/switch.c:271:43: warning: initialized field overwritten [-Woverride-init] 271 | [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32, | Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240723101204.7356-2-sebott@redhat.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* KVM: arm64: free kvm->arch.nested_mmus with kvfree()Danilo Krummrich2024-08-021-1/+1
| | | | | | | | | | | kvm->arch.nested_mmus is allocated with kvrealloc(), hence free it with kvfree() instead of kfree(). Fixes: 4f128f8e1aaa ("KVM: arm64: nv: Support multiple nested Stage-2 mmu structures") Signed-off-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240723142204.758796-1-dakr@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2024-07-2021-328/+2397
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
| * Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini2024-07-161-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM generic changes for 6.11 - Enable halt poll shrinking by default, as Intel found it to be a clear win. - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86. - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out(). - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs. - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout. - A few minor cleanups
| | * KVM: Introduce vcpu->wants_to_runDavid Matlack2024-06-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit was not set. Replace all references to vcpu->run->immediate_exit with !vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a malicious userspace could invoked KVM_RUN with immediate_exit=true and then after KVM reads it to set wants_to_run=false, flip it to false. This would result in the vCPU running in KVM_RUN with wants_to_run=false. This wouldn't cause any real bugs today but is a dangerous landmine. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20240503181734.1467938-2-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
| * | Merge tag 'kvmarm-6.11' of ↵Paolo Bonzini2024-07-1621-327/+2396
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 changes for 6.11 - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates
| | * \ Merge branch kvm-arm64/nv-tcr2 into kvmarm/nextOliver Upton2024-07-143-11/+54
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/nv-tcr2: : Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier : : Series addresses a couple gaps that are present in KVM (from cover : letter): : : - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly : save/restore stuff. : : - trap bit description and routing: none, obviously, since we make a : point in not trapping. KVM: arm64: Honor trap routing for TCR2_EL1 KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features KVM: arm64: Get rid of HCRX_GUEST_FLAGS KVM: arm64: Correctly honor the presence of FEAT_TCRX Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | KVM: arm64: Honor trap routing for TCR2_EL1Marc Zyngier2024-06-271-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TCR2_EL1 handling is missing the handling of its trap configuration: - HCRX_EL2.TCR2En must be handled in conjunction with HCR_EL2.{TVM,TRVM} - HFG{R,W}TR_EL2.TCR_EL1 does apply to TCR2_EL1 as well Without these two controls being implemented, it is impossible to correctly route TCR2_EL1 traps. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240625130042.259175-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRXMarc Zyngier2024-06-271-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX must be implemented as well. Take advantage of this to avoid checking for S1PIE when TCRX isn't implemented. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240625130042.259175-6-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM featuresMarc Zyngier2024-06-271-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As for other registers, save/restore of TCR2_EL1 should be gated on the feature being actually present. In the case of a nVHE hypervisor, it is perfectly fine to leave the host value in the register, as HCRX_EL2.TCREn==0 imposes that TCR2_EL1 is treated as 0. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240625130042.259175-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | KVM: arm64: Get rid of HCRX_GUEST_FLAGSMarc Zyngier2024-06-271-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HCRX_GUEST_FLAGS gives random KVM hackers the impression that they can stuff bits in this macro and unconditionally enable features in the guest. In general, this is wrong (we have been there with FEAT_MOPS, and again with FEAT_TCRX). Document that HCRX_EL2.SMPME is an exception rather than the rule, and get rid of HCRX_GUEST_FLAGS. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20240625130042.259175-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | KVM: arm64: Correctly honor the presence of FEAT_TCRXMarc Zyngier2024-06-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently blindly enable TCR2_EL1 use in a guest, irrespective of the feature set. This is obviously wrong, and we should actually honor the guest configuration and handle the possible trap resulting from the guest being buggy. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20240625130042.259175-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | * | | Merge branch kvm-arm64/nv-sve into kvmarm/nextOliver Upton2024-07-148-41/+306
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/nv-sve: : CPTR_EL2, FPSIMD/SVE support for nested : : This series brings support for honoring the guest hypervisor's CPTR_EL2 : trap configuration when running a nested guest, along with support for : FPSIMD/SVE usage at L1 and L2. KVM: arm64: Allow the use of SVE+NV KVM: arm64: nv: Add additional trap setup for CPTR_EL2 KVM: arm64: nv: Add trap description for CPTR_EL2 KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2 KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap KVM: arm64: nv: Handle CPACR_EL1 traps KVM: arm64: Spin off helper for programming CPTR traps KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context KVM: arm64: nv: Load guest hyp's ZCR into EL1 state KVM: arm64: nv: Handle ZCR_EL2 traps KVM: arm64: nv: Forward SVE traps to guest hypervisor KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: Allow the use of SVE+NVOliver Upton2024-06-202-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow SVE and NV to mix now that everything is in place to handle it correctly. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-16-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Add additional trap setup for CPTR_EL2Marc Zyngier2024-06-201-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to teach KVM a couple of new tricks. CPTR_EL2 and its VHE accessor CPACR_EL1 need to be handled specially: - CPACR_EL1 is trapped on VHE so that we can track the TCPAC and TTA bits - CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2 Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-15-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Add trap description for CPTR_EL2Marc Zyngier2024-06-201-0/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add trap description for CPTR_EL2.{TCPAC,TAM,E0POE,TTA}. TTA is a bit annoying as it changes location depending on E2H. This forces us to add yet another "complex" trap condition. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-14-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2Oliver Upton2024-06-201-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Start folding the guest hypervisor's FP/SVE traps into the value programmed in hardware. Note that as of writing this is dead code, since KVM does a full put() / load() for every nested exception boundary which saves + flushes the FP/SVE state. However, this will become useful when we can keep the guest's FP/SVE state alive across a nested exception boundary and the host no longer needs to conservatively program traps. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-12-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Load guest FP state for ZCR_EL2 trapOliver Upton2024-06-202-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path when the guest hypervisor tries to access SVE state. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-11-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Handle CPACR_EL1 trapsMarc Zyngier2024-06-201-1/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle CPACR_EL1 accesses when running a VHE guest. In order to limit the cost of the emulation, implement it ass a shallow exit. In the other cases: - this is a nVHE L1 which will write to memory, and we don't trap - this is a L2 guest: * the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register access * the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the handling is defered to the general handling for forwarding Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: Spin off helper for programming CPTR trapsOliver Upton2024-06-201-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A subsequent change to KVM will add preliminary support for merging a guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off a new helper for managing CPTR traps. Avoid reading CPACR_EL1 for the baseline trap config, and start off with the most restrictive set of traps that is subsequently relaxed. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Ensure correct VL is loaded before saving SVE stateOliver Upton2024-06-201-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that the guest hypervisor has selected a smaller VL than the maximum for its nested guest. As such, ZCR_EL2 may be configured for a different VL when exiting a nested guest. Set ZCR_EL2 (via the EL1 alias) to the maximum VL for the VM before saving SVE state as the SVE save area is dimensioned by the max VL. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-8-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Use guest hypervisor's max VL when running nested guestOliver Upton2024-06-201-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The max VL for nested guests is additionally constrained by the max VL selected by the guest hypervisor. Use that instead of KVM's max VL when running a nested guest. Note that the guest hypervisor's ZCR_EL2 is sanitised against the VM's max VL at the time of access, so there's no additional handling required at the time of use. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-7-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp contextOliver Upton2024-06-201-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a guest hypervisor, ZCR_EL2 is an alias for the counterpart EL1 state. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-6-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Load guest hyp's ZCR into EL1 stateOliver Upton2024-06-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Load the guest hypervisor's ZCR_EL2 into the corresponding EL1 register when restoring SVE state, as ZCR_EL2 affects the VL in the hypervisor context. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Handle ZCR_EL2 trapsOliver Upton2024-06-201-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike other SVE-related registers, ZCR_EL2 takes a sysreg trap to EL2 when HCR_EL2.NV = 1. KVM still needs to honor the guest hypervisor's trap configuration, which expects an SVE trap (i.e. ESR_EL2.EC = 0x19) when CPTR traps are enabled for the vCPU's current context. Otherwise, if the guest hypervisor has traps disabled, emulate the access by mapping the requested VL into ZCR_EL1. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Forward SVE traps to guest hypervisorOliver Upton2024-06-202-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to FPSIMD traps, don't load SVE state if the guest hypervisor has SVE traps enabled and forward the trap instead. Note that ZCR_EL2 will require some special handling, as it takes a sysreg trap to EL2 when HCR_EL2.NV = 1. Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-3-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisorJintack Lim2024-06-202-4/+15
| | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Give precedence to the guest hypervisor's trap configuration when routing an FP/ASIMD trap taken to EL2. Take advantage of the infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format and base the trap decision solely on the VHE view of the register. The in-memory value of CPTR_EL2 will always be up to date for the guest hypervisor (more on that later), so just read it directly from memory. Bury all of this behind a macro keyed off of the CPTR bitfield in anticipation of supporting other traps (e.g. SVE). [maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with all the hard work actually being done by Chase Conklin] [ oliver: translate nVHE->VHE format for testing traps; macro for reuse in other CPTR_EL2.xEN fields ] Signed-off-by: Jintack Lim <jintack.lim@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240620164653.1130714-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | * | | Merge branch kvm-arm64/el2-kcfi into kvmarm/nextOliver Upton2024-07-149-32/+60
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/el2-kcfi: : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi : : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect : branches in the EL2 hypervisor. Unlike kernel support for the feature, : CFI failures at EL2 are always fatal. KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2 KVM: arm64: Introduce print_nvhe_hyp_panic helper arm64: Introduce esr_brk_comment, esr_is_cfi_brk KVM: arm64: VHE: Mark __hyp_call_panic __noreturn KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32 KVM: arm64: nVHE: Simplify invalid_host_el2_vect KVM: arm64: Fix __pkvm_init_switch_pgd call ABI KVM: arm64: Fix clobbered ELR in sync abort/SError Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2Pierre-Clément Tosi2024-06-203-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The compiler implements kCFI by adding type information (u32) above every function that might be indirectly called and, whenever a function pointer is called, injects a read-and-compare of that u32 against the value corresponding to the expected type. In case of a mismatch, a BRK instruction gets executed. When the hypervisor triggers such an exception in nVHE, it panics and triggers and exception return to EL1. Therefore, teach nvhe_hyp_panic_handler() to detect kCFI errors from the ESR and report them. If necessary, remind the user that EL2 kCFI is not affected by CONFIG_CFI_PERMISSIVE. Pass $(CC_FLAGS_CFI) to the compiler when building the nVHE hyp code. Use SYM_TYPED_FUNC_START() for __pkvm_init_switch_pgd, as nVHE can't call it directly and must use a PA function pointer from C (because it is part of the idmap page), which would trigger a kCFI failure if the type ID wasn't present. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-9-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: Introduce print_nvhe_hyp_panic helperPierre-Clément Tosi2024-06-201-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a helper to display a panic banner soon to also be used for kCFI failures, to ensure that we remain consistent. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-8-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | arm64: Introduce esr_brk_comment, esr_is_cfi_brkPierre-Clément Tosi2024-06-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it is already used in two places, move esr_comment() to a header for re-use, with a clearer name. Introduce esr_is_cfi_brk() to detect kCFI BRK syndromes, currently used by early_brk64() but soon to also be used by hypervisor code. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-7-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: VHE: Mark __hyp_call_panic __noreturnPierre-Clément Tosi2024-06-201-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given that the sole purpose of __hyp_call_panic() is to call panic(), a __noreturn function, give it the __noreturn attribute, removing the need for its caller to use unreachable(). Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-6-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32Pierre-Clément Tosi2024-06-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ignore R_AARCH64_ABS32 relocations, instead of panicking, when emitting the relocation table of the hypervisor. The toolchain might produce them when generating function calls with kCFI to represent the 32-bit type ID which can then be resolved across compilation units at link time. These are NOT actual 32-bit addresses and are therefore not needed in the final (runtime) relocation table (which is unlikely to use 32-bit absolute addresses for arm64 anyway). Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-5-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nVHE: Simplify invalid_host_el2_vectPierre-Clément Tosi2024-06-201-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The invalid_host_el2_vect macro is used by EL2{t,h} handlers in nVHE *host* context, which should never run with a guest context loaded. Therefore, remove the superfluous vCPU context check and branch unconditionally to hyp_panic. Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-4-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: Fix __pkvm_init_switch_pgd call ABIPierre-Clément Tosi2024-06-202-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the mismatch between the (incorrect) C signature, C call site, and asm implementation by aligning all three on an API passing the parameters (pgd and SP) separately, instead of as a bundled struct. Remove the now unnecessary memory accesses while the MMU is off from the asm, which simplifies the C caller (as it does not need to convert a VA struct pointer to PA) and makes the code slightly more robust by offsetting the struct fields from C and properly expressing the call to the C compiler (e.g. type checker and kCFI). Fixes: f320bc742bc2 ("KVM: arm64: Prepare the creation of s1 mappings at EL2") Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-3-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: Fix clobbered ELR in sync abort/SErrorPierre-Clément Tosi2024-06-202-2/+11
| | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the hypervisor receives a SError or synchronous exception (EL2h) while running with the __kvm_hyp_vector and if ELR_EL2 doesn't point to an extable entry, it panics indirectly by overwriting ELR with the address of a panic handler in order for the asm routine it returns to to ERET into the handler. However, this clobbers ELR_EL2 for the handler itself. As a result, hyp_panic(), when retrieving what it believes to be the PC where the exception happened, actually ends up reading the address of the panic handler that called it! This results in an erroneous and confusing panic message where the source of any synchronous exception (e.g. BUG() or kCFI) appears to be __guest_exit_panic, making it hard to locate the actual BRK instruction. Therefore, store the original ELR_EL2 in the per-CPU kvm_hyp_ctxt and point the sysreg to a routine that first restores it to its previous value before running __guest_exit_panic. Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context") Signed-off-by: Pierre-Clément Tosi <ptosi@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240610063244.2828978-2-ptosi@google.com Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | * | | Merge branch kvm-arm64/ctr-el0 into kvmarm/nextOliver Upton2024-07-144-186/+216
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/ctr-el0: : Support for user changes to CTR_EL0, courtesy of Sebastian Ott : : Allow userspace to change the guest-visible value of CTR_EL0 for a VM, : so long as the requested value represents a subset of features supported : by hardware. In other words, prevent the VMM from over-promising the : capabilities of hardware. : : Make this happen by fitting CTR_EL0 into the existing infrastructure for : feature ID registers. KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking KVM: selftests: arm64: Test writes to CTR_EL0 KVM: arm64: rename functions for invariant sys regs KVM: arm64: show writable masks for feature registers KVM: arm64: Treat CTR_EL0 as a VM feature ID register KVM: arm64: unify code to prepare traps KVM: arm64: nv: Use accessors for modifying ID registers KVM: arm64: Add helper for writing ID regs KVM: arm64: Use read-only helper for reading VM ID registers KVM: arm64: Make idregs debugfs iterator search sysreg table directly KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show() Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 maskingOliver Upton2024-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marc reports that L1 VMs aren't booting with the NV series applied to today's kvmarm/next. After bisecting the issue, it appears that 44241f34fac9 ("KVM: arm64: nv: Use accessors for modifying ID registers") is to blame. Poking around at the issue a bit further, it'd appear that the value for ID_AA64PFR0_EL1 is complete garbage, as 'val' still contains the value we set ID_AA64ISAR1_EL1 to. Fix the read-modify-write pattern to actually use ID_AA64PFR0_EL1 as the starting point. Excuse me as I return to my shame cube. Reported-by: Marc Zyngier <maz@kernel.org> Fixes: 44241f34fac9 ("KVM: arm64: nv: Use accessors for modifying ID registers") Acked-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240621224044.2465901-1-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: rename functions for invariant sys regsSebastian Ott2024-06-201-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Invariant system id registers are populated with host values at initialization time using their .reset function cb. These are currently called get_* which is usually used by the functions implementing the .get_user callback. Change their function names to reset_* to reflect what they are used for. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-10-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
| | | * | | KVM: arm64: show writable masks for feature registersSebastian Ott2024-06-201-14/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using ~0UL provide the actual writable mask for non-id feature registers in the output of the KVM_ARM_GET_REG_WRITABLE_MASKS ioctl. This changes the mask for the CTR_EL0 and CLIDR_EL1 registers. Signed-off-by: Sebastian Ott <sebott@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20240619174036.483943-9-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>