summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kvm/hyp
Commit message (Collapse)AuthorAgeFilesLines
* arm64: Use the clearbhb instruction in mitigationsJames Morse2022-02-241-0/+1
| | | | | | | | | | | | | | Future CPUs may implement a clearbhb instruction that is sufficient to mitigate SpectreBHB. CPUs that implement this instruction, but not CSV2.3 must be affected by Spectre-BHB. Add support to use this instruction as the BHB mitigation on CPUs that support it. The instruction is in the hint space, so it will be treated by a NOP as older CPUs. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
* arm64: Mitigate spectre style branch history side channelsJames Morse2022-02-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Speculation attacks against some high-performance processors can make use of branch history to influence future speculation. When taking an exception from user-space, a sequence of branches or a firmware call overwrites or invalidates the branch history. The sequence of branches is added to the vectors, and should appear before the first indirect branch. For systems using KPTI the sequence is added to the kpti trampoline where it has a free register as the exit from the trampoline is via a 'ret'. For systems not using KPTI, the same register tricks are used to free up a register in the vectors. For the firmware call, arch-workaround-3 clobbers 4 registers, so there is no choice but to save them to the EL1 stack. This only happens for entry from EL0, so if we take an exception due to the stack access, it will not become re-entrant. For KVM, the existing branch-predictor-hardening vectors are used. When a spectre version of these vectors is in use, the firmware call is sufficient to mitigate against Spectre-BHB. For the non-spectre versions, the sequence of branches is added to the indirect vector. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
* arm64: Add percpu vectors for EL1James Morse2022-02-161-2/+8
| | | | | | | | | | | | | | | | | | The Spectre-BHB workaround adds a firmware call to the vectors. This is needed on some CPUs, but not others. To avoid the unaffected CPU in a big/little pair from making the firmware call, create per cpu vectors. The per-cpu vectors only apply when returning from EL0. Systems using KPTI can use the canonical 'full-fat' vectors directly at EL1, the trampoline exit code will switch to this_cpu_vector on exit to EL0. Systems not using KPTI should always use this_cpu_vector. this_cpu_vector will point at a vector in tramp_vecs or __bp_harden_el1_vectors, depending on whether KPTI is in use. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
* KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3AJames Morse2022-02-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware call from the vectors, or run a sequence of branches. This gets added to the hyp vectors. If there is no support for arch-workaround-1 in firmware, the indirect vector will be used. kvm_init_vector_slots() only initialises the two indirect slots if the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors() only initialises __hyp_bp_vect_base if the platform is vulnerable to Spectre-v3a. As there are about to more users of the indirect vectors, ensure their entries in hyp_spectre_vector_selector[] are always initialised, and __hyp_bp_vect_base defaults to the regular VA mapping. The Spectre-v3a check is moved to a helper kvm_system_needs_idmapped_vectors(), and merged with the code that creates the hyp mappings. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>
* KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errataJames Morse2022-02-031-1/+19
| | | | | | | | | | | | | | | | | | | | Cortex-A510's erratum #2077057 causes SPSR_EL2 to be corrupted when single-stepping authenticated ERET instructions. A single step is expected, but a pointer authentication trap is taken instead. The erratum causes SPSR_EL1 to be copied to SPSR_EL2, which could allow EL1 to cause a return to EL2 with a guest controlled ELR_EL2. Because the conditions require an ERET into active-not-pending state, this is only a problem for the EL2 when EL2 is stepping EL1. In this case the previous SPSR_EL2 value is preserved in struct kvm_vcpu, and can be restored. Cc: stable@vger.kernel.org # 53960faf2b73: arm64: Add Cortex-A510 CPU part definition Cc: stable@vger.kernel.org Signed-off-by: James Morse <james.morse@arm.com> [maz: fixup cpucaps ordering] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127122052.1584324-5-james.morse@arm.com
* KVM: arm64: Avoid consuming a stale esr value when SError occurJames Morse2022-02-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When any exception other than an IRQ occurs, the CPU updates the ESR_EL2 register with the exception syndrome. An SError may also become pending, and will be synchronised by KVM. KVM notes the exception type, and whether an SError was synchronised in exit_code. When an exception other than an IRQ occurs, fixup_guest_exit() updates vcpu->arch.fault.esr_el2 from the hardware register. When an SError was synchronised, the vcpu esr value is used to determine if the exception was due to an HVC. If so, ELR_EL2 is moved back one instruction. This is so that KVM can process the SError first, and re-execute the HVC if the guest survives the SError. But if an IRQ synchronises an SError, the vcpu's esr value is stale. If the previous non-IRQ exception was an HVC, KVM will corrupt ELR_EL2, causing an unrelated guest instruction to be executed twice. Check ARM_EXCEPTION_CODE() before messing with ELR_EL2, IRQs don't update this register so don't need to check. Fixes: defe21f49bc9 ("KVM: arm64: Move PC rollback on SError to HYP") Cc: stable@vger.kernel.org Reported-by: Steven Price <steven.price@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127122052.1584324-3-james.morse@arm.com
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-01-283-13/+13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "Two larger x86 series: - Redo incorrect fix for SEV/SMAP erratum - Windows 11 Hyper-V workaround Other x86 changes: - Various x86 cleanups - Re-enable access_tracking_perf_test - Fix for #GP handling on SVM - Fix for CPUID leaf 0Dh in KVM_GET_SUPPORTED_CPUID - Fix for ICEBP in interrupt shadow - Avoid false-positive RCU splat - Enable Enlightened MSR-Bitmap support for real ARM: - Correctly update the shadow register on exception injection when running in nVHE mode - Correctly use the mm_ops indirection when performing cache invalidation from the page-table walker - Restrict the vgic-v3 workaround for SEIS to the two known broken implementations Generic code changes: - Dead code cleanup" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits) KVM: eventfd: Fix false positive RCU usage warning KVM: nVMX: Allow VMREAD when Enlightened VMCS is in use KVM: nVMX: Implement evmcs_field_offset() suitable for handle_vmread() KVM: nVMX: Rename vmcs_to_field_offset{,_table} KVM: nVMX: eVMCS: Filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER KVM: nVMX: Also filter MSR_IA32_VMX_TRUE_PINBASED_CTLS when eVMCS selftests: kvm: check dynamic bits against KVM_X86_XCOMP_GUEST_SUPP KVM: x86: add system attribute to retrieve full set of supported xsave states KVM: x86: Add a helper to retrieve userspace address from kvm_device_attr selftests: kvm: move vm_xsave_req_perm call to amx_test KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any time KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS KVM: x86: Keep MSR_IA32_XSS unchanged for INIT KVM: x86: Free kvm_cpuid_entry2 array on post-KVM_RUN KVM_SET_CPUID{,2} KVM: nVMX: WARN on any attempt to allocate shadow VMCS for vmcs02 KVM: selftests: Don't skip L2's VMCALL in SMM test for SVM guest KVM: x86: Check .flags in kvm_cpuid_check_equal() too KVM: x86: Forcibly leave nested virt when SMM state is toggled KVM: SVM: drop unnecessary code in svm_hv_vmcb_dirty_nested_enlightenments() KVM: SVM: hyper-v: Enable Enlightened MSR-Bitmap support for real ...
| * Merge tag 'kvmarm-fixes-5.17-1' of ↵Paolo Bonzini2022-01-283-13/+13
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.17, take #1 - Correctly update the shadow register on exception injection when running in nVHE mode - Correctly use the mm_ops indirection when performing cache invalidation from the page-table walker - Restrict the vgic-v3 workaround for SEIS to the two known broken implementations
| | * KVM: arm64: Use shadow SPSR_EL1 when injecting exceptions on !VHEMarc Zyngier2022-01-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Injecting an exception into a guest with non-VHE is risky business. Instead of writing in the shadow register for the switch code to restore it, we override the CPU register instead. Which gets overriden a few instructions later by said restore code. The result is that although the guest correctly gets the exception, it will return to the original context in some random state, depending on what was there the first place... Boo. Fix the issue by writing to the shadow register. The original code is absolutely fine on VHE, as the state is already loaded, and writing to the shadow register in that case would actually be a bug. Fixes: bb666c472ca2 ("KVM: arm64: Inject AArch64 exceptions from HYP") Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20220121184207.423426-1-maz@kernel.org
| | * KVM: arm64: vgic-v3: Restrict SEIS workaround to known broken systemsMarc Zyngier2022-01-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Contrary to what df652bcf1136 ("KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors") was asserting, there is at least one other system out there (Cavium ThunderX2) implementing SEIS, and not in an obviously broken way. So instead of imposing the M1 workaround on an innocent bystander, let's limit it to the two known broken Apple implementations. Fixes: df652bcf1136 ("KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors") Reported-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220122103912.795026-1-maz@kernel.org
| | * KVM: arm64: pkvm: Use the mm_ops indirection for cache maintenanceMarc Zyngier2022-01-141-12/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CMOs issued from EL2 cannot directly use the kernel helpers, as EL2 doesn't have a mapping of the guest pages. Oops. Instead, use the mm_ops indirection to use helpers that will perform a mapping at EL2 and allow the CMO to be effective. Fixes: 25aa28691bb9 ("KVM: arm64: Move guest CMOs to the fault handlers") Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220114125038.1336965-1-maz@kernel.org
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-01-1616-311/+570
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...
| * | Merge tag 'kvmarm-5.17' of ↵Paolo Bonzini2022-01-0716-311/+570
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update
| | * Merge branch kvm-arm64/misc-5.17 into kvmarm-master/nextMarc Zyngier2022-01-042-5/+5
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/misc-5.17: : . : Misc fixes and improvements: : - Add minimal support for ARMv8.7's PMU extension : - Constify kvm_io_gic_ops : - Drop kvm_is_transparent_hugepage() prototype : - Drop unused workaround_flags field : - Rework kvm_pgtable initialisation : - Documentation fixes : - Replace open-coded SCTLR_EL1.EE useage with its defined macro : - Sysreg list selftest update to handle PAuth : - Include cleanups : . KVM: arm64: vgic: Replace kernel.h with the necessary inclusions KVM: arm64: Fix comment typo in kvm_vcpu_finalize_sve() KVM: arm64: selftests: get-reg-list: Add pauth configuration KVM: arm64: Fix comment on barrier in kvm_psci_vcpu_on() KVM: arm64: Fix comment for kvm_reset_vcpu() KVM: arm64: Use defined value for SCTLR_ELx_EE KVM: arm64: Rework kvm_pgtable initialisation KVM: arm64: Drop unused workaround_flags vcpu field Signed-off-by: Marc Zyngier <maz@kernel.org>
| | | * KVM: arm64: Rework kvm_pgtable initialisationMarc Zyngier2021-12-162-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ganapatrao reported that the kvm_pgtable->mmu pointer is more or less hardcoded to the main S2 mmu structure, while the nested code needs it to point to other instances (as we have one instance per nested context). Rework the initialisation of the kvm_pgtable structure so that this assumtion doesn't hold true anymore. This requires some minor changes to the order in which things are initialised (the mmu->arch pointer being the critical one). Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211129200150.351436-5-maz@kernel.org
| | * | Merge branch kvm-arm64/pkvm-hyp-sharing into kvmarm-master/nextMarc Zyngier2021-12-166-100/+543
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/pkvm-hyp-sharing: : . : Series from Quentin Perret, implementing HYP page share/unshare: : : This series implements an unshare hypercall at EL2 in nVHE : protected mode, and makes use of it to unmmap guest-specific : data-structures from EL2 stage-1 during guest tear-down. : Crucially, the implementation of the share and unshare : routines use page refcounts in the host kernel to avoid : accidentally unmapping data-structures that overlap a common : page. : [...] : . KVM: arm64: pkvm: Unshare guest structs during teardown KVM: arm64: Expose unshare hypercall to the host KVM: arm64: Implement do_unshare() helper for unsharing memory KVM: arm64: Implement __pkvm_host_share_hyp() using do_share() KVM: arm64: Implement do_share() helper for sharing memory KVM: arm64: Introduce wrappers for host and hyp spin lock accessors KVM: arm64: Extend pkvm_page_state enumeration to handle absent pages KVM: arm64: pkvm: Refcount the pages shared with EL2 KVM: arm64: Introduce kvm_share_hyp() KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2 KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table KVM: arm64: Fixup hyp stage-1 refcount KVM: arm64: Refcount hyp stage-1 pgtable pages KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator Signed-off-by: Marc Zyngier <maz@kernel.org>
| | | * | KVM: arm64: Expose unshare hypercall to the hostWill Deacon2021-12-163-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce an unshare hypercall which can be used to unmap memory from the hypervisor stage-1 in nVHE protected mode. This will be useful to update the EL2 ownership state of pages during guest teardown, and avoids keeping dangling mappings to unreferenced portions of memory. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-14-qperret@google.com
| | | * | KVM: arm64: Implement do_unshare() helper for unsharing memoryWill Deacon2021-12-161-0/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tearing down a previously shared memory region results in the borrower losing access to the underlying pages and returning them to the "owned" state in the owner. Implement a do_unshare() helper, along the same lines as do_share(), to provide this functionality for the host-to-hyp case. Reviewed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-13-qperret@google.com
| | | * | KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()Will Deacon2021-12-161-88/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __pkvm_host_share_hyp() shares memory between the host and the hypervisor so implement it as an invocation of the new do_share() mechanism. Note that double-sharing is no longer permitted (as this allows us to reduce the number of page-table walks significantly), but is thankfully no longer relied upon by the host. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-12-qperret@google.com
| | | * | KVM: arm64: Implement do_share() helper for sharing memoryWill Deacon2021-12-161-0/+237
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, protected KVM isolates memory pages so that they are accessible only to their owner: be it the host kernel, the hypervisor at EL2 or (in future) the guest. Establishing shared-memory regions between these components therefore involves a transition for each page so that the owner can share memory with a borrower under a certain set of permissions. Introduce a do_share() helper for safely sharing a memory region between two components. Currently, only host-to-hyp sharing is implemented, but the code is easily extended to handle other combinations and the permission checks for each component are reusable. Reviewed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-11-qperret@google.com
| | | * | KVM: arm64: Introduce wrappers for host and hyp spin lock accessorsWill Deacon2021-12-161-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for adding additional locked sections for manipulating page-tables at EL2, introduce some simple wrappers around the host and hypervisor locks so that it's a bit easier to read and bit more difficult to take the wrong lock (or even take them in the wrong order). Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-10-qperret@google.com
| | | * | KVM: arm64: Extend pkvm_page_state enumeration to handle absent pagesWill Deacon2021-12-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explicitly name the combination of SW0 | SW1 as reserved in the pte and introduce a new PKVM_NOPAGE meta-state which, although not directly stored in the software bits of the pte, can be used to represent an entry for which there is no underlying page. This is distinct from an invalid pte, as stage-2 identity mappings for the host are created lazily and so an invalid pte there is the same as a valid mapping for the purposes of ownership information. This state will be used for permission checking during page transitions in later patches. Reviewed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-9-qperret@google.com
| | | * | KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2Will Deacon2021-12-161-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor stage-1 mappings at EL2. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-6-qperret@google.com
| | | * | KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-tableWill Deacon2021-12-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_pgtable_hyp_unmap() relies on the ->page_count() function callback being provided by the memory-management operations for the page-table. Wire up this callback for the hypervisor stage-1 page-table. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-5-qperret@google.com
| | | * | KVM: arm64: Fixup hyp stage-1 refcountQuentin Perret2021-12-161-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In nVHE-protected mode, the hyp stage-1 page-table refcount is broken due to the lack of refcount support in the early allocator. Fix-up the refcount in the finalize walker, once the 'hyp_vmemmap' is up and running. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-4-qperret@google.com
| | | * | KVM: arm64: Refcount hyp stage-1 pgtable pagesQuentin Perret2021-12-161-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To prepare the ground for allowing hyp stage-1 mappings to be removed at run-time, update the KVM page-table code to maintain a correct refcount using the ->{get,put}_page() function callbacks. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-3-qperret@google.com
| | | * | KVM: arm64: Provide {get,put}_page() stubs for early hyp allocatorQuentin Perret2021-12-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In nVHE protected mode, the EL2 code uses a temporary allocator during boot while re-creating its stage-1 page-table. Unfortunately, the hyp_vmmemap is not ready to use at this stage, so refcounting pages is not possible. That is not currently a problem because hyp stage-1 mappings are never removed, which implies refcounting of page-table pages is unnecessary. In preparation for allowing hypervisor stage-1 mappings to be removed, provide stub implementations for {get,put}_page() in the early allocator. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-2-qperret@google.com
| | * | | Merge branch kvm-arm64/pkvm-cleanups-5.17 into kvmarm-master/nextMarc Zyngier2021-12-154-5/+4
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/pkvm-cleanups-5.17: : . : pKVM cleanups from Quentin Perret: : : This series is a collection of various fixes and cleanups for KVM/arm64 : when running in nVHE protected mode. The first two patches are real : fixes/improvements, the following two are minor cleanups, and the last : two help satisfy my paranoia so they're certainly optional. : . KVM: arm64: pkvm: Make kvm_host_owns_hyp_mappings() robust to VHE KVM: arm64: pkvm: Stub io map functions KVM: arm64: Make __io_map_base static KVM: arm64: Make the hyp memory pool static KVM: arm64: pkvm: Disable GICv2 support KVM: arm64: pkvm: Fix hyp_pool max order Signed-off-by: Marc Zyngier <maz@kernel.org>
| | | * | | KVM: arm64: Make __io_map_base staticQuentin Perret2021-12-152-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __io_map_base variable is used at EL2 to track the end of the hypervisor's "private" VA range in nVHE protected mode. However it doesn't need to be used outside of mm.c, so let's make it static to keep all the hyp VA allocation logic in one place. Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211208152300.2478542-5-qperret@google.com
| | | * | | KVM: arm64: Make the hyp memory pool staticQuentin Perret2021-12-152-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hyp memory pool struct is sized to fit exactly the needs of the hypervisor stage-1 page-table allocator, so it is important it is not used for anything else. As it is currently used only from setup.c, reduce its visibility by marking it static. Signed-off-by: Quentin Perret <qperret@google.com> Reviewed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211208152300.2478542-4-qperret@google.com
| | | * | | KVM: arm64: pkvm: Fix hyp_pool max orderQuentin Perret2021-12-151-1/+1
| | | | |/ | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The EL2 page allocator in protected mode maintains a per-pool max order value to optimize allocations when the memory region it covers is small. However, the max order value is currently under-estimated whenever the number of pages in the region is a power of two. Fix the estimation. Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211208152300.2478542-2-qperret@google.com
| | * | | Merge branch kvm-arm64/hyp-header-split into kvmarm-master/nextMarc Zyngier2021-12-077-167/+14
| | |\ \ \ | | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/hyp-header-split: : . : Tidy up the header file usage for the nvhe hyp object so : that header files under arch/arm64/kvm/hyp/include are not : included by host code running at EL1. : . KVM: arm64: Move host EL1 code out of hyp/ directory KVM: arm64: Generate hyp_constants.h for the host arm64: Add missing include of asm/cpufeature.h to asm/mmu.h Signed-off-by: Marc Zyngier <maz@kernel.org>
| | | * | KVM: arm64: Move host EL1 code out of hyp/ directoryWill Deacon2021-12-066-167/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm/hyp/reserved_mem.c contains host code executing at EL1 and is not linked into the hypervisor object. Move the file into kvm/pkvm.c and rework the headers so that the definitions shared between the host and the hypervisor live in asm/kvm_pkvm.h. Signed-off-by: Will Deacon <will@kernel.org> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211202171048.26924-4-will@kernel.org
| | | * | KVM: arm64: Generate hyp_constants.h for the hostWill Deacon2021-12-061-0/+10
| | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to avoid exposing hypervisor (EL2) data structures directly to the host, generate hyp_constants.h to provide constants such as structure sizes to the host without dragging in the definitions themselves. Signed-off-by: Will Deacon <will@kernel.org> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211202171048.26924-3-will@kernel.org
| | * | KVM: arm64: Stop mapping current thread_info at EL2Marc Zyngier2021-11-223-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we can track an equivalent of TIF_FOREIGN_FPSTATE, drop the mapping of current's thread_info at EL2. Signed-off-by: Marc Zyngier <maz@kernel.org>
| | * | KVM: arm64: Introduce flag shadowing TIF_FOREIGN_FPSTATEMarc Zyngier2021-11-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently have to maintain a mapping the thread_info structure at EL2 in order to be able to check the TIF_FOREIGN_FPSTATE flag. In order to eventually get rid of this, start with a vcpu flag that shadows the thread flag on each entry into the hypervisor. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
| | * | KVM: arm64: Remove unused __sve_save_stateMarc Zyngier2021-11-221-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we don't have any users left for __sve_save_state, remove it altogether. Should we ever need to save the SVE state from the hypervisor again, we can always re-introduce it. Suggested-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
| | * | KVM: arm64: Get rid of host SVE tracking/savingMarc Zyngier2021-11-221-24/+3
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SVE host tracking in KVM is pretty involved. It relies on a set of flags tracking the ownership of the SVE register, as well as that of the EL0 access. It is also pretty scary: __hyp_sve_save_host() computes a thread_struct pointer and obtains a sve_state which gets directly accessed without further ado, even on nVHE. How can this even work? The answer to that is that it doesn't, and that this is mostly dead code. Closer examination shows that on executing a syscall, userspace loses its SVE state entirely. This is part of the ABI. Another thing to notice is that although the kernel provides helpers such as kernel_neon_begin()/end(), they only deal with the FP/NEON state, and not SVE. Given that you can only execute a guest as the result of a syscall, and that the kernel cannot use SVE by itself, it becomes pretty obvious that there is never any host SVE state to save, and that this code is only there to increase confusion. Get rid of the TIF_SVE tracking and host save infrastructure altogether. Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
* | | Merge tag 'arm64-upstream' of ↵Linus Torvalds2022-01-101-0/+1
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - KCSAN enabled for arm64. - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD. - Some more SVE clean-ups and refactoring in preparation for SME support (scalable matrix extensions). - BTI clean-ups (SYM_FUNC macros etc.) - arm64 atomics clean-up and codegen improvements. - HWCAPs for FEAT_AFP (alternate floating point behaviour) and FEAT_RPRESS (increased precision of reciprocal estimate and reciprocal square root estimate). - Use SHA3 instructions to speed-up XOR. - arm64 unwind code refactoring/unification. - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP == 1 (potentially set by a hypervisor; user-space already does this). - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU, Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups. - Other fixes and clean-ups; highlights: fix the handling of erratum 1418040, correct the calculation of the nomap region boundaries, introduce io_stop_wc() mapped to the new DGH instruction (data gathering hint). * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits) arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: errata: Fix exec handling in erratum 1418040 workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64: Enable KCSAN kselftest/arm64: Add pidbench for floating point syscall cases arm64/fp: Add comments documenting the usage of state restore functions kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME ...
| * | arm64: Enable KCSANKefeng Wang2021-12-141-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables KCSAN for arm64, with updates to build rules to not use KCSAN for several incompatible compilation units. Recent GCC version(at least GCC10) made outline-atomics as the default option(unlike Clang), which will cause linker errors for kernel/kcsan/core.o. Disables the out-of-line atomics by no-outline-atomics to fix the linker errors. Meanwhile, as Mark said[1], some latent issues are needed to be fixed which isn't just a KCSAN problem, we make the KCSAN depends on EXPERT for now. Tested selftest and kcsan_test(built with GCC11 and Clang 13), and all passed. [1] https://lkml.kernel.org/r/YadiUPpJ0gADbiHQ@FVFF77S0Q05N Acked-by: Marco Elver <elver@google.com> # kernel/kcsan Tested-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20211211131734.126874-1-wangkefeng.wang@huawei.com [catalin.marinas@arm.com: added comment to justify EXPERT] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | KVM: arm64: Move pkvm's special 32bit handling into a generic infrastructureMarc Zyngier2021-11-243-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | Protected KVM is trying to turn AArch32 exceptions into an illegal exception entry. Unfortunately, it does that in a way that is a bit abrupt, and too early for PSTATE to be available. Instead, move it to the fixup code, which is a more reasonable place for it. This will also be useful for the NV code. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
* | KVM: arm64: Save PSTATE early on exitMarc Zyngier2021-11-242-1/+12
|/ | | | | | | | | | | | | | | In order to be able to use primitives such as vcpu_mode_is_32bit(), we need to synchronize the guest PSTATE. However, this is currently done deep into the bowels of the world-switch code, and we do have helpers evaluating this much earlier (__vgic_v3_perform_cpuif_access and handle_aarch32_guest, for example). Move the saving of the guest pstate into the early fixups, which cures the first issue. The second one will be addressed separately. Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
* Merge tag 'kvmarm-fixes-5.16-1' of ↵Paolo Bonzini2021-11-124-5/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm64 fixes for 5.16, take #1 - Fix the host S2 finalization by solely iterating over the memblocks instead of the whole IPA space - Tighten the return value of kvm_vcpu_preferred_target() now that 32bit support is long gone - Make sure the extraction of ESR_ELx.EC is limited to the architected bits - Comment fixups
| * KVM: arm64: Fix host stage-2 finalizationQuentin Perret2021-11-081-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently walk the hypervisor stage-1 page-table towards the end of hyp init in nVHE protected mode and adjust the host page ownership attributes in its stage-2 in order to get a consistent state from both point of views. The walk is done on the entire hyp VA space, and expects to only ever find page-level mappings. While this expectation is reasonable in the half of hyp VA space that maps memory with a fixed offset (see the loop in pkvm_create_mappings_locked()), it can be incorrect in the other half where nothing prevents the usage of block mappings. For instance, on systems where memory is physically aligned at an address that happens to maps to a PMD aligned VA in the hyp_vmemmap, kvm_pgtable_hyp_map() will install block mappings when backing the hyp_vmemmap, which will later cause finalize_host_mappings() to fail. Furthermore, it should be noted that all pages backing the hyp_vmemmap are also mapped in the 'fixed offset range' of the hypervisor, which implies that finalize_host_mappings() will walk both aliases and update the host stage-2 attributes twice. The order in which this happens is unpredictable, though, since the hyp VA layout is highly dependent on the position of the idmap page, hence resulting in a fragile mess at best. In order to fix all of this, let's restrict the finalization walk to only cover memory regions in the 'fixed-offset range' of the hyp VA space and nothing else. This not only fixes a correctness issue, but will also result in a slighlty faster hyp initialization overall. Fixes: 2c50166c62ba ("KVM: arm64: Mark host bss and rodata section as shared") Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211108154636.393384-1-qperret@google.com
| * KVM: arm64: nvhe: Fix a non-kernel-doc commentRandy Dunlap2021-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not use kernel-doc "/**" notation when the comment is not in kernel-doc format. Fixes this docs build warning: arch/arm64/kvm/hyp/nvhe/sys_regs.c:478: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Handler for protected VM restricted exceptions. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211106032529.15057-1-rdunlap@infradead.org
| * KVM: arm64: Extract ESR_ELx.EC onlyMark Rutland2021-11-082-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since ARMv8.0 the upper 32 bits of ESR_ELx have been RES0, and recently some of the upper bits gained a meaning and can be non-zero. For example, when FEAT_LS64 is implemented, ESR_ELx[36:32] contain ISS2, which for an ST64BV or ST64BV0 can be non-zero. This can be seen in ARM DDI 0487G.b, page D13-3145, section D13.2.37. Generally, we must not rely on RES0 bit remaining zero in future, and when extracting ESR_ELx.EC we must mask out all other bits. All C code uses the ESR_ELx_EC() macro, which masks out the irrelevant bits, and therefore no alterations are required to C code to avoid consuming irrelevant bits. In a couple of places the KVM assembly extracts ESR_ELx.EC using LSR on an X register, and so could in theory consume previously RES0 bits. In both cases this is for comparison with EC values ESR_ELx_EC_HVC32 and ESR_ELx_EC_HVC64, for which the upper bits of ESR_ELx must currently be zero, but this could change in future. This patch adjusts the KVM vectors to use UBFX rather than LSR to extract ESR_ELx.EC, ensuring these are robust to future additions to ESR_ELx. Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211103110545.4613-1-mark.rutland@arm.com
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2021-11-0214-181/+1230
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "ARM: - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us RISC-V: - New KVM port. x86: - New API to control TSC offset from userspace - TSC scaling for nested hypervisors on SVM - Switch masterclock protection from raw_spin_lock to seqcount - Clean up function prototypes in the page fault code and avoid repeated memslot lookups - Convey the exit reason to userspace on emulation failure - Configure time between NX page recovery iterations - Expose Predictive Store Forwarding Disable CPUID leaf - Allocate page tracking data structures lazily (if the i915 KVM-GT functionality is not compiled in) - Cleanups, fixes and optimizations for the shadow MMU code s390: - SIGP Fixes - initial preparations for lazy destroy of secure VMs - storage key improvements/fixes - Log the guest CPNC Starting from this release, KVM-PPC patches will come from Michael Ellerman's PPC tree" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) RISC-V: KVM: fix boolreturn.cocci warnings RISC-V: KVM: remove unneeded semicolon RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions RISC-V: KVM: Factor-out FP virtualization into separate sources KVM: s390: add debug statement for diag 318 CPNC data KVM: s390: pv: properly handle page flags for protected guests KVM: s390: Fix handle_sske page fault handling KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol KVM: x86: On emulation failure, convey the exit reason, etc. to userspace KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info KVM: x86: Clarify the kvm_run.emulation_failure structure layout KVM: s390: Add a routine for setting userspace CPU state KVM: s390: Simplify SIGP Set Arch handling KVM: s390: pv: avoid stalls when making pages secure KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm KVM: s390: pv: avoid double free of sida page KVM: s390: pv: add macros for UVC CC values s390/mm: optimize reset_guest_reference_bit() s390/mm: optimize set_guest_storage_key() s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present ...
| * | Merge tag 'kvmarm-5.16' of ↵Paolo Bonzini2021-10-3114-181/+1230
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.16 - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us
| | * Merge branch kvm-arm64/pkvm/fixed-features into kvmarm-master/nextMarc Zyngier2021-10-1812-147/+1176
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * kvm-arm64/pkvm/fixed-features: (22 commits) : . : Add the pKVM fixed feature that allows a bunch of exceptions : to either be forbidden or be easily handled at EL2. : . KVM: arm64: pkvm: Give priority to standard traps over pvm handling KVM: arm64: pkvm: Pass vpcu instead of kvm to kvm_get_exit_handler_array() KVM: arm64: pkvm: Move kvm_handle_pvm_restricted around KVM: arm64: pkvm: Consolidate include files KVM: arm64: pkvm: Preserve pending SError on exit from AArch32 KVM: arm64: pkvm: Handle GICv3 traps as required KVM: arm64: pkvm: Drop sysregs that should never be routed to the host KVM: arm64: pkvm: Drop AArch32-specific registers KVM: arm64: pkvm: Make the ERR/ERX*_EL1 registers RAZ/WI KVM: arm64: pkvm: Use a single function to expose all id-regs KVM: arm64: Fix early exit ptrauth handling KVM: arm64: Handle protected guests at 32 bits KVM: arm64: Trap access to pVM restricted features KVM: arm64: Move sanitized copies of CPU features KVM: arm64: Initialize trap registers for protected VMs KVM: arm64: Add handlers for protected VM System Registers KVM: arm64: Simplify masking out MTE in feature id reg KVM: arm64: Add missing field descriptor for MDCR_EL2 KVM: arm64: Pass struct kvm to per-EC handlers KVM: arm64: Move early handlers to per-EC handlers ... Signed-off-by: Marc Zyngier <maz@kernel.org>
| | | * KVM: arm64: pkvm: Give priority to standard traps over pvm handlingMarc Zyngier2021-10-181-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Checking for pvm handling first means that we cannot handle ptrauth traps or apply any of the workarounds (GICv3 or TX2 #219). Flip the order around. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Link: https://lore.kernel.org/r/20211013120346.2926621-12-maz@kernel.org