summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigationMark Gross2020-06-115-0/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7e5b3c267d256822407a22fdce6afdf9cd13f9fb upstream. SRBDS is an MDS-like speculative side channel that can leak bits from the random number generator (RNG) across cores and threads. New microcode serializes the processor access during the execution of RDRAND and RDSEED. This ensures that the shared buffer is overwritten before it is released for reuse. While it is present on all affected CPU models, the microcode mitigation is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the cases where TSX is not supported or has been disabled with TSX_CTRL. The mitigation is activated by default on affected processors and it increases latency for RDRAND and RDSEED instructions. Among other effects this will reduce throughput from /dev/urandom. * Enable administrator to configure the mitigation off when desired using either mitigations=off or srbds=off. * Export vulnerability status via sysfs * Rename file-scoped macros to apply for non-whitelist table initializations. [ bp: Massage, - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g, - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in, - flip check in cpu_set_bug_bits() to save an indentation level, - reflow comments. jpoimboe: s/Mitigated/Mitigation/ in user-visible strings tglx: Dropped the fused off magic for now ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> [bwh: Backported to 3.16: - CPU feature words and bugs are numbered differently - Adjust filename for <asm/msr-index.h>] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpu: Add 'table' argument to cpu_matches()Mark Gross2020-06-111-10/+13
| | | | | | | | | | | | | | | commit 93920f61c2ad7edb01e63323832585796af75fc9 upstream. To make cpu_matches() reusable for other matching tables, have it take a pointer to a x86_cpu_id table as an argument. [ bp: Flip arguments order. ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpu: Add a steppings field to struct x86_cpu_idMark Gross2020-06-112-1/+33
| | | | | | | | | | | | | | | | | | | | | commit e9d7144597b10ff13ff2264c059f7d4a7fbc89ac upstream. Intel uses the same family/model for several CPUs. Sometimes the stepping must be checked to tell them apart. On x86 there can be at most 16 steppings. Add a steppings bitmask to x86_cpu_id and a X86_MATCH_VENDOR_FAMILY_MODEL_STEPPING_FEATURE macro and support for matching against family/model/stepping. [ bp: Massage. tglx: Lightweight variant for backporting ] Signed-off-by: Mark Gross <mgross@linux.intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_steppingJia Zhang2020-06-1118-47/+47
| | | | | | | | | | | | | | | | | | | | | | | | commit b399151cb48db30ad1e0e93dd40d68c6d007b637 upstream. x86_mask is a confusing name which is hard to associate with the processor's stepping. Additionally, correct an indent issue in lib/cpu.c. Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com> [ Updated it to more recent kernels. ] Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bp@alien8.de Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - Drop changes in arch/x86/lib/cpu.c - Adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpcMichael Ellerman2020-05-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dabf6b36b83a18d57e3d4b9d50544ed040d86255 upstream. There's an OF helper called of_dma_is_coherent(), which checks if a device has a "dma-coherent" property to see if the device is coherent for DMA. But on some platforms devices are coherent by default, and on some platforms it's not possible to update existing device trees to add the "dma-coherent" property. So add a Kconfig symbol to allow arch code to tell of_dma_is_coherent() that devices are coherent by default, regardless of the presence of the property. Select that symbol on powerpc when NOT_COHERENT_CACHE is not set, ie. when the system has a coherent cache. Fixes: 92ea637edea3 ("of: introduce of_dma_is_coherent() helper") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Protect DR-based index computations from Spectre-v1/L1TF attacksMarios Pomonis2020-05-221-2/+6
| | | | | | | | | | | | | | | | | | | commit ea740059ecb37807ba47b84b33d1447435a8d868 upstream. This fixes a Spectre-v1/L1TF vulnerability in __kvm_set_dr() and kvm_get_dr(). Both kvm_get_dr() and kvm_set_dr() (a wrapper of __kvm_set_dr()) are exported symbols so KVM should tream them conservatively from a security perspective. Fixes: 020df0794f57 ("KVM: move DR register access handling into generic code") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Protect MSR-based index computations from Spectre-v1/L1TF attacks ↵Marios Pomonis2020-05-221-2/+9
| | | | | | | | | | | | | | | | | | | | | in x86.c commit 6ec4c5eee1750d5d17951c4e1960d953376a0dda upstream. This fixes a Spectre-v1/L1TF vulnerability in set_msr_mce() and get_msr_mce(). Both functions contain index computations based on the (attacker-controlled) MSR number. Fixes: 890ca9aefa78 ("KVM: Add MCE support") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: Add #include <linux/nospec.h>] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* kvm: x86: use macros to compute bank MSRsChen Yucong2020-05-221-4/+4
| | | | | | | | | | | | | | commit 81760dccf8d1fe5b128b58736fe3f56a566133cb upstream. Avoid open coded calculations for bank MSRs by using well-defined macros that hide the index of higher bank MSRs. No semantic changes. Signed-off-by: Chen Yucong <slaoub@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Protect kvm_lapic_reg_write() from Spectre-v1/L1TF attacksMarios Pomonis2020-05-221-4/+10
| | | | | | | | | | | | | | | | | | | | commit 4bf79cb089f6b1c6c632492c0271054ce52ad766 upstream. This fixes a Spectre-v1/L1TF vulnerability in kvm_lapic_reg_write(). This function contains index computations based on the (attacker-controlled) MSR number. Fixes: 0105d1a52640 ("KVM: x2apic interface to lapic") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: - Add #include <linux/nospec.h> - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacksMarios Pomonis2020-05-221-1/+3
| | | | | | | | | | | | | | | | | | commit 14e32321f3606e4b0970200b6e5e47ee6f1e6410 upstream. This fixes a Spectre-v1/L1TF vulnerability in picdev_write(). It replaces index computations based on the (attacked-controlled) port number with constants through a minor refactoring. Fixes: 85f455f7ddbe ("KVM: Add support for in-kernel PIC emulation") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: pic_{,un}lock() are called outside the switch] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Protect x86_decode_insn from Spectre-v1/L1TF attacksMarios Pomonis2020-05-221-3/+9
| | | | | | | | | | | | | | | | | | commit 3c9053a2cae7ba2ba73766a34cea41baa70f57f7 upstream. This fixes a Spectre-v1/L1TF vulnerability in x86_decode_insn(). kvm_emulate_instruction() (an ancestor of x86_decode_insn()) is an exported symbol, so KVM should treat it conservatively from a security perspective. Fixes: 045a282ca415 ("KVM: emulator: implement fninit, fnstsw, fnstcw") Signed-off-by: Nick Finco <nifi@google.com> Signed-off-by: Marios Pomonis <pomonis@google.com> Reviewed-by: Andrew Honig <ahonig@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: Add #include <linux/nospec.h>] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Free wbinvd_dirty_mask if vCPU creation failsSean Christopherson2020-05-221-2/+1
| | | | | | | | | | | | | | | | | | commit 16be9ddea268ad841457a59109963fff8c9de38d upstream. Free the vCPU's wbinvd_dirty_mask if vCPU creation fails after kvm_arch_vcpu_init(), e.g. when installing the vCPU's file descriptor. Do the freeing by calling kvm_arch_vcpu_free() instead of open coding the freeing. This adds a likely superfluous, but ultimately harmless, call to kvmclock_reset(), which only clears vcpu->arch.pv_time_enabled. Using kvm_arch_vcpu_free() allows for additional cleanup in the future. Fixes: f5f48ee15c2ee ("KVM: VMX: Execute WBINVD to keep data consistency with assigned devices") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: Also delete the preceding fx_free(), since kvm_arch_vcpu_free() calls it.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: PPC: Book3S PR: Free shared page if mmu initialization failsSean Christopherson2020-05-221-1/+3
| | | | | | | | | | | | | | | commit cb10bf9194f4d2c5d830eddca861f7ca0fecdbb4 upstream. Explicitly free the shared page if kvmppc_mmu_init() fails during kvmppc_core_vcpu_create(), as the page is freed only in kvmppc_core_vcpu_free(), which is not reached via kvm_vcpu_uninit(). Fixes: 96bc451a15329 ("KVM: PPC: Introduce shared page") Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: PPC: Book3S HV: Uninit vCPU if vcore creation failsSean Christopherson2020-05-221-1/+3
| | | | | | | | | | | | | | commit 1a978d9d3e72ddfa40ac60d26301b154247ee0bc upstream. Call kvm_vcpu_uninit() if vcore creation fails to avoid leaking any resources allocated by kvm_vcpu_init(), i.e. the vcpu->run page. Fixes: 371fefd6f2dc4 ("KVM: PPC: Allow book3s_hv guests to use SMT processor modes") Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVMSean Christopherson2020-05-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | commit e30a7d623dccdb3f880fbcad980b0cb589a1da45 upstream. Remove the bogus 64-bit only condition from the check that disables MMIO spte optimization when the system supports the max PA, i.e. doesn't have any reserved PA bits. 32-bit KVM always uses PAE paging for the shadow MMU, and per Intel's SDM: PAE paging translates 32-bit linear addresses to 52-bit physical addresses. The kernel's restrictions on max physical addresses are limits on how much memory the kernel can reasonably use, not what physical addresses are supported by hardware. Fixes: ce88decffd17 ("KVM: MMU: mmio page fault support") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86: kvm: avoid unused variable warningArnd Bergmann2020-05-221-3/+1
| | | | | | | | | | | | | | | | | | commit 7288bde1f9df6c1475675419bdd7725ce84dec56 upstream. Removing one of the two accesses of the maxphyaddr variable led to a harmless warning: arch/x86/kvm/x86.c: In function 'kvm_set_mmio_spte_mask': arch/x86/kvm/x86.c:6563:6: error: unused variable 'maxphyaddr' [-Werror=unused-variable] Removing the #ifdef seems to be the nicest workaround, as it makes the code look cleaner than adding another #ifdef. Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: nVMX: vmread should not set rflags to specify success in case of #PFMiaohe Lin2020-05-221-1/+3
| | | | | | | | | | | | | | | commit a4d956b9390418623ae5d07933e2679c68b6f83c upstream. In case writing to vmread destination operand result in a #PF, vmread should not call nested_vmx_succeed() to set rflags to specify success. Similar to as done in VMPTRST (See handle_vmptrst()). Reviewed-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: Don't let userspace set host-reserved cr4 bitsSean Christopherson2020-05-221-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b11306b53b2540c6ba068c4deddb6a17d9f8d95b upstream. Calculate the host-reserved cr4 bits at runtime based on the system's capabilities (using logic similar to __do_cpuid_func()), and use the dynamically generated mask for the reserved bit check in kvm_set_cr4() instead using of the static CR4_RESERVED_BITS define. This prevents userspace from "enabling" features in cr4 that are not supported by the system, e.g. by ignoring KVM_GET_SUPPORTED_CPUID and specifying a bogus CPUID for the vCPU. Allowing userspace to set unsupported bits in cr4 can lead to a variety of undesirable behavior, e.g. failed VM-Enter, and in general increases KVM's attack surface. A crafty userspace can even abuse CR4.LA57 to induce an unchecked #GP on a WRMSR. On a platform without LA57 support: KVM_SET_CPUID2 // CPUID_7_0_ECX.LA57 = 1 KVM_SET_SREGS // CR4.LA57 = 1 KVM_SET_MSRS // KERNEL_GS_BASE = 0x0004000000000000 KVM_RUN leads to a #GP when writing KERNEL_GS_BASE into hardware: unchecked MSR access error: WRMSR to 0xc0000102 (tried to write 0x0004000000000000) at rIP: 0xffffffffa00f239a (vmx_prepare_switch_to_guest+0x10a/0x1d0 [kvm_intel]) Call Trace: kvm_arch_vcpu_ioctl_run+0x671/0x1c70 [kvm] kvm_vcpu_ioctl+0x36b/0x5d0 [kvm] do_vfs_ioctl+0xa1/0x620 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4c/0x170 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc08133bf47 Note, the above sequence fails VM-Enter due to invalid guest state. Userspace can allow VM-Enter to succeed (after the WRMSR #GP) by adding a KVM_SET_SREGS w/ CR4.LA57=0 after KVM_SET_MSRS, in which case KVM will technically leak the host's KERNEL_GS_BASE into the guest. But, as KERNEL_GS_BASE is a userspace-defined value/address, the leak is largely benign as a malicious userspace would simply be exposing its own data to the guest, and attacking a benevolent userspace would require multiple bugs in the userspace VMM. Cc: Jun Nakajima <jun.nakajima@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: - PKE, LA57, and UMIP are totally unsupported and already included in CR4_RESERVED_BITS - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* sparc32: fix struct ipc64_perm type definitionArnd Bergmann2020-05-221-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 34ca70ef7d3a9fa7e89151597db5e37ae1d429b4 upstream. As discussed in the strace issue tracker, it appears that the sparc32 sysvipc support has been broken for the past 11 years. It was however working in compat mode, which is how it must have escaped most of the regular testing. The problem is that a cleanup patch inadvertently changed the uid/gid fields in struct ipc64_perm from 32-bit types to 16-bit types in uapi headers. Both glibc and uclibc-ng still use the original types, so they should work fine with compat mode, but not natively. Change the definitions to use __kernel_uid32_t and __kernel_gid32_t again. Fixes: 83c86984bff2 ("sparc: unify ipcbuf.h") Link: https://github.com/strace/strace/issues/116 Cc: Sam Ravnborg <sam@ravnborg.org> Cc: "Dmitry V . Levin" <ldv@altlinux.org> Cc: Rich Felker <dalias@libc.org> Cc: libc-alpha@sourceware.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: arm64: Only sign-extend MMIO up to register widthChristoffer Dall2020-05-225-4/+20
| | | | | | | | | | | | | | | | | | | | | | | commit b6ae256afd32f96bec0117175b329d0dd617655e upstream. On AArch64 you can do a sign-extended load to either a 32-bit or 64-bit register, and we should only sign extend the register up to the width of the register as specified in the operation (by using the 32-bit Wn or 64-bit Xn register specifier). As it turns out, the architecture provides this decoding information in the SF ("Sixty-Four" -- how cute...) bit. Let's take advantage of this with the usual 32-bit/64-bit header file dance and do the right thing on AArch64 hosts. Signed-off-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20191212195055.5541-1-christoffer.dall@arm.com [bwh: Backported to 3.16: - Use ESR_EL2_SF - Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpu: Update cached HLE state on write to TSX_CTRL_CPUID_CLEARPawan Gupta2020-05-221-6/+7
| | | | | | | | | | | | | | | | | | | | | commit 5efc6fa9044c3356d6046c6e1da6d02572dbed6b upstream. /proc/cpuinfo currently reports Hardware Lock Elision (HLE) feature to be present on boot cpu even if it was disabled during the bootup. This is because cpuinfo_x86->x86_capability HLE bit is not updated after TSX state is changed via the new MSR IA32_TSX_CTRL. Update the cached HLE bit also since it is expected to change after an update to CPUID_CLEAR bit in MSR IA32_TSX_CTRL. Fixes: 95c5824f75f3 ("x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default") Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/2529b99546294c893dfa1c89e2b3e46da3369a59.1578685425.git.pawan.kumar.gupta@linux.intel.com Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* efi/x86: Map the entire EFI vendor string before copying itArd Biesheuvel2020-05-221-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | commit ffc2760bcf2dba0dbef74013ed73eea8310cc52c upstream. Fix a couple of issues with the way we map and copy the vendor string: - we map only 2 bytes, which usually works since you get at least a page, but if the vendor string happens to cross a page boundary, a crash will result - only call early_memunmap() if early_memremap() succeeded, or we will call it with a NULL address which it doesn't like, - while at it, switch to early_memremap_ro(), and array indexing rather than pointer dereferencing to read the CHAR16 characters. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Matthew Garrett <mjg59@google.com> Cc: linux-efi@vger.kernel.org Fixes: 5b83683f32b1 ("x86: EFI runtime service support") Link: https://lkml.kernel.org/r/20200103113953.9571-5-ardb@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: Keep using early_memremap() since early_memremap_ro() is not defined.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* efi: Use early_mem*() instead of early_io*()Daniel Kiper2020-05-221-14/+14
| | | | | | | | | | | | | | | | | | | commit abc93f8eb6e46a480485f19256bdbda36ec78a84 upstream. Use early_mem*() instead of early_io*() because all mapped EFI regions are memory (usually RAM but they could also be ROM, EPROM, EEPROM, flash, etc.) not I/O regions. Additionally, I/O family calls do not work correctly under Xen in our case. early_ioremap() skips the PFN to MFN conversion when building the PTE. Using it for memory will attempt to map the wrong machine frame. However, all artificial EFI structures created under Xen live in dom0 memory and should be mapped/unmapped using early_mem*() family calls which map domain memory. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Cc: Leif Lindholm <leif.lindholm@linaro.org> Cc: Mark Salter <msalter@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arch/ia64: Define early_memunmap()Daniel Kiper2020-05-221-0/+1
| | | | | | | | | | | | | | | | | | | commit 4fa62481e231111373418f0d95dd1f24f6e83321 upstream. This is odd to use early_iounmap() function do tear down mapping created by early_memremap() function, even if it works right now, because they belong to different set of functions. The former is I/O related function and the later is memory related. So, create early_memunmap() macro which in real is early_iounmap(). This thing will help to not confuse code readers longer by mixing functions from different classes. EFI patches following this patch uses that functionality. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: dts: at91: sama5d3: define clock rate range for tcb1Alexandre Belloni2020-05-221-0/+1
| | | | | | | | | | | | | commit a7e0f3fc01df4b1b7077df777c37feae8c9e8b6d upstream. The clock rate range for the TCB1 clock is missing. define it in the device tree. Reported-by: Karl Rudbæk Olsen <karl@micro-technic.com> Fixes: d2e8190b7916 ("ARM: at91/dt: define sama5d3 clocks") Link: https://lore.kernel.org/r/20200110172007.1253659-2-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: dts: at91: sama5d3: fix maximum peripheral clock ratesAlexandre Belloni2020-05-223-17/+17
| | | | | | | | | | | | | | | | commit ee0aa926ddb0bd8ba59e33e3803b3b5804e3f5da upstream. Currently the maximum rate for peripheral clock is calculated based on a typical 133MHz MCK. The maximum frequency is defined in the datasheet as a ratio to MCK. Some sama5d3 platforms are using a 166MHz MCK. Update the device trees to match the maximum rate based on 166MHz. Reported-by: Karl Rudbæk Olsen <karl@micro-technic.com> Fixes: d2e8190b7916 ("ARM: at91/dt: define sama5d3 clocks") Link: https://lore.kernel.org/r/20200110172007.1253659-1-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> [bwh: Backported to 3.16: uart0_clk is only defined in sama5d3_uart.dtsi] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: tegra: Enable PLLP bypass during Tegra124 LP1Stephen Warren2020-05-221-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1a3388d506bf5b45bb283e6a4c4706cfb4897333 upstream. For a little over a year, U-Boot has configured the flow controller to perform automatic RAM re-repair on off->on power transitions of the CPU rail[1]. This is mandatory for correct operation of Tegra124. However, RAM re-repair relies on certain clocks, which the kernel must enable and leave running. PLLP is one of those clocks. This clock is shut down during LP1 in order to save power. Enable bypass (which I believe routes osc_div_clk, essentially the crystal clock, to the PLL output) so that this clock signal toggles even though the PLL is not active. This is required so that LP1 power mode (system suspend) operates correctly. The bypass configuration must then be undone when resuming from LP1, so that all peripheral clocks run at the expected rate. Without this, many peripherals won't work correctly; for example, the UART baud rate would be incorrect. NVIDIA's downstream kernel code only does this if not compiled for Tegra30, so the added code is made conditional upon the chip ID. NVIDIA's downstream code makes this change conditional upon the active CPU cluster. The upstream kernel currently doesn't support cluster switching, so this patch doesn't test the active CPU cluster ID. [1] 3cc7942a4ae5 ARM: tegra: implement RAM repair Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: nVMX: Don't emulate instructions in guest modePaolo Bonzini2020-04-281-1/+1
| | | | | | | | | | | | | | commit 07721feee46b4b248402133228235318199b05ec upstream. vmx_check_intercept is not yet fully implemented. To avoid emulating instructions disallowed by the L1 hypervisor, refuse to emulate instructions by default. [Made commit, added commit msg - Oliver] Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/efistub: Disable paging at mixed mode entryArd Biesheuvel2020-04-281-0/+5
| | | | | | | | | | | | | | | | | | | | | | commit 4911ee401b7ceff8f38e0ac597cbf503d71e690c upstream. The EFI mixed mode entry code goes through the ordinary startup_32() routine before jumping into the kernel's EFI boot code in 64-bit mode. The 32-bit startup code must be entered with paging disabled, but this is not documented as a requirement for the EFI handover protocol, and so we should disable paging explicitly when entering the kernel from 32-bit EFI firmware. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Cc: Arvind Sankar <nivedita@alum.mit.edu> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: https://lkml.kernel.org/r/20191224132909.102540-4-ardb@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* kvm: x86: Host feature SSBD doesn't imply guest feature SPEC_CTRL_SSBDJim Mattson2020-04-281-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 396d2e878f92ec108e4293f1c77ea3bc90b414ff upstream. The host reports support for the synthetic feature X86_FEATURE_SSBD when any of the three following hardware features are set: CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] CPUID.80000008H:EBX.AMD_SSBD[bit 24] CPUID.80000008H:EBX.VIRT_SSBD[bit 25] Either of the first two hardware features implies the existence of the IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does not. Therefore, CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] should only be set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host. Fixes: 0c54914d0c52a ("KVM: x86: use Intel speculation bugs and features as derived in generic x86 code") Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Jacob Xu <jacobhxu@google.com> Reviewed-by: Peter Shier <pshier@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: adjust indentation] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc/irq: fix stack overflow verificationChristophe Leroy2020-04-281-2/+2
| | | | | | | | | | | | | | | | | | | | commit 099bc4812f09155da77eeb960a983470249c9ce1 upstream. Before commit 0366a1c70b89 ("powerpc/irq: Run softirqs off the top of the irq stack"), check_stack_overflow() was called by do_IRQ(), before switching to the irq stack. In that commit, do_IRQ() was renamed __do_irq(), and is now executing on the irq stack, so check_stack_overflow() has just become almost useless. Move check_stack_overflow() call in do_IRQ() to do the check while still on the current stack. Fixes: 0366a1c70b89 ("powerpc/irq: Run softirqs off the top of the irq stack") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e033aa8116ab12b7ca9a9c75189ad0741e3b9b5f.1575872340.git.christophe.leroy@c-s.fr Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/pti/efi: broken conversion from efi to kernel page tablePavel Tatashin2020-04-283-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In entry_64.S we have code like this: /* Unconditionally use kernel CR3 for do_nmi() */ /* %rax is saved above, so OK to clobber here */ ALTERNATIVE "jmp 2f", "movq %cr3, %rax", X86_FEATURE_KAISER /* If PCID enabled, NOFLUSH now and NOFLUSH on return */ ALTERNATIVE "", "bts $63, %rax", X86_FEATURE_PCID pushq %rax /* mask off "user" bit of pgd address and 12 PCID bits: */ andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax movq %rax, %cr3 2: /* paranoidentry do_nmi, 0; without TRACE_IRQS_OFF */ call do_nmi With this instruction: andq $(~(X86_CR3_PCID_ASID_MASK | KAISER_SHADOW_PGD_OFFSET)), %rax We unconditionally switch from whatever our CR3 was to kernel page table. But, in arch/x86/platform/efi/efi_64.c We temporarily set a different page table, that does not have the kernel page table with 0x1000 offset from it. Look in efi_thunk() and efi_thunk_set_virtual_address_map(). So, while CR3 points to the other page table, we get an NMI interrupt, and clear 0x1000 from CR3, resulting in a bogus CR3 if the 0x1000 bit was set. The efi page table comes from realmode/rm/trampoline_64.S: arch/x86/realmode/rm/trampoline_64.S 141 .bss 142 .balign PAGE_SIZE 143 GLOBAL(trampoline_pgd) .space PAGE_SIZE Notice: alignment is PAGE_SIZE, so after applying KAISER_SHADOW_PGD_OFFSET which equal to PAGE_SIZE, we can get a different page table. But, even if we fix alignment, here the trampoline binary is later copied into dynamically allocated memory in reserve_real_mode(), so we need to fix that place as well. Fixes: f9a1666f97b3 ("KAISER: Kernel Address Isolation") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Adjust the Fixes field for 3.16] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/microcode/AMD: Add support for fam17h microcode loadingTom Lendacky2020-04-281-0/+4
| | | | | | | | | | | | | | | commit f4e9b7af0cd58dd039a0fb2cd67d57cea4889abf upstream. The size for the Microcode Patch Block (MPB) for an AMD family 17h processor is 3200 bytes. Add a #define for fam17h so that it does not default to 2048 bytes and fail a microcode load/update. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20171130224640.15391.40247.stgit@tlendack-t1.amdoffice.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* s390: Fix unmatched preempt_disable() on exitBen Hutchings2020-02-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | exit_thread_runtime_instr() may return with preemption disabled, leading to the following lockdep splat: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586 in_atomic(): 1, irqs_disabled(): 0, pid: 565, name: kworker/u2:0 no locks held by kworker/u2:0/565. CPU: 0 PID: 565 Comm: kworker/u2:0 Not tainted 3.16.81-00145-gafe1c874fa44 #1 00000000025dbbd8 00000000025dbbe8 0000000000000002 0000000000000000 00000000025dbc78 00000000025dbbf0 00000000025dbbf0 000000000098c55c 0000000000000000 00000000025d05b8 00000000025d1590 0000000000000000 0000000000000000 000000000000000c 00000000025dbbd8 0000000000000070 00000000009b7220 000000000098c55c 00000000025dbbd8 00000000025dbc20 Call Trace: ([<000000000098c4ce>] show_trace+0xb6/0xd8) [<000000000098c592>] show_stack+0xa2/0xd8 [<0000000000992c04>] dump_stack+0xc4/0x118 [<0000000000191e20>] __might_sleep+0x230/0x238 [<000000000099fbb0>] mutex_lock_nested+0x48/0x3d8 [<000000000025e33e>] perf_event_exit_task+0x36/0x398 [<0000000000158536>] do_exit+0x3ae/0xca0 [<0000000000175826>] ____call_usermodehelper+0x136/0x148 [<00000000009a550a>] kernel_thread_starter+0x6/0xc [<00000000009a5504>] kernel_thread_starter+0x0/0xc This was fixed by commit 8d9047f8b967 "s390/runtime instrumentation: simplify task exit handling" upstream, but that won't apply here. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc: Fix vDSO clock_getres()Vincenzo Frascino2020-02-115-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 552263456215ada7ee8700ce022d12b0cffe4802 upstream. clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or at run time. Fix the powerpc vdso implementation of clock_getres keeping a copy of hrtimer_resolution in vdso data and using that directly. Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Shuah Khan <skhan@linuxfoundation.org> [chleroy: changed CLOCK_REALTIME_RES to CLOCK_HRTIMER_RES] Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a55eca3a5e85233838c2349783bcb5164dae1d09.1575273217.git.christophe.leroy@c-s.fr [bwh: Backported to 3.16: - In asm-offsets.c, use DEFINE() instead of OFFSET() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* xtensa: fix TLB sanity checkerMax Filippov2020-02-111-2/+2
| | | | | | | | | | | | | | | | | commit 36de10c4788efc6efe6ff9aa10d38cb7eea4c818 upstream. Virtual and translated addresses retrieved by the xtensa TLB sanity checker must be consistent, i.e. correspond to the same state of the checked TLB entry. KASAN shadow memory is mapped dynamically using auto-refill TLB entries and thus may change TLB state between the virtual and translated address retrieval, resulting in false TLB insanity report. Move read_xtlb_translation close to read_xtlb_virtual to make sure that read values are consistent. Fixes: a99e07ee5e88 ("xtensa: check TLB sanity on return to userspace") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defectKai-Heng Feng2020-02-111-0/+11
| | | | | | | | | | | | | | | | | | | | | | | commit 7e8ce0e2b036dbc6617184317983aea4f2c52099 upstream. The AMD FCH USB XHCI Controller advertises support for generating PME# while in D0. When in D0, it does signal PME# for USB 3.0 connect events, but not for USB 2.0 or USB 1.1 connect events, which means the controller doesn't wake correctly for those events. 00:10.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller [1022:7914] (rev 20) (prog-if 30 [XHCI]) Subsystem: Dell FCH USB XHCI Controller [1028:087e] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Clear PCI_PM_CAP_PME_D0 in dev->pme_support to indicate the device will not assert PME# from D0 so we don't rely on it. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203673 Link: https://lore.kernel.org/r/20190902145252.32111-1-kai.heng.feng@canonical.com Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: do not modify masked bits of shared MSRsPaolo Bonzini2020-02-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | commit de1fca5d6e0105c9d33924e1247e2f386efc3ece upstream. "Shared MSRs" are guest MSRs that are written to the host MSRs but keep their value until the next return to userspace. They support a mask, so that some bits keep the host value, but this mask is only used to skip an unnecessary MSR write and the value written to the MSR is always the guest MSR. Fix this and, while at it, do not update smsr->values[slot].curr if for whatever reason the wrmsr fails. This should only happen due to reserved bits, so the value written to smsr->values[slot].curr will not match when the user-return notifier and the host value will always be restored. However, it is untidy and in rare cases this can actually avoid spurious WRMSRs on return to userspace. Reviewed-by: Jim Mattson <jmattson@google.com> Tested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIESPaolo Bonzini2020-02-111-2/+7
| | | | | | | | | | | | | | | commit cbbaa2727aa3ae9e0a844803da7cef7fd3b94f2b upstream. KVM does not implement MSR_IA32_TSX_CTRL, so it must not be presented to the guests. It is also confusing to have !ARCH_CAP_TSX_CTRL_MSR && !RTM && ARCH_CAP_TAA_NO: lack of MSR_IA32_TSX_CTRL suggests TSX was not hidden (it actually was), yet the value says that TSX is not vulnerable to microarchitectural data sampling. Fix both. Tested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Fix incorrect MDS/TAA mitigation statusWaiman Long2020-02-111-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 64870ed1b12e235cfca3f6c6da75b542c973ff78 upstream. For MDS vulnerable processors with TSX support, enabling either MDS or TAA mitigations will enable the use of VERW to flush internal processor buffers at the right code path. IOW, they are either both mitigated or both not. However, if the command line options are inconsistent, the vulnerabilites sysfs files may not report the mitigation status correctly. For example, with only the "mds=off" option: vulnerabilities/mds:Vulnerable; SMT vulnerable vulnerabilities/tsx_async_abort:Mitigation: Clear CPU buffers; SMT vulnerable The mds vulnerabilities file has wrong status in this case. Similarly, the taa vulnerability file will be wrong with mds mitigation on, but taa off. Change taa_select_mitigation() to sync up the two mitigation status and have them turned off if both "mds=off" and "tsx_async_abort=off" are present. Update documentation to emphasize the fact that both "mds=off" and "tsx_async_abort=off" have to be specified together for processors that are affected by both TAA and MDS to be effective. [ bp: Massage and add kernel-parameters.txt change too. ] Fixes: 1b42f017415b ("x86/speculation/taa: Add mitigation for TSX Async Abort") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-doc@vger.kernel.org Cc: Mark Gross <mgross@linux.intel.com> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20191115161445.30809-2-longman@redhat.com [bwh: Backported to 3.16: adjust filenames] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: ux500: remove unused regulator dataArnd Bergmann2020-02-112-608/+0
| | | | | | | | | | | | | | commit 37dc78d9b17c971479f742d6d08f38d8f2beb688 upstream. Most of the board-mop500-regulators.c file is never referenced and can simply be removed. Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> [bwh: Backported to 3.16 as dependency of commit 99c4f70df3a6 "regulator: ab8500: Remove AB8505 USB regulator"] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc: Allow 64bit VDSO __kernel_sync_dicache to work across ranges >4GBAlastair D'Silva2020-02-111-2/+2
| | | | | | | | | | | | | | | | commit f9ec11165301982585e5e5f606739b5bae5331f3 upstream. When calling __kernel_sync_dicache with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191104023305.9581-3-alastair@au1.ibm.com Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc: Allow flush_icache_range to work across ranges >4GBAlastair D'Silva2020-02-111-2/+2
| | | | | | | | | | | | | | | | | commit 29430fae82073d39b1b881a3cd507416a56a363f upstream. When calling flush_icache_range with a size >4GB, we were masking off the upper 32 bits, so we would incorrectly flush a range smaller than intended. This patch replaces the 32 bit shifts with 64 bit ones, so that the full size is accounted for. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191104023305.9581-2-alastair@au1.ibm.com [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: tegra: Fix FLOW_CTLR_HALT register clobbering by tegra_resume()Dmitry Osipenko2020-02-111-3/+3
| | | | | | | | | | | | commit d70f7d31a9e2088e8a507194354d41ea10062994 upstream. There is an unfortunate typo in the code that results in writing to FLOW_CTLR_HALT instead of FLOW_CTLR_CSR. Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/ioapic: Prevent inconsistent state when moving an interruptThomas Gleixner2020-02-111-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit df4393424af3fbdcd5c404077176082a8ce459c4 upstream. There is an issue with threaded interrupts which are marked ONESHOT and using the fasteoi handler: if (IS_ONESHOT()) mask_irq(); .... cond_unmask_eoi_irq() chip->irq_eoi(); if (setaffinity_pending) { mask_ioapic(); ... move_affinity(); unmask_ioapic(); } So if setaffinity is pending the interrupt will be moved and then unconditionally unmasked at the ioapic level, which is wrong in two aspects: 1) It should be kept masked up to the point where the threaded handler finished. 2) The physical chip state and the software masked state are inconsistent Guard both the mask and the unmask with a check for the software masked state. If the line is marked masked then the ioapic line is also masked, so both mask_ioapic() and unmask_ioapic() can be skipped safely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Siewior <bigeasy@linutronix.de> Fixes: 3aa551c9b4c4 ("genirq: add threaded interrupt handler support") Link: https://lkml.kernel.org/r/20191017101938.321393687@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: Keep using {,un}mask_iopaic_irq()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: dts: s3c64xx: Fix init order of clock providersLihua Yao2020-02-112-0/+8
| | | | | | | | | | | | | | | | | | | | | | commit d60d0cff4ab01255b25375425745c3cff69558ad upstream. fin_pll is the parent of clock-controller@7e00f000, specify the dependency to ensure proper initialization order of clock providers. without this patch: [ 0.000000] S3C6410 clocks: apll = 0, mpll = 0 [ 0.000000] epll = 0, arm_clk = 0 with this patch: [ 0.000000] S3C6410 clocks: apll = 532000000, mpll = 532000000 [ 0.000000] epll = 24000000, arm_clk = 532000000 Fixes: 3f6d439f2022 ("clk: reverse default clk provider initialization order in of_clk_init()") Signed-off-by: Lihua Yao <ylhuajnu@outlook.com> Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86: Treat R_X86_64_PLT32 as R_X86_64_PC32H.J. Lu2020-02-112-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b21ebf2fb4cde1618915a97cc773e287ff49173e upstream. On i386, there are 2 types of PLTs, PIC and non-PIC. PIE and shared objects must use PIC PLT. To use PIC PLT, you need to load _GLOBAL_OFFSET_TABLE_ into EBX first. There is no need for that on x86-64 since x86-64 uses PC-relative PLT. On x86-64, for 32-bit PC-relative branches, we can generate PLT32 relocation, instead of PC32 relocation, which can also be used as a marker for 32-bit PC-relative branches. Linker can always reduce PLT32 relocation to PC32 if function is defined locally. Local functions should use PC32 relocation. As far as Linux kernel is concerned, R_X86_64_PLT32 can be treated the same as R_X86_64_PC32 since Linux kernel doesn't use PLT. R_X86_64_PLT32 for 32-bit PC-relative branches has been enabled in binutils master branch which will become binutils 2.31. [ hjl is working on having better documentation on this all, but a few more notes from him: "PLT32 relocation is used as marker for PC-relative branches. Because of EBX, it looks odd to generate PLT32 relocation on i386 when EBX doesn't have GOT. As for symbol resolution, PLT32 and PC32 relocations are almost interchangeable. But when linker sees PLT32 relocation against a protected symbol, it can resolved locally at link-time since it is used on a branch instruction. Linker can't do that for PC32 relocation" but for the kernel use, the two are basically the same, and this commit gets things building and working with the current binutils master - Linus ] Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [Woody Suwalski: Backported to 3.16] Signed-off-by: Woody Suwalski <terraluna977@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* locking/x86: Remove the unused atomic_inc_short() methdDmitry Vyukov2020-01-112-15/+1
| | | | | | | | | | | | | | | | | | | | | | | commit 31b35f6b4d5285a311e10753f4eb17304326b211 upstream. It is completely unused and implemented only on x86. Remove it. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170526172900.91058-1-dvyukov@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16 because this function is broken after "x86/atomic: Fix smp_mb__{before,after}_atomic()": - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* locking,x86: Kill atomic_or_long()Peter Zijlstra2020-01-111-15/+0
| | | | | | | | | | | | | | | | commit f6b4ecee0eb7bfa66ae8d5652105ed4da53209a3 upstream. There are no users, kill it. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140508135851.768177189@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16 because this function is broken after "x86/atomic: Fix smp_mb__{before,after}_atomic()"] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/atomic: Fix smp_mb__{before,after}_atomic()Peter Zijlstra2020-01-113-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 69d927bba39517d0980462efc051875b7f4db185 upstream. Recent probing at the Linux Kernel Memory Model uncovered a 'surprise'. Strongly ordered architectures where the atomic RmW primitive implies full memory ordering and smp_mb__{before,after}_atomic() are a simple barrier() (such as x86) fail for: *x = 1; atomic_inc(u); smp_mb__after_atomic(); r0 = *y; Because, while the atomic_inc() implies memory order, it (surprisingly) does not provide a compiler barrier. This then allows the compiler to re-order like so: atomic_inc(u); *x = 1; smp_mb__after_atomic(); r0 = *y; Which the CPU is then allowed to re-order (under TSO rules) like: atomic_inc(u); r0 = *y; *x = 1; And this very much was not intended. Therefore strengthen the atomic RmW ops to include a compiler barrier. NOTE: atomic_{or,and,xor} and the bitops already had the compiler barrier. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jari Ruusu <jari.ruusu@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>