summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* arm64: Remove unimplemented syscall log messageMichael Weiser2018-06-161-8/+0
| | | | | | | | | | | | | | | | | | commit 1962682d2b2fbe6cfa995a85c53c069fadda473e upstream. Stop printing a (ratelimited) kernel message for each instance of an unimplemented syscall being called. Userland making an unimplemented syscall is not necessarily misbehaviour and to be expected with a current userland running on an older kernel. Also, the current message looks scary to users but does not actually indicate a real problem nor help them narrow down the cause. Just rely on sys_ni_syscall() to return -ENOSYS. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Weiser <michael.weiser@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> [bwh: Backported to 3.16: Deleted code was slightly different] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm64: Disable unhandled signal log messages by defaultMichael Weiser2018-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5ee39a71fd89ab7240c5339d04161c44a8e03269 upstream. aarch64 unhandled signal kernel messages are very verbose, suggesting them to be more of a debugging aid: sigsegv[33]: unhandled level 2 translation fault (11) at 0x00000000, esr 0x92000046, in sigsegv[400000+71000] CPU: 1 PID: 33 Comm: sigsegv Tainted: G W 4.15.0-rc3+ #3 Hardware name: linux,dummy-virt (DT) pstate: 60000000 (nZCv daif -PAN -UAO) pc : 0x4003f4 lr : 0x4006bc sp : 0000fffffe94a060 x29: 0000fffffe94a070 x28: 0000000000000000 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000000 x24: 00000000004001b0 x23: 0000000000486ac8 x22: 00000000004001c8 x21: 0000000000000000 x20: 0000000000400be8 x19: 0000000000400b30 x18: 0000000000484728 x17: 000000000865ffc8 x16: 000000000000270f x15: 00000000000000b0 x14: 0000000000000002 x13: 0000000000000001 x12: 0000000000000000 x11: 0000000000000000 x10: 0008000020008008 x9 : 000000000000000f x8 : ffffffffffffffff x7 : 0004000000000000 x6 : ffffffffffffffff x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000004003e4 x2 : 0000fffffe94a1e8 x1 : 000000000000000a x0 : 0000000000000000 Disable them by default, so they can be enabled using /proc/sys/debug/exception-trace. Signed-off-by: Michael Weiser <michael.weiser@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm64: do not use print_symbol()Sergey Senozhatsky2018-06-161-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4ef7963843d3243260aa335dfb9cb2fede06aacf upstream. print_symbol() is a very old API that has been obsoleted by %pS format specifier in a normal printk() call. Replace print_symbol() with a direct printk("%pS") call. Link: http://lkml.kernel.org/r/20171211125025.2270-3-sergey.senozhatsky@gmail.com To: Andrew Morton <akpm@linux-foundation.org> To: Russell King <linux@armlinux.org.uk> To: Catalin Marinas <catalin.marinas@arm.com> To: Mark Salter <msalter@redhat.com> To: Tony Luck <tony.luck@intel.com> To: David Howells <dhowells@redhat.com> To: Yoshinori Sato <ysato@users.sourceforge.jp> To: Guan Xuetao <gxt@mprc.pku.edu.cn> To: Borislav Petkov <bp@alien8.de> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> To: Thomas Gleixner <tglx@linutronix.de> To: Peter Zijlstra <peterz@infradead.org> To: Vineet Gupta <vgupta@synopsys.com> To: Fengguang Wu <fengguang.wu@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Petr Mladek <pmladek@suse.com> Cc: LKML <linux-kernel@vger.kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-sh@vger.kernel.org Cc: linux-edac@vger.kernel.org Cc: x86@kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> [pmladek@suse.com: updated commit message] Signed-off-by: Petr Mladek <pmladek@suse.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm64: traps: Don't print stack or raw PC/LR values in backtracesWill Deacon2018-06-162-50/+4
| | | | | | | | | | | | | | | | | | commit a25ffd3a6302a67814280274d8f1aa4ae2ea4b59 upstream. Printing raw pointer values in backtraces has potential security implications and are of questionable value anyway. This patch follows x86's lead and removes the "Exception stack:" dump from kernel backtraces, as well as converting PC/LR values to symbols such as "sysrq_handle_crash+0x20/0x30". Tested-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> [bwh: Backported to 3.16: - Deleted code in dump_mem() and dump_backtrace_entry() is a bit different - Leave dump_backtrace() unchanged, since it doesn't use dump_mem()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm64: remove __die()'s stack dumpMark Rutland2018-06-161-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c5bc503cbeee8586395aa541d2b53c69c3dd6930 upstream. Our __die() implementation tries to dump the stack memory, in addition to a backtrace, which is problematic. For contemporary 16K stacks, this can be a lot of data, which can take a long time to dump, and can push other useful context out of the kernel's printk ringbuffer (and/or a user's scrollback buffer on an attached console). Additionally, the code implicitly assumes that the SP is on the task's stack, and tries to dump everything between the SP and the highest task stack address. When the SP points at an IRQ stack (or is corrupted), this makes the kernel attempt to dump vast amounts of VA space. With vmap'd stacks, this may result in erroneous accesses to peripherals. This patch removes the memory dump, leaving us to rely on the backtrace, and other means of dumping stack memory such as kdump. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Add <asm/msr-index.h> dependencyPeter Zijlstra2018-06-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ea00f301285ea2f07393678cd2b6057878320c9d upstream. Joe Konno reported a compile failure resulting from using an MSR without inclusion of <asm/msr-index.h>, and while the current code builds fine (by accident) this needs fixing for future patches. Reported-by: Joe Konno <joe.konno@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan@linux.intel.com Cc: bp@alien8.de Cc: dan.j.williams@intel.com Cc: dave.hansen@linux.intel.com Cc: dwmw2@infradead.org Cc: dwmw@amazon.co.uk Cc: gregkh@linuxfoundation.org Cc: hpa@zytor.com Cc: jpoimboe@redhat.com Cc: linux-tip-commits@vger.kernel.org Cc: luto@kernel.org Fixes: 20ffa1caecca ("x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support") Link: http://lkml.kernel.org/r/20180213132819.GJ25201@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc/pseries: Add empty update_numa_cpu_lookup_table() for NUMA=nCorentin Labbe2018-06-161-0/+3
| | | | | | | | | | | | | | | | | commit c1e150ceb61e4a585bad156da15c33bfe89f5858 upstream. When CONFIG_NUMA is not set, the build fails with: arch/powerpc/platforms/pseries/hotplug-cpu.c:335:4: error: déclaration implicite de la fonction « update_numa_cpu_lookup_table » So we have to add update_numa_cpu_lookup_table() as an empty function when CONFIG_NUMA is not set. Fixes: 1d9a090783be ("powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: mvebu: Fix broken PL310_ERRATA_753970 selectsUlf Magnusson2018-06-161-2/+2
| | | | | | | | | | | | | | | | | | | commit 8aa36a8dcde3183d84db7b0d622ffddcebb61077 upstream. The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970, but it was renamed to PL310_ERRATA_753970 by commit fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for PL310 errata workarounds"). Fix the selects to use the new name. Discovered with the https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py script. Fixes: fa0ce4035d48 ("ARM: 7162/1: errata: tidy up Kconfig options for PL310 errata workarounds" Signed-off-by: Ulf Magnusson <ulfalizer@gmail.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc/numa: Invalidate numa_cpu_lookup_table on cpu removeNathan Fontenot2018-06-163-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 1d9a090783bef19fe8cdec878620d22f05191316 upstream. When DLPAR removing a CPU, the unmapping of the cpu from a node in unmap_cpu_from_node() should also invalidate the CPUs entry in the numa_cpu_lookup_table. There is not a guarantee that on a subsequent DLPAR add of the CPU the associativity will be the same and thus could be in a different node. Invalidating the entry in the numa_cpu_lookup_table causes the associativity to be read from the device tree at the time of the add. The current behavior of not invalidating the CPUs entry in the numa_cpu_lookup_table can result in scenarios where the the topology layout of CPUs in the partition does not match the device tree or the topology reported by the HMC. This bug looks like it was introduced in 2004 in the commit titled "ppc64: cpu hotplug notifier for numa", which is 6b15e4e87e32 in the linux-fullhist tree. Hence tag it for all stable releases. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm64: KVM: Increment PC after handling an SMC trapMarc Zyngier2018-06-161-0/+9
| | | | | | | | | | | | | | | | | | | commit f5115e8869e1dfafac0e414b4f1664f3a84a4683 upstream. When handling an SMC trap, the "preferred return address" is set to that of the SMC, and not the next PC (which is a departure from the behaviour of an SMC that isn't trapped). Increment PC in the handler, as the guest is otherwise forever stuck... Fixes: acfb3b883f6d ("arm64: KVM: Fix SMCCC handling of unimplemented SMC/HVC calls") Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm: KVM: Fix SMCCC handling of unimplemented SMC/HVC callsMarc Zyngier2018-06-161-2/+11
| | | | | | | | | | | | | | | | | | commit 20e8175d246e9f9deb377f2784b3e7dfb2ad3e86 upstream. KVM doesn't follow the SMCCC when it comes to unimplemented calls, and inject an UNDEF instead of returning an error. Since firmware calls are now used for security mitigation, they are becoming more common, and the undef is counter productive. Instead, let's follow the SMCCC which states that -1 must be returned to the caller when getting an unknown function number. Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [bwh: Backported to 3.16: Use vcpu_reg() instead of vcpu_set_reg()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/xen: init %gs very early to avoid page faults with stack protectorJuergen Gross2018-06-161-0/+15
| | | | | | | | | | | | | | | | | commit 4f277295e54c5b7340e48efea3fc5cc21a2872b7 upstream. When running as Xen pv guest %gs is initialized some time after C code is started. Depending on stack protector usage this might be too late, resulting in page faults. So setup %gs and MSR_GS_BASE in assembly code already. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Chris Patterson <cjp256@gmail.com> Signed-off-by: Juergen Gross <jgross@suse.com> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASSMatt Redfearn2018-06-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0cde5b44a30f1daaef1c34e08191239dc63271c4 upstream. When commit b27311e1cace ("MIPS: TXx9: Add RBTX4939 board support") added board support for the RBTX4939, it added a call to led_classdev_register even if the LED class is built as a module. Built-in arch code cannot call module code directly like this. Commit b33b44073734 ("MIPS: TXX9: use IS_ENABLED() macro") subsequently changed the inclusion of this code to a single check that CONFIG_LEDS_CLASS is either builtin or a module, but the same issue remains. This leads to MIPS allmodconfig builds failing when CONFIG_MACH_TX49XX=y is set: arch/mips/txx9/rbtx4939/setup.o: In function `rbtx4939_led_probe': setup.c:(.init.text+0xc0): undefined reference to `of_led_classdev_register' make: *** [Makefile:999: vmlinux] Error 1 Fix this by using the IS_BUILTIN() macro instead. Fixes: b27311e1cace ("MIPS: TXx9: Add RBTX4939 board support") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Reviewed-by: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18544/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabledAlexander Graf2018-06-163-18/+12
| | | | | | | | | | | | | | | | commit 07ae5389e98c53bb9e9f308fce9c903bc3ee7720 upstream. When copying between the vcpu and svcpu, we may get scheduled away onto a different host CPU which in turn means our svcpu pointer may change. That means we need to atomically copy to and from the svcpu with preemption disabled, so that all code around it always sees a coherent state. Reported-by: Simon Guo <wei.guo.simon@gmail.com> Fixes: 3d3319b45eea ("KVM: PPC: Book3S: PR: Enable interrupts earlier") Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* s390: fix handling of -1 in set{,fs}[gu]id16 syscallsEugene Syromiatnikov2018-06-161-4/+4
| | | | | | | | | | | | | | | | | commit 6dd0d2d22aa363fec075cb2577ba273ac8462e94 upstream. For some reason, the implementation of some 16-bit ID system calls (namely, setuid16/setgid16 and setfsuid16/setfsgid16) used type cast instead of low2highgid/low2highuid macros for converting [GU]IDs, which led to incorrect handling of value of -1 (which ought to be considered invalid). Discovered by strace test suite. Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* alpha: fix crash if pthread_create races with signal deliveryMikulas Patocka2018-06-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 21ffceda1c8b3807615c40d440d7815e0c85d366 upstream. On alpha, a process will crash if it attempts to start a thread and a signal is delivered at the same time. The crash can be reproduced with this program: https://cygwin.com/ml/cygwin/2014-11/msg00473.html The reason for the crash is this: * we call the clone syscall * we go to the function copy_process * copy process calls copy_thread_tls, it is a wrapper around copy_thread * copy_thread sets the tls pointer: childti->pcb.unique = regs->r20 * copy_thread sets regs->r20 to zero * we go back to copy_process * copy process checks "if (signal_pending(current))" and returns -ERESTARTNOINTR * the clone syscall is restarted, but this time, regs->r20 is zero, so the new thread is created with zero tls pointer * the new thread crashes in start_thread when attempting to access tls The comment in the code says that setting the register r20 is some compatibility with OSF/1. But OSF/1 doesn't use the CLONE_SETTLS flag, so we don't have to zero r20 if CLONE_SETTLS is set. This patch fixes the bug by zeroing regs->r20 only if CLONE_SETTLS is not set. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* alpha: fix reboot on Avanti platformMikulas Patocka2018-06-161-1/+2
| | | | | | | | | | | commit 55fc633c41a08ce9244ff5f528f420b16b1e04d6 upstream. We need to define NEED_SRM_SAVE_RESTORE on the Avanti, otherwise we get machine check exception when attempting to reboot the machine. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* MIPS: Fix clean of vmlinuz.{32,ecoff,bin,srec}James Hogan2018-06-161-1/+5
| | | | | | | | | | | | | | | commit 5f2483eb2423152445b39f2db59d372f523e664e upstream. Make doesn't expand shell style "vmlinuz.{32,ecoff,bin,srec}" to the 4 separate files, so none of these files get cleaned up by make clean. List the files separately instead. Fixes: ec3352925b74 ("MIPS: Remove all generated vmlinuz* files on "make clean"") Signed-off-by: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18491/ Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* mn10300/misalignment: Use SIGSEGV SEGV_MAPERR to report a failed user copyEric W. Biederman2018-06-161-1/+1
| | | | | | | | | | | | | | | | | commit 6ac1dc736b323011a55ecd1fc5897c24c4f77cbd upstream. Setting si_code to 0 is the same a setting si_code to SI_USER which is definitely not correct. With si_code set to SI_USER si_pid and si_uid will be copied to userspace instead of si_addr. Which is very wrong. So fix this by using a sensible si_code (SEGV_MAPERR) for this failure. Fixes: b920de1b77b7 ("mn10300: add the MN10300/AM33 architecture to the kernel") Cc: David Howells <dhowells@redhat.com> Cc: Masakazu Urade <urade.masakazu@jp.panasonic.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* signal/openrisc: Fix do_unaligned_access to send the proper signalEric W. Biederman2018-06-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit 500d58300571b6602341b041f97c082a461ef994 upstream. While reviewing the signal sending on openrisc the do_unaligned_access function stood out because it is obviously wrong. A comment about an si_code set above when actually si_code is never set. Leading to a random si_code being sent to userspace in the event of an unaligned access. Looking further SIGBUS BUS_ADRALN is the proper pair of signal and si_code to send for an unaligned access. That is what other architectures do and what is required by posix. Given that do_unaligned_access is broken in a way that no one can be relying on it on openrisc fix the code to just do the right thing. Fixes: 769a8a96229e ("OpenRISC: Traps") Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: openrisc@lists.librecores.org Acked-by: Stafford Horne <shorne@gmail.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* crypto: hash - annotate algorithms taking optional keyEric Biggers2018-06-163-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a208fa8f33031b9e0aba44c7d1b7e68eb0cbd29e upstream. We need to consistently enforce that keyed hashes cannot be used without setting the key. To do this we need a reliable way to determine whether a given hash algorithm is keyed or not. AF_ALG currently does this by checking for the presence of a ->setkey() method. However, this is actually slightly broken because the CRC-32 algorithms implement ->setkey() but can also be used without a key. (The CRC-32 "key" is not actually a cryptographic key but rather represents the initial state. If not overridden, then a default initial state is used.) Prepare to fix this by introducing a flag CRYPTO_ALG_OPTIONAL_KEY which indicates that the algorithm has a ->setkey() method, but it is not required to be called. Then set it on all the CRC-32 algorithms. The same also applies to the Adler-32 implementation in Lustre. Also, the cryptd and mcryptd templates have to pass through the flag from their underlying algorithm. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> [bwh: Backported to 3.16: - Drop changes to nonexistent drivers - There's no CRYPTO_ALG_INTERNAL flag - Adjust filenames] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm: spear13xx: Fix spics gpio controller's warningViresh Kumar2018-06-161-1/+1
| | | | | | | | | | | | | | | | commit f8975cb1b8a36d0839b6365235778dd9df1d04ca upstream. This fixes the following warning by also sending the flags argument for gpio controllers: Property 'cs-gpios', cell 6 is not a phandle reference in /ahb/apb/spi@e0100000 Fixes: 8113ba917dfa ("ARM: SPEAr: DT: Update device nodes") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm: spear13xx: Fix dmas cellsViresh Kumar2018-06-162-5/+5
| | | | | | | | | | | | | | | commit cdd10409914184c7eee5ae3e11beb890c9c16c61 upstream. The "dmas" cells for the designware DMA controller need to have only 3 properties apart from the phandle: request line, src master and destination master. But the commit 6e8887f60f60 updated it incorrectly while moving from platform code to DT. Fix it. Fixes: 6e8887f60f60 ("ARM: SPEAr13xx: Pass generic DW DMAC platform data from DT") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* arm: spear600: Add missing interrupt-parent of rtcViresh Kumar2018-06-161-0/+1
| | | | | | | | | | | | commit 6ffb5b4f248fe53e0361b8cbc2a523b432566442 upstream. The interrupt-parent of rtc was missing, add it. Fixes: 8113ba917dfa ("ARM: SPEAr: DT: Update device nodes") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/gart: Exclude GART aperture from vmcoreJiri Bohac2018-06-162-2/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2a3e83c6f96c513f43ce5a8c9034608ea584a255 upstream. On machines where the GART aperture is mapped over physical RAM /proc/vmcore contains the remapped range and reading it may cause hangs or reboots. In the past, the GART region was added into the resource map, implemented by commit 56dd669a138c ("[PATCH] Insert GART region into resource map") However, inserting the iomem_resource from the early GART code caused resource conflicts with some AGP drivers (bko#72201), which got avoided by reverting the patch in commit 707d4eefbdb3 ("Revert [PATCH] Insert GART region into resource map"). This revert introduced the /proc/vmcore bug. The vmcore ELF header is either prepared by the kernel (when using the kexec_file_load syscall) or by the kexec userspace (when using the kexec_load syscall). Since we no longer have the GART iomem resource, the userspace kexec has no way of knowing which region to exclude from the ELF header. Changes from v1 of this patch: Instead of excluding the aperture from the ELF header, this patch makes /proc/vmcore return zeroes in the second kernel when attempting to read the aperture region. This is done by reusing the gart_oldmem_pfn_is_ram infrastructure originally intended to exclude XEN balooned memory. This works for both, the kexec_file_load and kexec_load syscalls. [Note that the GART region is the same in the first and second kernels: regardless whether the first kernel fixed up the northbridge/bios setting and mapped the aperture over physical memory, the second kernel finds the northbridge properly configured by the first kernel and the aperture never overlaps with e820 memory because the second kernel has a fake e820 map created from the crashkernel memory regions. Thus, the second kernel keeps the aperture address/size as configured by the first kernel.] register_oldmem_pfn_is_ram can only register one callback and returns an error if the callback has been registered already. Since XEN used to be the only user of this function, it never checks the return value. Now that we have more than one user, I added a WARN_ON just in case agp, XEN, or any other future user of register_oldmem_pfn_is_ram were to step on each other's toes. Fixes: 707d4eefbdb3 ("Revert [PATCH] Insert GART region into resource map") Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Baoquan He <bhe@redhat.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: David Airlie <airlied@linux.ie> Cc: yinghai@kernel.org Cc: joro@8bytes.org Cc: kexec@lists.infradead.org Cc: Borislav Petkov <bp@alien8.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Link: https://lkml.kernel.org/r/20180106010013.73suskgxm7lox7g6@dwarf.suse.cz [bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* signal/sh: Ensure si_signo is initialized in do_divide_errorEric W. Biederman2018-06-161-1/+2
| | | | | | | | | | | | | | commit 0e88bb002a9b2ee8cc3cc9478ce2dc126f849696 upstream. Set si_signo. Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: linux-sh@vger.kernel.org Fixes: 0983b31849bb ("sh: Wire up division and address error exceptions on SH-2A.") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* xtensa: fix futex_atomic_cmpxchg_inatomicMax Filippov2018-06-161-13/+10
| | | | | | | | | | | | | | commit ca47480921587ae30417dd234a9f79af188e3666 upstream. Return 0 if the operation was successful, not the userspace memory value. Check that userspace value equals passed oldval, not itself. Don't update *uval if the value wasn't read from userspace memory. This fixes process hang due to infinite loop in futex_lock_pi. It also fixes a bunch of glibc tests nptl/tst-mutexpi*. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* powerpc/64: Don't trace irqs-off at interrupt return to soft-disabled contextNicholas Piggin2018-06-161-3/+7
| | | | | | | | | | | | | | | | | | | | commit acb1feab320e38588fccc568e3767761f494976f upstream. When an interrupt is returning to a soft-disabled context (which can happen for non-maskable interrupts or synchronous interrupts), it goes through the motions of soft-disabling again, including calling TRACE_DISABLE_INTS (i.e., trace_hardirqs_off()). This is not necessary, because we must already be soft-disabled in the interrupt context, it also may be causing crashes in the irq tracing code to re-enter as an nmi. Replace it with a warning to ensure that soft-interrupts are still disabled. Fixes: 7c0482e3d055 ("powerpc/irq: Fix another case of lazy IRQ state getting out of sync") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* ARM: dts: exynos: Correct Trats2 panel reset lineSimon Shields2018-06-161-1/+1
| | | | | | | | | | | | | | | commit 1b377924841df1e13ab5b225be3a83f807a92b52 upstream. Trats2 uses gpf2-1 as the panel reset GPIO. gpy4-5 was only used on early revisions of the board. Fixes: 420ae8451a22 ("ARM: dts: exynos4412-trats2: add panel node") Signed-off-by: Simon Shields <simon@lineageos.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Correct Speculation Control microcode blacklist againDavid Woodhouse2018-06-161-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d37fc6d360a404b208547ba112e7dabb6533c7fc upstream. Arjan points out that the Intel document only clears the 0xc2 microcode on *some* parts with CPUID 506E3 (INTEL_FAM6_SKYLAKE_DESKTOP stepping 3). For the Skylake H/S platform it's OK but for Skylake E3 which has the same CPUID it isn't (yet) cleared. So removing it from the blacklist was premature. Put it back for now. Also, Arjan assures me that the 0x84 microcode for Kaby Lake which was featured in one of the early revisions of the Intel document was never released to the public, and won't be until/unless it is also validated as safe. So those can change to 0x80 which is what all *other* versions of the doc have identified. Once the retrospective testing of existing public microcodes is done, we should be back into a mode where new microcodes are only released in batches and we shouldn't even need to update the blacklist for those anyway, so this tweaking of the list isn't expected to be a thing which keeps happening. Requested-by: Arjan van de Ven <arjan.van.de.ven@intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan.van.de.ven@intel.com Cc: dave.hansen@intel.com Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Link: http://lkml.kernel.org/r/1518449255-2182-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Update Speculation Control microcode blacklistDavid Woodhouse2018-06-161-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1751342095f0d2b36fa8114d8e12c5688c455ac4 upstream. Intel have retroactively blessed the 0xc2 microcode on Skylake mobile and desktop parts, and the Gemini Lake 0x22 microcode is apparently fine too. We blacklisted the latter purely because it was present with all the other problematic ones in the 2018-01-08 release, but now it's explicitly listed as OK. We still list 0x84 for the various Kaby Lake / Coffee Lake parts, as that appeared in one version of the blacklist and then reverted to 0x80 again. We can change it if 0x84 is actually announced to be safe. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arjan.van.de.ven@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: sironi@amazon.de Link: http://lkml.kernel.org/r/1518305967-31356-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Use Indirect Branch Prediction Barrier in context switchTim Chen2018-06-161-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 18bf3c3ea8ece8f03b6fc58508f2dfd23c7711c7 upstream. Flush indirect branches when switching into a process that marked itself non dumpable. This protects high value processes like gpg better, without having too high performance overhead. If done naïvely, we could switch to a kernel idle thread and then back to the original process, such as: process A -> idle -> process A In such scenario, we do not have to do IBPB here even though the process is non-dumpable, as we are switching back to the same process after a hiatus. To avoid the redundant IBPB, which is expensive, we track the last mm user context ID. The cost is to have an extra u64 mm context id to track the last mm we were using before switching to the init_mm used by idle. Avoiding the extra IBPB is probably worth the extra memory for this common scenario. For those cases where tlb_defer_switch_to_init_mm() returns true (non PCID), lazy tlb will defer switch to init_mm, so we will not be changing the mm for the process A -> idle -> process A switch. So IBPB will be skipped for this case. Thanks to the reviewers and Andy Lutomirski for the suggestion of using ctx_id which got rid of the problem of mm pointer recycling. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: linux@dominikbrodowski.net Cc: peterz@infradead.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: pbonzini@redhat.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517263487-3708-1-git-send-email-dwmw@amazon.co.uk [bwh: Backported to 3.16: Drop the optimisation for switching via the idle task, since we don't have mm_context_t::ctx_id here] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR ↵Paolo Bonzini2018-06-162-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | path as unlikely() commit 946fbbc13dce68902f64515b610eeb2a6c3d7a64 upstream. vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving branch hints to the compiler can actually make a substantial cycle difference by keeping the fast path contiguous in memory. With this optimization, the retpoline-guest/retpoline-host case is about 50 cycles faster. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPPIngo Molnar2018-06-161-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | commit d72f4e29e6d84b7ec02ae93088aa459ac70e733b upstream. firmware_restrict_branch_speculation_*() recently started using preempt_enable()/disable(), but those are relatively high level primitives and cause build failures on some 32-bit builds. Since we want to keep <asm/nospec-branch.h> low level, convert them to macros to avoid header hell... Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM/x86: Remove indirect MSR op calls from SPEC_CTRLPaolo Bonzini2018-06-162-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | commit ecb586bd29c99fb4de599dec388658e74388daad upstream. Having a paravirt indirect call in the IBRS restore path is not a good idea, since we are trying to protect from speculative execution of bogus indirect branch targets. It is also slower, so use native_wrmsrl() on the vmentry path too. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: KarimAllah Ahmed <karahmed@amazon.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Fixes: d28b387fb74da95d69d2615732f50cceb38e9a4d Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRLKarimAllah Ahmed2018-06-161-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b2ac58f90540e39324e7a29a7ad471407ae0bf48 upstream. [ Based on a patch from Paolo Bonzini <pbonzini@redhat.com> ] ... basically doing exactly what we do for VMX: - Passthrough SPEC_CTRL to guests (if enabled in guest CPUID) - Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest actually used it. Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRLKarimAllah Ahmed2018-06-164-4/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d28b387fb74da95d69d2615732f50cceb38e9a4d upstream. [ Based on a patch from Ashok Raj <ashok.raj@intel.com> ] Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for guests that will only mitigate Spectre V2 through IBRS+IBPB and will not be using a retpoline+IBPB based approach. To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for guests that do not actually use the MSR, only start saving and restoring when a non-zero is written to it. No attempt is made to handle STIBP here, intentionally. Filtering STIBP may be added in a future patch, which may require trapping all writes if we don't want to pass it through directly to the guest. [dwmw2: Clean up CPUID bits, save/restore manually, handle reset] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - No support for nested MSR bitmaps - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIESKarimAllah Ahmed2018-06-164-3/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 28c1c9fabf48d6ad596273a11c46e0d0da3e14cd upstream. Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO (bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the contents will come directly from the hardware, but user-space can still override it. [dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - Add mapping of the relevant CPUID word - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM/x86: Add IBPB supportAshok Raj2018-06-164-1/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 15d45071523d89b3fb7372e2135fbd72f6af9506 upstream. The Indirect Branch Predictor Barrier (IBPB) is an indirect branch control mechanism. It keeps earlier branches from influencing later ones. Unlike IBRS and STIBP, IBPB does not define a new mode of operation. It's a command that ensures predicted branch targets aren't used after the barrier. Although IBRS and IBPB are enumerated by the same CPUID enumeration, IBPB is very different. IBPB helps mitigate against three potential attacks: * Mitigate guests from being attacked by other guests. - This is addressed by issing IBPB when we do a guest switch. * Mitigate attacks from guest/ring3->host/ring3. These would require a IBPB during context switch in host, or after VMEXIT. The host process has two ways to mitigate - Either it can be compiled with retpoline - If its going through context switch, and has set !dumpable then there is a IBPB in that path. (Tim's patch: https://patchwork.kernel.org/patch/10192871) - The case where after a VMEXIT you return back to Qemu might make Qemu attackable from guest when Qemu isn't compiled with retpoline. There are issues reported when doing IBPB on every VMEXIT that resulted in some tsc calibration woes in guest. * Mitigate guest/ring0->host/ring0 attacks. When host kernel is using retpoline it is safe against these attacks. If host kernel isn't using retpoline we might need to do a IBPB flush on every VMEXIT. Even when using retpoline for indirect calls, in certain conditions 'ret' can use the BTB on Skylake-era CPUs. There are other mitigations available like RSB stuffing/clearing. * IBPB is issued only for SVM during svm_free_vcpu(). VMX has a vmclear and SVM doesn't. Follow discussion here: https://lkml.org/lkml/2018/1/15/146 Please refer to the following spec for more details on the enumeration and control. Refer here to get documentation about mitigations. https://software.intel.com/en-us/side-channel-security-support [peterz: rebase and changelog rewrite] [karahmed: - rebase - vmx: expose PRED_CMD if guest has it in CPUID - svm: only pass through IBPB if guest has it in CPUID - vmx: support !cpu_has_vmx_msr_bitmap()] - vmx: support nested] [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS) PRED_CMD is a write-only MSR] Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: kvm@vger.kernel.org Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - No support for nested MSR bitmaps in VMX - Use literal number for CPU feature word - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: VMX: make MSR bitmaps per-VCPUPaolo Bonzini2018-06-161-121/+107
| | | | | | | | | | | | | | | | | | | | | | | commit 904e14fb7cb96401a7dc803ca2863fd5ba32ffe6 upstream. Place the MSR bitmap in struct loaded_vmcs, and update it in place every time the x2apic or APICv state can change. This is rare and the loop can handle 64 MSRs per iteration, in a similar fashion as nested_vmx_prepare_msr_bitmap. This prepares for choosing, on a per-VM basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs. Suggested-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - No support for nested MSR bitmaps - APICv support looked different - We still need to intercept the APIC_ID MSR - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: VMX: introduce alloc_loaded_vmcsPaolo Bonzini2018-06-161-13/+22
| | | | | | | | | | | | | | | commit f21f165ef922c2146cc5bdc620f542953c41714b upstream. Group together the calls to alloc_vmcs and loaded_vmcs_init. Soon we'll also allocate an MSR bitmap there. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - No loaded_vmcs::shadow_vmcs field to initialise - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: nVMX: Eliminate vmcs02 poolJim Mattson2018-06-161-123/+22
| | | | | | | | | | | | | | | | | | | | | commit de3a0021a60635de96aa92713c1a31a96747d72c upstream. The potential performance advantages of a vmcs02 pool have never been realized. To simplify the code, eliminate the pool. Instead, a single vmcs02 is allocated per VCPU when the VCPU enters VMX operation. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Mark Kanda <mark.kanda@oracle.com> Reviewed-by: Ameya More <ameya.more@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - No loaded_vmcs::shadow_vmcs field to initialise - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* KVM: nVMX: mark vmcs12 pages dirty on L2 exitDavid Matlack2018-06-161-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | commit c9f04407f2e0b3fc9ff7913c65fcfcb0a4b61570 upstream. The host physical addresses of L1's Virtual APIC Page and Posted Interrupt descriptor are loaded into the VMCS02. The CPU may write to these pages via their host physical address while L2 is running, bypassing address-translation-based dirty tracking (e.g. EPT write protection). Mark them dirty on every exit from L2 to prevent them from getting out of sync with dirty tracking. Also mark the virtual APIC page and the posted interrupt descriptor dirty when KVM is virtualizing posted interrupt processing. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - No nested posted interrupt support - No SMM support, so use mark_page_dirty() instead of kvm_vcpu_mark_page_dirty()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Use IBRS if available before calling into firmwareDavid Woodhouse2018-06-165-10/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit dd84441a797150dcc49298ec95c459a8891d8bb1 upstream. Retpoline means the kernel is safe because it has no indirect branches. But firmware isn't, so use IBRS for firmware calls if it's available. Block preemption while IBRS is set, although in practice the call sites already had to be doing that. Ignore hpwdt.c for now. It's taking spinlocks and calling into firmware code, from an NMI handler. I don't want to touch that with a bargepole. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: arjan.van.de.ven@intel.com Cc: bp@alien8.de Cc: dave.hansen@intel.com Cc: jmattson@google.com Cc: karahmed@amazon.de Cc: kvm@vger.kernel.org Cc: pbonzini@redhat.com Cc: rkrcmar@redhat.com Link: http://lkml.kernel.org/r/1519037457-7643-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> [bwh: Backported to 3.16: - x86 defines {,__}efi_call_virt() itself; update those definitions - Renumber the feature bit - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on IntelDavid Woodhouse2018-06-162-19/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7fcae1118f5fd44a862aa5c3525248e35ee67c3b upstream. Despite the fact that all the other code there seems to be doing it, just using set_cpu_cap() in early_intel_init() doesn't actually work. For CPUs with PKU support, setup_pku() calls get_cpu_cap() after c->c_init() has set those feature bits. That resets those bits back to what was queried from the hardware. Turning the bits off for bad microcode is easy to fix. That can just use setup_clear_cpu_cap() to force them off for all CPUs. I was less keen on forcing the feature bits *on* that way, just in case of inconsistencies. I appreciate that the kernel is going to get this utterly wrong if CPU features are not consistent, because it has already applied alternatives by the time secondary CPUs are brought up. But at least if setup_force_cpu_cap() isn't being used, we might have a chance of *detecting* the lack of the corresponding bit and either panicking or refusing to bring the offending CPU online. So ensure that the appropriate feature bits are set within get_cpu_cap() regardless of how many extra times it's called. Fixes: 2961298e ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: karahmed@amazon.de Cc: peterz@infradead.org Cc: bp@alien8.de Link: https://lkml.kernel.org/r/1517322623-15261-1-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpufeatures: Clean up Spectre v2 related CPUID flagsDavid Woodhouse2018-06-164-21/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2 upstream. We want to expose the hardware features simply in /proc/cpuinfo as "ibrs", "ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them as the user-visible bits. When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP bit is set, set the AMD STIBP that's used for the generic hardware capability. Hide the rest from /proc/cpuinfo by putting "" in the comments. Including RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are patches to make the sysfs vulnerabilities information non-readable by non-root, and the same should apply to all information about which mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo. The feature bit for whether IBPB is actually used, which is needed for ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB. Originally-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ak@linux.intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: Adjust context and numbering] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) supportDavid Woodhouse2018-06-163-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 20ffa1caecca4db8f79fe665acdeaa5af815a24d upstream. Expose indirect_branch_prediction_barrier() for use in subsequent patches. [ tglx: Add IBPB status to spectre_v2 sysfs file ] Co-developed-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-8-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: - Renumber the feature bit - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodesDavid Woodhouse2018-06-162-2/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a5b2966364538a0e68c9fa29bc0a3a1651799035 upstream. This doesn't refuse to load the affected microcodes; it just refuses to use the Spectre v2 mitigation features if they're detected, by clearing the appropriate feature bits. The AMD CPUID bits are handled here too, because hypervisors *may* have been exposing those bits even on Intel chips, for fine-grained control of what's available. It is non-trivial to use x86_match_cpu() for this table because that doesn't handle steppings. And the approach taken in commit bd9240a18 almost made me lose my lunch. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: Add #include <asm/intel-family.h>] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/pti: Mark constant arrays as __initconstArnd Bergmann2018-06-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | commit 4bf5d56d429cbc96c23d809a08f63cd29e1a702e upstream. I'm seeing build failures from the two newly introduced arrays that are marked 'const' and '__initdata', which are mutually exclusive: arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init' arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init' The correct annotation is __initconst. Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Thomas Garnier <thgarnie@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* x86/pti: Do not enable PTI on CPUs which are not vulnerable to MeltdownDavid Woodhouse2018-06-161-5/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit fec9434a12f38d3aeafeb75711b71d8a1fdef621 upstream. Also, for CPUs which don't speculate at all, don't report that they're vulnerable to the Spectre variants either. Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it for now, even though that could be done with a simple comparison, on the assumption that we'll have more to add. Based on suggestions from Dave Hansen and Alan Cox. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>