summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
...
| * | | | | parisc: Disable further stack checks when panic occurs during stack checkHelge Deller2017-07-231-1/+7
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the irq handler detects a low stack and then panics the kernel, disable further stack checks to avoid recursive panics. Reported-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
* | | | | Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds2017-07-275-23/+80
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull ARM fixes from Russell King: "Two areas addressed by these fixes: - Fixes from Dave Martin for the signal frames that were broken with certain configurations. No one noticed until recently. - More kexec fixes to ensure that the crashkernel region is correctly allocated, and a fix for the location of the device tree when several kexec kernels are loaded" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8687/1: signal: Fix unparseable iwmmxt_sigframe in uc_regspace[] ARM: 8686/1: iwmmxt: Add missing __user annotations to sigframe accessors ARM: kexec: fix failure to boot crash kernel ARM: kexec: avoid allocating crashkernel region outside lowmem
| * | | | | ARM: 8687/1: signal: Fix unparseable iwmmxt_sigframe in uc_regspace[]Dave Martin2017-07-242-17/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In kernels with CONFIG_IWMMXT=y running on non-iWMMXt hardware, the signal frame can be left partially uninitialised in such a way that userspace cannot parse uc_regspace[] safely. In particular, this means that the VFP registers cannot be located reliably in the signal frame when a multi_v7_defconfig kernel is run on the majority of platforms. The cause is that the uc_regspace[] is laid out statically based on the kernel config, but the decision of whether to save/restore the iWMMXt registers must be a runtime decision. To minimise breakage of software that may assume a fixed layout, this patch emits a dummy block of the same size as iwmmxt_sigframe, for non-iWMMXt threads. However, the magic and size of this block are now filled in to help parsers skip over it. A new DUMMY_MAGIC is defined for this purpose. It is probably legitimate (if non-portable) for userspace to manufacture its own sigframe for sigreturn, and there is no obvious reason why userspace should be required to insert a DUMMY_MAGIC block when running on non-iWMMXt hardware, when omitting it has worked just fine forever in other configurations. So in this case, sigreturn does not require this block to be present. Reported-by: Edmund Grimley-Evans <Edmund.Grimley-Evans@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
| * | | | | ARM: 8686/1: iwmmxt: Add missing __user annotations to sigframe accessorsDave Martin2017-07-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | preserve_iwmmxt_context() and restore_iwmmxt_context() lack __user accessors on their arguments pointing to the user signal frame. There does not be appear to be a bug here, but this omission is inconsistent with the crunch and vfp sigframe access functions. This patch adds the annotations, for consistency. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
| * | | | | ARM: kexec: fix failure to boot crash kernelRussell King2017-07-202-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When kexec was converted to DTB, the dtb address was passed between machine_kexec_prepare() and machine_kexec() using a static variable. This is bad news if you load a crash kernel followed by a normal kernel or vice versa - the last loaded kernel overwrites the dtb address. This can result in kexec failures, as (eg) we try to boot the crash kernel with the last loaded dtb. For example, with: the crash kernel fails to find the dtb. Avoid this by defining a kimage architecture structure, and store the address to be passed in r2 there, which will either be the ATAGs or the dtb blob. Fixes: 4cabd1d9625c ("ARM: 7539/1: kexec: scan for dtb magic in segments") Fixes: 42d720d1731a ("ARM: kexec: Make .text R/W in machine_kexec") Reported-by: Keerthy <j-keerthy@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
| * | | | | ARM: kexec: avoid allocating crashkernel region outside lowmemRussell King2017-07-201-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocating the crashkernel region outside lowmem causes the kernel to oops while trying to kexec into the new kernel: Loading crashdump kernel... Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = edd70000 [00000000] *pgd=de19e835 Internal error: Oops: 817 [#2] SMP ARM Modules linked in: ... CPU: 0 PID: 689 Comm: sh Not tainted 4.12.0-rc3-next-20170601-04015-gc3a5a20 Hardware name: Generic DRA74X (Flattened Device Tree) task: edb32f00 task.stack: edf18000 PC is at memcpy+0x50/0x330 LR is at 0xe3c34001 pc : [<c04baf30>] lr : [<e3c34001>] psr: 800c0193 sp : edf19c2c ip : 0a000001 fp : c0553170 r10: c055316e r9 : 00000001 r8 : e3130001 r7 : e4903004 r6 : 0a000014 r5 : e3500000 r4 : e59f106c r3 : e59f0074 r2 : ffffffe8 r1 : c010fb88 r0 : 00000000 Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: add7006a DAC: 00000051 Process sh (pid: 689, stack limit = 0xedf18218) Stack: (0xedf19c2c to 0xedf1a000) ... [<c04baf30>] (memcpy) from [<c010fae0>] (machine_kexec+0xa8/0x12c) [<c010fae0>] (machine_kexec) from [<c01e4104>] (__crash_kexec+0x5c/0x98) [<c01e4104>] (__crash_kexec) from [<c01e419c>] (crash_kexec+0x5c/0x68) [<c01e419c>] (crash_kexec) from [<c010c5c0>] (die+0x228/0x490) [<c010c5c0>] (die) from [<c011e520>] (__do_kernel_fault.part.0+0x54/0x1e4) [<c011e520>] (__do_kernel_fault.part.0) from [<c082412c>] (do_page_fault+0x1e8/0x400) [<c082412c>] (do_page_fault) from [<c010135c>] (do_DataAbort+0x38/0xb8) [<c010135c>] (do_DataAbort) from [<c0823584>] (__dabt_svc+0x64/0xa0) This is caused by image->control_code_page being a highmem page, so page_address(image->control_code_page) returns NULL. In any case, we don't want the control page to be a highmem page. We already limit the crash kernel region to the top of 32-bit physical memory space. Also limit it to the top of lowmem in physical space. Reported-by: Keerthy <j-keerthy@ti.com> Tested-by: Keerthy <j-keerthy@ti.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
* | | | | | Merge tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds2017-07-255-14/+41
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull dma mapping fixes from Christoph Hellwig: "split the global dma coherent pool from the per-device pool. This fixes a regression in the earlier 4.13 pull requests where the global pool would override a per-device CMA pool (Vladimir Murzin)" * tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mapping: ARM: NOMMU: Wire-up default DMA interface dma-coherent: introduce interface for default DMA pool
| * | | | | | ARM: NOMMU: Wire-up default DMA interfaceVladimir Murzin2017-07-201-9/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The way how default DMA pool is exposed has changed and now we need to use dedicated interface to work with it. This patch makes alloc/release operations to use such interface. Since, default DMA pool is not handled by generic code anymore we have to implement our own mmap operation. Tested-by: Andras Szemzo <sza@esh.hu> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | | | | | dma-coherent: introduce interface for default DMA poolVladimir Murzin2017-07-204-5/+5
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Christoph noticed [1] that default DMA pool in current form overload the DMA coherent infrastructure. In reply, Robin suggested [2] to split the per-device vs. global pool interfaces, so allocation/release from default DMA pool is driven by dma ops implementation. This patch implements Robin's idea and provide interface to allocate/release/mmap the default (aka global) DMA pool. To make it clear that existing *_from_coherent routines work on per-device pool rename them to *_from_dev_coherent. [1] https://lkml.org/lkml/2017/7/7/370 [2] https://lkml.org/lkml/2017/7/7/431 Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Suggested-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Andras Szemzo <sza@esh.hu> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
* | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2017-07-252-4/+4
|\ \ \ \ \ \ | |_|_|/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Three bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: set change and reference bit on lazy key enablement s390: chp: handle CRW_ERC_INIT for channel-path status change s390/perf: fix problem state detection
| * | | | | s390/mm: set change and reference bit on lazy key enablementChristian Borntraeger2017-07-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we enable storage keys for a guest lazily, we reset the ACC and F values. That is correct assuming that these are 0 on a clear reset and the guest obviously has not used any key setting instruction. We also zero out the change and reference bit. This is not correct as the architecture prefers over-indication instead of under-indication for the keyless->keyed transition. This patch fixes the behaviour and always sets guest change and guest reference for all guest storage keys on the keyless -> keyed switch. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | | s390/perf: fix problem state detectionChristian Borntraeger2017-07-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The P sample bit indicates problem state and not PER. Fixes: commit a752598254 ("s390: rename struct psw_bits members") Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* | | | | | xen/x86: fix cpu hotplugJuergen Gross2017-07-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit dc6416f1d711eb4c1726e845d653235dcaae12e1 ("xen/x86: Call cpu_startup_entry(CPUHP_AP_ONLINE_IDLE) from xen_play_dead()") introduced an error leading to a stack overflow of the idle task when a cpu was brought offline/online many times: by calling cpu_startup_entry() instead of returning at the end of xen_play_dead() do_idle() would be entered again and again. Don't use cpu_startup_entry(), but cpuhp_online_idle() instead allowing to return from xen_play_dead(). Cc: <stable@vger.kernel.org> # 4.12 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | | | | | xen/x86: Don't BUG on CPU0 offliningVitaly Kuznetsov2017-07-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CONFIG_BOOTPARAM_HOTPLUG_CPU0 allows to offline CPU0 but Xen HVM guests BUG() in xen_teardown_timer(). Remove the BUG_ON(), this is probably a leftover from ancient times when CPU0 hotplug was impossible, it works just fine for HVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>
* | | | | | Merge tag 'tty-4.13-rc2' of ↵Linus Torvalds2017-07-227-7/+7
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are some small tty and serial driver fixes for 4.13-rc2. Nothing huge at all, a revert of a patch that turned out to break things, a fix up for a new tty ioctl we added in 4.13-rc1 to get the uapi definition correct, and a few minor serial driver fixes for reported issues. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: Fix TIOCGPTPEER ioctl definition tty: hide unused pty_get_peer function tty: serial: lpuart: Fix the logic for detecting the 32-bit type UART serial: imx: Prevent TX buffer PIO write when a DMA has been started Revert "serial: imx-serial - move DMA buffer configuration to DT" serial: sh-sci: Uninitialized variables in sysfs files serial: st-asc: Potential error pointer dereference
| * | | | | | tty: Fix TIOCGPTPEER ioctl definitionGleb Fotengauer-Malinovskiy2017-07-177-7/+7
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This ioctl does nothing to justify an _IOC_READ or _IOC_WRITE flag because it doesn't copy anything from/to userspace to access the argument. Fixes: 54ebbfb16034 ("tty: add TIOCGPTPEER ioctl") Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by: Aleksa Sarai <asarai@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2017-07-214-21/+38
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Radim Krčmář: "A bunch of small fixes for x86" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: x86: hyperv: avoid livelock in oneshot SynIC timers KVM: VMX: Fix invalid guest state detection after task-switch emulation x86: add MULTIUSER dependency for KVM KVM: nVMX: Disallow VM-entry in MOV-SS shadow KVM: nVMX: track NMI blocking state separately for each VMCS KVM: x86: masking out upper bits
| * | | | | kvm: x86: hyperv: avoid livelock in oneshot SynIC timersRoman Kagan2017-07-201-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the SynIC timer message delivery fails due to SINT message slot being busy, there's no point to attempt starting the timer again until we're notified of the slot being released by the guest (via EOM or EOI). Even worse, when a oneshot timer fails to deliver its message, its re-arming with an expiration time in the past leads to immediate retry of the delivery, and so on, without ever letting the guest vcpu to run and release the slot, which results in a livelock. To avoid that, only start the timer when there's no timer message pending delivery. When there is, meaning the slot is busy, the processing will be restarted upon notification from the guest that the slot is released. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
| * | | | | KVM: VMX: Fix invalid guest state detection after task-switch emulationWanpeng Li2017-07-201-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This can be reproduced by EPT=1, unrestricted_guest=N, emulate_invalid_state=Y or EPT=0, the trace of kvm-unit-tests/taskswitch2.flat is like below, it tries to emulate invalid guest state task-switch: kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0 kvm_emulate_insn: 42000:0:0f 0b (0x2) kvm_emulate_insn: 42000:0:0f 0b (0x2) failed kvm_inj_exception: #UD (0x0) kvm_entry: vcpu 0 kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0 kvm_emulate_insn: 42000:0:0f 0b (0x2) kvm_emulate_insn: 42000:0:0f 0b (0x2) failed kvm_inj_exception: #UD (0x0) ...................... It appears that the task-switch emulation updates rflags (and vm86 flag) only after the segments are loaded, causing vmx->emulation_required to be set, when in fact invalid guest state emulation is not needed. This patch fixes it by updating vmx->emulation_required after the rflags (and vm86 flag) is updated in task-switch emulation. Thanks Radim for moving the update to vmx__set_flags and adding Paolo's suggestion for the check. Suggested-by: Nadav Amit <nadav.amit@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
| * | | | | x86: add MULTIUSER dependency for KVMArnd Bergmann2017-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KVM tries to select 'TASKSTATS', which had additional dependencies: warning: (KVM) selects TASKSTATS which has unmet direct dependencies (NET && MULTIUSER) Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
| * | | | | KVM: nVMX: Disallow VM-entry in MOV-SS shadowJim Mattson2017-07-191-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Immediately following MOV-to-SS/POP-to-SS, VM-entry is disallowed. This check comes after the check for a valid VMCS. When this check fails, the instruction pointer should fall through to the next instruction, the ALU flags should be set to indicate VMfailValid, and the VM-instruction error should be set to 26 ("VM entry with events blocked by MOV SS"). Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
| * | | | | KVM: nVMX: track NMI blocking state separately for each VMCSPaolo Bonzini2017-07-191-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vmx_recover_nmi_blocking is using a cached value of the guest interruptibility info, which is stored in vmx->nmi_known_unmasked. vmx_recover_nmi_blocking is run for both normal and nested guests, so the cached value must be per-VMCS. This fixes eventinj.flat in a nested non-EPT environment. With EPT it works, because the EPT violation handler doesn't have the vmx->nmi_known_unmasked optimization (it is unnecessary because, unlike vmx_recover_nmi_blocking, it can just look at the exit qualification). Thanks to Wanpeng Li for debugging the testcase and providing an initial patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
| * | | | | KVM: x86: masking out upper bitsDan Carpenter2017-07-191-2/+2
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_read_cr3() returns an unsigned long and gfn is a u64. We intended to mask out the bottom 5 bits but because of the type issue we mask the top 32 bits as well. I don't know if this is a real problem, but it causes static checker warnings. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
* | | | | Merge tag 'powerpc-4.13-3' of ↵Linus Torvalds2017-07-2111-34/+102
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A handful of fixes, mostly for new code: - some reworking of the new STRICT_KERNEL_RWX support to make sure we also remove executable permission from __init memory before it's freed. - a fix to some recent optimisations to the hypercall entry where we were clobbering r12, this was breaking nested guests (PR KVM). - a fix for the recent patch to opal_configure_cores(). This could break booting on bare metal Power8 boxes if the kernel was built without CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG. - .. and finally a workaround for spurious PMU interrupts on Power9 DD2. Thanks to: Nicholas Piggin, Anton Blanchard, Balbir Singh" * tag 'powerpc-4.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=y powerpc/mm/hash: Refactor hash__mark_rodata_ro() powerpc/mm/radix: Refactor radix__mark_rodata_ro() powerpc/64s: Fix hypercall entry clobbering r12 input powerpc/perf: Avoid spurious PMU interrupts after idle powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()
| * | | | powerpc/mm: Mark __init memory no-execute when STRICT_KERNEL_RWX=yMichael Ellerman2017-07-188-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently even with STRICT_KERNEL_RWX we leave the __init text marked executable after init, which is bad. Add a hook to mark it NX (no-execute) before we free it, and implement it for radix and hash. Note that we use __init_end as the end address, not _einittext, because overlaps_kernel_text() uses __init_end, because there are additional executable sections other than .init.text between __init_begin and __init_end. Tested on radix and hash with: 0:mon> p $__init_begin *** 400 exception occurred Fixes: 1e0fc9d1eb2b ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | powerpc/mm/hash: Refactor hash__mark_rodata_ro()Michael Ellerman2017-07-181-13/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the core logic into a helper, so we can use it for changing other permissions. We also change the logic to align start down, and end up. This means calling the function with a range will expand that range to be at least 1 mmu_linear_psize page in size. We need that so we can use it on __init_begin ... __init_end which is not a full page in size. This should always work for _stext/__init_begin, because we align __init_begin to _stext + 16M in the linker script. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | powerpc/mm/radix: Refactor radix__mark_rodata_ro()Michael Ellerman2017-07-181-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the core logic into a helper, so we can use it for changing permissions other than _PAGE_WRITE. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | powerpc/64s: Fix hypercall entry clobbering r12 inputNicholas Piggin2017-07-181-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A previous optimisation incorrectly assumed the PAPR hcall does not use r12, and clobbers it upon entry. In fact it is used as an input. This can result in KVM guests crashing (observed with PR KVM). Instead of using r12 to save r13, tihs patch saves r13 in ctr. This is more costly, but not as slow as using the SPRG. Fixes: acd7d8cef0153 ("powerpc/64s: Optimize hypercall/syscall entry") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | powerpc/perf: Avoid spurious PMU interrupts after idleNicholas Piggin2017-07-181-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POWER9 DD2 can see spurious PMU interrupts after state-loss idle in some conditions. A solution is to save and reload MMCR0 over state-loss idle. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Tested-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()Michael Ellerman2017-07-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9"), we added additional flags to the OPAL call to configure CPUs at boot. These flags only work on Power9 firmwares, and worse can cause boot failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9) before adding the extra flags. Unfortunately we forgot that opal_configure_cores() is called before the CPU feature checks are dynamically patched, meaning the check always returns true. We definitely need to do something to make the CPU feature checks less prone to bugs like this, but for now the minimal fix is to use early_cpu_has_feature(). Reported-and-tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Fixes: 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2017-07-2116-38/+55
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Half of the fixes are for various build time warnings triggered by randconfig builds. Most (but not all...) were harmless. There's also: - ACPI boundary condition fixes - UV platform fixes - defconfig updates - an AMD K6 CPU init fix - a %pOF printk format related preparatory change - .. and a warning fix related to the tlb/PCID changes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/devicetree: Convert to using %pOF instead of ->full_name x86/platform/uv/BAU: Disable BAU on single hub configurations x86/platform/intel-mid: Fix a format string overflow warning x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUG x86/build: Silence the build with "make -s" x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warning x86/fpu/math-emu: Fix possible uninitialized variable use perf/x86: Shut up false-positive -Wmaybe-uninitialized warning x86/defconfig: Remove stale, old Kconfig options x86/ioapic: Pass the correct data to unmask_ioapic_irq() x86/acpi: Prevent out of bound access caused by broken ACPI tables x86/mm, KVM: Fix warning when !CONFIG_PREEMPT_COUNT x86/platform/uv/BAU: Fix congested_response_us not taking effect x86/cpu: Use indirect call to measure performance in init_amd_k6()
| * | | | | x86/devicetree: Convert to using %pOF instead of ->full_nameRob Herring2017-07-211-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each device node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: devicetree@vger.kernel.org Link: http://lkml.kernel.org/r/20170718214339.7774-7-robh@kernel.org [ Clarify the error message while at it, as 'node' is ambiguous. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/platform/uv/BAU: Disable BAU on single hub configurationsAndrew Banman2017-07-211-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BAU confers no benefit to a UV system running with only one hub/socket. Permanently disable the BAU driver if there are less than two hubs online to avoid BAU overhead. We have observed failed boots on single-socket UV4 systems caused by BAU that are avoided with this patch. Also, while at it, consolidate initialization error blocks and fix a memory leak. Signed-off-by: Andrew Banman <abanman@hpe.com> Acked-by: Russ Anderson <rja@hpe.com> Acked-by: Mike Travis <mike.travis@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: tony.ernst@hpe.com Link: http://lkml.kernel.org/r/1500588351-78016-1-git-send-email-abanman@hpe.com [ Minor cleanups. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/platform/intel-mid: Fix a format string overflow warningArnd Bergmann2017-07-201-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have space for exactly three characters for the index in "max7315_%d_base", but as GCC points out having more would cause an string overflow: arch/x86/platform/intel-mid/device_libs/platform_max7315.c: In function 'max7315_platform_data': arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: error: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Werror=format-overflow=] sprintf(base_pin_name, "max7315_%d_base", nr); ^~~~~~~~~~~~~~~~~ arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: note: directive argument in the range [-2147483647, 2147483647] arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:3: note: 'sprintf' output between 15 and 25 bytes into a destination of size 17 sprintf(base_pin_name, "max7315_%d_base", nr); This makes it use an snprintf() to truncate the string if that happened rather than overflowing the stack. In practice, this is safe, because there won't be a large number of max7315 devices in the systems, and both the format and the length are defined by the firmware interface. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-9-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUGArnd Bergmann2017-07-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IOSF_MBI option requires PCI support, without it we get a harmless Kconfig warning when it gets selected by PUNIT_ATOM_DEBUG: warning: (X86_INTEL_LPSS && SND_SST_IPC_ACPI && MMC_SDHCI_ACPI && PUNIT_ATOM_DEBUG) selects IOSF_MBI which has unmet direct dependencies (PCI) This adds another dependency to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-8-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/build: Silence the build with "make -s"Arnd Bergmann2017-07-201-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every kernel build on x86 will result in some output: Setup is 13084 bytes (padded to 13312 bytes). System is 4833 kB CRC 6d35fa35 Kernel: arch/x86/boot/bzImage is ready (#2) This shuts it up, so that 'make -s' is truely silent as long as everything works. Building without '-s' should produce unchanged output. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-6-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outslArnd Bergmann2017-07-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The x86 version of insb/insw/insl uses an inline assembly that does not have the target buffer listed as an output. This can confuse the compiler, leading it to think that a subsequent access of the buffer is uninitialized: drivers/net/wireless/wl3501_cs.c: In function ‘wl3501_mgmt_scan_confirm’: drivers/net/wireless/wl3501_cs.c:665:9: error: ‘sig.status’ is used uninitialized in this function [-Werror=uninitialized] drivers/net/wireless/wl3501_cs.c:668:12: error: ‘sig.cap_info’ may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c: In function 'sb1000_rx': drivers/net/sb1000.c:775:9: error: 'st[0]' is used uninitialized in this function [-Werror=uninitialized] drivers/net/sb1000.c:776:10: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c:784:11: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] I tried to mark the exact input buffer as an output here, but couldn't figure it out. As suggested by Linus, marking all memory as clobbered however is good enough too. For the outs operations, I also add the memory clobber, to force the input to be written to local variables. This is probably already guaranteed by the "asm volatile", but it can't hurt to do this for symmetry. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: http://lkml.kernel.org/r/20170719125310.2487451-5-arnd@arndb.de Link: https://lkml.org/lkml/2017/7/12/605 Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warningArnd Bergmann2017-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc-7.1.1 produces this warning: arch/x86/math-emu/reg_add_sub.c: In function 'FPU_add': arch/x86/math-emu/reg_add_sub.c:80:48: error: ?: using integer constants in boolean context [-Werror=int-in-bool-context] This appears to be a bug in gcc-7.1.1, and I have reported it as PR81484. The compiler suggests that code written as if (a & b ? c : d) is usually incorrect and should have been if (a & (b ? c : d)) However, in this case, we correctly write if ((a & b) ? c : d) and should not get a warning for it. This adds a dirty workaround for the problem, adding a comparison with zero inside of the macro. The warning is currently disabled in the kernel, so we may decide not to apply the patch, and instead wait for future gcc releases to fix the problem. On the other hand, it seems to be the only instance of this particular problem. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Bill Metzenthen <billm@melbpc.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-4-arnd@arndb.de Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484 Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/fpu/math-emu: Fix possible uninitialized variable useArnd Bergmann2017-07-202-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building the kernel with "make EXTRA_CFLAGS=...", this overrides the "PARANOID" preprocessor macro defined in arch/x86/math-emu/Makefile, and we run into a build warning: arch/x86/math-emu/reg_compare.c: In function ‘compare_i_st_st’: arch/x86/math-emu/reg_compare.c:254:6: error: ‘f’ may be used uninitialized in this function [-Werror=maybe-uninitialized] This fixes the implementation to work correctly even without the PARANOID flag, and also fixes the Makefile to not use the EXTRA_CFLAGS variable but instead use the ccflags-y variable in the Makefile that is meant for this purpose. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Bill Metzenthen <billm@melbpc.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-3-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | perf/x86: Shut up false-positive -Wmaybe-uninitialized warningArnd Bergmann2017-07-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intialization function checks for various failure scenarios, but unfortunately the compiler gets a little confused about the possible combinations, leading to a false-positive build warning when -Wmaybe-uninitialized is set: arch/x86/events/core.c: In function ‘init_hw_perf_events’: arch/x86/events/core.c:264:3: warning: ‘reg_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] arch/x86/events/core.c:264:3: warning: ‘val_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", We can't actually run into this case, so this shuts up the warning by initializing the variables to a known-invalid state. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-2-arnd@arndb.de Link: https://patchwork.kernel.org/patch/9392595/ Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/defconfig: Remove stale, old Kconfig optionsKrzysztof Kozlowski2017-07-202-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove old, dead Kconfig options (in order appearing in this commit): - EXPERIMENTAL is gone since v3.9; - IP_NF_TARGET_ULOG: commit d4da843e6fad ("netfilter: kill remnants of ulog targets"); - USB_LIBUSUAL: commit f61870ee6f8c ("usb: remove libusual"); Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1500526885-4341-1-git-send-email-krzk@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/ioapic: Pass the correct data to unmask_ioapic_irq()Seunghun Han2017-07-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the rarely executed code pathes in check_timer() calls unmask_ioapic_irq() passing irq_get_chip_data(0) as argument. That's wrong as unmask_ioapic_irq() expects a pointer to the irq data of interrupt 0. irq_get_chip_data(0) returns NULL, so the following dereference in unmask_ioapic_irq() causes a kernel panic. The issue went unnoticed in the first place because irq_get_chip_data() returns a void pointer so the compiler cannot do a type check on the argument. The code path was added for machines with broken configuration, but it seems that those machines are either not running current kernels or simply do not longer exist. Hand in irq_get_irq_data(0) as argument which provides the correct data. [ tglx: Rewrote changelog ] Fixes: 4467715a44cc ("x86/irq: Move irq_cfg.irq_2_pin into io_apic.c") Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1500369644-45767-1-git-send-email-kkamagui@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/acpi: Prevent out of bound access caused by broken ACPI tablesSeunghun Han2017-07-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bus_irq argument of mp_override_legacy_irq() is used as the index into the isa_irq_to_gsi[] array. The bus_irq argument originates from ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI tables, but is nowhere sanity checked. That allows broken or malicious ACPI tables to overwrite memory, which might cause malfunction, panic or arbitrary code execution. Add a sanity check and emit a warning when that triggers. [ tglx: Added warning and rewrote changelog ] Signed-off-by: Seunghun Han <kkamagui@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: stable@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/mm, KVM: Fix warning when !CONFIG_PREEMPT_COUNTRoman Kagan2017-07-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent commit: d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") introduced a VM_WARN_ON(!in_atomic()) which generates false positives on every VM entry on !CONFIG_PREEMPT_COUNT kernels. Replace it with a test for preemptible(), which appears to match the original intent and works across different CONFIG_PREEMPT* variations. Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Nadav Amit <namit@vmware.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Cc: linux-mm@kvack.org Fixes: d6e41f1151fe ("x86/mm, KVM: Teach KVM's VMX code that CR3 isn't a constant") Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/platform/uv/BAU: Fix congested_response_us not taking effectJustin Ernst2017-07-161-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug fix for the BAU tunable congested_cycles not being set to the user defined value. Instead of referencing a global variable when deciding on BAU shutdown, a node will reference its own tunable set value ( cong_response_us). This results in the user set tunable value congested_response_us taking effect correctly. Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Acked-by: Andrew Banman <abanman@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: mike.travis@hpe.com Cc: sivanich@hpe.com Link: http://lkml.kernel.org/r/1499970803-282432-1-git-send-email-justin.ernst@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | x86/cpu: Use indirect call to measure performance in init_amd_k6()Mikulas Patocka2017-07-161-0/+1
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This old piece of code is supposed to measure the performance of indirect calls to determine if the processor is buggy or not, however the compiler optimizer turns it into a direct call. Use the OPTIMIZER_HIDE_VAR() macro to thwart the optimization, so that a real indirect call is generated. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707110737530.8746@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* | | | | Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2017-07-215-16/+202
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two hw-enablement patches, two race fixes, three fixes for regressions of semantics, plus a number of tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Add proper condition to run sched_task callbacks perf/core: Fix locking for children siblings group read perf/core: Fix scheduling regression of pinned groups perf/x86/intel: Fix debug_store reset field for freq events perf/x86/intel: Add Goldmont Plus CPU PMU support perf/x86/intel: Enable C-state residency events for Apollo Lake perf symbols: Accept zero as the kernel base address Revert "perf/core: Drop kernel samples even though :u is specified" perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target perf evsel: State in the default event name if attr.exclude_kernel is set perf evsel: Fix attr.exclude_kernel setting for default cycles:p
| * | | | | perf/x86/intel: Add proper condition to run sched_task callbacksJiri Olsa2017-07-213-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have 2 functions using the same sched_task callback: - PEBS drain for free running counters - LBR save/store Both of them are called from intel_pmu_sched_task() and either of them can be unwillingly triggered when the other one is configured to run. Let's say there's PEBS drain configured in sched_task callback for the event, but in the callback itself (intel_pmu_sched_task()) we will also run the code for LBR save/restore, which we did not ask for, but the code in intel_pmu_sched_task() does not check for that. This can lead to extra cycles in some perf monitoring, like when we monitor PEBS event without LBR data. # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000 (We need PEBS, non freq/non timestamp event to enable the sched_task callback) The perf stat of cycles and msr:write_msr for above command before the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,519,557,441 cycles:k 91,195,527 msr:write_msr 29.334476406 seconds time elapsed And after the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,704,973,540 cycles:k 27,184,720 msr:write_msr 16.977875900 seconds time elapsed There's no affect on cycles:k because the sched_task happens with events switched off, however the msr:write_msr tracepoint counter together with almost 50% of time speedup show the improvement. Monitoring LBR event and having extra PEBS drain processing in sched_task callback showed just a little speedup, because the drain function does not do much extra work in case there is no PEBS data. Adding conditions to recognize the configured work that needs to be done in the x86_pmu's sched_task callback. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170719075247.GA27506@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | perf/x86/intel: Fix debug_store reset field for freq eventsJiri Olsa2017-07-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a bug in PEBs event enabling code, that prevents PEBS freq events to work properly after non freq PEBS event was run. freq events - perf_event_attr::freq set -F <freq> option of perf record PEBS events - perf_event_attr::precise_ip > 0 default for perf record Like in following example with CPU 0 busy, we expect ~10000 samples for following perf tool run: # perf record -F 10000 -C 0 sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.640 MB perf.data (10031 samples) ] Everything's fine, but once we run non freq PEBS event like: # perf record -c 10000 -C 0 sleep 1 [ perf record: Woken up 4 times to write data ] [ perf record: Captured and wrote 1.053 MB perf.data (20061 samples) ] the freq events start to fail like this: # perf record -F 10000 -C 0 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.185 MB perf.data (40 samples) ] The issue is in non freq PEBs event initialization of debug_store reset field, which value is used to auto-reload the counter value after PEBS event drain. This value is not being used for PEBS freq events, but once we run non freq event it stays in debug_store data and screws the sample_freq counting for PEBS freq events. Setting the reset field to 0 for freq events. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170714163551.19459-1-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * | | | | perf/x86/intel: Add Goldmont Plus CPU PMU supportKan Liang2017-07-183-0/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add perf core PMU support for Intel Goldmont Plus CPU cores: - The init code is based on Goldmont. - There is a new cache event list, based on the Goldmont cache event list. - All four general-purpose performance counters support PEBS. - The first general-purpose performance counter is for reduced skid PEBS mechanism. Using :ppp to indicate the event which want to do reduced skid PEBS. - Goldmont Plus has 4-wide pipeline for Topdown Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Link: http://lkml.kernel.org/r/20170712134423.17766-1-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>