summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kernel
Commit message (Collapse)AuthorAgeFilesLines
...
| | * | | | | arm64: move non-entry code out of .entry.textMark Rutland2017-08-081-44/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cpu_switch_to and ret_from_fork both live in .entry.text, though neither form the critical path for an exception entry. In subsequent patches, we will require that code in .entry.text is part of the critical path for exception entry, for which we can assume certain properties (e.g. the presence of exception regs on the stack). Neither cpu_switch_to nor ret_from_fork will meet these requirements, so we must move them out of .entry.text. To ensure that neither are kprobed after being moved out of .entry.text, we must explicitly blacklist them, requiring a new NOKPROBE() asm helper. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
| | * | | | | arm64: consistently use bl for C exception entryMark Rutland2017-08-081-4/+8
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In most cases, our exception entry assembly branches to C handlers with a BL instruction, but in cases where we do not expect to return, we use B instead. While this is correct today, it means that backtraces for fatal exceptions miss the entry assembly (as the LR is stale at the point we call C code), while non-fatal exceptions have the entry assembly in the LR. In subsequent patches, we will need the LR to be set in these cases in order to backtrace reliably. This patch updates these sites to use a BL, ensuring consistency, and preparing for backtrace rework. An ASM_BUG() is added after each of these new BLs, which both catches unexpected returns, and ensures that the LR value doesn't point to another function label. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com>
| * | | | | arm64/vdso: Support mremap() for vDSODmitry Safonov2017-08-091-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vDSO VMA address is saved in mm_context for the purpose of using restorer from vDSO page to return to userspace after signal handling. In Checkpoint Restore in Userspace (CRIU) project we place vDSO VMA on restore back to the place where it was on the dump. With the exception for x86 (where there is API to map vDSO with arch_prctl()), we move vDSO inherited from CRIU task to restoree position by mremap(). CRIU does support arm64 architecture, but kernel doesn't update context.vdso pointer after mremap(). Which results in translation fault after signal handling on restored application: https://github.com/xemul/criu/issues/288 Make vDSO code track the VMA address by supplying .mremap() fops the same way it's done for x86 and arm32 by: commit b059a453b1cf ("x86/vdso: Add mremap hook to vm_special_mapping") commit 280e87e98c09 ("ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO"). Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: linux-arm-kernel@lists.infradead.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Christopher Covington <cov@codeaurora.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | arm64: Implement pmem API supportRobin Murphy2017-08-091-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a clean-to-point-of-persistence cache maintenance helper, and wire up the basic architectural support for the pmem driver based on it. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> [catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c] [catalin.marinas@arm.com: change dmb(sy) to dmb(osh)] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | arm64: Handle trapped DC CVAPRobin Murphy2017-08-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cache clean to PoP is subject to the same access controls as to PoC, so if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we need to be prepared to handle it. To avoid getting into complicated fights with binutils about ARMv8.2 options, we'll just cheat and use the raw SYS instruction rather than the 'proper' DC alias. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | arm64: Expose DC CVAP to userspaceRobin Murphy2017-08-092-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARMv8.2-DCPoP feature introduces persistent memory support to the architecture, by defining a point of persistence in the memory hierarchy, and a corresponding cache maintenance operation, DC CVAP. Expose the support via HWCAP and MRS emulation. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | arm64: Convert __inval_cache_range() to area-basedRobin Murphy2017-08-091-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __inval_cache_range() is already the odd one out among our data cache maintenance routines as the only remaining range-based one; as we're going to want an invalidation routine to call from C code for the pmem API, let's tweak the prototype and name to bring it in line with the clean operations, and to make its relationship with __dma_inv_area() neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area(). The loop clearing the early page tables gets mildly massaged in the process for the sake of consistency. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | arm64: Abstract syscallno manipulationDave Martin2017-08-074-13/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The -1 "no syscall" value is written in various ways, shared with the user ABI in some places, and generally obscure. This patch attempts to make things a little more consistent and readable by replacing all these uses with a single #define. A couple of symbolic helpers are provided to clarify the intent further. Because the in-syscall check in do_signal() is changed from >= 0 to != NO_SYSCALL by this patch, different behaviour may be observable if syscallno is set to values less than -1 by a tracer. However, this is not different from the behaviour that is already observable if a tracer sets syscallno to a value >= __NR_(compat_)syscalls. It appears that this can cause spurious syscall restarting, but that is not a new behaviour either, and does not appear harmful. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | | | | arm64: syscallno is secretly an int, make it officialDave Martin2017-08-075-23/+23
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The upper 32 bits of the syscallno field in thread_struct are handled inconsistently, being sometimes zero extended and sometimes sign-extended. In fact, only the lower 32 bits seem to have any real significance for the behaviour of the code: it's been OK to handle the upper bits inconsistently because they don't matter. Currently, the only place I can find where those bits are significant is in calling trace_sys_enter(), which may be unintentional: for example, if a compat tracer attempts to cancel a syscall by passing -1 to (COMPAT_)PTRACE_SET_SYSCALL at the syscall-enter-stop, it will be traced as syscall 4294967295 rather than -1 as might be expected (and as occurs for a native tracer doing the same thing). Elsewhere, reads of syscallno cast it to an int or truncate it. There's also a conspicuous amount of code and casting to bodge around the fact that although semantically an int, syscallno is stored as a u64. Let's not pretend any more. In order to preserve the stp x instruction that stores the syscall number in entry.S, this patch special-cases the layout of struct pt_regs for big endian so that the newly 32-bit syscallno field maps onto the low bits of the stored value. This is not beautiful, but benchmarking of the getpid syscall on Juno suggests indicates a minor slowdown if the stp is split into an stp x and stp w. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | | | | Merge branch 'x86-syscall-for-linus' of ↵Linus Torvalds2017-09-041-0/+5
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull syscall updates from Ingo Molnar: "Improve the security of set_fs(): we now check the address limit on a number of key platforms (x86, arm, arm64) before returning to user-space - without adding overhead to the typical system call fast path" * 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/syscalls: Check address limit on user-mode return arm/syscalls: Check address limit on user-mode return x86/syscalls: Check address limit on user-mode return
| * | | | | arm64/syscalls: Check address limit on user-mode returnThomas Garnier2017-07-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure the address limit is a user-mode segment before returning to user-mode. Otherwise a process can corrupt kernel-mode memory and elevate privileges [1]. The set_fs function sets the TIF_SETFS flag to force a slow path on return. In the slow path, the address limit is checked to be USER_DS if needed. [1] https://bugs.chromium.org/p/project-zero/issues/detail?id=990 Signed-off-by: Thomas Garnier <thgarnie@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: kernel-hardening@lists.openwall.com Cc: Will Deacon <will.deacon@arm.com> Cc: David Howells <dhowells@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Petr Mladek <pmladek@suse.com> Cc: Rik van Riel <riel@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Will Drewry <wad@chromium.org> Cc: linux-api@vger.kernel.org Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/20170615011203.144108-3-thgarnie@google.com
* | | | | | Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2017-09-041-0/+2
|\ \ \ \ \ \ | |_|_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnad: "The main RCU related changes in this cycle were: - Removal of spin_unlock_wait() - SRCU updates - RCU torture-test updates - RCU Documentation updates - Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant - Miscellaneous RCU fixes - CPU-hotplug fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits) arch: Remove spin_unlock_wait() arch-specific definitions locking: Remove spin_unlock_wait() generic definitions drivers/ata: Replace spin_unlock_wait() with lock/unlock pair ipc: Replace spin_unlock_wait() with lock/unlock pair exit: Replace spin_unlock_wait() with lock/unlock pair completion: Replace spin_unlock_wait() with lock/unlock pair doc: Set down RCU's scheduling-clock-interrupt needs doc: No longer allowed to use rcu_dereference on non-pointers doc: Add RCU files to docbook-generation files doc: Update memory-barriers.txt for read-to-write dependencies doc: Update RCU documentation membarrier: Provide expedited private command rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter() rcu: Add warning to rcu_idle_enter() for irqs enabled rcu: Make rcu_idle_enter() rely on callers disabling irqs rcu: Add assertions verifying blocked-tasks list rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit() rcu: Add TPS() protection for _rcu_barrier_trace strings rcu: Use idle versions of swait to make idle-hack clear swait: Add idle variants which don't contribute to load average ...
| * | | | | Merge branch 'for-mingo' of ↵Ingo Molnar2017-08-211-0/+2
| |\ \ \ \ \ | | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Removal of spin_unlock_wait() - SRCU updates - Torture-test updates - Documentation updates - Miscellaneous fixes - CPU-hotplug fixes - Miscellaneous non-RCU fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * | | | membarrier: Provide expedited private commandMathieu Desnoyers2017-08-171-0/+2
| | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement MEMBARRIER_CMD_PRIVATE_EXPEDITED with IPIs using cpumask built from all runqueues for which current thread's mm is the same as the thread calling sys_membarrier. It executes faster than the non-expedited variant (no blocking). It also works on NOHZ_FULL configurations. Scheduler-wise, it requires a memory barrier before and after context switching between processes (which have different mm). The memory barrier before context switch is already present. For the barrier after context switch: * Our TSO archs can do RELEASE without being a full barrier. Look at x86 spin_unlock() being a regular STORE for example. But for those archs, all atomics imply smp_mb and all of them have atomic ops in switch_mm() for mm_cpumask(), and on x86 the CR3 load acts as a full barrier. * From all weakly ordered machines, only ARM64 and PPC can do RELEASE, the rest does indeed do smp_mb(), so there the spin_unlock() is a full barrier and we're good. * ARM64 has a very heavy barrier in switch_to(), which suffices. * PPC just removed its barrier from switch_to(), but appears to be talking about adding something to switch_mm(). So add a smp_mb__after_unlock_lock() for now, until this is settled on the PPC side. Changes since v3: - Properly document the memory barriers provided by each architecture. Changes since v2: - Address comments from Peter Zijlstra, - Add smp_mb__after_unlock_lock() after finish_lock_switch() in finish_task_switch() to add the memory barrier we need after storing to rq->curr. This is much simpler than the previous approach relying on atomic_dec_and_test() in mmdrop(), which actually added a memory barrier in the common case of switching between userspace processes. - Return -EINVAL when MEMBARRIER_CMD_SHARED is used on a nohz_full kernel, rather than having the whole membarrier system call returning -ENOSYS. Indeed, CMD_PRIVATE_EXPEDITED is compatible with nohz_full. Adapt the CMD_QUERY mask accordingly. Changes since v1: - move membarrier code under kernel/sched/ because it uses the scheduler runqueue, - only add the barrier when we switch from a kernel thread. The case where we switch from a user-space thread is already handled by the atomic_dec_and_test() in mmdrop(). - add a comment to mmdrop() documenting the requirement on the implicit memory barrier. CC: Peter Zijlstra <peterz@infradead.org> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> CC: Boqun Feng <boqun.feng@gmail.com> CC: Andrew Hunter <ahh@google.com> CC: Maged Michael <maged.michael@gmail.com> CC: gromer@google.com CC: Avi Kivity <avi@scylladb.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Dave Watson <davejwatson@fb.com>
* | | | | arm64: kaslr: Adjust the offset to avoid Image across alignment boundaryCatalin Marinas2017-08-221-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With 16KB pages and a kernel Image larger than 16MB, the current kaslr_early_init() logic for avoiding mappings across swapper table boundaries fails since increasing the offset by kimg_sz just moves the problem to the next boundary. This patch rounds the offset down to (1 << SWAPPER_TABLE_SHIFT) if the Image crosses a PMD_SIZE boundary. Fixes: afd0e5a87670 ("arm64: kaslr: Fix up the kernel image alignment") Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* | | | | arm64: kaslr: ignore modulo offset when validating virtual displacementArd Biesheuvel2017-08-222-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the KASLR setup routine, we ensure that the early virtual mapping of the kernel image does not cover more than a single table entry at the level above the swapper block level, so that the assembler routines involved in setting up this mapping can remain simple. In this calculation we add the proposed KASLR offset to the values of the _text and _end markers, and reject it if they would end up falling in different swapper table sized windows. However, when taking the addresses of _text and _end, the modulo offset (the physical displacement modulo 2 MB) is already accounted for, and so adding it again results in incorrect results. So disregard the modulo offset from the calculation. Fixes: 08cdac619c81 ("arm64: relocatable: deal with physically misaligned ...") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
* | | | | arm64: fpsimd: Prevent registers leaking across execDave Martin2017-08-221-0/+2
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are some tricky dependencies between the different stages of flushing the FPSIMD register state during exec, and these can race with context switch in ways that can cause the old task's regs to leak across. In particular, a context switch during the memset() can cause some of the task's old FPSIMD registers to reappear. Disabling preemption for this small window would be no big deal for performance: preemption is already disabled for similar scenarios like updating the FPSIMD registers in sigreturn. So, instead of rearranging things in ways that might swap existing subtle bugs for new ones, this patch just disables preemption around the FPSIMD state flushing so that races of this type can't occur here. This brings fpsimd_flush_thread() into line with other code paths. Cc: stable@vger.kernel.org Fixes: 674c242c9323 ("arm64: flush FP/SIMD state correctly after execve()") Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* | | / arm64: Use arch_timer_get_rate when trapping CNTFRQ_EL0Marc Zyngier2017-08-011-1/+1
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an ideal world, CNTFRQ_EL0 always contains the timer frequency for the kernel to use. Sadly, we get quite a few broken systems where the firmware authors cannot be bothered to program that register on all CPUs, and rely on DT to provide that frequency. So when trapping CNTFRQ_EL0, make sure to return the actual rate (as known by the kernel), and not CNTFRQ_EL0. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* | | arm64: Convert to using %pOF instead of full_nameRob Herring2017-07-203-19/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each node. Signed-off-by: Rob Herring <robh@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Will Deacon <will.deacon@arm.com>
* | | arm64: traps: disable irq in die()Qiao Zhou2017-07-201-2/+6
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In current die(), the irq is disabled for __die() handle, not including the possible panic() handling. Since the log in __die() can take several hundreds ms, new irq might come and interrupt current die(). If the process calling die() holds some critical resource, and some other process scheduled later also needs it, then it would deadlock. The first panic will not be executed. So here disable irq for the whole flow of die(). Signed-off-by: Qiao Zhou <qiaozhou@asrmicro.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* | Merge tag 'pci-v4.13-changes' of ↵Linus Torvalds2017-07-081-6/+4
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add sysfs max_link_speed/width, current_link_speed/width (Wong Vee Khee) - make host bridge IRQ mapping much more generic (Matthew Minter, Lorenzo Pieralisi) - convert most drivers to pci_scan_root_bus_bridge() (Lorenzo Pieralisi) - mutex sriov_configure() (Jakub Kicinski) - mutex pci_error_handlers callbacks (Christoph Hellwig) - split ->reset_notify() into ->reset_prepare()/reset_done() (Christoph Hellwig) - support multiple PCIe portdrv interrupts for MSI as well as MSI-X (Gabriele Paoloni) - allocate MSI/MSI-X vector for Downstream Port Containment (Gabriele Paoloni) - fix MSI IRQ affinity pre/post/min_vecs issue (Michael Hernandez) - test INTx masking during enumeration, not at run-time (Piotr Gregor) - avoid using device_may_wakeup() for runtime PM (Rafael J. Wysocki) - restore the status of PCI devices across hibernation (Chen Yu) - keep parent resources that start at 0x0 (Ard Biesheuvel) - enable ECRC only if device supports it (Bjorn Helgaas) - restore PRI and PASID state after Function-Level Reset (CQ Tang) - skip DPC event if device is not present (Keith Busch) - check domain when matching SMBIOS info (Sujith Pandel) - mark Intel XXV710 NIC INTx masking as broken (Alex Williamson) - avoid AMD SB7xx EHCI USB wakeup defect (Kai-Heng Feng) - work around long-standing Macbook Pro poweroff issue (Bjorn Helgaas) - add Switchtec "running" status flag (Logan Gunthorpe) - fix dra7xx incorrect RW1C IRQ register usage (Arvind Yadav) - modify xilinx-nwl IRQ chip for legacy interrupts (Bharat Kumar Gogada) - move VMD SRCU cleanup after bus, child device removal (Jon Derrick) - add Faraday clock handling (Linus Walleij) - configure Rockchip MPS and reorganize (Shawn Lin) - limit Qualcomm TLP size to 2K (hardware issue) (Srinivas Kandagatla) - support Tegra MSI 64-bit addressing (Thierry Reding) - use Rockchip normal (not privileged) register bank (Shawn Lin) - add HiSilicon Kirin SoC PCIe controller driver (Xiaowei Song) - add Sigma Designs Tango SMP8759 PCIe controller driver (Marc Gonzalez) - add MediaTek PCIe host controller support (Ryder Lee) - add Qualcomm IPQ4019 support (John Crispin) - add HyperV vPCI protocol v1.2 support (Jork Loeser) - add i.MX6 regulator support (Quentin Schulz) * tag 'pci-v4.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits) PCI: tango: Add Sigma Designs Tango SMP8759 PCIe host bridge support PCI: Add DT binding for Sigma Designs Tango PCIe controller PCI: rockchip: Use normal register bank for config accessors dt-bindings: PCI: Add documentation for MediaTek PCIe PCI: Remove __pci_dev_reset() and pci_dev_reset() PCI: Split ->reset_notify() method into ->reset_prepare() and ->reset_done() PCI: xilinx: Make of_device_ids const PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts PCI: vmd: Move SRCU cleanup after bus, child device removal PCI: vmd: Correct comment: VMD domains start at 0x10000, not 0x1000 PCI: versatile: Add local struct device pointers PCI: tegra: Do not allocate MSI target memory PCI: tegra: Support MSI 64-bit addressing PCI: rockchip: Use local struct device pointer consistently PCI: rockchip: Check for clk_prepare_enable() errors during resume MAINTAINERS: Remove Wenrui Li as Rockchip PCIe driver maintainer PCI: rockchip: Configure RC's MPS setting PCI: rockchip: Reconfigure configuration space header type PCI: rockchip: Split out rockchip_pcie_cfg_configuration_accesses() PCI: rockchip: Move configuration accesses into rockchip_pcie_cfg_atu() ...
| * | arm64: PCI: Drop DT IRQ allocation from pcibios_alloc_irq()Lorenzo Pieralisi2017-07-021-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of struct pci_host_bridge.map_irq pointer it is possible to assign IRQs for all devices originating from a PCI host bridge at probe time; this is implemented through pci_assign_irq() that relies on the struct pci_host_bridge.map_irq pointer to map IRQ for a given device. The benefits this brings are twofold: - the IRQ for a device is assigned once at probe time - the IRQ assignment works also for hotplugged devices With all DT based PCI host bridges converted to the struct pci_host_bridge.{map/swizzle}_irq hooks mechanism the DT IRQ allocation in ARM64 pcibios_alloc_irq() is now redundant and can be removed. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Will Deacon <will.deacon@arm.com>
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2017-07-061-0/+21
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "PPC: - Better machine check handling for HV KVM - Ability to support guests with threads=2, 4 or 8 on POWER9 - Fix for a race that could cause delayed recognition of signals - Fix for a bug where POWER9 guests could sleep with interrupts pending. ARM: - VCPU request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups s390: - initial machine check forwarding - migration support for the CMMA page hinting information - cleanups and fixes x86: - nested VMX bugfixes and improvements - more reliable NMI window detection on AMD - APIC timer optimizations Generic: - VCPU request overhaul + documentation of common code patterns - kvm_stat improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (124 commits) Update my email address kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12 kvm: x86: mmu: allow A/D bits to be disabled in an mmu x86: kvm: mmu: make spte mmio mask more explicit x86: kvm: mmu: dead code thanks to access tracking KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving code KVM: PPC: Book3S HV: Close race with testing for signals on guest entry KVM: PPC: Book3S HV: Simplify dynamic micro-threading code KVM: x86: remove ignored type attribute KVM: LAPIC: Fix lapic timer injection delay KVM: lapic: reorganize restart_apic_timer KVM: lapic: reorganize start_hv_timer kvm: nVMX: Check memory operand to INVVPID KVM: s390: Inject machine check into the nested guest KVM: s390: Inject machine check into the guest tools/kvm_stat: add new interactive command 'b' tools/kvm_stat: add new command line switch '-i' tools/kvm_stat: fix error on interactive command 'g' KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit ...
| * \ \ Merge tag 'kvmarm-for-4.13' of ↵Paolo Bonzini2017-06-301-0/+21
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.13 - vcpu request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups Conflicts: arch/s390/include/asm/kvm_host.h
| | * | | arm64: Add workaround for Cavium Thunder erratum 30115David Daney2017-06-151-0/+21
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some Cavium Thunder CPUs suffer a problem where a KVM guest may inadvertently cause the host kernel to quit receiving interrupts. Use the Group-0/1 trapping in order to deal with it. [maz]: Adapted patch to the Group-0/1 trapping, reworked commit log Tested-by: Alexander Graf <agraf@suse.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
* | | | Merge tag 'arm64-upstream' of ↵Linus Torvalds2017-07-0521-99/+611
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - RAS reporting via GHES/APEI (ACPI) - Indirect ftrace trampolines for modules - Improvements to kernel fault reporting - Page poisoning - Sigframe cleanups and preparation for SVE context - Core dump fixes - Sparse fixes (mainly relating to endianness) - xgene SoC PMU v3 driver - Misc cleanups and non-critical fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits) arm64: fix endianness annotation for 'struct jit_ctx' and friends arm64: cpuinfo: constify attribute_group structures. arm64: ptrace: Fix incorrect get_user() use in compat_vfp_set() arm64: ptrace: Remove redundant overrun check from compat_vfp_set() arm64: ptrace: Avoid setting compat FP[SC]R to garbage if get_user fails arm64: fix endianness annotation for __apply_alternatives()/get_alt_insn() arm64: fix endianness annotation in get_kaslr_seed() arm64: add missing conversion to __wsum in ip_fast_csum() arm64: fix endianness annotation in acpi_parking_protocol.c arm64: use readq() instead of readl() to read 64bit entry_point arm64: fix endianness annotation for reloc_insn_movw() & reloc_insn_imm() arm64: fix endianness annotation for aarch64_insn_write() arm64: fix endianness annotation in aarch64_insn_read() arm64: fix endianness annotation in call_undef_hook() arm64: fix endianness annotation for debug-monitors.c ras: mark stub functions as 'inline' arm64: pass endianness info to sparse arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernels arm64: signal: Allow expansion of the signal frame acpi: apei: check for pending errors when probing GHES entries ...
| * | | | arm64: cpuinfo: constify attribute_group structures.Arvind Yadav2017-06-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attribute_groups are not supposed to change at runtime. All functions working with attribute_groups provided by <linux/sysfs.h> work with const attribute_group. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: ptrace: Fix incorrect get_user() use in compat_vfp_set()Dave Martin2017-06-291-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that compat_vfp_get() uses the regset API to copy the FPSCR value out to userspace, compat_vfp_set() looks inconsistent. In particular, compat_vfp_set() will fail if called with kbuf != NULL && ubuf == NULL (which is valid usage according to the regset API). This patch fixes compat_vfp_set() to use user_regset_copyin(), similarly to compat_vfp_get(). This also squashes a sparse warning triggered by the cast that drops __user when calling get_user(). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: ptrace: Remove redundant overrun check from compat_vfp_set()Dave Martin2017-06-291-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | compat_vfp_set() checks for userspace trying to write an excessive amount of data to the regset. However this check is conspicuous for its absence from every other _set() in the arm64 ptrace implementation. In fact, the core ptrace_regset() already clamps userspace's iov_len to the regset size before the individual regset .{get,set}() methods get called. This patch removes the redundant check. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: ptrace: Avoid setting compat FP[SC]R to garbage if get_user failsDave Martin2017-06-291-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If get_user() fails when reading the new FPSCR value from userspace in compat_vfp_get(), then garbage* will be written to the task's FPSR and FPCR registers. This patch prevents this by checking the return from get_user() first. [*] Actually, zero, due to the behaviour of get_user() on error, but that's still not what userspace expects. Fixes: 478fcb2cdb23 ("arm64: Debugging support") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation for __apply_alternatives()/get_alt_insn()Luc Van Oostenryck2017-06-291-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | get_alt_insn() is used to read and create ARM instructions, which are always stored in memory in little-endian order. These values are thus correctly converted to/from native order when processed but the pointers used to hold the address of these instructions are declared as for native order values. Fix this by declaring the pointers as __le32* instead of u32* and make the few appropriate needed changes like removing the unneeded cast '(u32*)' in front of __ALT_PTR()'s definition. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation in get_kaslr_seed()Luc Van Oostenryck2017-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the flattened device tree format, all integer properties are in big-endian order. Here the property "kaslr-seed" is read from the fdt and then correctly converted to native order (via fdt64_to_cpu()) but the pointer used for this is not annotated as being for big-endian. Fix this by declaring the pointer as fdt64_t instead of u64 (fdt64_t being itself typedefed to __be64). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation in acpi_parking_protocol.cLuc Van Oostenryck2017-06-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here both variables 'cpu_id' and 'entry_point' are read via read[lq]_relaxed(), from a little-endian annotated pointer and then used as a native endian value. This is correct since the read[lq]() family of function internally do a little-to-native endian conversion. But in this case, it is wrong to declare these variable as little-endian since there are native ones. Fix this by changing the declaration of these variables as 'u32' or 'u64' instead of '__le32' / '__le64'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: use readq() instead of readl() to read 64bit entry_pointLuc Van Oostenryck2017-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here the entrypoint, declared as a 64 bit integer, is read from a pointer to 64bit integer but the read is done via readl_relaxed() which is for 32bit quantities. All the high bits will thus be lost which change the meaning of the test against zero done later. Fix this by using readq_relaxed() instead as it should be for 64bit quantities. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation for reloc_insn_movw() & reloc_insn_imm()Luc Van Oostenryck2017-06-291-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here the functions reloc_insn_movw() & reloc_insn_imm() are used to read, modify and write back ARM instructions, which are always stored in memory in little-endian order. These values are thus correctly converted to/from native order but the pointers used to hold their addresses are declared as for native order values. Fix this by declaring the pointers as __le32* and remove the casts that are now unneeded. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation for aarch64_insn_write()Luc Van Oostenryck2017-06-291-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | aarch64_insn_write() is used to write an instruction. As on ARM64 in-memory instructions are always stored in little-endian order, this function, taking the instruction opcode in native order, correctly convert it to little-endian before sending it to an helper function __aarch64_insn_write() which will do the effective write. This is all good, but the variable and argument holding the converted value are not annotated for a little-endian value but left for native values. Fix this by adjusting the prototype of the helper and directly using the result of cpu_to_le32() without passing by an intermediate variable (which was not a distinct one but the same as the one holding the native value). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation in aarch64_insn_read()Luc Van Oostenryck2017-06-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function arch64_insn_read() is used to read an instruction. On AM64 instructions are always stored in little-endian order and thus the function correctly do a little-to-native endian conversion to the value just read. However, the variable used to hold the value before the conversion is not declared for a little-endian value but for a native one. Fix this by using the correct type for the declaration: __le32 Note: This only works because the function reading the value, probe_kernel_read((), takes a void pointer and void pointers are endian-agnostic. Otherwise probe_kernel_read() should also be properly annotated (or worse, need to be specialized). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation in call_undef_hook()Luc Van Oostenryck2017-06-291-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we're reading thumb or ARM instructions, which are always stored in memory in little-endian order. These values are thus correctly converted to native order but the intermediate value should be annotated as for little-endian values. Fix this by declaring the intermediate var as __le32 or __le16. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | arm64: fix endianness annotation for debug-monitors.cLuc Van Oostenryck2017-06-291-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we're reading thumb or ARM instructions, which are always stored in memory in little-endian order. These values are thus correctly converted to native order but the intermediate value should be annotated as for little-endian values. Fix this by declaring the intermediate var as __le32 or __le16. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | Merge branch 'aarch64/for-next/ras-apei' into aarch64/for-next/coreWill Deacon2017-06-261-1/+3
| |\ \ \ \ | | | | | | | | | | | | | | | | | | Merge in arm64 ACPI RAS support (APEI/GHES) from Tyler Baicar.
| * \ \ \ \ Merge branch 'perf/updates' into aarch64/for-next/coreWill Deacon2017-06-261-1/+1
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge in arm64 perf updates: * xgene system PMUv3 support * 16-bit events for ARMv8.1
| | * | | | | arm64: perf: Extend event config for ARMv8.1Shaokun Zhang2017-05-301-1/+1
| | | |/ / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf has supported ARMv8.1 feature with 16-bit evtCount filed [see c210ae8 arm64: perf: Extend event mask for ARMv8.1], event config should be extended to 16-bit too, otherwise, if use -e event_name whose event_code is more than 0x3ff, pmu_config_term will return -EINVAL because function pmu_format_max_value depends on event config. This patch extends event config to 16-bit. Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernelsMark Rutland2017-06-231-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a kernel is built without CONFIG_ARM64_MODULE_PLTS, we don't generate the expected branch instruction in ftrace_make_nop(). This means we pass zero (rather than a valid branch) to ftrace_modify_code() as the expected instruction to validate. This causes us to return -EINVAL to the core ftrace code for a valid case, resulting in a splat at boot time. This was an unintended effect of commit: 687644209a6e9557 ("arm64: ftrace: fix building without CONFIG_MODULES") ... which incorrectly moved the generation of the branch instruction into the ifdef for CONFIG_ARM64_MODULE_PLTS. This patch fixes the issue by moving the ifdef inside of the relevant if-else case, and always checking that the branch is in range, regardless of CONFIG_ARM64_MODULE_PLTS. This ensures that we generate the expected branch instruction, and also improves our sanity checks. For consistency, both ftrace_make_nop() and ftrace_make_call() are updated with this pattern. Fixes: 687644209a6e9557 ("arm64: ftrace: fix building without CONFIG_MODULES") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: signal: Allow expansion of the signal frameDave Martin2017-06-231-18/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch defines an extra_context signal frame record that can be used to describe an expanded signal frame, and modifies the context block allocator and signal frame setup and parsing code to create, populate, parse and decode this block as necessary. To avoid abuse by userspace, parse_user_sigframe() attempts to ensure that: * no more than one extra_context is accepted; * the extra context data is a sensible size, and properly placed and aligned. The extra_context data is required to start at the first 16-byte aligned address immediately after the dummy terminator record following extra_context in rt_sigframe.__reserved[] (as ensured during signal delivery). This serves as a sanity-check that the signal frame has not been moved or copied without taking the extra data into account. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [will: add __force annotation when casting extra_datap to __user pointer] Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: dump cpu_hwcaps at panic timeMark Rutland2017-06-221-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When debugging a kernel panic(), it can be useful to know which CPU features have been detected by the kernel, as some code paths can depend on these (and may have been patched at runtime). This patch adds a notifier to dump the detected CPU caps (as a hex string) at panic(), when we log other information useful for debugging. On a Juno R1 system running v4.12-rc5, this looks like: [ 615.431249] Kernel panic - not syncing: Fatal exception in interrupt [ 615.437609] SMP: stopping secondary CPUs [ 615.441872] Kernel Offset: disabled [ 615.445372] CPU features: 0x02086 [ 615.448522] Memory Limit: none A developer can decode this by looking at the corresponding <asm/cpucaps.h> bits. For example, the above decodes as: * bit 1: ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE * bit 2: ARM64_WORKAROUND_845719 * bit 7: ARM64_WORKAROUND_834220 * bit 13: ARM64_HAS_32BIT_EL0 Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: ptrace: Flush user-RW TLS reg to thread_struct before readingDave Martin2017-06-222-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading current's user-writable TLS register (which occurs when dumping core for native tasks), it is possible that userspace has modified it since the time the task was last scheduled out. The new TLS register value is not guaranteed to have been written immediately back to thread_struct in this case. As a result, a coredump can capture stale data for this register. Reading the register for a stopped task via ptrace is unaffected. For native tasks, this patch explicitly flushes the TPIDR_EL0 register back to thread_struct before dumping when operating on current, thus ensuring that coredump contents are up to date. For compat tasks, the TLS register is not user-writable and so cannot be out of sync, so no flush is required in compat_tls_get(). Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: ptrace: Flush FPSIMD regs back to thread_struct before readingDave Martin2017-06-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading the FPSIMD state of current (which occurs when dumping core), it is possible that userspace has modified the FPSIMD registers since the time the task was last scheduled out. Such changes are not guaranteed to be reflected immedately in thread_struct. As a result, a coredump can contain stale values for these registers. Reading the registers of a stopped task via ptrace is unaffected. This patch explicitly flushes the CPU state back to thread_struct before dumping when operating on current, thus ensuring that coredump contents are up to date. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: ptrace: Fix VFP register dumping in compat coredumpsDave Martin2017-06-221-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, VFP registers are omitted from coredumps for compat processes, due to a bug in the REGSET_COMPAT_VFP regset implementation. compat_vfp_get() needs to transfer non-contiguous data from thread_struct.fpsimd_state, and uses put_user() to handle the offending trailing word (FPSCR). This fails when copying to a kernel address (i.e., kbuf && !ubuf), which is what happens when dumping core. As a result, the ELF coredump core code silently omits the NT_ARM_VFP note from the dump. It would be possible to work around this with additional special case code for the put_user(), but since user_regset_copyout() is explicitly designed to handle this scenario it is cleaner to port the put_user() to a user_regset_copyout() call, which this patch does. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: signal: factor out signal frame record allocationDave Martin2017-06-201-7/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch factors out the allocator for signal frame optional records into a separate function, to ensure consistency and facilitate later expansion. No overrun checking is currently done, because the allocation is in user memory and anyway the kernel never tries to allocate enough space in the signal frame yet for an overrun to occur. This behaviour will be refined in future patches. The approach taken in this patch to allocation of the terminator record is not very clean: this will also be replaced in subsequent patches. For future extension, a comment is added in sigcontext.h documenting the current static allocations in __reserved[]. This will be important for determining under what circumstances userspace may or may not see an expanded signal frame. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * | | | | arm64: signal: factor frame layout and population into separate passesDave Martin2017-06-201-23/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for expanding the signal frame, this patch refactors the signal frame setup code in setup_sigframe() into two separate passes. The first pass, setup_sigframe_layout(), determines the size of the signal frame and its internal layout, including the presence and location of optional records. The resulting knowledge is used to allocate and locate the user stack space required for the signal frame and to determine which optional records to include. The second pass, setup_sigframe(), is called once the stack frame is allocated in order to populate it with the necessary context information. As a result of these changes, it becomes more natural to represent locations in the signal frame by a base pointer and an offset, since the absolute address of each location is not known during the layout pass. To be more consistent with this logic, parse_user_sigframe() is refactored to describe signal frame locations in a similar way. This change has no effect on the signal ABI, but will make it easier to expand the signal frame in future patches. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>