summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/kernel/irq.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'powerpc-5.3-1' of ↵Linus Torvalds2019-07-131-3/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver, as well as some other functions only used by drivers that haven't (yet?) made it upstream. - A fix for a bug in our handling of hardware watchpoints (eg. perf record -e mem: ...) which could lead to register corruption and kernel crashes. - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for vmalloc when using the Radix MMU. - A large but incremental rewrite of our exception handling code to use gas macros rather than multiple levels of nested CPP macros. And the usual small fixes, cleanups and improvements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz, Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj Jitindar Singh, Thiago Jung Bauermann, YueHaibing" * tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits) powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state. powerpc/eeh: Handle hugepages in ioremap space ocxl: Update for AFU descriptor template version 1.1 powerpc/boot: pass CONFIG options in a simpler and more robust way powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h powerpc/irq: Don't WARN continuously in arch_local_irq_restore() powerpc/module64: Use symbolic instructions names. powerpc/module32: Use symbolic instructions names. powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h powerpc/module64: Fix comment in R_PPC64_ENTRY handling powerpc/boot: Add lzo support for uImage powerpc/boot: Add lzma support for uImage powerpc/boot: don't force gzipped uImage powerpc/8xx: Add microcode patch to move SMC parameter RAM. powerpc/8xx: Use IO accessors in microcode programming. powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c powerpc/8xx: refactor programming of microcode CPM params. powerpc/8xx: refactor printing of microcode patch name. powerpc/8xx: Refactor microcode write powerpc/8xx: refactor writing of CPM microcode arrays ...
| * powerpc/irq: Don't WARN continuously in arch_local_irq_restore()Michael Ellerman2019-07-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is enabled (uncommon), we have a series of WARN_ON's in arch_local_irq_restore(). These are "should never happen" conditions, but if they do happen they can flood the console and render the system unusable. So switch them to WARN_ON_ONCE(). Fixes: e2b36d591720 ("powerpc/64: Don't trace code that runs with the soft irq mask unreconciled") Fixes: 9b81c0211c24 ("powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closely") Fixes: 7c0482e3d055 ("powerpc/irq: Fix another case of lazy IRQ state getting out of sync") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190708061046.7075-1-mpe@ellerman.id.au
* | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner2019-05-301-5/+1
|/ | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* powerpc/64: Don't trace code that runs with the soft irq mask unreconciledNicholas Piggin2019-05-031-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | "Reconciling" in terms of interrupt handling, is to bring the soft irq mask state in to synch with the hardware, after an interrupt causes MSR[EE] to be cleared (while the soft mask may be enabled, and hard irqs not marked disabled). General kernel code should not be called while unreconciled, because local_irq_disable, etc. manipulations can cause surprising irq traces, and it's fragile because the soft irq code does not really expect to be called in this situation. When exiting from an interrupt, MSR[EE] is cleared to prevent races, but soft irq state is enabled for the returned-to context, so this is now an unreconciled state. restore_math is called in this state, and that can be ftraced, and the ftrace subsystem disables local irqs. Mark restore_math and its callees as notrace. Restore a sanity check in the soft irq code that had to be disabled for this case, by commit 4da1f79227ad4 ("powerpc/64: Disable irq restore warning for now"). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/irq: drop __irq_offset_valueChristophe Leroy2019-05-031-3/+0
| | | | | | | | | | | This patch drops__irq_offset_value which has not been used since commit 9c4cb8251513 ("powerpc: Remove use of CONFIG_PPC_MERGE") This removes a sparse warning. Fixes: 9c4cb8251513 ("powerpc: Remove use of CONFIG_PPC_MERGE") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: clean stack pointers namingChristophe Leroy2019-02-231-10/+7
| | | | | | | | | Some stack pointers used to also be thread_info pointers and were called tp. Now that they are only stack pointers, rename them sp. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: regain entire stack spaceChristophe Leroy2019-02-231-10/+9
| | | | | | | | | | | | | | | | | | | thread_info is not anymore in the stack, so the entire stack can now be used. There is also no risk anymore of corrupting task_cpu(p) with a stack overflow so the patch removes the test. When doing this, an explicit test for NULL stack pointer is needed in validate_sp() as it is not anymore implicitely covered by the sizeof(thread_info) gap. In the meantime, with the previous patch all pointers to the stacks are not anymore pointers to thread_info so this patch changes them to void* Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Activate CONFIG_THREAD_INFO_IN_TASKChristophe Leroy2019-02-231-78/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch activates CONFIG_THREAD_INFO_IN_TASK which moves the thread_info into task_struct. Moving thread_info into task_struct has the following advantages: - It protects thread_info from corruption in the case of stack overflows. - Its address is harder to determine if stack addresses are leaked, making a number of attacks more difficult. This has the following consequences: - thread_info is now located at the beginning of task_struct. - The 'cpu' field is now in task_struct, and only exists when CONFIG_SMP is active. - thread_info doesn't have anymore the 'task' field. This patch: - Removes all recopy of thread_info struct when the stack changes. - Changes the CURRENT_THREAD_INFO() macro to point to current. - Selects CONFIG_THREAD_INFO_IN_TASK. - Modifies raw_smp_processor_id() to get ->cpu from current without including linux/sched.h to avoid circular inclusion and without including asm/asm-offsets.h to avoid symbol names duplication between ASM constants and C constants. - Modifies klp_init_thread_info() to take a task_struct pointer argument. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Add task_stack.h to livepatch.h to fix build fails] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Don't use CURRENT_THREAD_INFO to find the stackChristophe Leroy2019-02-231-1/+1
| | | | | | | | | | | A few places use CURRENT_THREAD_INFO, or the C version, to find the stack. This will no longer work with THREAD_INFO_IN_TASK so change them to find the stack in other ways. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/irq: use memblock functions returning virtual addressChristophe Leroy2019-02-231-5/+0
| | | | | | | | | | | | | Since only the virtual address of allocated blocks is used, lets use functions returning directly virtual address. Those functions have the advantage of also zeroing the block. Suggested-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/irq: drop arch_early_irq_init()Christophe Leroy2019-01-141-5/+0
| | | | | | | | | | arch_early_irq_init() does nothing different than the weak arch_early_irq_init() in kernel/softirq.c Fixes: 089fb442f301 ("powerpc: Use ARCH_IRQ_INIT_FLAGS") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Disable irq restore warning for nowMichael Ellerman2018-08-071-3/+10
| | | | | | | | | | | | | | | | | We recently added a warning in arch_local_irq_restore() to check that the soft masking state matches reality. Unfortunately it trips in a few places, which are not entirely trivial to fix. The key problem is if we're doing function_graph tracing of restore_math(), the warning pops and then seems to recurse. It's not entirely clear because the system continuously oopses on all CPUs, with the output interleaved and unreadable. It's also been observed on a G5 coming out of idle. Until we can fix those cases disable the warning for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s: make PACA_IRQ_HARD_DIS track MSR[EE] closelyNicholas Piggin2018-07-241-10/+24
| | | | | | | | | | | | | | | | | | | When the masked interrupt handler clears MSR[EE] for an interrupt in the PACA_IRQ_MUST_HARD_MASK set, it does not set PACA_IRQ_HARD_DIS. This makes them get out of synch. With that taken into account, it's only low level irq manipulation (and interrupt entry before reconcile) where they can be out of synch. This makes the code less surprising. It also allows the IRQ replay code to rely on the IRQ_HARD_DIS value and not have to mtmsrd again in this case (e.g., for an external interrupt that has been masked). The bigger benefit might just be that there is not such an element of surprise in these two bits of state. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/time: account broadcast timer event interrupts separatelyNicholas Piggin2018-06-031-0/+6
| | | | | | | | | These are not local timer interrupts but IPIs. It's good to be able to see how timer offloading is behaving, so split these out into their own category. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/tau: Synchronize function prototypes and bodyMathieu Malaterre2018-05-251-1/+1
| | | | | | | | | | | | | | | | | | Some function prototypes and body for Thermal Assist Units were not in sync. Update the function definition to match the existing function declaration found in `setup-common.c`, changing an `int` return type to a `u32` return type. Move the prototypes to a header file. Fix the following warnings, treated as error with W=1: arch/powerpc/kernel/tau_6xx.c:257:5: error: no previous prototype for ‘cpu_temp_both’ [-Werror=missing-prototypes] arch/powerpc/kernel/tau_6xx.c:262:5: error: no previous prototype for ‘cpu_temp’ [-Werror=missing-prototypes] arch/powerpc/kernel/tau_6xx.c:267:5: error: no previous prototype for ‘tau_interrupts’ [-Werror=missing-prototypes] Compile tested with CONFIG_TAU_INT. Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s: Fix lost pending interrupt due to race causing lost update to ↵Nicholas Piggin2018-03-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | irq_happened force_external_irq_replay() can be called in the do_IRQ path with interrupts hard enabled and soft disabled if may_hard_irq_enable() set MSR[EE]=1. It updates local_paca->irq_happened with a load, modify, store sequence. If a maskable interrupt hits during this sequence, it will go to the masked handler to be marked pending in irq_happened. This update will be lost when the interrupt returns and the store instruction executes. This can result in unpredictable latencies, timeouts, lockups, etc. Fix this by ensuring hard interrupts are disabled before modifying irq_happened. This could cause any maskable asynchronous interrupt to get lost, but it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE, so very high external interrupt rate and high IPI rate. The hang was bisected down to enabling doorbell interrupts for IPIs. These provided an interrupt type that could run at high rates in the do_IRQ path, stressing the race. Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Carol L. Soto <clsoto@us.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Add new kconfig CONFIG_PPC_IRQ_SOFT_MASK_DEBUGMadhavan Srinivasan2018-01-191-2/+2
| | | | | | | | | | | New Kconfig is added "CONFIG_PPC_IRQ_SOFT_MASK_DEBUG" to add WARN_ON to alert the invalid transitions. Also moved the code under the CONFIG_TRACE_IRQFLAGS in arch_local_irq_restore() to new Kconfig. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Fix name of CONFIG option in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s: Add support to mask perf interrupts and replay themMadhavan Srinivasan2018-01-191-1/+6
| | | | | | | | | | | | | | | | Two new bit mask field "IRQ_DISABLE_MASK_PMU" is introduced to support the masking of PMI and "IRQ_DISABLE_MASK_ALL" to aid interrupt masking checking. Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added to use in the exception code to check for PMI interrupts. In the masked_interrupt handler, for PMIs we reset the MSR[EE] and return. In the __check_irq_replay(), replay the PMI interrupt by calling performance_monitor_common handler. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Rename soft_enabled to irq_soft_maskMadhavan Srinivasan2018-01-191-18/+5
| | | | | | | | Rename the paca->soft_enabled to paca->irq_soft_mask as it is no longer used as a flag for interrupt state, but a mask. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Change soft_enabled from flag to bitmaskMadhavan Srinivasan2018-01-191-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "paca->soft_enabled" is used as a flag to mask some of interrupts. Currently supported flags values and their details: soft_enabled MSR[EE] 0 0 Disabled (PMI and HMI not masked) 1 1 Enabled "paca->soft_enabled" is initialized to 1 to make the interripts as enabled. arch_local_irq_disable() will toggle the value when interrupts needs to disbled. At this point, the interrupts are not actually disabled, instead, interrupt vector has code to check for the flag and mask it when it occurs. By "mask it", it update interrupt paca->irq_happened and return. arch_local_irq_restore() is called to re-enable interrupts, which checks and replays interrupts if any occured. Now, as mentioned, current logic doesnot mask "performance monitoring interrupts" and PMIs are implemented as NMI. But this patchset depends on local_irq_* for a successful local_* update. Meaning, mask all possible interrupts during local_* update and replay them after the update. So the idea here is to reserve the "paca->soft_enabled" logic. New values and details: soft_enabled MSR[EE] 1 0 Disabled (PMI and HMI not masked) 0 1 Enabled Reason for the this change is to create foundation for a third mask value "0x2" for "soft_enabled" to add support to mask PMIs. When ->soft_enabled is set to a value "3", PMI interrupts are mask and when set to a value of "1", PMI are not mask. With this patch also extends soft_enabled as interrupt disable mask. Current flags are renamed from IRQ_[EN?DIS}ABLED to IRQS_ENABLED and IRQS_DISABLED. Patch also fixes the ptrace call to force the user to see the softe value to be alway 1. Reason being, even though userspace has no business knowing about softe, it is part of pt_regs. Like-wise in signal context. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Move set_soft_enabled() and renameMadhavan Srinivasan2018-01-191-10/+4
| | | | | | | | | | | | | | | Move set_soft_enabled() from powerpc/kernel/irq.c to asm/hw_irq.c, to encourage updates to paca->soft_enabled done via these access function. Add "memory" clobber to hint compiler since paca->soft_enabled memory is the target here. Renaming it as soft_enabled_set() will make namespaces works better as prefix than a postfix when new soft_enabled manipulation functions are introduced. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Add #defines for paca->soft_enabled flagsMadhavan Srinivasan2018-01-191-4/+5
| | | | | | | | | | | Two #defines IRQS_ENABLED and IRQS_DISABLED are added to be used when updating paca->soft_enabled. Replace the hardcoded values used when updating paca->soft_enabled with IRQ_(EN|DIS)ABLED #define. No logic change. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Fix latency tracing for lazy irq replayNicholas Piggin2017-11-061-0/+9
| | | | | | | | | | | | | When returning from an exception to a soft-enabled context, pending IRQs are replayed but IRQ tracing is not reset, so a number of them can get chained together into the same IRQ-disabled trace. Fix this by having __check_irq_replay re-set IRQ trace. This is conceptually where we respond to the next interrupt, so it fits the semantics of the IRQ tracer. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* KVM: PPC: Book3S HV: Handle host system reset in guest modeNicholas Piggin2017-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the host takes a system reset interrupt while a guest is running, the CPU must exit the guest before processing the host exception handler. After this patch, taking a sysrq+x with a CPU running in a guest gives a trace like this: cpu 0x27: Vector: 100 (System Reset) at [c000000fdf5776f0] pc: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv] lr: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv] sp: c000000fdf577850 msr: 9000000002803033 current = 0xc000000fdf4b1e00 paca = 0xc00000000fd4d680 softe: 3 irq_happened: 0x01 pid = 6608, comm = qemu-system-ppc Linux version 4.14.0-rc7-01489-g47e1893a404a-dirty #26 SMP [c000000fdf577a00] c008000010159dd4 kvmppc_vcpu_run_hv+0x3dc/0x12d0 [kvm_hv] [c000000fdf577b30] c0080000100a537c kvmppc_vcpu_run+0x44/0x60 [kvm] [c000000fdf577b60] c0080000100a1ae0 kvm_arch_vcpu_ioctl_run+0x118/0x310 [kvm] [c000000fdf577c00] c008000010093e98 kvm_vcpu_ioctl+0x530/0x7c0 [kvm] [c000000fdf577d50] c000000000357bf8 do_vfs_ioctl+0xd8/0x8c0 [c000000fdf577df0] c000000000358448 SyS_ioctl+0x68/0x100 [c000000fdf577e30] c00000000000b220 system_call+0x58/0x6c --- Exception: c01 (System Call) at 00007fff76868df0 SP (7fff7069baf0) is in userspace Fixes: e36d0a2ed5 ("powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s: Implement system reset idle wakeup reasonNicholas Piggin2017-10-041-3/+38
| | | | | | | | | | | | | | | It is possible to wake from idle due to a system reset exception, in which case the CPU takes a system reset interrupt to wake from idle, with system reset as the wakeup reason. The regular (not idle wakeup) system reset interrupt handler must be invoked in this case, otherwise the system reset interrupt is lost. Handle the system reset interrupt immediately after CPU state has been restored. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s: Merge HV and non-HV paths for doorbell IRQ replayNicholas Piggin2017-08-231-2/+0
| | | | | | | | | | This results in smaller code, and fewer branches. This relies on the fact that both the 0xe80 and 0xa00 handlers call the same upper level code, namely doorbell_exception(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Mention we rely on the implementation of the 0xe80/0xa00 handlers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Cleanup __check_irq_replay()Nicholas Piggin2017-08-231-22/+23
| | | | | | | | | Move the clearing of irq_happened bits into the condition where they were found to be set. This reduces instruction count slightly, and reduces stores into irq_happened. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge branch 'fixes' into nextMichael Ellerman2017-08-231-1/+14
|\ | | | | | | | | | | There's a non-trivial dependency between some commits we want to put in next and the KVM prefetch work around that went into fixes. So merge fixes into next.
| * powerpc/64: Fix __check_irq_replay missing decrementer interruptNicholas Piggin2017-08-041-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the decrementer wraps again and de-asserts the decrementer exception while hard-disabled, __check_irq_replay() has a test to notice the wrap when interrupts are re-enabled. The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked because the decrementer interrupt was always the first one checked after clearing the hard disable flag, but HMI check was moved ahead of that, which introduced this bug. This can cause a missed decrementer interrupt if we soft-disable interrupts then take an HMI which is recorded in irq_happened, then hard-disable interrupts for > 4s to wrap the decrementer. Fixes: e0e0d6b7390b ("powerpc/64: Replay hypervisor maintenance interrupt first") Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc/8xx: Getting rid of remaining use of CONFIG_8xxChristophe Leroy2017-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two config options exist to define powerpc MPC8xx: * CONFIG_PPC_8xx * CONFIG_8xx arch/powerpc/platforms/Kconfig.cputype has contained the following comment about CONFIG_8xx item for some years: "# this is temp to handle compat with arch=ppc" arch/powerpc is now the only place with remaining use of CONFIG_8xx: get rid of them. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Add irq accounting for watchdog interruptsNicholas Piggin2017-08-101-0/+10
| | | | | | | | | | | | | | | | | | | | This adds an irq counter for the watchdog soft-NMI. This interrupt only fires when interrupts are soft-disabled, so it will not increment much even when the watchdog is running. However it's useful for debugging and sanity checking. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Add irq accounting for system reset interruptsNicholas Piggin2017-08-101-0/+6
|/ | | | | Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s/idle: Process interrupts from system reset wakeupNicholas Piggin2017-06-191-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | When the CPU wakes from low power state, it begins at the system reset interrupt with the exception that caused the wakeup encoded in SRR1. Today, powernv idle wakeup ignores the wakeup reason (except a special case for HMI), and the regular interrupt corresponding to the exception will fire after the idle wakeup exits. Change this to replay the interrupt from the idle wakeup before interrupts are hard-enabled. Test on POWER8 of context_switch selftests benchmark with polling idle disabled (e.g., always nap, giving cross-CPU IPIs) gives the following results: original wakeup direct Different threads, same core: 315k/s 264k/s Different cores: 235k/s 242k/s There is a slowdown for doorbell IPI (same core) case because system reset wakeup does not clear the message and the doorbell interrupt fires again needlessly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s/idle: Move soft interrupt mask logic into C codeNicholas Piggin2017-06-191-1/+32
| | | | | | | | | | | | This simplifies the asm and fixes irq-off tracing over sleep instructions. Also move powersave_nap check for POWER8 into C code, and move PSSCR register value calculation for POWER9 into C. Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Revert "powerpc: Handle simultaneous interrupts at once"Michael Ellerman2017-06-151-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 45cb08f4791ce6a15c54598b4cb73db4b4b8294f. For some reason this is causing IRQ problems on Freescale Book3E machines, eg on my p5020ds: irq 25: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc3-gcc-6.3.1-00037-g45cb08f4791c #624 Call Trace: [c0000000fffdbb10] [c00000000049962c] .dump_stack+0xa8/0xe8 (unreliable) [c0000000fffdbba0] [c0000000000babf4] .__report_bad_irq+0x54/0x140 [c0000000fffdbc40] [c0000000000bb11c] .note_interrupt+0x324/0x380 [c0000000fffdbd00] [c0000000000b7110] .handle_irq_event_percpu+0x68/0x88 [c0000000fffdbd90] [c0000000000b718c] .handle_irq_event+0x5c/0xa8 [c0000000fffdbe10] [c0000000000bc01c] .handle_fasteoi_irq+0xe4/0x298 [c0000000fffdbe90] [c0000000000b59c4] .generic_handle_irq+0x50/0x74 [c0000000fffdbf10] [c0000000000075d8] .__do_irq+0x74/0x1f0 [c0000000fffdbf90] [c0000000000189f8] .call_do_irq+0x14/0x24 [c0000000f7173060] [c0000000000077e4] .do_IRQ+0x90/0x120 [c0000000f7173100] [c00000000001d93c] exc_0x500_common+0xfc/0x100 --- interrupt: 501 at .prepare_to_wait_event+0xc/0x14c LR = .fsl_elbc_run_command+0xc8/0x23c [c0000000f71734d0] [c00000000065f418] .nand_reset+0xb8/0x168 [c0000000f7173560] [c00000000065fec4] .nand_scan_ident+0x2b0/0x1638 [c0000000f7173650] [c000000000666cd8] .fsl_elbc_nand_probe+0x34c/0x5f0 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [c0000000f7173750] [c0000000005a3c60] .platform_drv_probe+0x64/0xb0 [c0000000f71737d0] [c0000000005a12e0] .really_probe+0x290/0x334 [c0000000f7173870] [c0000000005a14a0] .__driver_attach+0x11c/0x120 [c0000000f7173900] [c00000000059e6a0] .bus_for_each_dev+0x98/0xfc [c0000000f71739a0] [c0000000005a0b3c] .driver_attach+0x34/0x4c [c0000000f7173a20] [c0000000005a04b0] .bus_add_driver+0x1ac/0x2e0 [c0000000f7173ac0] [c0000000005a2170] .driver_register+0x94/0x160 [c0000000f7173b40] [c0000000005a3be0] .__platform_driver_register+0x60/0x7c [c0000000f7173bc0] [c000000000d6aab4] .fsl_elbc_nand_driver_init+0x24/0x38 [c0000000f7173c30] [c000000000001934] .do_one_initcall+0x68/0x1b8 [c0000000f7173d00] [c000000000d210f8] .kernel_init_freeable+0x260/0x338 [c0000000f7173db0] [c0000000000021b0] .kernel_init+0x20/0xe70 [c0000000f7173e30] [c0000000000009bc] .ret_from_kernel_thread+0x58/0x9c handlers: [<c000000000ed85c8>] .fsl_lbc_ctrl_irq Disabling IRQ #25 Ben also had concerns with the implementation being potentially slow on some PICs, so revert it for now. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Handle simultaneous interrupts at onceChristophe Leroy2017-06-021-1/+5
| | | | | | | | | | | | | It often happens to have simultaneous interrupts, for instance when having double Ethernet attachment. With the current implementation, we suffer the cost of kernel entry/exit for each interrupt. This patch introduces a loop in __do_irq() to handle all interrupts at once before returning. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Merge branch 'topic/xive' (early part) into nextMichael Ellerman2017-04-121-40/+0
|\ | | | | | | | | | | This merges the arch part of the XIVE support, leaving the final commit with the KVM specific pieces dangling on the branch for Paul to merge via the kvm-ppc tree.
| * powerpc/smp: Remove migrate_irq() custom implementationBenjamin Herrenschmidt2017-04-071-40/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some powerpc platforms use this to move IRQs away from a CPU being unplugged. This function has several bugs such as not taking the right locks or failing to NULL check pointers. There's a new generic function doing exactly the same thing without all the bugs, so let's use it instead. mpe: The obvious place for the select of GENERIC_IRQ_MIGRATION is on HOTPLUG_CPU, but that doesn't work. On some configs PM_SLEEP_SMP will select HOTPLUG_CPU even though its dependencies are not met, which means the select of GENERIC_IRQ_MIGRATION doesn't happen. That leads to the build breaking. Fix it by moving the select of GENERIC_IRQ_MIGRATION to SMP. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* | powerpc: Remove unnecessary includes of asm/debug.hMichael Ellerman2017-04-111-1/+0
|/ | | | | | | These files don't seem to have any need for asm/debug.h, now that all it includes are the debugger hooks and breakpoint definitions. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds2016-12-241-1/+1
| | | | | | | | | | | | | This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* powerpc: Remove all usages of NO_IRQMichael Ellerman2016-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NO_IRQ has been == 0 on powerpc for just over ten years (since commit 0ebfff1491ef ("[POWERPC] Add new interrupt mapping core and change platforms to use it")). It's also 0 on most other arches. Although it's fairly harmless, every now and then it causes confusion when a driver is built on powerpc and another arch which doesn't define NO_IRQ. There's at least 6 definitions of NO_IRQ in drivers/, at least some of which are to work around that problem. So we'd like to remove it. This is fairly trivial in the arch code, we just convert: if (irq == NO_IRQ) to if (!irq) if (irq != NO_IRQ) to if (irq) irq = NO_IRQ; to irq = 0; return NO_IRQ; to return 0; And a few other odd cases as well. At least for now we keep the #define NO_IRQ, because there is driver code that uses NO_IRQ and the fixes to remove those will go via other trees. Note we also change some occurrences in PPC sound drivers, drivers/ps3, and drivers/macintosh. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Replay hypervisor maintenance interrupt firstNicholas Piggin2016-09-201-5/+9
| | | | | | | | | | The HMI (Hypervisor Maintenance Interrupt) is defined by the architecture to be higher priority than other maskable interrupts, so replay it first, as a best-effort to replay according to hardware priorities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/sparse: Add more assembler prototypesDaniel Axtens2016-09-131-0/+1
| | | | | | | | Another set of things that are only called from assembler and so need prototypes to keep sparse happy. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Move cpu_has_feature() to a separate fileKevin Hao2016-08-011-0/+1
| | | | | | | | | | | | | | | | | We plan to use jump label for cpu_has_feature(). In order to implement this we need to include the linux/jump_label.h in asm/cputable.h. Unfortunately if we do that it leads to an include loop. The root of the problem seems to be that reg.h needs cputable.h (for CPU_FTRs), and then cputable.h via jump_label.h eventually pulls in hw_irq.h which needs reg.h (for MSR_EE). So move cpu_has_feature() to a separate file on its own. Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [mpe: Rename to cpu_has_feature.h and flesh out change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/irq: Add mechanism to force a replay of interruptsBenjamin Herrenschmidt2016-07-171-0/+15
| | | | | | | | | | | | | Calling this function with interrupts soft-disabled will cause a replay of the external interrupt vector when they are re-enabled. This will be used by the OPAL XICS backend (and latter by the native XIVE code) to handle EOI signaling that there are more interrupts to fetch from the hardware since the hardware won't issue another HW interrupt in that case. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Fix typo in comment reference to CONFIG_TRACE_IRQFLAGSAndrew Donnellan2016-07-081-1/+1
| | | | | Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/livepatch: Add livepatch stack to struct thread_infoMichael Ellerman2016-04-141-0/+3
| | | | | | | | | | | | | | | In order to support live patching we need to maintain an alternate stack of TOC & LR values. We use the base of the stack for this, and store the "live patch stack pointer" in struct thread_info. Unlike the other fields of thread_info, we can not statically initialise that value, so it must be done at run time. This patch just adds the code to support that, it is not enabled until the next patch which actually adds live patch support. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com>
* powerpc, irq: Use access helper irq_data_get_affinity_mask()Jiang Liu2015-09-151-1/+1
| | | | | | | | | | | | Use access helper irq_data_get_affinity_mask() so we can move the affinity mask to irq_common_data. Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1433145945-789-25-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* powerpc: Remove superfluous bootmem includesAnton Blanchard2014-11-101-1/+0
| | | | | | | | Lots of places included bootmem.h even when not using bootmem. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc: Replace __get_cpu_var usesChristoph Lameter2014-11-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: Christoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>