summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked contextNicholas Piggin2022-10-211-2/+13
| | | | | | | | | | | | | | | | | | | | | | commit a4cb3651a174366cc85a677da9e3681fbe97fdae upstream. It's possible for an interrupt returning to an irqs-disabled context to lose a pending soft-masked irq because it branches to part of the exit code for irqs-enabled contexts, which is meant to clear only the PACA_IRQS_HARD_DIS flag from PACAIRQHAPPENED by zeroing the byte. This just looks like a simple thinko from a recent commit (if there was no hard mask pending, there would be no reason to clear it anyway). This also adds comment to the code that actually does need to clear the flag. Fixes: e485f6c751e0a ("powerpc/64/interrupt: Fix return to masked context after hard-mask irq becomes pending") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013064418.1311104-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* powerpc/pseries/vas: Pass hw_cpu_id to node associativity HCALLHaren Myneni2022-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit f3e5d9e53e74d77e711a2c90a91a8b0836a9e0b3 ] Generally the hypervisor decides to allocate a window on different VAS instances. But if user space wishes to allocate on the current VAS instance where the process is executing, the kernel has to pass associativity domain IDs to allocate VAS window HCALL. To determine the associativity domain IDs for the current CPU, smp_processor_id() is passed to node associativity HCALL which may return H_P2 (-55) error during DLPAR CPU event. This is because Linux CPU numbers (smp_processor_id()) are not the same as the hypervisor's view of CPU numbers. Fix the issue by passing hard_smp_processor_id() with VPHN_FLAG_VCPU flag (PAPR 14.11.6.1 H_HOME_NODE_ASSOCIATIVITY). Fixes: b22f2d88e435 ("powerpc/pseries/vas: Integrate API with open/close windows") Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Haren Myneni <haren@linux.ibm.com> [mpe: Update change log to mention Linux vs HV CPU numbers] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/55380253ea0c11341824cd4c0fc6bbcfc5752689.camel@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/kprobes: Fix null pointer reference in arch_prepare_kprobe()Li Huafei2022-10-211-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 97f88a3d723162781d6cbfdc7b9617eefab55b19 ] I found a null pointer reference in arch_prepare_kprobe(): # echo 'p cmdline_proc_show' > kprobe_events # echo 'p cmdline_proc_show+16' >> kprobe_events Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc000000000050bfc Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 122 Comm: sh Not tainted 6.0.0-rc3-00007-gdcf8e5633e2e #10 NIP: c000000000050bfc LR: c000000000050bec CTR: 0000000000005bdc REGS: c0000000348475b0 TRAP: 0300 Not tainted (6.0.0-rc3-00007-gdcf8e5633e2e) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 88002444 XER: 20040006 CFAR: c00000000022d100 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0 ... NIP arch_prepare_kprobe+0x10c/0x2d0 LR arch_prepare_kprobe+0xfc/0x2d0 Call Trace: 0xc0000000012f77a0 (unreliable) register_kprobe+0x3c0/0x7a0 __register_trace_kprobe+0x140/0x1a0 __trace_kprobe_create+0x794/0x1040 trace_probe_create+0xc4/0xe0 create_or_delete_trace_kprobe+0x2c/0x80 trace_parse_run_command+0xf0/0x210 probes_write+0x20/0x40 vfs_write+0xfc/0x450 ksys_write+0x84/0x140 system_call_exception+0x17c/0x3a0 system_call_vectored_common+0xe8/0x278 --- interrupt: 3000 at 0x7fffa5682de0 NIP: 00007fffa5682de0 LR: 0000000000000000 CTR: 0000000000000000 REGS: c000000034847e80 TRAP: 3000 Not tainted (6.0.0-rc3-00007-gdcf8e5633e2e) MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44002408 XER: 00000000 The address being probed has some special: cmdline_proc_show: Probe based on ftrace cmdline_proc_show+16: Probe for the next instruction at the ftrace location The ftrace-based kprobe does not generate kprobe::ainsn::insn, it gets set to NULL. In arch_prepare_kprobe() it will check for: ... prev = get_kprobe(p->addr - 1); preempt_enable_no_resched(); if (prev && ppc_inst_prefixed(ppc_inst_read(prev->ainsn.insn))) { ... If prev is based on ftrace, 'ppc_inst_read(prev->ainsn.insn)' will occur with a null pointer reference. At this point prev->addr will not be a prefixed instruction, so the check can be skipped. Check if prev is ftrace-based kprobe before reading 'prev->ainsn.insn' to fix this problem. Fixes: b4657f7650ba ("powerpc/kprobes: Don't allow breakpoints on suffixes") Signed-off-by: Li Huafei <lihuafei1@huawei.com> [mpe: Trim oops] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220923093253.177298-1-lihuafei1@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc: Fix SPE Power ISA properties for e500v1 platformsPali Rohár2022-10-215-4/+55
| | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 37b9345ce7f4ab17538ea62def6f6d430f091355 ] Commit 2eb28006431c ("powerpc/e500v2: Add Power ISA properties to comply with ePAPR 1.1") introduced new include file e500v2_power_isa.dtsi and should have used it for all e500v2 platforms. But apparently it was used also for e500v1 platforms mpc8540, mpc8541, mpc8555 and mpc8560. e500v1 cores compared to e500v2 do not support double precision floating point SPE instructions. Hence power-isa-sp.fd should not be set on e500v1 platforms, which is in e500v2_power_isa.dtsi include file. Fix this issue by introducing a new e500v1_power_isa.dtsi include file and use it in all e500v1 device tree files. Fixes: 2eb28006431c ("powerpc/e500v2: Add Power ISA properties to comply with ePAPR 1.1") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220902212103.22534-1-pali@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/64/interrupt: Fix return to masked context after hard-mask irq ↵Nicholas Piggin2022-10-212-13/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | becomes pending [ Upstream commit e485f6c751e0a969327336c635ca602feea117f0 ] If a synchronous interrupt (e.g., hash fault) is taken inside an irqs-disabled region which has MSR[EE]=1, then an asynchronous interrupt that is PACA_IRQ_MUST_HARD_MASK (e.g., PMI) is taken inside the synchronous interrupt handler, then the synchronous interrupt will return with MSR[EE]=1 and the asynchronous interrupt fires again. If the asynchronous interrupt is a PMI and the original context does not have PMIs disabled (only Linux IRQs), the asynchronous interrupt will fire despite having the PMI marked soft pending. This can confuse the perf code and cause warnings. This patch changes the interrupt return so that irqs-disabled MSR[EE]=1 contexts will be returned to with MSR[EE]=0 if a PACA_IRQ_MUST_HARD_MASK interrupt has become pending in the meantime. The longer explanation for what happens: 1. local_irq_disable() 2. Hash fault interrupt fires, do_hash_fault handler runs 3. interrupt_enter_prepare() sets IRQS_ALL_DISABLED 4. interrupt_enter_prepare() sets MSR[EE]=1 5. PMU interrupt fires, masked handler runs 6. Masked handler marks PMI pending 7. Masked handler returns with PACA_IRQ_HARD_DIS set, MSR[EE]=0 8. do_hash_fault interrupt return handler runs 9. interrupt_exit_kernel_prepare() clears PACA_IRQ_HARD_DIS 10. interrupt returns with MSR[EE]=1 11. PMU interrupt fires, perf handler runs Fixes: 4423eb5ae32e ("powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926054305.2671436-4-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/64: mark irqs hard disabled in boot pacaNicholas Piggin2022-10-211-1/+3
| | | | | | | | | | | | | | [ Upstream commit 799f7063c7645f9a751d17f5dfd73b952f962cd2 ] This prevents interrupts in early boot (e.g., program check) from enabling MSR[EE], potentially causing endian mismatch or other crashes when reporting early boot traps. Fixes: 4423eb5ae32ec ("powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926054305.2671436-3-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/64/interrupt: Fix false warning in context tracking due to idle stateNicholas Piggin2022-10-211-1/+2
| | | | | | | | | | | | | | [ Upstream commit 56adbb7a8b6cc7fc9b940829c38494e53c9e57d1 ] Commit 171476775d32 ("context_tracking: Convert state to atomic_t") added a CONTEXT_IDLE state which can be encountered by interrupts from kernel mode in the idle thread, causing a false positive warning. Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926054305.2671436-2-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/64s: Fix GENERIC_CPU build flags for PPC970 / G5Nicholas Piggin2022-10-211-1/+1
| | | | | | | | | | | | | | | | | [ Upstream commit 58ec7f06b74e0d6e76c4110afce367c8b5f0837d ] Big-endian GENERIC_CPU supports 970, but builds with -mcpu=power5. POWER5 is ISA v2.02 whereas 970 is v2.01 plus Altivec. 2.02 added the popcntb instruction which a compiler might use. Use -mcpu=power4. Fixes: 471d7ff8b51b ("powerpc/64s: Remove POWER4 support") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921014103.587954-1-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc: Fix fallocate and fadvise64_64 compat parameter combinationRohan McLure2022-10-213-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 016ff72bd2090903715c0f9422a44afbb966f4ee ] As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate compatibility handlers assume parameters are passed with 32-bit big-endian ABI. This affects the assignment of odd-even parameter pairs to the high or low words of a 64-bit syscall parameter. Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower 32 bits conditioned on endianness. A future patch will replace the arch-specific compat fallocate with an asm-generic implementation. This patch is intended for ease of back-port. [1]: https://lore.kernel.org/all/be29926f-226e-48dc-871a-e29a54e80583@www.fastmail.com/ Fixes: 57f48b4b74e7 ("powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rohan McLure <rmclure@linux.ibm.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220921065605.1051927-9-rmclure@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc: dts: turris1x.dts: Fix labels in DSA cpu port nodesPali Rohár2022-10-211-2/+2
| | | | | | | | | | | | | [ Upstream commit 8bf056f57f1d16c561e43f9af37301f23990cd21 ] DSA cpu port node has to be marked with "cpu" label. So fix it for both cpu port nodes. Fixes: 54c15ec3b738 ("powerpc: dts: Add DTS file for CZ.NIC Turris 1.x routers") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220827131538.14577-1-pali@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc: dts: turris1x.dts: Fix NOR partitions labelsPali Rohár2022-10-211-5/+5
| | | | | | | | | | | | | | | | | [ Upstream commit c9986f0aefd1ae22fe9cf794d49699643f1e268b ] Partition partition@20000 contains generic kernel image and it does not have to be used only for rescue purposes. Partition partition@1c0000 contains bootable rescue system and partition partition@340000 contains factory image/data for restoring to NAND. So change partition labels to better fit their purpose by removing possible misleading substring "rootfs" from these labels. Fixes: 54c15ec3b738 ("powerpc: dts: Add DTS file for CZ.NIC Turris 1.x routers") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220830225500.8856-1-pali@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/powernv: add missing of_node_put() in opal_export_attrs()Zheng Yongjun2022-10-211-0/+1
| | | | | | | | | | | | | [ Upstream commit 71a92e99c47900cc164620948b3863382cec4f1a ] After using 'np' returned by of_find_node_by_path(), of_node_put() need be called to decrease the refcount. Fixes: 11fe909d2362 ("powerpc/powernv: Add OPAL exports attributes to sysfs") Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220906141703.118192-1-zhengyongjun3@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/pci_dn: Add missing of_node_put()Liang He2022-10-211-0/+1
| | | | | | | | | | | | | | | [ Upstream commit 110a1fcb6c4d55144d8179983a475f17a1d6f832 ] In pci_add_device_node_info(), use of_node_put() to drop the reference to 'parent' returned by of_get_parent() to keep refcount balance. Fixes: cca87d303c85 ("powerpc/pci: Refactor pci_dn") Co-authored-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Link: https://lore.kernel.org/r/20220701131750.240170-1-windhl@126.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/sysdev/fsl_msi: Add missing of_node_put()Liang He2022-10-211-0/+2
| | | | | | | | | | | | | | [ Upstream commit def435c04ee984a5f9ed2711b2bfe946936c6a21 ] In fsl_setup_msi_irqs(), use of_node_put() to drop the reference returned by of_parse_phandle(). Fixes: 895d603f945ba ("powerpc/fsl_msi: add support for the fsl, msi property in PCI nodes") Co-authored-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220704145233.278539-1-windhl@126.com Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/math_emu/efp: Include module.hNathan Chancellor2022-10-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cfe0d370e0788625ce0df3239aad07a2506c1796 ] When building with a recent version of clang, there are a couple of errors around the call to module_init(): arch/powerpc/math-emu/math_efp.c:927:1: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int] module_init(spe_mathemu_init); ^ int arch/powerpc/math-emu/math_efp.c:927:13: error: a parameter list without types is only allowed in a function definition module_init(spe_mathemu_init); ^ 2 errors generated. module_init() is a macro, which is not getting expanded because module.h is not included in this file. Add the include so that the macro can expand properly, clearing up the build failure. Fixes: ac6f120369ff ("powerpc/85xx: Workaroudn e500 CPU erratum A005") [chleroy: added fixes tag] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/8403854a4c187459b2f4da3537f51227b70b9223.1662134272.git.christophe.leroy@csgroup.eu Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/configs: Properly enable PAPR_SCM in pseries_defconfigMichael Ellerman2022-10-211-0/+1
| | | | | | | | | | | | | | [ Upstream commit aa398d88aea4ec863bd7aea35d5035a37096dc59 ] My commit to add PAPR_SCM to pseries_defconfig failed to add the required dependencies, meaning the driver doesn't get built. Add the required LIBNVDIMM=m. Fixes: d6481a7195df ("powerpc/configs: Add PAPR_SCM to pseries_defconfig") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220901014253.252927-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>
* powerpc/boot: Explicitly disable usage of SPE instructionsPali Rohár2022-10-211-0/+1
| | | | | | | | | | | | | | | commit 110a58b9f91c66f743c01a2c217243d94c899c23 upstream. uImage boot wrapper should not use SPE instructions, like kernel itself. Boot wrapper has already disabled Altivec and VSX instructions but not SPE. Options -mno-spe and -mspe=no already set when compilation of kernel, but not when compiling uImage wrapper yet. Fix it. Cc: stable@vger.kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220827134454.17365-1-pali@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* powerpc/Kconfig: Fix non existing CONFIG_PPC_FSL_BOOKEChristophe Leroy2022-10-211-1/+1
| | | | | | | | | | | | | commit d1203f32d86987a3ccd7de9ba2448ba12d86d125 upstream. CONFIG_PPC_FSL_BOOKE doesn't exist. Should be CONFIG_FSL_BOOKE. Fixes: 49e3d8ea6248 ("powerpc/fsl_booke: Enable STRICT_KERNEL_RWX") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/828f6a64eeb51ce9abfa1d4e84c521a02fecebb8.1663606875.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Revert "powerpc/rtas: Implement reentrant rtas call"Nathan Lynch2022-10-155-99/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f88aabad33ea22be2ce1c60d8901942e4e2a9edb upstream. At the time this was submitted by Leonardo, I confirmed -- or thought I had confirmed -- with PowerVM partition firmware development that the following RTAS functions: - ibm,get-xive - ibm,int-off - ibm,int-on - ibm,set-xive were safe to call on multiple CPUs simultaneously, not only with respect to themselves as indicated by PAPR, but with arbitrary other RTAS calls: https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/ Recent discussion with firmware development makes it clear that this is not true, and that the code in commit b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call") is unsafe, likely explaining several strange bugs we've seen in internal testing involving DLPAR and LPM. These scenarios use ibm,configure-connector, whose internal state can be corrupted by the concurrent use of the "reentrant" functions, leading to symptoms like endless busy statuses from RTAS. Fixes: b664db8e3f97 ("powerpc/rtas: Implement reentrant rtas call") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Laurent Dufour <laurent.dufour@fr.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220907220111.223267-1-nathanl@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* Merge tag 'mm-hotfixes-stable-2022-09-26' of ↵Linus Torvalds2022-09-261-9/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull last (?) hotfixes from Andrew Morton: "26 hotfixes. 8 are for issues which were introduced during this -rc cycle, 18 are for earlier issues, and are cc:stable" * tag 'mm-hotfixes-stable-2022-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits) x86/uaccess: avoid check_object_size() in copy_from_user_nmi() mm/page_isolation: fix isolate_single_pageblock() isolation behavior mm,hwpoison: check mm when killing accessing process mm/hugetlb: correct demote page offset logic mm: prevent page_frag_alloc() from corrupting the memory mm: bring back update_mmu_cache() to finish_fault() frontswap: don't call ->init if no ops are registered mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all() mm: fix madivse_pageout mishandling on non-LRU page powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush mm: gup: fix the fast GUP race against THP collapse mm: fix dereferencing possible ERR_PTR vmscan: check folio_test_private(), not folio_get_private() mm: fix VM_BUG_ON in __delete_from_swap_cache() tools: fix compilation after gfp_types.h split mm/damon/dbgfs: fix memory leak when using debugfs_lookup() mm/migrate_device.c: copy pte dirty bit to page mm/migrate_device.c: add missing flush_cache_page() mm/migrate_device.c: flush TLB while holding PTL x86/mm: disable instrumentations of mm/pgprot.c ...
| * powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flushYang Shi2022-09-261-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will move to use RCU instead of disabling local interrupts in fast-GUP. Using an IPI is the old-styled way of serializing against fast-GUP although it still works as expected now. And fast-GUP now fixed the potential race with THP collapse by checking whether PMD is changed or not. So IPI broadcast in radix pmd collapse flush is not necessary anymore. But it is still needed for hash TLB. Link: https://lkml.kernel.org/r/20220907180144.555485-2-shy828301@gmail.com Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Yang Shi <shy828301@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | Merge tag 'powerpc-6.0-5' of ↵Linus Torvalds2022-09-091-1/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Fix crashes on bare metal due to the new plkps driver trying to probe and call the hypervisor on non-pseries machines. Thanks to Nathan Chancellor and Dan Horák. * tag 'powerpc-6.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Fix plpks crash on non-pseries
| * | powerpc/pseries: Fix plpks crash on non-pseriesMichael Ellerman2022-09-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported[1] by Nathan, the recently added plpks driver will crash if it's built into the kernel and booted on a non-pseries machine, eg powernv: kernel BUG at arch/powerpc/kernel/syscall.c:39! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV ... NIP system_call_exception+0x90/0x3d0 LR system_call_common+0xec/0x250 Call Trace: 0xc0000000035c3e10 (unreliable) system_call_common+0xec/0x250 --- interrupt: c00 at plpar_hcall+0x38/0x60 NIP: c0000000000e4300 LR: c00000000202945c CTR: 0000000000000000 REGS: c0000000035c3e80 TRAP: 0c00 Not tainted (6.0.0-rc4) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000284 XER: 00000000 ... NIP plpar_hcall+0x38/0x60 LR pseries_plpks_init+0x64/0x23c --- interrupt: c00 On powernv Linux is the hypervisor, so a hypercall just ends up going to the syscall path, which BUGs if the syscall (hypercall) didn't come from userspace. The fix is simply to not probe the plpks driver on non-pseries machines. [1] https://lore.kernel.org/linuxppc-dev/Yxe06fbq18Wv9y3W@dev-arch.thelio-3990X/ Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Dan Horák <dan@danny.cz> Reviewed-by: Dan Horák <dan@danny.cz> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220907065038.1604504-1-mpe@ellerman.id.au
* | | Merge tag 'asm-generic-fixes-6.0-rc4' of ↵Linus Torvalds2022-09-091-2/+2
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull SOFTIRQ_ON_OWN_STACK rework from Arnd Bergmann: "Just one fixup patch, reworking the softirq_on_own_stack logic for preempt-rt kernels as discussed in https://lore.kernel.org/all/CAHk-=wgZSD3W2y6yczad2Am=EfHYyiPzTn3CfXxrriJf9i5W5w@mail.gmail.com/" * tag 'asm-generic-fixes-6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
| * | asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.Sebastian Andrzej Siewior2022-09-051-2/+2
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the CONFIG_PREEMPT_RT symbol from the ifdef around do_softirq_own_stack() and move it to Kconfig instead. Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT. This ensures that softirq stacks are not used on PREEMPT_RT and avoids a 'select' statement on an option which has a 'depends' statement. Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | powerpc/papr_scm: Ensure rc is always initialized in papr_scm_pmu_register()Nathan Chancellor2022-09-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clang warns: arch/powerpc/platforms/pseries/papr_scm.c:492:6: warning: variable 'rc' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] if (!p->stat_buffer_len) ^~~~~~~~~~~~~~~~~~~ arch/powerpc/platforms/pseries/papr_scm.c:523:64: note: uninitialized use occurs here dev_info(&p->pdev->dev, "nvdimm pmu didn't register rc=%d\n", rc); ^~ include/linux/dev_printk.h:150:67: note: expanded from macro 'dev_info' dev_printk_index_wrap(_dev_info, KERN_INFO, dev, dev_fmt(fmt), ##__VA_ARGS__) ^~~~~~~~~~~ include/linux/dev_printk.h:110:23: note: expanded from macro 'dev_printk_index_wrap' _p_func(dev, fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~ arch/powerpc/platforms/pseries/papr_scm.c:492:2: note: remove the 'if' if its condition is always false if (!p->stat_buffer_len) ^~~~~~~~~~~~~~~~~~~~~~~~ arch/powerpc/platforms/pseries/papr_scm.c:484:8: note: initialize the variable 'rc' to silence this warning int rc, nodeid; ^ = 0 1 warning generated. The call to papr_scm_pmu_check_events() was eliminated but a return code was not added to the if statement. Add the same return code from papr_scm_pmu_check_events() for this condition so there is no more warning. Fixes: 9b1ac04698a4 ("powerpc/papr_scm: Fix nvdimm event mappings") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/1701 Link: https://lore.kernel.org/r/20220830151256.1473169-1-nathan@kernel.org
* | Revert "powerpc/irq: Don't open code irq_soft_mask helpers"Michael Ellerman2022-09-021-7/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit ef5b570d3700fbb8628a58da0487486ceeb713cd. Zhouyi reported that commit is causing crashes when running rcutorture with KASAN enabled: BUG: using smp_processor_id() in preemptible [00000000] code: rcu_torture_rea/100 caller is rcu_preempt_deferred_qs_irqrestore+0x74/0xed0 CPU: 4 PID: 100 Comm: rcu_torture_rea Tainted: G W 5.19.0-rc5-next-20220708-dirty #253 Call Trace: dump_stack_lvl+0xbc/0x108 (unreliable) check_preemption_disabled+0x154/0x160 rcu_preempt_deferred_qs_irqrestore+0x74/0xed0 __rcu_read_unlock+0x290/0x3b0 rcu_torture_read_unlock+0x30/0xb0 rcutorture_one_extend+0x198/0x810 rcu_torture_one_read+0x58c/0xc90 rcu_torture_reader+0x12c/0x360 kthread+0x1e8/0x220 ret_from_kernel_thread+0x5c/0x64 KASAN will generate instrumentation instructions around the WRITE_ONCE(local_paca->irq_soft_mask, mask): 0xc000000000295cb0 <+0>: addis r2,r12,774 0xc000000000295cb4 <+4>: addi r2,r2,16464 0xc000000000295cb8 <+8>: mflr r0 0xc000000000295cbc <+12>: bl 0xc00000000008bb4c <mcount> 0xc000000000295cc0 <+16>: mflr r0 0xc000000000295cc4 <+20>: std r31,-8(r1) 0xc000000000295cc8 <+24>: addi r3,r13,2354 0xc000000000295ccc <+28>: mr r31,r13 0xc000000000295cd0 <+32>: std r0,16(r1) 0xc000000000295cd4 <+36>: stdu r1,-48(r1) 0xc000000000295cd8 <+40>: bl 0xc000000000609b98 <__asan_store1+8> 0xc000000000295cdc <+44>: nop 0xc000000000295ce0 <+48>: li r9,1 0xc000000000295ce4 <+52>: stb r9,2354(r31) 0xc000000000295ce8 <+56>: addi r1,r1,48 0xc000000000295cec <+60>: ld r0,16(r1) 0xc000000000295cf0 <+64>: ld r31,-8(r1) 0xc000000000295cf4 <+68>: mtlr r0 If there is a context switch before "stb r9,2354(r31)", r31 may not equal to r13, in such case, irq soft mask will not work. The usual solution of marking the code ineligible for instrumentation forces the code out-of-line, which we would prefer to avoid. Christophe proposed a partial revert, but Nick raised some concerns with that. So for now do a full revert. Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com> [mpe: Construct change log based on Zhouyi's original report] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220831131052.42250-1-mpe@ellerman.id.au
* | powerpc: Fix hard_irq_disable() with sanitizerChristophe Leroy2022-08-311-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | As reported by Zhouyi Zhou, WRITE_ONCE() is not atomic as expected when KASAN or KCSAN are compiled in. Fix it by re-implementing it using inline assembly. Fixes: 077fc62b2b66 ("powerpc/irq: remove inline assembly in hard_irq_disable macro") Reported-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a8298991b3df049a54ee8e558838e34265812014.1661272586.git.christophe.leroy@csgroup.eu
* | powerpc/rtas: Fix RTAS MSR[HV] handling for CellMichael Ellerman2022-08-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The semi-recent changes to MSR handling when entering RTAS (firmware) cause crashes on IBM Cell machines. An example trace: kernel tried to execute user page (2fff01a8) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0x2fff01a8 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=4 NUMA Cell Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.0.0-rc2-00433-gede0a8d3307a #207 NIP: 000000002fff01a8 LR: 0000000000032608 CTR: 0000000000000000 REGS: c0000000015236b0 TRAP: 0400 Tainted: G W (6.0.0-rc2-00433-gede0a8d3307a) MSR: 0000000008001002 <ME,RI> CR: 00000000 XER: 20000000 ... NIP 0x2fff01a8 LR 0x32608 Call Trace: 0xc00000000143c5f8 (unreliable) .rtas_call+0x224/0x320 .rtas_get_boot_time+0x70/0x150 .read_persistent_clock64+0x114/0x140 .read_persistent_wall_and_boot_offset+0x24/0x80 .timekeeping_init+0x40/0x29c .start_kernel+0x674/0x8f0 start_here_common+0x1c/0x50 Unlike PAPR platforms where RTAS is only used in guests, on the IBM Cell machines Linux runs with MSR[HV] set but also uses RTAS, provided by SLOF. Fix it by copying the MSR[HV] bit from the MSR value we've just read using mfmsr into the value used for RTAS. It seems like we could also fix it using an #ifdef CELL to set MSR[HV], but that doesn't work because it's possible to build a single kernel image that runs on both Cell native and pseries. Fixes: b6b1c3ce06ca ("powerpc/rtas: Keep MSR[RI] set when calling RTAS") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Jordan Niethe <jniethe5@gmail.com> Link: https://lore.kernel.org/r/20220823115952.1203106-2-mpe@ellerman.id.au
* | Revert "powerpc: Remove unused FW_FEATURE_NATIVE references"Michael Ellerman2022-08-261-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 79b74a68486765a4fe685ac4069bc71366c538f5. It broke booting on IBM Cell machines when the kernel is also built with CONFIG_PPC_PS3=y. That's because FW_FEATURE_NATIVE_ALWAYS = 0 does have an important effect, which is to clear the PS3 ALWAYS features from FW_FEATURE_ALWAYS. Note that CONFIG_PPC_NATIVE has since been renamed CONFIG_PPC_HASH_MMU_NATIVE. Fixes: 79b74a684867 ("powerpc: Remove unused FW_FEATURE_NATIVE references") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220823115952.1203106-1-mpe@ellerman.id.au
* | powerpc: align syscall table for ppc32Masahiro Yamada2022-08-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Christophe Leroy reported that commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS") broke mpc85xx_defconfig + CONFIG_RELOCATABLE=y. LD vmlinux SYSMAP System.map SORTTAB vmlinux CHKREL vmlinux WARNING: 451 bad relocations c0b312a9 R_PPC_UADDR32 .head.text-0x3ff9ed54 c0b312ad R_PPC_UADDR32 .head.text-0x3ffac224 c0b312b1 R_PPC_UADDR32 .head.text-0x3ffb09f4 c0b312b5 R_PPC_UADDR32 .head.text-0x3fe184dc c0b312b9 R_PPC_UADDR32 .head.text-0x3fe183a8 ... The compiler emits a bunch of R_PPC_UADDR32, which is not supported by arch/powerpc/kernel/reloc_32.S. The reason is there exists an unaligned symbol. $ powerpc-linux-gnu-nm -n vmlinux ... c0b31258 d spe_aligninfo c0b31298 d __func__.0 c0b312a9 D sys_call_table c0b319b8 d __func__.0 Commit 7b4537199a4a is not the root cause. Even before that, I can reproduce the same issue for mpc85xx_defconfig + CONFIG_RELOCATABLE=y + CONFIG_MODVERSIONS=n. It is just that nobody noticed because when CONFIG_MODVERSIONS is enabled, a __crc_* symbol inserted before sys_call_table was hiding the unalignment issue. Adding alignment to the syscall table for ppc32 fixes the issue. Cc: stable@vger.kernel.org Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Trim change log discussion, add Cc stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/38605f6a-a568-f884-f06f-ea4da5b214f0@csgroup.eu/ Link: https://lore.kernel.org/r/20220820165129.1147589-1-masahiroy@kernel.org
* | powerpc/pci: Enable PCI domains in /proc when PCI bus numbers are not uniquePali Rohár2022-08-251-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit powerpc systems with more PCIe controllers and more PCI domains, where on more PCI domains are same PCI numbers, when kernel is compiled with CONFIG_PROC_FS=y and CONFIG_PPC_PCI_BUS_NUM_DOMAIN_DEPENDENT=y options, kernel prints "proc_dir_entry 'pci/01' already registered" error message. proc_dir_entry 'pci/01' already registered WARNING: CPU: 0 PID: 1 at fs/proc/generic.c:377 proc_register+0x1a8/0x1ac ... NIP proc_register+0x1a8/0x1ac LR proc_register+0x1a8/0x1ac Call Trace: proc_register+0x1a8/0x1ac (unreliable) _proc_mkdir+0x78/0xa4 pci_proc_attach_device+0x11c/0x168 pci_proc_init+0x80/0x98 do_one_initcall+0x80/0x284 kernel_init_freeable+0x1f4/0x2a0 kernel_init+0x24/0x150 ret_from_kernel_thread+0x5c/0x64 This regression started appearing after commit 566356813082 ("powerpc/pci: Add config option for using all 256 PCI buses") in case in each mPCIe slot is connected PCIe card and therefore PCI bus 1 is populated in for every PCIe controller / PCI domain. The reason is that PCI procfs code expects that when PCI bus numbers are not unique across all PCI domains, function pci_proc_domain() returns true for domain dependent buses. Fix this issue by setting PCI_ENABLE_PROC_DOMAINS and PCI_COMPAT_DOMAIN_0 flags for 32-bit powerpc code when CONFIG_PPC_PCI_BUS_NUM_DOMAIN_DEPENDENT is enabled. Same approach is already implemented for 64-bit powerpc code (where PCI bus numbers are always domain dependent). Fixes: 566356813082 ("powerpc/pci: Add config option for using all 256 PCI buses") Signed-off-by: Pali Rohár <pali@kernel.org> [mpe: Trim change log oops message] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220820115113.30581-1-pali@kernel.org
* | powerpc/papr_scm: Fix nvdimm event mappingsKajol Jain2022-08-231-61/+27
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 4c08d4bbc089 ("powerpc/papr_scm: Add perf interface support") added performance monitoring support for papr-scm nvdimm devices via perf interface. Commit also added an array in papr_scm_priv structure called "nvdimm_events_map", which got filled based on the result of H_SCM_PERFORMANCE_STATS hcall. Currently there is an assumption that the order of events in the stats buffer, returned by the hypervisor is same. And order also happens to matches with the events specified in nvdimm driver code. But this assumption is not documented in Power Architecture Platform Requirements (PAPR) document. Although the order of events happens to be same on current generation od system, but it might not be true in future generation systems. Fix the issue, by adding a static mapping for nvdimm events to corresponding stat-id, and removing the dynamic map from papr_scm_priv structure. Also remove the function papr_scm_pmu_check_events from papr_scm.c file, as we no longer need to copy stat-ids dynamically. Fixes: 4c08d4bbc089 ("powerpc/papr_scm: Add perf interface support") Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220804074852.55157-1-kjain@linux.ibm.com
* Merge tag 'powerpc-6.0-3' of ↵Linus Torvalds2022-08-201-6/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix atomic sleep warnings at boot due to get_phb_number() taking a mutex with a spinlock held on some machines. - Add missing PMU selftests to .gitignores. Thanks to Guenter Roeck and Russell Currey. * tag 'powerpc-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Add missing PMU selftests to .gitignores powerpc/pci: Fix get_phb_number() locking
| * powerpc/pci: Fix get_phb_number() lockingMichael Ellerman2022-08-151-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent change to get_phb_number() causes a DEBUG_ATOMIC_SLEEP warning on some systems: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 1 lock held by swapper/1: #0: c157efb0 (hose_spinlock){+.+.}-{2:2}, at: pcibios_alloc_controller+0x64/0x220 Preemption disabled at: [<00000000>] 0x0 CPU: 0 PID: 1 Comm: swapper Not tainted 5.19.0-yocto-standard+ #1 Call Trace: [d101dc90] [c073b264] dump_stack_lvl+0x50/0x8c (unreliable) [d101dcb0] [c0093b70] __might_resched+0x258/0x2a8 [d101dcd0] [c0d3e634] __mutex_lock+0x6c/0x6ec [d101dd50] [c0a84174] of_alias_get_id+0x50/0xf4 [d101dd80] [c002ec78] pcibios_alloc_controller+0x1b8/0x220 [d101ddd0] [c140c9dc] pmac_pci_init+0x198/0x784 [d101de50] [c140852c] discover_phbs+0x30/0x4c [d101de60] [c0007fd4] do_one_initcall+0x94/0x344 [d101ded0] [c1403b40] kernel_init_freeable+0x1a8/0x22c [d101df10] [c00086e0] kernel_init+0x34/0x160 [d101df30] [c001b334] ret_from_kernel_thread+0x5c/0x64 This is because pcibios_alloc_controller() holds hose_spinlock but of_alias_get_id() takes of_mutex which can sleep. The hose_spinlock protects the phb_bitmap, and also the hose_list, but it doesn't need to be held while get_phb_number() calls the OF routines, because those are only looking up information in the device tree. So fix it by having get_phb_number() take the hose_spinlock itself, only where required, and then dropping the lock before returning. pcibios_alloc_controller() then needs to take the lock again before the list_add() but that's safe, the order of the list is not important. Fixes: 0fe1e96fef0a ("powerpc/pci: Prefer PCI domain assignment via DT 'linux,pci-domain' and alias") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220815065550.1303620-1-mpe@ellerman.id.au
* | KVM: Rename mmu_notifier_* to mmu_invalidate_*Chao Peng2022-08-197-15/+15
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation of this renaming is to make these variables and related helper functions less mmu_notifier bound and can also be used for non mmu_notifier based page invalidation. mmu_invalidate_* was chosen to better describe the purpose of 'invalidating' a page that those variables are used for. - mmu_notifier_seq/range_start/range_end are renamed to mmu_invalidate_seq/range_start/range_end. - mmu_notifier_retry{_hva} helper functions are renamed to mmu_invalidate_retry{_hva}. - mmu_notifier_count is renamed to mmu_invalidate_in_progress to avoid confusion with mn_active_invalidate_count. - While here, also update kvm_inc/dec_notifier_count() to kvm_mmu_invalidate_begin/end() to match the change for mmu_notifier_count. No functional change intended. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge tag 'powerpc-6.0-2' of ↵Linus Torvalds2022-08-146-30/+25
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Ensure we never emit lwarx with EH=1 on 32-bit, because some 32-bit CPUs trap on it rather than ignoring it as they should. - Fix ftrace when building with clang, which was broken by some refactoring. - A couple of other minor fixes. Thanks to Christophe Leroy, Naveen N. Rao, Nick Desaulniers, Ondrej Mosnacek, Pali Rohár, Russell Currey, and Segher Boessenkool. * tag 'powerpc-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/kexec: Fix build failure from uninitialised variable powerpc/ppc-opcode: Fix PPC_RAW_TW() powerpc64/ftrace: Fix ftrace for clang builds powerpc: Make eh value more explicit when using lwarx powerpc: Don't hide eh field of lwarx behind a macro powerpc: Fix eh field when calling lwarx on PPC32
| * powerpc/kexec: Fix build failure from uninitialised variableRussell Currey2022-08-101-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | clang 14 won't build because ret is uninitialised and can be returned if both prop and fdtprop are NULL. Drop the ret variable and return an error in that failure case. Fixes: b1fc44eaa9ba ("pseries/iommu/ddw: Fix kdump to work in absence of ibm,dma-window") Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220810054331.373761-1-ruscur@russell.cc
| * powerpc/ppc-opcode: Fix PPC_RAW_TW()Christophe Leroy2022-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PPC_RAW_TW() is erroneously defined with base code 0x7f000008 instead of 0x7c000008. That's invisible because its only user is PPC_RAW_TRAP() which is 0x7fe00008, but fix it anyway to avoid any risk of future bug. Fixes: d00d762daf12 ("powerpc/ppc-opcode: Define and use PPC_RAW_TRAP() and PPC_RAW_TW()") Reported-by: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eca9251f1e1f82c4c46ec6380ddb28356ab3fdfe.1659527244.git.christophe.leroy@csgroup.eu
| * powerpc64/ftrace: Fix ftrace for clang buildsNaveen N. Rao2022-08-101-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clang doesn't support -mprofile-kernel ABI, so guard the checks against CONFIG_DYNAMIC_FTRACE_WITH_REGS, rather than the elf ABI version. Fixes: 23b44fc248f4 ("powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64") Cc: stable@vger.kernel.org # v5.19+ Reported-by: Nick Desaulniers <ndesaulniers@google.com> Reported-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Ondrej Mosnacek <omosnacek@gmail.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/llvm/llvm-project/issues/57031 Link: https://github.com/ClangBuiltLinux/linux/issues/1682 Link: https://lore.kernel.org/r/20220809095907.418764-1-naveen.n.rao@linux.vnet.ibm.com
| * powerpc: Make eh value more explicit when using lwarxChristophe Leroy2022-08-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like the first patch of this series, define a local 'eh' in order to make the code clearer. And IS_ENABLED() returns either 1 or 0 so no need to do IS_ENABLED(CONFIG_PPC64) ? 1 : 0. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Use symbolic names, use 'n' constraint per Segher] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/629befaa2d05e2922346e58a383886510d6af55a.1659430931.git.christophe.leroy@csgroup.eu
| * powerpc: Don't hide eh field of lwarx behind a macroChristophe Leroy2022-08-102-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The eh field must remain 0 for PPC32 and is only used by PPC64. Don't hide that behind a macro, just leave the responsibility to the user. At the time being, the only users of PPC_RAW_L{WDQ}ARX are setting the eh field to 0, so the special handling of __PPC_EH is useless. Just take the value given by the caller. Same for DEFINE_TESTOP(), don't do special handling in that macro, ensure the caller hands over the proper eh value. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Use 'n' constraint per Segher] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8b9c8a1a14f9143552a85fcbf96698224a8c2469.1659430931.git.christophe.leroy@csgroup.eu
| * powerpc: Fix eh field when calling lwarx on PPC32Christophe Leroy2022-08-101-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros") properly handled the eh field of lwarx in asm/bitops.h but failed to clear it for PPC32 in asm/simple_spinlock.h So, do as in arch_atomic_try_cmpxchg_lock(), set it to 1 if PPC64 but set it to 0 if PPC32. For that use IS_ENABLED(CONFIG_PPC64) which returns 1 when CONFIG_PPC64 is set and 0 otherwise. Fixes: 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros") Cc: stable@vger.kernel.org # v5.15+ Reported-by: Pali Rohár <pali@kernel.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Pali Rohár <pali@kernel.org> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> [mpe: Use symbolic names, use 'n' constraint per Segher] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a1176e19e627dd6a1b8d24c6c457a8ab874b7d12.1659430931.git.christophe.leroy@csgroup.eu
* | Merge tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds2022-08-101-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for 6.0: - Introduce a 'struct cxl_region' object with support for provisioning and assembling persistent memory regions. - Introduce alloc_free_mem_region() to accompany the existing request_free_mem_region() as a method to allocate physical memory capacity out of an existing resource. - Export insert_resource_expand_to_fit() for the CXL subsystem to late-publish CXL platform windows in iomem_resource. - Add a polled mode PCI DOE (Data Object Exchange) driver service and use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute Table)" * tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits) cxl/hdm: Fix skip allocations vs multiple pmem allocations cxl/region: Disallow region granularity != window granularity cxl/region: Fix x1 interleave to greater than x1 interleave routing cxl/region: Move HPA setup to cxl_region_attach() cxl/region: Fix decoder interleave programming Documentation: cxl: remove dangling kernel-doc reference cxl/region: describe targets and nr_targets members of cxl_region_params cxl/regions: add padding for cxl_rr_ep_add nested lists cxl/region: Fix IS_ERR() vs NULL check cxl/region: Fix region reference target accounting cxl/region: Fix region commit uninitialized variable warning cxl/region: Fix port setup uninitialized variable warnings cxl/region: Stop initializing interleave granularity cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime cxl/acpi: Minimize granularity for x1 interleaves cxl/region: Delete 'region' attribute from root decoders cxl/acpi: Autoload driver for 'cxl_acpi' test devices cxl/region: decrement ->nr_targets on error in cxl_region_attach() cxl/region: prevent underflow in ways_to_cxl() cxl/region: uninitialized variable in alloc_hpa() ...
| * | powerpc/mm: Export memory_add_physaddr_to_nid() for modulesMichael Ellerman2022-07-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The cxl_pmem module wants to call memory_add_physaddr_to_nid(), so export the symbol. Link: http://lore.kernel.org/r/87sfmkbfyg.fsf@mpe.ellerman.id.au Fixes: 04ad63f086d1 ("cxl/region: Introduce cxl_pmem_region objects") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
* | | Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linuxLinus Torvalds2022-08-072-8/+9
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...
| * | | powerpc: drop dependency on <asm/machdep.h> in archrandom.hYury Norov2022-08-012-8/+13
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | archrandom.h includes <asm/machdep.h> to refer ppc_md. This causes circular header dependency, if generic nodemask.h includes random.h: In file included from include/linux/cred.h:16, from include/linux/seq_file.h:13, from arch/powerpc/include/asm/machdep.h:6, from arch/powerpc/include/asm/archrandom.h:5, from include/linux/random.h:109, from include/linux/nodemask.h:97, from include/linux/list_lru.h:12, from include/linux/fs.h:13, from include/linux/compat.h:17, from arch/powerpc/kernel/asm-offsets.c:12: include/linux/sched.h:1203:9: error: unknown type name 'nodemask_t' 1203 | nodemask_t mems_allowed; | ^~~~~~~~~~ Fix it by removing <asm/machdep.h> dependency from archrandom.h Now as arch_get_random_seed_long() moved to c-file, and not exported, it's not available for modules. As Michael Ellerman says: I think we actually don't need it exported to modules, I think it's a private detail of the RNG <-> architecture interface, not something that modules should be calling. CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Paul Mackerras <paulus@samba.org> CC: Rasmus Villemoes <linux@rasmusvillemoes.dk> CC: Stephen Rothwell <sfr@canb.auug.org.au> CC: linuxppc-dev@lists.ozlabs.org Suggested-by: Michael Ellerman <mpe@ellerman.id.au> (for non-exporting) Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Yury Norov <yury.norov@gmail.com>
* | | Merge tag 'mm-nonmm-stable-2022-08-06-2' of ↵Linus Torvalds2022-08-071-7/+0
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc updates from Andrew Morton: "Updates to various subsystems which I help look after. lib, ocfs2, fatfs, autofs, squashfs, procfs, etc. A relatively small amount of material this time" * tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits) scripts/gdb: ensure the absolute path is generated on initial source MAINTAINERS: kunit: add David Gow as a maintainer of KUnit mailmap: add linux.dev alias for Brendan Higgins mailmap: update Kirill's email profile: setup_profiling_timer() is moslty not implemented ocfs2: fix a typo in a comment ocfs2: use the bitmap API to simplify code ocfs2: remove some useless functions lib/mpi: fix typo 'the the' in comment proc: add some (hopefully) insightful comments bdi: remove enum wb_congested_state kernel/hung_task: fix address space of proc_dohung_task_timeout_secs lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t() squashfs: support reading fragments in readahead call squashfs: implement readahead squashfs: always build "file direct" version of page actor Revert "squashfs: provide backing_dev_info in order to disable read-ahead" fs/ocfs2: Fix spelling typo in comment ia64: old_rr4 added under CONFIG_HUGETLB_PAGE proc: fix test for "vsyscall=xonly" boot option ...
| * | profile: setup_profiling_timer() is moslty not implementedBen Dooks2022-07-291-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The setup_profiling_timer() is mostly un-implemented by many architectures. In many places it isn't guarded by CONFIG_PROFILE which is needed for it to be used. Make it a weak symbol in kernel/profile.c and remove the 'return -EINVAL' implementations from the kenrel. There are a couple of architectures which do return 0 from the setup_profiling_timer() function but they don't seem to do anything else with it. To keep the /proc compatibility for now, leave these for a future update or removal. On ARM, this fixes the following sparse warning: arch/arm/kernel/smp.c:793:5: warning: symbol 'setup_profiling_timer' was not declared. Should it be static? Link: https://lkml.kernel.org/r/20220721195509.418205-1-ben-linux@fluff.org Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | Merge tag 'powerpc-6.0-1' of ↵Linus Torvalds2022-08-06194-1863/+3363
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add support for syscall stack randomization - Add support for atomic operations to the 32 & 64-bit BPF JIT - Full support for KASAN on 64-bit Book3E - Add a watchdog driver for the new PowerVM hypervisor watchdog - Add a number of new selftests for the Power10 PMU support - Add a driver for the PowerVM Platform KeyStore - Increase the NMI watchdog timeout during live partition migration, to avoid timeouts due to increased memory access latency - Add support for using the 'linux,pci-domain' device tree property for PCI domain assignment - Many other small features and fixes Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada, Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu Jianfeng, and Zhouyi Zhou. * tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits) powerpc/64e: Fix kexec build error EDAC/ppc_4xx: Include required of_irq header directly powerpc/pci: Fix PHB numbering when using opal-phbid powerpc/64: Init jump labels before parse_early_param() selftests/powerpc: Avoid GCC 12 uninitialised variable warning powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address powerpc/xive: Fix refcount leak in xive_get_max_prio powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader powerpc/perf: Include caps feature for power10 DD1 version powerpc: add support for syscall stack randomization powerpc: Move system_call_exception() to syscall.c powerpc/powernv: rename remaining rng powernv_ functions to pnv_ powerpc/powernv/kvm: Use darn for H_RANDOM on Power9 powerpc/powernv: Avoid crashing if rng is NULL selftests/powerpc: Fix matrix multiply assist test powerpc/signal: Update comment for clarity powerpc: make facility_unavailable_exception 64s powerpc/platforms/83xx/suspend: Remove write-only global variable powerpc/platforms/83xx/suspend: Prevent unloading the driver powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration ...