summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
...
* | powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()Christophe Leroy2020-09-021-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the time being, __put_user()/__get_user() and friends only use D-form addressing, with 0 offset. Ex: lwz reg1, 0(reg2) Give the compiler the opportunity to use other adressing modes whenever possible, to get more optimised code. Hereunder is a small exemple: struct test { u32 item1; u16 item2; u8 item3; u64 item4; }; int set_test_user(struct test __user *from, struct test __user *to) { int err; u32 item1; u16 item2; u8 item3; u64 item4; err = __get_user(item1, &from->item1); err |= __get_user(item2, &from->item2); err |= __get_user(item3, &from->item3); err |= __get_user(item4, &from->item4); err |= __put_user(item1, &to->item1); err |= __put_user(item2, &to->item2); err |= __put_user(item3, &to->item3); err |= __put_user(item4, &to->item4); return err; } Before the patch: 00000df0 <set_test_user>: df0: 94 21 ff f0 stwu r1,-16(r1) df4: 39 40 00 00 li r10,0 df8: 93 c1 00 08 stw r30,8(r1) dfc: 93 e1 00 0c stw r31,12(r1) e00: 7d 49 53 78 mr r9,r10 e04: 80 a3 00 00 lwz r5,0(r3) e08: 38 e3 00 04 addi r7,r3,4 e0c: 7d 46 53 78 mr r6,r10 e10: a0 e7 00 00 lhz r7,0(r7) e14: 7d 29 33 78 or r9,r9,r6 e18: 39 03 00 06 addi r8,r3,6 e1c: 7d 46 53 78 mr r6,r10 e20: 89 08 00 00 lbz r8,0(r8) e24: 7d 29 33 78 or r9,r9,r6 e28: 38 63 00 08 addi r3,r3,8 e2c: 7d 46 53 78 mr r6,r10 e30: 83 c3 00 00 lwz r30,0(r3) e34: 83 e3 00 04 lwz r31,4(r3) e38: 7d 29 33 78 or r9,r9,r6 e3c: 7d 43 53 78 mr r3,r10 e40: 90 a4 00 00 stw r5,0(r4) e44: 7d 29 1b 78 or r9,r9,r3 e48: 38 c4 00 04 addi r6,r4,4 e4c: 7d 43 53 78 mr r3,r10 e50: b0 e6 00 00 sth r7,0(r6) e54: 7d 29 1b 78 or r9,r9,r3 e58: 38 e4 00 06 addi r7,r4,6 e5c: 7d 43 53 78 mr r3,r10 e60: 99 07 00 00 stb r8,0(r7) e64: 7d 23 1b 78 or r3,r9,r3 e68: 38 84 00 08 addi r4,r4,8 e6c: 93 c4 00 00 stw r30,0(r4) e70: 93 e4 00 04 stw r31,4(r4) e74: 7c 63 53 78 or r3,r3,r10 e78: 83 c1 00 08 lwz r30,8(r1) e7c: 83 e1 00 0c lwz r31,12(r1) e80: 38 21 00 10 addi r1,r1,16 e84: 4e 80 00 20 blr After the patch: 00000dbc <set_test_user>: dbc: 39 40 00 00 li r10,0 dc0: 7d 49 53 78 mr r9,r10 dc4: 80 03 00 00 lwz r0,0(r3) dc8: 7d 48 53 78 mr r8,r10 dcc: a1 63 00 04 lhz r11,4(r3) dd0: 7d 29 43 78 or r9,r9,r8 dd4: 7d 48 53 78 mr r8,r10 dd8: 88 a3 00 06 lbz r5,6(r3) ddc: 7d 29 43 78 or r9,r9,r8 de0: 7d 48 53 78 mr r8,r10 de4: 80 c3 00 08 lwz r6,8(r3) de8: 80 e3 00 0c lwz r7,12(r3) dec: 7d 29 43 78 or r9,r9,r8 df0: 7d 43 53 78 mr r3,r10 df4: 90 04 00 00 stw r0,0(r4) df8: 7d 29 1b 78 or r9,r9,r3 dfc: 7d 43 53 78 mr r3,r10 e00: b1 64 00 04 sth r11,4(r4) e04: 7d 29 1b 78 or r9,r9,r3 e08: 7d 43 53 78 mr r3,r10 e0c: 98 a4 00 06 stb r5,6(r4) e10: 7d 23 1b 78 or r3,r9,r3 e14: 90 c4 00 08 stw r6,8(r4) e18: 90 e4 00 0c stw r7,12(r4) e1c: 7c 63 53 78 or r3,r3,r10 e20: 4e 80 00 20 blr Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c27bc4e598daf3bbb225de7a1f5c52121cf1e279.1597235091.git.christophe.leroy@csgroup.eu
* | powerpc: Remove flush_instruction_cache() on 8xxChristophe Leroy2020-09-021-7/+0
| | | | | | | | | | | | | | | | flush_instruction_cache() is never used on 8xx, remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/245cabd8f291facac8c8c5fd370e361a69e02860.1597384145.git.christophe.leroy@csgroup.eu
* | powerpc: unrel_branch_check.sh: enable the use of llvm-objdump v9, 10 or 11Stephen Rothwell2020-09-021-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, using llvm-objtool, this script just silently succeeds without actually do the intended checking. So this updates it to work properly. Firstly, llvm-objdump does not add target symbol names to the end of branches in its asm output, so we have to drop the branch to __start_initialization_multiplatform using its address. Secondly, v9 and 10 specify branch targets as .+<offset>, so we convert those to actual addresses. Thirdly, v10 and 11 error out on a vmlinux if given the -R option complaining that it is "not a dynamic object". The -R does not make any difference to the asm output, so remove it. Lastly, v11 produces asm that is very similar to Gnu objtool (at least as far as branches are concerned), so no further changes are necessary to make it work. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200812081036.7969-3-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: use nm to find symbol valueStephen Rothwell2020-09-022-9/+6
| | | | | | | | | | | | | | | | | | This is considerably faster then parsing the objdump asm output. It will also make the enabling of llvm-objdump a little easier. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200812081036.7969-2-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: exit silently for early errorsStephen Rothwell2020-09-021-1/+4
| | | | | | | | | | | | | | | | | | If we can't find the address of __end_interrupts, then we still exit successfully as that is the current behaviour. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-8-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: fix up the file headerStephen Rothwell2020-09-021-11/+4
| | | | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-7-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: simplify and tidy up the final loopStephen Rothwell2020-09-021-16/+10
| | | | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-6-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: convert grep | sed | awk to just sedStephen Rothwell2020-09-021-10/+20
| | | | | | | | | | | | | | | | | | | | Also start using sed -E and make all the separate expressions into a single one with comments. Pull the stripping of condition registers back into the sed command. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-5-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: simplify objdump's asm outputStephen Rothwell2020-09-021-6/+6
| | | | | | | | | | | | | | | | | | | | | | We don't use the raw hex instruction dump, so elide it and adjust the following expressions. Also use \s instead of [[:space:]] everywhere. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-4-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: simplify and combine some executionsStephen Rothwell2020-09-021-14/+11
| | | | | | | | | | | | | | | | | | | | Also some minor style changes. There should still be no change in behaviour. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-3-sfr@canb.auug.org.au
* | powerpc: unrel_branch_check.sh: fix shellcheck complaintsStephen Rothwell2020-09-021-6/+7
| | | | | | | | | | | | | | | | No functional change Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811140435.20957-2-sfr@canb.auug.org.au
* | pseries/drmem: don't cache node id in drmem_lmb structScott Cheloha2020-09-023-34/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At memory hot-remove time we can retrieve an LMB's nid from its corresponding memory_block. There is no need to store the nid in multiple locations. Note that lmb_to_memblock() uses find_memory_block() to get the corresponding memory_block. As find_memory_block() runs in sub-linear time this approach is negligibly slower than what we do at present. In exchange for this lookup at hot-remove time we no longer need to call memory_add_physaddr_to_nid() during drmem_init() for each LMB. On powerpc, memory_add_physaddr_to_nid() is a linear search, so this spares us an O(n^2) initialization during boot. On systems with many LMBs that initialization overhead is palpable and disruptive. For example, on a box with 249854 LMBs we're seeing drmem_init() take upwards of 30 seconds to complete: [ 53.721639] drmem: initializing drmem v2 [ 80.604346] watchdog: BUG: soft lockup - CPU#65 stuck for 23s! [swapper/0:1] [ 80.604377] Modules linked in: [ 80.604389] CPU: 65 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc2+ #4 [ 80.604397] NIP: c0000000000a4980 LR: c0000000000a4940 CTR: 0000000000000000 [ 80.604407] REGS: c0002dbff8493830 TRAP: 0901 Not tainted (5.6.0-rc2+) [ 80.604412] MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000248 XER: 0000000d [ 80.604431] CFAR: c0000000000a4a38 IRQMASK: 0 [ 80.604431] GPR00: c0000000000a4940 c0002dbff8493ac0 c000000001904400 c0003cfffffede30 [ 80.604431] GPR04: 0000000000000000 c000000000f4095a 000000000000002f 0000000010000000 [ 80.604431] GPR08: c0000bf7ecdb7fb8 c0000bf7ecc2d3c8 0000000000000008 c00c0002fdfb2001 [ 80.604431] GPR12: 0000000000000000 c00000001e8ec200 [ 80.604477] NIP [c0000000000a4980] hot_add_scn_to_nid+0xa0/0x3e0 [ 80.604486] LR [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 [ 80.604492] Call Trace: [ 80.604498] [c0002dbff8493ac0] [c0000000000a4940] hot_add_scn_to_nid+0x60/0x3e0 (unreliable) [ 80.604509] [c0002dbff8493b20] [c000000000087c10] memory_add_physaddr_to_nid+0x20/0x60 [ 80.604521] [c0002dbff8493b40] [c0000000010d4880] drmem_init+0x25c/0x2f0 [ 80.604530] [c0002dbff8493c10] [c000000000010154] do_one_initcall+0x64/0x2c0 [ 80.604540] [c0002dbff8493ce0] [c0000000010c4aa0] kernel_init_freeable+0x2d8/0x3a0 [ 80.604550] [c0002dbff8493db0] [c000000000010824] kernel_init+0x2c/0x148 [ 80.604560] [c0002dbff8493e20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74 [ 80.604567] Instruction dump: [ 80.604574] 392918e8 e9490000 e90a000a e92a0000 80ea000c 1d080018 3908ffe8 7d094214 [ 80.604586] 7fa94040 419d00dc e9490010 714a0088 <2faa0008> 409e00ac e9490000 7fbe5040 [ 89.047390] drmem: 249854 LMB(s) With a patched kernel on the same machine we're no longer seeing the soft lockup. drmem_init() now completes in negligible time, even when the LMB count is large. Fixes: b2d3b5ee66f2 ("powerpc/pseries: Track LMB nid instead of using device tree") Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com> Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811015115.63677-1-cheloha@linux.ibm.com
* | powerpc: Rewrite FSL_BOOKE flush_cache_instruction() in CChristophe Leroy2020-09-022-22/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | Nothing prevents flush_cache_instruction() from being writen in C. Do it to improve readability and maintainability. This function is only use by low level callers, it is not intended to be used by module. Don't export it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f989eff8296800c427622c0985384148404e4f0b.1597384512.git.christophe.leroy@csgroup.eu
* | powerpc: Rewrite 4xx flush_cache_instruction() in CChristophe Leroy2020-09-022-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Nothing prevents flush_cache_instruction() from being writen in C. Do it to improve readability and maintainability. This function is very small and isn't called from assembly, make it static inline in asm/cacheflush.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/93d93fc69b4b3ad3ceba2fc0756333c0c0245bb7.1597384512.git.christophe.leroy@csgroup.eu
* | powerpc: Move flush_instruction_cache() prototype in asm/cacheflush.hChristophe Leroy2020-09-023-1/+3
| | | | | | | | | | | | | | | | | | | | | | flush_instruction_cache() belongs to the cache flushing function family. Move its prototype in asm/cacheflush.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/993445b5227e8ca2f0e38bcc9ea3dfea6e865920.1597384512.git.christophe.leroy@csgroup.eu
* | powerpc: Remove flush_instruction_cache for book3s/32Christophe Leroy2020-09-021-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only callers of flush_instruction_cache() are: arch/powerpc/kernel/swsusp_booke.S: bl flush_instruction_cache arch/powerpc/mm/nohash/40x.c: flush_instruction_cache(); arch/powerpc/mm/nohash/44x.c: flush_instruction_cache(); arch/powerpc/mm/nohash/fsl_booke.c: flush_instruction_cache(); arch/powerpc/platforms/44x/machine_check.c: flush_instruction_cache(); arch/powerpc/platforms/44x/machine_check.c: flush_instruction_cache(); This function is not used by book3s/32, drop it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/50098f49877cea0f46730a9df82dcabf84160e4b.1597384512.git.christophe.leroy@csgroup.eu
* | powerpc/pseries: explicitly reschedule during drmem_lmb list traversalNathan Lynch2020-09-021-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The drmem lmb list can have hundreds of thousands of entries, and unfortunately lookups take the form of linear searches. As long as this is the case, traversals have the potential to monopolize the CPU and provoke lockup reports, workqueue stalls, and the like unless they explicitly yield. Rather than placing cond_resched() calls within various for_each_drmem_lmb() loop blocks in the code, put it in the iteration expression of the loop macro itself so users can't omit it. Introduce a drmem_lmb_next() iteration helper function which calls cond_resched() at a regular interval during array traversal. Each iteration of the loop in DLPAR code paths can involve around ten RTAS calls which can each take up to 250us, so this ensures the check is performed at worst every few milliseconds. Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200813151131.2070161-1-nathanl@linux.ibm.com
* | powerpc: Drop _nmask_and_or_msr()Christophe Leroy2020-09-024-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | _nmask_and_or_msr() is only used at two places to set MSR_IP. The SYNC is unnecessary as the users are not PowerPC 601. Can be easily writen in C. Do it, and drop _nmask_and_or_msr() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c2d2b8dfb8dd677026b26dffc8d31070c38a6b89.1597388079.git.christophe.leroy@csgroup.eu
* | powerpc: Use simple i2c probe functionStephen Kitt2020-09-022-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The i2c probe functions here don't use the id information provided in their second argument, so the single-parameter i2c probe function ("probe_new") can be used instead. This avoids scanning the identifier tables during probes. Signed-off-by: Stephen Kitt <steve@sk2.org> Acked-by: Wolfram Sang <wsa@kernel.org> Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200807152713.381588-1-steve@sk2.org
* | powerpc/pseries: new lparcfg key/value pair: partition_affinity_scoreScott Cheloha2020-09-021-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The H_GetPerformanceCounterInfo (GPCI) PHYP hypercall has a subcall, Affinity_Domain_Info_By_Partition, which returns, among other things, a "partition affinity score" for a given LPAR. This score, a value on [0-100], represents the processor-memory affinity for the LPAR in question. A score of 0 indicates the worst possible affinity while a score of 100 indicates perfect affinity. The score can be used to reason about performance. This patch adds the score for the local LPAR to the lparcfg procfile under a new 'partition_affinity_score' key. Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200727184605.2945095-2-cheloha@linux.ibm.com
* | powerpc/perf: consolidate GPCI hcall structs into asm/hvcall.hScott Cheloha2020-09-023-36/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | The H_GetPerformanceCounterInfo (GPCI) hypercall input/output structs are useful to modules outside of perf/, so move them into asm/hvcall.h to live alongside the other powerpc hypercall structs. Leave the perf-specific GPCI stuff in perf/hv-gpci.h. Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com> Acked-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200727184605.2945095-1-cheloha@linux.ibm.com
* | powerpc: drop hard_reset_now() and poweroff_now() declarationChristophe Leroy2020-09-021-2/+0
| | | | | | | | | | | | | | | | Those function have never existed. Drop their declaration. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/edcdd72a36495d25213c0256c8022367458e0d19.1596716418.git.christophe.leroy@csgroup.eu
* | powerpc/fpu: Drop cvt_fd() and cvt_df()Christophe Leroy2020-09-022-17/+0
| | | | | | | | | | | | | | | | | | | | Those two functions have been unused since commit identified below. Drop them. Fixes: 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d5641ada199b8dd2af16ad00a66084cf974f2704.1596716418.git.christophe.leroy@csgroup.eu
* | powerpc/irq: Drop forward declaration of struct irqactionChristophe Leroy2020-09-021-1/+0
| | | | | | | | | | | | | | | | | | | | Since the commit identified below, the forward declaration of struct irqaction is useless. Drop it. Fixes: b709c0832824 ("ppc64: move stack switching up in interrupt processing") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e0bcdabac45fcd26c02d7df273bd4a5827c6033d.1596716375.git.christophe.leroy@csgroup.eu
* | powerpc/hwirq: Remove stale forward irq_chip declarationChristophe Leroy2020-09-021-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | Since commit identified below, the forward declaration of struct irq_chip is useless (was struct hw_interrupt_type at that time) Remove it, together with the associated comment. Fixes: c0ad90a32fb6 ("[PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fbe58d27cf128d5fe581e4510ded8701858f268e.1596716328.git.christophe.leroy@csgroup.eu
* | powerpc/32s: Fix assembler warning about r0Christophe Leroy2020-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The assembler says: arch/powerpc/kernel/head_32.S:1095: Warning: invalid register expression It's objecting to the use of r0 as the RA argument. That's because when RA = 0 the literal value 0 is used, rather than the content of r0, making the use of r0 in the source potentially confusing. Fix it to use a literal 0, the generated code is identical. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2b69ac8e1cddff6f808fc7415907179eab4aae9e.1596693679.git.christophe.leroy@csgroup.eu
* | powerpc/nx: Don't pack struct coprocessor_request_blockOliver O'Halloran2020-08-251-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building with W=1 results in the following warning: In file included from arch/powerpc/platforms/powernv/vas-fault.c:16: ./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned] 159 | } __packed; | ^ ./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned] ./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned] ./arch/powerpc/include/asm/icswx.h:159:1: error: alignment 1 of ‘struct coprocessor_request_block’ is less than 16 [-Werror=packed-not-aligned] cc1: all warnings being treated as errors This happens because coprocessor_request_block includes several sub-structures with an alignment specified using the __aligned(XX) attribute. The problem comes from coprocessor_request_block having the __packed attribute. Packing the structure causes the preferred alignment of the nested structures to be ignored and we get the warnings as a result. This isn't a problem in practice since the struct is defined with explicit padding in the form of reserved fields, but we'd like to get rid of the spurious warnings. The simplest solution is to remove the packed attribute and use a BUILD_BUG_ON() to ensure the struct is the correct (expected by HW) size compile time. Also add a __aligned(128) to the request block structure since Book4 for P8 suggests the HW requires it to be aligned to a 128 byte boundary. There's a similar requirement for P9 since the COPY and PASTE instructions used to invoke VAS/NX accelerators operates on a cache line boundary. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804005410.146094-7-oohall@gmail.com
* | powerpc/powernv: Fix spurious kerneldoc warnings in opal-prd.cOliver O'Halloran2020-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Comments opening with /** are parsed by kerneldoc and this causes the following warning to be printed: arch/powerpc/platforms/powernv/opal-prd.c:31: warning: cannot understand function prototype: 'struct opal_prd_msg_queue_item ' opal_prd_mesg_queue_item is an internal data structure so there's no real need for it to be documented at all. Fix up the comment to squash the warning. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804005410.146094-5-oohall@gmail.com
* | powerpc/powernv: Staticify functions without prototypesOliver O'Halloran2020-08-253-8/+7
| | | | | | | | | | | | | | | | | | There's a few scattered in the powernv platform. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804005410.146094-4-oohall@gmail.com
* | powerpc/powernv: Include asm/powernv.h from the local powernv.hOliver O'Halloran2020-08-252-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The asm/powernv.h header provides prototypes for functions which need to be called by non-powernv platform code. Also include it in the powernv.h that's local to the platform directory to squash some warnings about non-static functions missing prototypes. Also include powernv.h since from opal-memcons.c since it has the prototypes for the memcons wrangling functions which are used for the opal and ultravisor msglog. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804005410.146094-3-oohall@gmail.com
* | powerpc/powernv/smp: Fix spurious DBG() warningOliver O'Halloran2020-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building with W=1 we get the following warning: arch/powerpc/platforms/powernv/smp.c: In function ‘pnv_smp_cpu_kill_self’: arch/powerpc/platforms/powernv/smp.c:276:16: error: suggest braces around empty body in an ‘if’ statement [-Werror=empty-body] 276 | cpu, srr1); | ^ cc1: all warnings being treated as errors The full context is this block: if (srr1 && !generic_check_cpu_restart(cpu)) DBG("CPU%d Unexpected exit while offline srr1=%lx!\n", cpu, srr1); When building with DEBUG undefined DBG() expands to nothing and GCC emits the warning due to the lack of braces around an empty statement. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804005410.146094-2-oohall@gmail.com
* | powerpc/oprofile: fix spelling mistake "contex" -> "context"Colin Ian King2020-08-251-1/+1
| | | | | | | | | | | | | | | | There is a spelling mistake in a pr_debug message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200804174316.402425-1-colin.king@canonical.com
* | powerpc/vmemmap: Don't warn if we don't find a mapping vmemmap list entryAneesh Kumar K.V2020-08-251-3/+1
| | | | | | | | | | | | | | | | | | Now that we are handling vmemmap list allocation failure correctly, don't WARN in section deactivate when we don't find a mapping vmemmap list entry. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200731113500.248306-2-aneesh.kumar@linux.ibm.com
* | powerpc/vmemmap: Fix memory leak with vmemmap list allocation failures.Aneesh Kumar K.V2020-08-251-7/+28
| | | | | | | | | | | | | | | | | | | | If we fail to allocate vmemmap list, we don't keep track of allocated vmemmap block buf. Hence on section deactivate we skip vmemmap block buf free. This results in memory leak. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200731113500.248306-1-aneesh.kumar@linux.ibm.com
* | powerpc/powernv: Remove set but not used variable 'parent'zhengbin2020-08-251-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix gcc '-Wunused-but-set-variable' warning: arch/powerpc/platforms/powernv/pci-ioda.c: In function pnv_ioda_configure_pe: arch/powerpc/platforms/powernv/pci-ioda.c:867:18: warning: variable parent set but not used [-Wunused-but-set-variable] It is not used since commit b131a8425c34 ("powerpc/powernv: Set PELTV for compound PEs") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1574144074-142032-6-git-send-email-zhengbin13@huawei.com
* | powerpc/perf: Remove set but not used variable 'target'zhengbin2020-08-251-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix gcc '-Wunused-but-set-variable' warning: arch/powerpc/perf/imc-pmu.c: In function trace_imc_event_init: arch/powerpc/perf/imc-pmu.c:1292:22: warning: variable target set but not used [-Wunused-but-set-variable] It is introduced by commit 012ae244845f ("powerpc/perf: Trace imc PMU functions"), but never used, so remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1574144074-142032-3-git-send-email-zhengbin13@huawei.com
* | powerpc/fadump: Remove set but not used variable 'elf'zhengbin2020-08-251-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix gcc '-Wunused-but-set-variable' warning: arch/powerpc/kernel/fadump.c: In function fadump_update_elfcore_header: arch/powerpc/kernel/fadump.c:790:17: warning: variable elf set but not used [-Wunused-but-set-variable] It is introduced by commit ebaeb5ae2437 ("fadump: Convert firmware-assisted cpu state dump data into elf notes."), but never used, so remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1574144074-142032-2-git-send-email-zhengbin13@huawei.com
* | powerc/dtc/t1024rdb: remove interrupts propertyBiwen Li2020-08-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the interrupt pin for RTC DS1339 is not connected to the CPU on T1024RDB, remove the interrupt property from the device tree. This also fix the following warning for hwclock.util-linux: $ hwclock.util-linux hwclock.util-linux: select() to /dev/rtc0 to wait for clock tick timed out Signed-off-by: Biwen Li <biwen.li@nxp.com> Acked-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527034228.23793-2-biwen.li@oss.nxp.com
* | powerpc/dts/t4240rdb: remove interrupts propertyBiwen Li2020-08-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the interrupt pin for RTC DS1374 is not connected to the CPU on T4240RDB, remove the interrupt property from the device tree. This also fix the following warning for hwclock.util-linux: $ hwclock.util-linux hwclock.util-linux: select() to /dev/rtc0 to wait for clock tick timed out Signed-off-by: Biwen Li <biwen.li@nxp.com> Acked-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200527034228.23793-1-biwen.li@oss.nxp.com
* | ocxl: Remove custom service to allocate interruptsFrederic Barrat2020-08-252-33/+0
| | | | | | | | | | | | | | | | | | | | | | We now allocate interrupts through xive directly. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200403153838.29224-5-fbarrat@linux.ibm.com
* | powerpc/icp-hv: Fix missing of_node_put() in success pathNicholas Mc Guire2020-08-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | Both of_find_compatible_node() and of_find_node_by_type() will return a refcounted node on success - thus for the success path the node must be explicitly released with a of_node_put(). Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1530691407-3991-1-git-send-email-hofrat@osadl.org
* | powerpc/pseries: Fix missing of_node_put() in rng_init()Nicholas Mc Guire2020-08-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | The call to of_find_compatible_node() returns a node pointer with refcount incremented thus it must be explicitly decremented here before returning. Fixes: a489043f4626 ("powerpc/pseries: Implement arch_get_random_long() based on H_RANDOM") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1530522496-14816-1-git-send-email-hofrat@osadl.org
* | Merge tag 'powerpc-5.9-3' of ↵Linus Torvalds2020-08-2320-24/+160
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Add perf support for emitting extended registers for power10. - A fix for CPU hotplug on pseries, where on large/loaded systems we may not wait long enough for the CPU to be offlined, leading to crashes. - Addition of a raw cputable entry for Power10, which is not required to boot, but is required to make our PMU setup work correctly in guests. - Three fixes for the recent changes on 32-bit Book3S to move modules into their own segment for strict RWX. - A fix for a recent change in our powernv PCI code that could lead to crashes. - A change to our perf interrupt accounting to avoid soft lockups when using some events, found by syzkaller. - A change in the way we handle power loss events from the hypervisor on pseries. We no longer immediately shut down if we're told we're running on a UPS. - A few other minor fixes. Thanks to Alexey Kardashevskiy, Andreas Schwab, Aneesh Kumar K.V, Anju T Sudhakar, Athira Rajeev, Christophe Leroy, Frederic Barrat, Greg Kurz, Kajol Jain, Madhavan Srinivasan, Michael Neuling, Michael Roth, Nageswara R Sastry, Oliver O'Halloran, Thiago Jung Bauermann, Vaidyanathan Srinivasan, Vasant Hegde. * tag 'powerpc-5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driver powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000 powerpc/pseries: Do not initiate shutdown when system is running on UPS powerpc/perf: Fix soft lockups due to missed interrupt accounting powerpc/powernv/pci: Fix possible crash when releasing DMA resources powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death powerpc/32s: Fix is_module_segment() when MODULES_VADDR is defined powerpc/kasan: Fix KASAN_SHADOW_START on BOOK3S_32 powerpc/fixmap: Fix the size of the early debug area powerpc/pkeys: Fix build error with PPC_MEM_KEYS disabled powerpc/kernel: Cleanup machine check function declarations powerpc: Add POWER10 raw mode cputable entry powerpc/perf: Add extended regs support for power10 platform powerpc/perf: Add support for outputting extended regs in perf intr_regs powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores
| * powerpc/perf/hv-24x7: Move cpumask file to top folder of hv-24x7 driverKajol Jain2020-08-211-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 792f73f747b8 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask") added cpumask file as part of hv-24x7 driver inside the interface folder. The cpumask file is supposed to be in the top folder of the PMU driver in order to make hotplug work. This patch fixes that issue and creates new group 'cpumask_attr_group' to add cpumask file and make sure it added in top folder. command:# cat /sys/devices/hv_24x7/cpumask 0 Fixes: 792f73f747b8 ("powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show cpumask") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200821080610.123997-1-kjain@linux.ibm.com
| * powerpc/32s: Fix module loading failure when VMALLOC_END is over 0xf0000000Christophe Leroy2020-08-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In is_module_segment(), when VMALLOC_END is over 0xf0000000, ALIGN(VMALLOC_END, SZ_256M) has value 0. In that case, addr >= ALIGN(VMALLOC_END, SZ_256M) is always true then is_module_segment() always returns false. Use (ALIGN(VMALLOC_END, SZ_256M) - 1) which will have value 0xffffffff and will be suitable for the comparison. Fixes: c49643319715 ("powerpc/32s: Only leave NX unset on segments used for modules") Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/09fc73fe9c7423c6b4cf93f93df9bb0ed8eefab5.1597994047.git.christophe.leroy@csgroup.eu
| * powerpc/pseries: Do not initiate shutdown when system is running on UPSVasant Hegde2020-08-201-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per PAPR we have to look for both EPOW sensor value and event modifier to identify the type of event and take appropriate action. In LoPAPR v1.1 section 10.2.2 includes table 136 "EPOW Action Codes": SYSTEM_SHUTDOWN 3 The system must be shut down. An EPOW-aware OS logs the EPOW error log information, then schedules the system to be shut down to begin after an OS defined delay internal (default is 10 minutes.) Then in section 10.3.2.2.8 there is table 146 "Platform Event Log Format, Version 6, EPOW Section", which includes the "EPOW Event Modifier": For EPOW sensor value = 3 0x01 = Normal system shutdown with no additional delay 0x02 = Loss of utility power, system is running on UPS/Battery 0x03 = Loss of system critical functions, system should be shutdown 0x04 = Ambient temperature too high All other values = reserved We have a user space tool (rtas_errd) on LPAR to monitor for EPOW_SHUTDOWN_ON_UPS. Once it gets an event it initiates shutdown after predefined time. It also starts monitoring for any new EPOW events. If it receives "Power restored" event before predefined time it will cancel the shutdown. Otherwise after predefined time it will shutdown the system. Commit 79872e35469b ("powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown") changed our handling of the "on UPS/Battery" case, to immediately shutdown the system. This breaks existing setups that rely on the userspace tool to delay shutdown and let the system run on the UPS. Fixes: 79872e35469b ("powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [mpe: Massage change log and add PAPR references] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200820061844.306460-1-hegdevasant@linux.vnet.ibm.com
| * powerpc/perf: Fix soft lockups due to missed interrupt accountingAthira Rajeev2020-08-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance monitor interrupt handler checks if any counter has overflown and calls record_and_restart() in core-book3s which invokes perf_event_overflow() to record the sample information. Apart from creating sample, perf_event_overflow() also does the interrupt and period checks via perf_event_account_interrupt(). Currently we record information only if the SIAR (Sampled Instruction Address Register) valid bit is set (using siar_valid() check) and hence the interrupt check. But it is possible that we do sampling for some events that are not generating valid SIAR, and hence there is no chance to disable the event if interrupts are more than max_samples_per_tick. This leads to soft lockup. Fix this by adding perf_event_account_interrupt() in the invalid SIAR code path for a sampling event. ie if SIAR is invalid, just do interrupt check and don't record the sample information. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1596717992-7321-1-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/powernv/pci: Fix possible crash when releasing DMA resourcesFrederic Barrat2020-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a typo introduced during recent code cleanup, which could lead to silently not freeing resources or an oops message (on PCI hotplug or CAPI reset). Only impacts ioda2, the code path for ioda1 is correct. Fixes: 01e12629af4e ("powerpc/powernv/pci: Add explicit tracking of the DMA setup state") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200819130741.16769-1-fbarrat@linux.ibm.com
| * powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU deathMichael Roth2020-08-181-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a power9 KVM guest with XIVE enabled, running a test loop where we hotplug 384 vcpus and then unplug them, the following traces can be seen (generally within a few loops) either from the unplugged vcpu: cpu 65 (hwid 65) Ready to die... Querying DEAD? cpu 66 (66) shows 2 list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048 ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:56! Oops: Exception in kernel mode, sig: 5 [#1] LE SMP NR_CPUS=2048 NUMA pSeries Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ... CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1 NIP: c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac REGS: c0000009e5a17840 TRAP: 0700 Not tainted (4.18.0-221.el8.ppc64le) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28000842 XER: 20040000 ... NIP __list_del_entry_valid+0xac/0x100 LR __list_del_entry_valid+0xa8/0x100 Call Trace: __list_del_entry_valid+0xa8/0x100 (unreliable) free_pcppages_bulk+0x1f8/0x940 free_unref_page+0xd0/0x100 xive_spapr_cleanup_queue+0x148/0x1b0 xive_teardown_cpu+0x1bc/0x240 pseries_mach_cpu_die+0x78/0x2f0 cpu_die+0x48/0x70 arch_cpu_idle_dead+0x20/0x40 do_idle+0x2f4/0x4c0 cpu_startup_entry+0x38/0x40 start_secondary+0x7bc/0x8f0 start_secondary_prolog+0x10/0x14 or on the worker thread handling the unplug: pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a Querying DEAD? cpu 314 (314) shows 2 BUG: Bad page state in process kworker/u768:3 pfn:95de1 cpu 314 (hwid 314) Ready to die... page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 flags: 0x5ffffc00000000() raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000 raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ... CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1 Workqueue: pseries hotplug workque pseries_hp_work_fn Call Trace: dump_stack+0xb0/0xf4 (unreliable) bad_page+0x12c/0x1b0 free_pcppages_bulk+0x5bc/0x940 page_alloc_cpu_dead+0x118/0x120 cpuhp_invoke_callback.constprop.5+0xb8/0x760 _cpu_down+0x188/0x340 cpu_down+0x5c/0xa0 cpu_subsys_offline+0x24/0x40 device_offline+0xf0/0x130 dlpar_offline_cpu+0x1c4/0x2a0 dlpar_cpu_remove+0xb8/0x190 dlpar_cpu_remove_by_index+0x12c/0x150 dlpar_cpu+0x94/0x800 pseries_hp_work_fn+0x128/0x1e0 process_one_work+0x304/0x5d0 worker_thread+0xcc/0x7a0 kthread+0x1ac/0x1c0 ret_from_kernel_thread+0x5c/0x80 The latter trace is due to the following sequence: page_alloc_cpu_dead drain_pages drain_pages_zone free_pcppages_bulk where drain_pages() in this case is called under the assumption that the unplugged cpu is no longer executing. To ensure that is the case, and early call is made to __cpu_die()->pseries_cpu_die(), which runs a loop that waits for the cpu to reach a halted state by polling its status via query-cpu-stopped-state RTAS calls. It only polls for 25 iterations before giving up, however, and in the trace above this results in the following being printed only .1 seconds after the hotplug worker thread begins processing the unplug request: pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a Querying DEAD? cpu 314 (314) shows 2 At that point the worker thread assumes the unplugged CPU is in some unknown/dead state and procedes with the cleanup, causing the race with the XIVE cleanup code executed by the unplugged CPU. Fix this by waiting indefinitely, but also making an effort to avoid spurious lockup messages by allowing for rescheduling after polling the CPU status and printing a warning if we wait for longer than 120s. Fixes: eac1e731b59ee ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> [mpe: Trim oopses in change log slightly for readability] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com
| * powerpc/32s: Fix is_module_segment() when MODULES_VADDR is definedChristophe Leroy2020-08-181-0/+7
| | | | | | | | | | | | | | | | | | | | When MODULES_VADDR is defined, is_module_segment() shall check the address against it instead of checking agains VMALLOC_START. Fixes: 6ca055322da8 ("powerpc/32s: Use dedicated segment for modules with STRICT_KERNEL_RWX") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/07884ed033c31e074747b7eb8eaa329d15db07ec.1596641219.git.christophe.leroy@csgroup.eu