MIPS: CPS: Handle cores not powering down more gracefully

If we get into a state where a core that ought to power down isn't doing so then the current result is that another CPU gets stuck inside cps_cpu_die() waiting for CPU that ought to be powering down to do so. The best case scenario is that we then trigger RCU stall messages or lockup messages, but neither makes it particularly clear what's happening. Handle this more gracefully by introducing a timeout beyond which we warn the user that the core didn't power down & stop waiting for it. This at least allows the CPU running cps_cpu_die() to continue normally, and hopefully presuming the CPU that powered back up is doing nothing harmful the system will continue functioning as normal. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16197/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
author: Paul Burton <paul.burton@imgtec.com> 2017-06-02 14:48:54 -0700
committer: Ralf Baechle <ralf@linux-mips.org> 2017-06-29 02:42:28 +0200
commit: 4ad755c9e39c0eeae16f96b97602f1954f582c66 (patch)
tree: a6770d2f6964e0aaa36c80ac55b4bce2eb6faf78 /arch
parent: 5570ba2ee920de4e7760a2802b842771845b2c32 (diff)
download: linux-stable-4ad755c9e39c0eeae16f96b97602f1954f582c66.tar.gz
linux-stable-4ad755c9e39c0eeae16f96b97602f1954f582c66.tar.bz2
linux-stable-4ad755c9e39c0eeae16f96b97602f1954f582c66.zip
1 files changed, 24 insertions, 3 deletions
diff --git a/arch/mips/kernel/smp-cps.c b/arch/mips/kernel/smp-cps.c
index 90ecd099c4b0..f832e99ad4c3 100644
--- a/arch/mips/kernel/smp-cps.c
+++ b/arch/mips/kernel/smp-cps.c
@@ -490,6 +490,7 @@ static void cps_cpu_die(unsigned int cpu)
 {
 	unsigned core = cpu_data[cpu].core;
 	unsigned int vpe_id = cpu_vpe_id(&cpu_data[cpu]);
+	ktime_t fail_time;
 	unsigned stat;
 	int err;
 
@@ -516,6 +517,7 @@ static void cps_cpu_die(unsigned int cpu)
 		 * state, the latter happening when a JTAG probe is connected
 		 * in which case the CPC will refuse to power down the core.
 		 */
+		fail_time = ktime_add_ms(ktime_get(), 2000);
 		do {
 			mips_cm_lock_other(core, 0);
 			mips_cpc_lock_other(core);
@@ -523,9 +525,28 @@ static void cps_cpu_die(unsigned int cpu)
 			stat &= CPC_Cx_STAT_CONF_SEQSTATE_MSK;
 			mips_cpc_unlock_other();
 			mips_cm_unlock_other();
-		} while (stat != CPC_Cx_STAT_CONF_SEQSTATE_D0 &&
-			 stat != CPC_Cx_STAT_CONF_SEQSTATE_D2 &&
-			 stat != CPC_Cx_STAT_CONF_SEQSTATE_U2);
+
+			if (stat == CPC_Cx_STAT_CONF_SEQSTATE_D0 ||
+			    stat == CPC_Cx_STAT_CONF_SEQSTATE_D2 ||
+			    stat == CPC_Cx_STAT_CONF_SEQSTATE_U2)
+				break;
+
+			/*
+			 * The core ought to have powered down, but didn't &
+			 * now we don't really know what state it's in. It's
+			 * likely that its _pwr_up pin has been wired to logic
+			 * 1 & it powered back up as soon as we powered it
+			 * down...
+			 *
+			 * The best we can do is warn the user & continue in
+			 * the hope that the core is doing nothing harmful &
+			 * might behave properly if we online it later.
+			 */
+			if (WARN(ktime_after(ktime_get(), fail_time),
+				 "CPU%u hasn't powered down, seq. state %u\n",
+				 cpu, stat >> CPC_Cx_STAT_CONF_SEQSTATE_SHF))
+				break;
+		} while (1);
 
 		/* Indicate the core is powered off */
 		bitmap_clear(core_power, core, 1);
author	Paul Burton <paul.burton@imgtec.com>	2017-06-02 14:48:54 -0700
committer	Ralf Baechle <ralf@linux-mips.org>	2017-06-29 02:42:28 +0200
commit	4ad755c9e39c0eeae16f96b97602f1954f582c66 (patch)
tree	a6770d2f6964e0aaa36c80ac55b4bce2eb6faf78 /arch
parent	5570ba2ee920de4e7760a2802b842771845b2c32 (diff)
download	linux-stable-4ad755c9e39c0eeae16f96b97602f1954f582c66.tar.gz linux-stable-4ad755c9e39c0eeae16f96b97602f1954f582c66.tar.bz2 linux-stable-4ad755c9e39c0eeae16f96b97602f1954f582c66.zip