sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race

[ Upstream commit 49bef33e4b87b743495627a529029156c6e09530 ] John reported that push_rt_task() can end up invoking find_lowest_rq(rq->curr) when curr is not an RT task (in this case a CFS one), which causes mayhem down convert_prio(). This can happen when current gets demoted to e.g. CFS when releasing an rt_mutex, and the local CPU gets hit with an rto_push_work irqwork before getting the chance to reschedule. Exactly who triggers this work isn't entirely clear to me - switched_from_rt() only invokes rt_queue_pull_task() if there are no RT tasks on the local RQ, which means the local CPU can't be in the rto_mask. My current suspected sequence is something along the lines of the below, with the demoted task being current. mark_wakeup_next_waiter() rt_mutex_adjust_prio() rt_mutex_setprio() // deboost originally-CFS task check_class_changed() switched_from_rt() // Only rt_queue_pull_task() if !rq->rt.rt_nr_running switched_to_fair() // Sets need_resched __balance_callbacks() // if pull_rt_task(), tell_cpu_to_push() can't select local CPU per the above raw_spin_rq_unlock(rq) // need_resched is set, so task_woken_rt() can't // invoke push_rt_tasks(). Best I can come up with is // local CPU has rt_nr_migratory >= 2 after the demotion, so stays // in the rto_mask, and then: <some other CPU running rto_push_irq_work_func() queues rto_push_work on this CPU> push_rt_task() // breakage follows here as rq->curr is CFS Move an existing check to check rq->curr vs the next pushable task's priority before getting anywhere near find_lowest_rq(). While at it, add an explicit sched_class of rq->curr check prior to invoking find_lowest_rq(rq->curr). Align the DL logic to also reschedule regardless of next_task's migratability. Fixes: a7c81556ec4d ("sched: Fix migrate_disable() vs rt/dl balancing") Reported-by: John Keeping <john@metanate.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: John Keeping <john@metanate.com> Link: https://lore.kernel.org/r/20220127154059.974729-1-valentin.schneider@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
author: Valentin Schneider <valentin.schneider@arm.com> 2022-01-27 15:40:59 +0000
committer: Greg Kroah-Hartman <gregkh@linuxfoundation.org> 2022-04-08 14:23:11 +0200
commit: 1e0e63ad6243f378e8476b42412f83c1be6ebbed (patch)
tree: 2f8e42cb5a2fe228f2ea5eab27f8adcab5709194 /kernel
parent: b7aec0843e2883b4bd5bed46f70b0febb046f9cc (diff)
download: linux-stable-1e0e63ad6243f378e8476b42412f83c1be6ebbed.tar.gz
linux-stable-1e0e63ad6243f378e8476b42412f83c1be6ebbed.tar.bz2
linux-stable-1e0e63ad6243f378e8476b42412f83c1be6ebbed.zip
2 files changed, 28 insertions, 16 deletions
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index e94314633b39..1f811b375bf0 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2145,12 +2145,6 @@ static int push_dl_task(struct rq *rq)
 		return 0;
 
 retry:
-	if (is_migration_disabled(next_task))
-		return 0;
-
-	if (WARN_ON(next_task == rq->curr))
-		return 0;
-
 	/*
 	 * If next_task preempts rq->curr, and rq->curr
 	 * can move away, it makes sense to just reschedule
@@ -2163,6 +2157,12 @@ retry:
 		return 0;
 	}
 
+	if (is_migration_disabled(next_task))
+		return 0;
+
+	if (WARN_ON(next_task == rq->curr))
+		return 0;
+
 	/* We might release rq lock */
 	get_task_struct(next_task);
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 54f9bb3f1560..2758cf5f7987 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1900,6 +1900,16 @@ static int push_rt_task(struct rq *rq, bool pull)
 		return 0;
 
 retry:
+	/*
+	 * It's possible that the next_task slipped in of
+	 * higher priority than current. If that's the case
+	 * just reschedule current.
+	 */
+	if (unlikely(next_task->prio < rq->curr->prio)) {
+		resched_curr(rq);
+		return 0;
+	}
+
 	if (is_migration_disabled(next_task)) {
 		struct task_struct *push_task = NULL;
 		int cpu;
@@ -1907,6 +1917,18 @@ retry:
 		if (!pull || rq->push_busy)
 			return 0;
 
+		/*
+		 * Invoking find_lowest_rq() on anything but an RT task doesn't
+		 * make sense. Per the above priority check, curr has to
+		 * be of higher priority than next_task, so no need to
+		 * reschedule when bailing out.
+		 *
+		 * Note that the stoppers are masqueraded as SCHED_FIFO
+		 * (cf. sched_set_stop_task()), so we can't rely on rt_task().
+		 */
+		if (rq->curr->sched_class != &rt_sched_class)
+			return 0;
+
 		cpu = find_lowest_rq(rq->curr);
 		if (cpu == -1 || cpu == rq->cpu)
 			return 0;
@@ -1931,16 +1953,6 @@ retry:
 	if (WARN_ON(next_task == rq->curr))
 		return 0;
 
-	/*
-	 * It's possible that the next_task slipped in of
-	 * higher priority than current. If that's the case
-	 * just reschedule current.
-	 */
-	if (unlikely(next_task->prio < rq->curr->prio)) {
-		resched_curr(rq);
-		return 0;
-	}
-
 	/* We might release rq lock */
 	get_task_struct(next_task);
author	Valentin Schneider <valentin.schneider@arm.com>	2022-01-27 15:40:59 +0000
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-04-08 14:23:11 +0200
commit	1e0e63ad6243f378e8476b42412f83c1be6ebbed (patch)
tree	2f8e42cb5a2fe228f2ea5eab27f8adcab5709194 /kernel
parent	b7aec0843e2883b4bd5bed46f70b0febb046f9cc (diff)
download	linux-stable-1e0e63ad6243f378e8476b42412f83c1be6ebbed.tar.gz linux-stable-1e0e63ad6243f378e8476b42412f83c1be6ebbed.tar.bz2 linux-stable-1e0e63ad6243f378e8476b42412f83c1be6ebbed.zip