diff options
author | Steven Rostedt <rostedt@goodmis.org> | 2008-11-26 21:04:24 -0500 |
---|---|---|
committer | Ingo Molnar <mingo@elte.hu> | 2008-11-27 10:29:52 +0100 |
commit | 4cd4262034849da01eb88659af677b69f8169f06 (patch) | |
tree | eaab94e7fd4a436bcead7efd6684405252f336c4 | |
parent | ee2f6cc7f9ea2542ad46070ed62ba7aa04d08871 (diff) | |
download | linux-stable-4cd4262034849da01eb88659af677b69f8169f06.tar.gz linux-stable-4cd4262034849da01eb88659af677b69f8169f06.tar.bz2 linux-stable-4cd4262034849da01eb88659af677b69f8169f06.zip |
sched: prevent divide by zero error in cpu_avg_load_per_task
Impact: fix divide by zero crash in scheduler rebalance irq
While testing the branch profiler, I hit this crash:
divide error: 0000 [#1] PREEMPT SMP
[...]
RIP: 0010:[<ffffffff8024a008>] [<ffffffff8024a008>] cpu_avg_load_per_task+0x50/0x7f
[...]
Call Trace:
<IRQ> <0> [<ffffffff8024fd43>] find_busiest_group+0x3e5/0xcaa
[<ffffffff8025da75>] rebalance_domains+0x2da/0xa21
[<ffffffff80478769>] ? find_next_bit+0x1b2/0x1e6
[<ffffffff8025e2ce>] run_rebalance_domains+0x112/0x19f
[<ffffffff8026d7c2>] __do_softirq+0xa8/0x232
[<ffffffff8020ea7c>] call_softirq+0x1c/0x3e
[<ffffffff8021047a>] do_softirq+0x94/0x1cd
[<ffffffff8026d5eb>] irq_exit+0x6b/0x10e
[<ffffffff8022e6ec>] smp_apic_timer_interrupt+0xd3/0xff
[<ffffffff8020e4b3>] apic_timer_interrupt+0x13/0x20
The code for cpu_avg_load_per_task has:
if (rq->nr_running)
rq->avg_load_per_task = rq->load.weight / rq->nr_running;
The runqueue lock is not held here, and there is nothing that prevents
the rq->nr_running from going to zero after it passes the if condition.
The branch profiler simply made the race window bigger.
This patch saves off the rq->nr_running to a local variable and uses that
for both the condition and the division.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
-rw-r--r-- | kernel/sched.c | 5 |
1 files changed, 3 insertions, 2 deletions
diff --git a/kernel/sched.c b/kernel/sched.c index 9b1e79371c20..700aa9a1413f 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -1453,9 +1453,10 @@ static int task_hot(struct task_struct *p, u64 now, struct sched_domain *sd); static unsigned long cpu_avg_load_per_task(int cpu) { struct rq *rq = cpu_rq(cpu); + unsigned long nr_running = rq->nr_running; - if (rq->nr_running) - rq->avg_load_per_task = rq->load.weight / rq->nr_running; + if (nr_running) + rq->avg_load_per_task = rq->load.weight / nr_running; else rq->avg_load_per_task = 0; |