summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
diff options
context:
space:
mode:
authorMilton Miller <miltonm@bga.com>2011-05-24 20:34:18 +0000
committerBenjamin Herrenschmidt <benh@kernel.crashing.org>2011-05-26 13:38:59 +1000
commit9b7882515864117d0015a3484c0ba0eee6713de9 (patch)
tree727b76078328927ba87f060b7bb57219ead3f314 /arch/powerpc
parent2e455257d143f54b44701e947a092d513889d01c (diff)
downloadlinux-9b7882515864117d0015a3484c0ba0eee6713de9.tar.gz
linux-9b7882515864117d0015a3484c0ba0eee6713de9.tar.bz2
linux-9b7882515864117d0015a3484c0ba0eee6713de9.zip
powerpc/irq: Protect irq_radix_revmap_lookup against irq_free_virt
The radix-tree code uses call_rcu when freeing internal elements. We must protect against the elements being freed while we traverse the tree, even if the returned pointer will still be valid. While preparing a patch to expand the context in which irq_radix_revmap_lookup will be called, I realized that the radix tree was not locked. When asked For a normal call_rcu usage, is it allowed to read the structure in irq_enter / irq_exit, without additional rcu_read_lock? Could an element freed with call_rcu advance with the cpu still between irq_enter/irq_exit (and irq_disabled())? Paul McKenney replied: Absolutely illegal to do so. OK for call_rcu_sched(), but a flaming bug for call_rcu(). And thank you very much for finding this!!! Further analysis: In the current CONFIG_TREE_RCU implementation. CONFIG_TREE_PREEMPT_RCU (and CONFIG_TINY_PREEMPT_RCU) uses explicit counters. These counters are reflected from per-CPU to global in the scheduling-clock-interrupt handler, so disabling irq does prevent the grace period from completing. But there are real-time implementations (such as the one use by the Concurrent guys) where disabling irq does -not- prevent the grace period from completing. While an alternative fix would be to switch radix-tree to rcu_sched, I don't want to audit the other users of radix trees (nor put alternative freeing in the library). The normal overhead for rcu_read_lock and unlock are a local counter increment and decrement. This does not show up in the rcu lockdep because in 2.6.34 commit 2676a58c98 (radix-tree: Disable RCU lockdep checking in radix tree) deemed it too hard to pass the condition of the protecting lock to the library. Signed-off-by: Milton Miller <miltonm@bga.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Diffstat (limited to 'arch/powerpc')
-rw-r--r--arch/powerpc/kernel/irq.c7
1 files changed, 5 insertions, 2 deletions
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index ac4d29119f3e..6cb3fcd7fc37 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -893,10 +893,13 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host,
return irq_find_mapping(host, hwirq);
/*
- * No rcu_read_lock(ing) needed, the ptr returned can't go under us
- * as it's referencing an entry in the static irq_map table.
+ * The ptr returned references the static global irq_map.
+ * but freeing an irq can delete nodes along the path to
+ * do the lookup via call_rcu.
*/
+ rcu_read_lock();
ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
+ rcu_read_unlock();
/*
* If found in radix tree, then fine.