diff options
author | Peter Zijlstra <peterz@infradead.org> | 2016-05-20 18:04:36 +0200 |
---|---|---|
committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2016-06-01 12:17:02 -0700 |
commit | acc55cf5339a99389669139f83463ada2b1a6559 (patch) | |
tree | 8d5909bcb5d6f327ba50c699eb1b1a5ee2c0b7b2 | |
parent | 5637fc0d84fc4dd499b4f59ea2390175ca154906 (diff) | |
download | linux-stable-acc55cf5339a99389669139f83463ada2b1a6559.tar.gz linux-stable-acc55cf5339a99389669139f83463ada2b1a6559.tar.bz2 linux-stable-acc55cf5339a99389669139f83463ada2b1a6559.zip |
locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()
commit 54cf809b9512be95f53ed4a5e3b631d1ac42f0fa upstream.
Similar to commits:
51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()")
d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")
qspinlock suffers from the fact that the _Q_LOCKED_VAL store is
unordered inside the ACQUIRE of the lock.
And while this is not a problem for the regular mutual exclusive
critical section usage of spinlocks, it breaks creative locking like:
spin_lock(A) spin_lock(B)
spin_unlock_wait(B) if (!spin_is_locked(A))
do_something() do_something()
In that both CPUs can end up running do_something at the same time,
because our _Q_LOCKED_VAL store can drop past the spin_unlock_wait()
spin_is_locked() loads (even on x86!!).
To avoid making the normal case slower, add smp_mb()s to the less used
spin_unlock_wait() / spin_is_locked() side of things to avoid this
problem.
Reported-and-tested-by: Davidlohr Bueso <dave@stgolabs.net>
Reported-by: Giovanni Gherdovich <ggherdovich@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-rw-r--r-- | include/asm-generic/qspinlock.h | 27 |
1 files changed, 26 insertions, 1 deletions
diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index 39e1cb201b8e..332da3ad8eb5 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -28,7 +28,30 @@ */ static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { - return atomic_read(&lock->val); + /* + * queued_spin_lock_slowpath() can ACQUIRE the lock before + * issuing the unordered store that sets _Q_LOCKED_VAL. + * + * See both smp_cond_acquire() sites for more detail. + * + * This however means that in code like: + * + * spin_lock(A) spin_lock(B) + * spin_unlock_wait(B) spin_is_locked(A) + * do_something() do_something() + * + * Both CPUs can end up running do_something() because the store + * setting _Q_LOCKED_VAL will pass through the loads in + * spin_unlock_wait() and/or spin_is_locked(). + * + * Avoid this by issuing a full memory barrier between the spin_lock() + * and the loads in spin_unlock_wait() and spin_is_locked(). + * + * Note that regular mutual exclusion doesn't care about this + * delayed store. + */ + smp_mb(); + return atomic_read(&lock->val) & _Q_LOCKED_MASK; } /** @@ -108,6 +131,8 @@ static __always_inline void queued_spin_unlock(struct qspinlock *lock) */ static inline void queued_spin_unlock_wait(struct qspinlock *lock) { + /* See queued_spin_is_locked() */ + smp_mb(); while (atomic_read(&lock->val) & _Q_LOCKED_MASK) cpu_relax(); } |