summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAl Viro <viro@zeniv.linux.org.uk>2018-05-20 16:46:23 -0400
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2018-05-30 07:51:47 +0200
commitfbcede36bbfd69a9df511a22d14bccaaf36b412a (patch)
tree1cd98fb902ea531ff61dc2465e6f7d28e0065d65
parentb9659ff375cbb965bbfa3cbce1336b917aa0831e (diff)
downloadlinux-stable-fbcede36bbfd69a9df511a22d14bccaaf36b412a.tar.gz
linux-stable-fbcede36bbfd69a9df511a22d14bccaaf36b412a.tar.bz2
linux-stable-fbcede36bbfd69a9df511a22d14bccaaf36b412a.zip
aio: fix io_destroy(2) vs. lookup_ioctx() race
commit baf10564fbb66ea222cae66fbff11c444590ffd9 upstream. kill_ioctx() used to have an explicit RCU delay between removing the reference from ->ioctx_table and percpu_ref_kill() dropping the refcount. At some point that delay had been removed, on the theory that percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was the wrong kind of RCU delay and it didn't care about rcu_read_lock() used by lookup_ioctx(). As the result, we could get ctx freed right under lookup_ioctx(). Tejun has fixed that in a6d7cff472e ("fs/aio: Add explicit RCU grace period when freeing kioctx"); however, that fix is not enough. Suppose io_destroy() from one thread races with e.g. io_setup() from another; CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2 has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the refcount, getting it to 0 and triggering a call of free_ioctx_users(), which proceeds to drop the secondary refcount and once that reaches zero calls free_ioctx_reqs(). That does INIT_RCU_WORK(&ctx->free_rwork, free_ioctx); queue_rcu_work(system_wq, &ctx->free_rwork); and schedules freeing the whole thing after RCU delay. In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the refcount from 0 to 1 and returned the reference to io_setup(). Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get freed until after percpu_ref_get(). Sure, we'd increment the counter before ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it has grabbed the reference, ctx is *NOT* going away until it gets around to dropping that reference. The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss. It's not costlier than what we currently do in normal case, it's safe to call since freeing *is* delayed and it closes the race window - either lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx() fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see the object in question at all. Cc: stable@kernel.org Fixes: a6d7cff472e "fs/aio: Add explicit RCU grace period when freeing kioctx" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-rw-r--r--fs/aio.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/fs/aio.c b/fs/aio.c
index c3ace7833a03..4e23958c2509 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1087,8 +1087,8 @@ static struct kioctx *lookup_ioctx(unsigned long ctx_id)
ctx = rcu_dereference(table->table[id]);
if (ctx && ctx->user_id == ctx_id) {
- percpu_ref_get(&ctx->users);
- ret = ctx;
+ if (percpu_ref_tryget_live(&ctx->users))
+ ret = ctx;
}
out:
rcu_read_unlock();