diff options
author | Tariq Saeed <tariq.x.saeed@oracle.com> | 2014-08-06 16:03:58 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-08-06 18:01:13 -0700 |
commit | bba1cb17d9adb8c074f138c1cc039fb2ee8c964c (patch) | |
tree | ba3766311c7dffa2be7c013b9dc3585653bdbd48 /fs/ocfs2 | |
parent | 7567c148835f7ac16edf662d65816640cf8e20bb (diff) | |
download | linux-bba1cb17d9adb8c074f138c1cc039fb2ee8c964c.tar.gz linux-bba1cb17d9adb8c074f138c1cc039fb2ee8c964c.tar.bz2 linux-bba1cb17d9adb8c074f138c1cc039fb2ee8c964c.zip |
ocfs2: race between umount and unfinished remastering during recovery
Orabug: 19074140
When umount is issued during recovery on the new master that has not
finished remastering locks, it triggers BUG() in
dlm_send_mig_lockres_msg(). Here is the situation:
1) node A has a lock on resource X mastered by node B.
2) node B dies -> node A sets recovering flag for res X
3) Node C becomes the new master for resources owned by the
dead node and is remastering locks of the dead node but
has not finished the remastering process yet.
4) umount is issued on node C.
5) During processing of umount, ignoring unfished recovery,
node C attempts to migrate resource X to node A.
6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers
it a logic error and sends back -EFAULT.
7) node C asserts BUG() upon seeing EFAULT resp from node B.
Fix is to delay migrating res X till remastering is finished at which
point recovering flag will be cleared on both A and C.
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/ocfs2')
-rw-r--r-- | fs/ocfs2/dlm/dlmmaster.c | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 82abf0cc9a12..3ec906ef5d9a 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -2405,6 +2405,10 @@ static int dlm_is_lockres_migrateable(struct dlm_ctxt *dlm, if (res->state & DLM_LOCK_RES_MIGRATING) return 0; + /* delay migration when the lockres is in RECOCERING state */ + if (res->state & DLM_LOCK_RES_RECOVERING) + return 0; + if (res->owner != dlm->node_num) return 0; |