summaryrefslogtreecommitdiffstats
path: root/fs/gfs2
diff options
context:
space:
mode:
authorXue jiufei <xuejiufei@huawei.com>2014-06-04 16:06:14 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2014-06-04 16:53:54 -0700
commit01c6222f876062355599e5a63560c514b6de25d2 (patch)
treebf29b34aac8ea3a95fefc0e11873ea3a77ffa858 /fs/gfs2
parenta9e9acaeb0a981a6dfa54b32dd756103aeefa6a7 (diff)
downloadlinux-01c6222f876062355599e5a63560c514b6de25d2.tar.gz
linux-01c6222f876062355599e5a63560c514b6de25d2.tar.bz2
linux-01c6222f876062355599e5a63560c514b6de25d2.zip
ocfs2/dlm: disallow node joining when recovery is on going
We found a race situation when dlm recovery and node joining occurs simultaneously if the network state is bad. N1 N4 start joining dlm and send query join to all live nodes set joining node to N1, return OK send query join to other live nodes and it may take a while call dlm_send_join_assert() to send assert join message when N2 is down, so keep trying to send message to N2 until find N2 is down send assert join message to N3, but connection is down with N3, so it may take a while become the recovery master for N2 and send begin reco message to other nodes in domain map but no N1 connection with N3 is rebuild, then send assert join to N4 call dlm_assert_joined_handler(), add N1 to domain_map dlm recovery done, send finalize message to nodes in domain map, including N1 receiving finalize message, trigger the BUG() because recovery master mismatch. Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/gfs2')
0 files changed, 0 insertions, 0 deletions