diff options
author | Xue jiufei <xuejiufei@huawei.com> | 2013-09-11 14:19:46 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-09-11 15:56:31 -0700 |
commit | 98ac9125c5afed8c5d2e4c5824988f8ad51814e1 (patch) | |
tree | 123fa8641217bf5891caa2179146ec0a54de0dbd /fs/ocfs2/dlm | |
parent | f17c20dd2ec81e8ff328b81bc847da9429d0975b (diff) | |
download | linux-98ac9125c5afed8c5d2e4c5824988f8ad51814e1.tar.gz linux-98ac9125c5afed8c5d2e4c5824988f8ad51814e1.tar.bz2 linux-98ac9125c5afed8c5d2e4c5824988f8ad51814e1.zip |
ocfs2: dlm_request_all_locks() should deal with the status sent from target node
dlm_request_all_locks() should deal with the status sent from target node
if DLM_LOCK_REQUEST_MSG is sent successfully, or recovery master will fall
into endless loop, waiting for other nodes to send locks and
DLM_RECO_DATA_DONE_MSG to me.
NodeA NodeB
selected as recovery master
dlm_remaster_locks()
->dlm_request_all_locks()
send DLM_LOCK_REQUEST_MSG to nodeA
It happened that NodeA cannot alloc memory when it processes this
message. dlm_request_all_locks_handler() do not queue
dlm_request_all_locks_worker and returns -ENOMEM. It will never send
locks and DLM_RECO_DATA_DONE_MSG to NodeB.
NodeB do not deal with the status
sent from nodeA, and will fall in
endless loop waiting for the
recovery state of NodeA to be
changed.
Signed-off-by: joyce <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Jeff Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'fs/ocfs2/dlm')
-rw-r--r-- | fs/ocfs2/dlm/dlmrecovery.c | 5 |
1 files changed, 4 insertions, 1 deletions
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 773bd32bfd8c..f94550218152 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -787,6 +787,7 @@ static int dlm_request_all_locks(struct dlm_ctxt *dlm, u8 request_from, { struct dlm_lock_request lr; int ret; + int status; mlog(0, "\n"); @@ -800,13 +801,15 @@ static int dlm_request_all_locks(struct dlm_ctxt *dlm, u8 request_from, // send message ret = o2net_send_message(DLM_LOCK_REQUEST_MSG, dlm->key, - &lr, sizeof(lr), request_from, NULL); + &lr, sizeof(lr), request_from, &status); /* negative status is handled by caller */ if (ret < 0) mlog(ML_ERROR, "%s: Error %d send LOCK_REQUEST to node %u " "to recover dead node %u\n", dlm->name, ret, request_from, dead_node); + else + ret = status; // return from here, then // sleep until all received or error return ret; |