diff options
author | Or Gerlitz <ogerlitz@voltaire.com> | 2010-05-05 17:31:44 +0300 |
---|---|---|
committer | Roland Dreier <rolandd@cisco.com> | 2010-05-12 09:30:44 -0700 |
commit | 39ff05dbbbdb082bbabf06206c56b3cd4ef73904 (patch) | |
tree | 85466e1e75d632b33a294dea436fad2f3233fe52 /drivers/infiniband/ulp/iser/iscsi_iser.h | |
parent | d265b9808272c9f25e1c36d3fb5ddb466efd90e9 (diff) | |
download | linux-39ff05dbbbdb082bbabf06206c56b3cd4ef73904.tar.gz linux-39ff05dbbbdb082bbabf06206c56b3cd4ef73904.tar.bz2 linux-39ff05dbbbdb082bbabf06206c56b3cd4ef73904.zip |
IB/iser: Enhance disconnection logic for multi-pathing
The iser connection teardown flow isn't over until the underlying
Connection Manager (e.g the IB CM) delivers a disconnected or timeout
event through the RDMA-CM. When the remote (target) side isn't
reachable, e.g when some HW e.g port/hca/switch isn't functioning or
taken down administratively, the CM timeout flow is used and the event
may be generated only after relatively long time -- on the order of
tens of seconds.
The current iser code exposes this possibly long delay to higher
layers, specifically to the iscsid daemon and iscsi kernel stack. As a
result, the iscsi stack doesn't respond well: this low-level CM delay
is added to the fail-over time under HA schemes such as the one
provided by DM multipath through the multipathd(8) service.
This patch enhances the reference counting scheme on iser's IB
connections so that the disconnect flow initiated by iscsid from user
space (ep_disconnect) doesn't wait for the CM to deliver the
disconnect/timeout event. (The connection teardown isn't done from
iser's view point until the event is delivered)
The iser ib (rdma) connection object is destroyed when its reference
count reaches zero. When this happens on the RDMA-CM callback
context, extra care is taken so that the RDMA-CM does the actual
destroying of the associated ID, since doing it in the callback is
prohibited.
The reference count of iser ib connection normally reaches three,
where the <ref, deref> relations are
1. conn <init, terminate>
2. conn <bind, stop/destroy>
3. cma id <create, disconnect/error/timeout callbacks>
With this patch, multipath fail-over time is about 30 seconds, while
without this patch, multipath fail-over time is about 130 seconds.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Diffstat (limited to 'drivers/infiniband/ulp/iser/iscsi_iser.h')
-rw-r--r-- | drivers/infiniband/ulp/iser/iscsi_iser.h | 3 |
1 files changed, 1 insertions, 2 deletions
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h index 53da74b45c75..f1df01567bb6 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h @@ -247,7 +247,6 @@ struct iser_conn { struct rdma_cm_id *cma_id; /* CMA ID */ struct ib_qp *qp; /* QP */ struct ib_fmr_pool *fmr_pool; /* pool of IB FMRs */ - int disc_evt_flag; /* disconn event delivered */ wait_queue_head_t wait; /* waitq for conn/disconn */ int post_recv_buf_count; /* posted rx count */ atomic_t post_send_buf_count; /* posted tx count */ @@ -321,7 +320,7 @@ void iser_conn_init(struct iser_conn *ib_conn); void iser_conn_get(struct iser_conn *ib_conn); -void iser_conn_put(struct iser_conn *ib_conn); +int iser_conn_put(struct iser_conn *ib_conn, int destroy_cma_id_allowed); void iser_conn_terminate(struct iser_conn *ib_conn); |