summaryrefslogtreecommitdiffstats
path: root/include/rdma/rdmavt_qp.h
diff options
context:
space:
mode:
authorDoug Ledford <dledford@redhat.com>2019-02-09 12:50:02 -0500
committerDoug Ledford <dledford@redhat.com>2019-02-09 12:50:02 -0500
commit416fbc1bbaa51742f7c49ff0578a594c2082c73b (patch)
treed74d2b91aa5cb1c8509c22bceb51badf5f17bcfe /include/rdma/rdmavt_qp.h
parentdb421a54996c602503204345171c662e65f20527 (diff)
parent885c5807fa0ccf206dc7a92fa83e4991120c1fd9 (diff)
downloadlinux-stable-416fbc1bbaa51742f7c49ff0578a594c2082c73b.tar.gz
linux-stable-416fbc1bbaa51742f7c49ff0578a594c2082c73b.tar.bz2
linux-stable-416fbc1bbaa51742f7c49ff0578a594c2082c73b.zip
Merge branch 'hfi1-tid' into wip/dl-for-next
Omni-Path TID RDMA Feature Intel Omni-Path (OPA) TID RDMA support is a feature that accelerates data movement between two OPA nodes through the IB Verbs interface. It improves RDMA READ/WRITE performance by delivering the data payload to a user buffer directly without any software copying. Architecture ============= The TID RDMA protocol is implemented on the hfi1 driver level and is therefore transparent to the ULPs. It is designed to facilitate the data transactions for two specific RDMA requests: - RDMA READ; - RDMA WRITE. Previously, when a verbs data packet is received at the destination (requester side for RDMA READ and responder side for RDMA WRITE), the data payload is copied to the user buffer by software, which slows down the performance significantly for large requests. Internally, hfi1 converts qualified RDMA READ/WRITE requests into TID RDMA READ/WRITE requests when the requests are post sent to the hfi1 driver. Non-qualified RDMA requests are handled by normal RDMA protocol. For TID RDMA requests, hardware resources (hardware flow and TID entries) are allocated on the destination side (the requester side for TID RDMA READ and the responder side for TID RDMA WRITE). The information for these resources is conveyed to the data source side (the responder side for TID RDMA READ and the requester side for TID RDMA WRITE) and embedded in data packets. When data packets are received by the destination, hardware will deliver the data payload to the destination buffer without involving software and therefore improve the performance. Details ======= RDMA READ/WRITE requests are qualified by the following: - Total data length >= 256k; - Totoal data length is a multiple of 4K pages. Additional qualifications are enforced for the destination buffers: For RDMA RAED: - Each destination sge buffer is 4K aligned; - Each destination sge buffer is a multiple of 4K pages. For RDMA WRITE: - The destination number is 4K aligned. In addition, in an OPA fabric, some nodes may support TID RDMA while others may not. As such, it is important for two transaction nodes to exchange the information about the features they support. This discovery mechanism is called OPA Feature Negotion (OPFN) and is described in details in the patch series. Through OPFN, two nodes can find whether they both support TID RDMA and subsequently convert RDMA requests into TID RDMA requests. * hfi1-tid: (46 commits) IB/hfi1: Prioritize the sending of ACK packets IB/hfi1: Add static trace for TID RDMA WRITE protocol IB/hfi1: Enable TID RDMA WRITE protocol IB/hfi1: Add interlock between TID RDMA WRITE and other requests IB/hfi1: Add TID RDMA WRITE functionality into RDMA verbs IB/hfi1: Add the dual leg code IB/hfi1: Add the TID second leg ACK packet builder IB/hfi1: Add the TID second leg send packet builder IB/hfi1: Resend the TID RDMA WRITE DATA packets IB/hfi1: Add a function to receive TID RDMA RESYNC packet IB/hfi1: Add a function to build TID RDMA RESYNC packet IB/hfi1: Add TID RDMA retry timer IB/hfi1: Add a function to receive TID RDMA ACK packet IB/hfi1: Add a function to build TID RDMA ACK packet IB/hfi1: Add a function to receive TID RDMA WRITE DATA packet IB/hfi1: Add a function to build TID RDMA WRITE DATA packet IB/hfi1: Add a function to receive TID RDMA WRITE response IB/hfi1: Add TID resource timer IB/hfi1: Add a function to build TID RDMA WRITE response IB/hfi1: Add functions to receive TID RDMA WRITE request ... Signed-off-by: Doug Ledford <dledford@redhat.com>
Diffstat (limited to 'include/rdma/rdmavt_qp.h')
-rw-r--r--include/rdma/rdmavt_qp.h20
1 files changed, 19 insertions, 1 deletions
diff --git a/include/rdma/rdmavt_qp.h b/include/rdma/rdmavt_qp.h
index cbafb1878669..f0fbd4063fef 100644
--- a/include/rdma/rdmavt_qp.h
+++ b/include/rdma/rdmavt_qp.h
@@ -174,6 +174,7 @@ struct rvt_swqe {
u32 lpsn; /* last packet sequence number */
u32 ssn; /* send sequence number */
u32 length; /* total length of data in sg_list */
+ void *priv; /* driver dependent field */
struct rvt_sge sg_list[0];
};
@@ -235,6 +236,7 @@ struct rvt_ack_entry {
u32 lpsn;
u8 opcode;
u8 sent;
+ void *priv;
};
#define RC_QP_SCALING_INTERVAL 5
@@ -244,6 +246,7 @@ struct rvt_ack_entry {
#define RVT_OPERATION_ATOMIC_SGE 0x00000004
#define RVT_OPERATION_LOCAL 0x00000008
#define RVT_OPERATION_USE_RESERVE 0x00000010
+#define RVT_OPERATION_IGN_RNR_CNT 0x00000020
#define RVT_OPERATION_MAX (IB_WR_RESERVED10 + 1)
@@ -373,6 +376,7 @@ struct rvt_qp {
u8 s_rnr_retry; /* requester RNR retry counter */
u8 s_num_rd_atomic; /* number of RDMA read/atomic pending */
u8 s_tail_ack_queue; /* index into s_ack_queue[] */
+ u8 s_acked_ack_queue; /* index into s_ack_queue[] */
struct rvt_sge_state s_ack_rdma_sge;
struct timer_list s_timer;
@@ -629,6 +633,16 @@ __be32 rvt_compute_aeth(struct rvt_qp *qp);
void rvt_get_credit(struct rvt_qp *qp, u32 aeth);
/**
+ * rvt_restart_sge - rewind the sge state for a wqe
+ * @ss: the sge state pointer
+ * @wqe: the wqe to rewind
+ * @len: the data length from the start of the wqe in bytes
+ *
+ * Returns the remaining data length.
+ */
+u32 rvt_restart_sge(struct rvt_sge_state *ss, struct rvt_swqe *wqe, u32 len);
+
+/**
* @qp - the qp pair
* @len - the length
*
@@ -676,7 +690,11 @@ enum hrtimer_restart rvt_rc_rnr_retry(struct hrtimer *t);
void rvt_add_rnr_timer(struct rvt_qp *qp, u32 aeth);
void rvt_del_timers_sync(struct rvt_qp *qp);
void rvt_stop_rc_timers(struct rvt_qp *qp);
-void rvt_add_retry_timer(struct rvt_qp *qp);
+void rvt_add_retry_timer_ext(struct rvt_qp *qp, u8 shift);
+static inline void rvt_add_retry_timer(struct rvt_qp *qp)
+{
+ rvt_add_retry_timer_ext(qp, 0);
+}
void rvt_copy_sge(struct rvt_qp *qp, struct rvt_sge_state *ss,
void *data, u32 length,