summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband
Commit message (Collapse)AuthorAgeFilesLines
* RDAM/bnxt_re: Use tlv apis while processing the slow path commandsSelvin Xavier2023-04-041-11/+11
| | | | | | | | | | | Use the new TLV APIs for existing slow path commands. The TLV APIs will be used to populate extended headers for some of the Firmware commands, which will be introduced in the patches that follow. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1680169540-10029-7-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: RoCE slow path TLV supportSelvin Xavier2023-04-041-0/+162
| | | | | | | | | Header file to support TLV encapsulated commands. These functions will be used by the driver in the follow up patches. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1680169540-10029-6-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: Reduce number of argumets to control path command APIsSelvin Xavier2023-04-044-138/+199
| | | | | | | | | | | Reducing the number of arguments to bnxt_qplib_rcfw_send_message by enclosing all its arguments into a command message structure. Use the same struct while passing the command information to send_message. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1680169540-10029-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: Convert RCFW_CMD_PREP macro to static inline functionSelvin Xavier2023-04-044-69/+98
| | | | | | | | | | Convert RCFW_CMD_PREP macro to static inline function. Also, remove the cmd_flags passed as none of the functions are using it. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1680169540-10029-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: Remove HW queue mapping from RoCE DriverSelvin Xavier2023-04-043-93/+0
| | | | | | | | | bnxt_en driver does the queue mapping for RoCE traffic. Removing the queue mapping from RoCE driver. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1680169540-10029-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: Update HW interface headersSelvin Xavier2023-04-042-2957/+4228
| | | | | | | | | | | | | Updating the HW structures to the latest version. This is copied from the code maintained internally. No functionality changes in this patch. Code is re-organized to match the file maintained in the internal tree. Also, New HW interface structures are added, which will be used by the drivers in future. CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1680169540-10029-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/siw: Remove namespace check from siw_netdev_event()Tetsuo Handa2023-04-031-3/+0
| | | | | | | | | | | | | | | | syzbot is reporting that siw_netdev_event(NETDEV_UNREGISTER) cannot destroy siw_device created after unshare(CLONE_NEWNET) due to net namespace check. It seems that this check was by error there and should be removed. Reported-by: syzbot <syzbot+5e70d01ee8985ae62a3b@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b Suggested-by: Jason Gunthorpe <jgg@ziepe.ca> Suggested-by: Leon Romanovsky <leon@kernel.org> Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Link: https://lore.kernel.org/r/a44e9ac5-44e2-d575-9e30-02483cc7ffd1@I-love.SAKURA.ne.jp Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/cma: Remove NULL check before dev_{put, hold}Yang Li2023-04-031-4/+2
| | | | | | | | | | | | | | | The call netdev_{put, hold} of dev_{put, hold} will check NULL, so there is no need to check before using dev_{put, hold}, remove it to silence the warnings: ./drivers/infiniband/core/cma.c:713:2-9: WARNING: NULL check before dev_{put, hold} functions is not needed. ./drivers/infiniband/core/cma.c:2433:2-9: WARNING: NULL check before dev_{put, hold} functions is not needed. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4668 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20230331010633.63261-1-yang.lee@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* IB/qib: Remove unused cnt variableTom Rix2023-04-031-5/+4
| | | | | | | | | | | | | | | | | clang with W=1 reports drivers/infiniband/hw/qib/qib_file_ops.c:487:20: error: variable 'cnt' set but not used [-Werror,-Wunused-but-set-variable] u32 tid, ctxttid, cnt, limit, tidcnt; ^ drivers/infiniband/hw/qib/qib_file_ops.c:1771:9: error: variable 'cnt' set but not used [-Werror,-Wunused-but-set-variable] int i, cnt = 0, maxtid = ctxt_tidbase + dd->rcvtidcnt; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230330235800.1845815-1-trix@redhat.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/mlx5: Remove unused num_alloc_xa_entries variableTom Rix2023-04-031-2/+0
| | | | | | | | | | | | | clang with W=1 reports drivers/infiniband/hw/mlx5/devx.c:1996:6: error: variable 'num_alloc_xa_entries' set but not used [-Werror,-Wunused-but-set-variable] int num_alloc_xa_entries = 0; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230330153607.1838750-1-trix@redhat.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* IB/iser: remove redundant new lineMax Gurtovoy2023-04-031-1/+0
| | | | | | | | | | This commit doesn't change any logic. Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20230330131333.37900-3-mgurtovoy@nvidia.com Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* IB/iser: centralize setting desc type and done callbackMax Gurtovoy2023-04-031-7/+9
| | | | | | | | | | | Move this common logic into iser_create_send_desc instead of duplicating the code. Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20230330131333.37900-2-mgurtovoy@nvidia.com Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* IB/iser: remove unused macrosMax Gurtovoy2023-04-031-6/+0
| | | | | | | | | | The removed macros are old leftovers. Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20230330131333.37900-1-mgurtovoy@nvidia.com Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/rxe: Clean kzalloc failure pathsLeon Romanovsky2023-03-302-23/+9
| | | | | | | | | | | | | | There is no need to print any debug messages after failure to allocate memory, because kernel will print OOM dumps anyway. Together with removal of these messages, remove useless goto jumps. Fixes: 5bf944f24129 ("RDMA/rxe: Add error messages") Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/all/ea43486f-43dd-4054-b1d5-3a0d202be621@kili.mountain Link: https://lore.kernel.org/r/d3cedf723b84e73e8062a67b7489d33802bafba2.1680113597.git.leon@kernel.org Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
* RDMA/rxe: Remove tasklet call from rxe_cq.cBob Pearson2023-03-293-33/+3
| | | | | | | | | | | | | | | | | | Remove the tasklet call in rxe_cq.c and also the is_dying in the cq struct. There is no reason for the rxe driver to defer the call to the cq completion handler by scheduling a tasklet. rxe_cq_post() is not called in a hard irq context. The rxe driver currently is incorrect because the tasklet call is made without protecting the cq pointer with a reference from having the underlying memory freed before the deferred routine is called. Executing the comp_handler inline fixes this problem. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Link: https://lore.kernel.org/r/20230327215643.10410-1-rpearsonhpe@gmail.com Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/ocrdma: remove unused discard_cnt variableTom Rix2023-03-291-2/+0
| | | | | | | | | | | | | clang with W=1 reports drivers/infiniband/hw/ocrdma/ocrdma_verbs.c:1592:6: error: variable 'discard_cnt' set but not used [-Werror,-Wunused-but-set-variable] int discard_cnt = 0; ^ This variable is not used so remove it. Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230326120959.1351948-1-trix@redhat.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: remove unused num_srqne_processed and num_cqne_processed variablesTom Rix2023-03-291-10/+4
| | | | | | | | | | | | | | | | | | clang with W=1 reports drivers/infiniband/hw/bnxt_re/qplib_fp.c:303:6: error: variable 'num_srqne_processed' set but not used [-Werror,-Wunused-but-set-variable] int num_srqne_processed = 0; ^ drivers/infiniband/hw/bnxt_re/qplib_fp.c:304:6: error: variable 'num_cqne_processed' set but not used [-Werror,-Wunused-but-set-variable] int num_cqne_processed = 0; ^ These variables are not used so remove them. Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230325140559.1336056-1-trix@redhat.com Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/usnic: Remove redundant pci_clear_masterCai Huoqing2023-03-291-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | Remove pci_clear_master to simplify the code, the bus-mastering is also cleared in do_pci_disable_device, like this: ./drivers/pci/pci.c:2197 static void do_pci_disable_device(struct pci_dev *dev) { u16 pci_command; pci_read_config_word(dev, PCI_COMMAND, &pci_command); if (pci_command & PCI_COMMAND_MASTER) { pci_command &= ~PCI_COMMAND_MASTER; pci_write_config_word(dev, PCI_COMMAND, pci_command); } pcibios_disable_device(dev); }. And dev->is_busmaster is set to 0 in pci_disable_device. Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Link: https://lore.kernel.org/r/20230323115742.13836-1-cai.huoqing@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/mlx5: Expand switchdev Q-counters to expose representor statisticsPatrisious Haddad2023-03-291-29/+142
| | | | | | | | | | | | | Previously for switchdev only per device counters were supported. Currently we allocate counters for switchdev per port, which also includes the ports that belong to VF representors in order to expose them to users through the rdma tool, allowing the host to track the VFs statistics through their representors counters. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://lore.kernel.org/r/ea31e1103c125cd27931ba213f307cde30d2eaed.1679566038.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/bnxt_re: Add resize_cq supportSelvin Xavier2023-03-295-0/+163
| | | | | | | | | | | | | | | | | | Add resize_cq verb support for user space CQs. Resize operation for kernel CQs are not supported now. Driver should free the current CQ only after user library polls for all the completions and switch to new CQ. So after the resize_cq is returned from the driver, user library polls for existing completions and store it as temporary data. Once library reaps all completions in the current CQ, it invokes the ibv_cmd_poll_cq to inform the driver about the resize_cq completion. Adding a check for user CQs in driver's poll_cq and complete the resize operation for user CQs. Updating uverbs_cmd_mask with poll_cq to support this. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1678868215-23626-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* RDMA/erdma: Use fixed hardware page sizeCheng Xu2023-03-242-8/+13
| | | | | | | | | | Hardware's page size is 4096, but the kernel's page size may vary. Driver should use hardware's page size when communicating with hardware. Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation") Link: https://lore.kernel.org/r/20230307102924.70577-2-chengyou@linux.alibaba.com Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Rewrite rxe_task.cBob Pearson2023-03-242-56/+218
| | | | | | | | | | | | | | | | | | | | | | | | | This patch is a major rewrite of the tasklet routines in rxe_task.c. The main motivation for this is the realization that the code violates the safety of the qp pointer by correct reference counting. When a tasklet is scheduled from a verbs API the calling thread has a valid reference to the qp and schedules the tasklet to run at a later time carrying a pointer to the qp. Once the calling code returns however the qp can be destroyed at any time. In order to correct this a reference to the qp must be taken when the task is scheduled and held until it finishes running. This is complicated by the tasklet library not alwys running a task that is scheduled depending on whether someone else has scheduled it. This patch moves the logic for deciding whether to run or schedule a task outside of do_task() and guarantees that there is only one copy of the task scheduled or running at a time. Secondly the separate flags controlling teardown and draining of the task are included in the task state machine and all references to the state are protected by spinlocks to avoid consistency and memory barrier issues. Link: https://lore.kernel.org/r/20230304174533.11296-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Make tasks schedule each otherBob Pearson2023-03-242-6/+6
| | | | | | | | | | | Replace rxe_run_task() by rxe_sched_task() when tasks call each other. These are not performance critical and mainly involve error paths but they run the risk of causing deadlocks. Link: https://lore.kernel.org/r/20230304174533.11296-8-rpearsonhpe@gmail.com Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Remove __rxe_do_task()Bob Pearson2023-03-243-58/+17
| | | | | | | | | | | | | | | | | | | | | The subroutine __rxe_do_task is not thread safe and it has no way to guarantee that the tasks, which are designed with the assumption that they are non-reentrant, are not reentered. All of its uses are non-performance critical. This patch replaces calls to __rxe_do_task with calls to rxe_sched_task. It also removes irrelevant or unneeded if tests. Instead of calling the task machinery a single call to the tasklet function (rxe_requester, etc.) is sufficient to draing the queues if task execution has been disabled or stopped. Together these changes allow the removal of __rxe_do_task. Link: https://lore.kernel.org/r/20230304174533.11296-7-rpearsonhpe@gmail.com Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Remove qp reference counting in tasksBob Pearson2023-03-243-14/+0
| | | | | | | | | | | | | | | | | | | Currently each of the three tasklets requester, completer and responder in the rxe driver take and release a reference to the qp argument at the beginning and end of the subroutines. The caller passing in the qp argument should be responsible for holding a reference to qp so these are not required. Further doing so breaks the qp cleanup code in rxe_qp_do_cleanup which calls these routines after all the references have been dropped so they cannot drain the packet and work request queues as intended. In fact if these routines are deferred by calling tasklet_schedule there is no guarantee that the calling code does have a qp reference. That is a bug in rxe_task.c which will be fixed later in this series. Link: https://lore.kernel.org/r/20230304174533.11296-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Cleanup error state handling in rxe_comp.cBob Pearson2023-03-243-23/+61
| | | | | | | | | | Cleanup the handling of qp in the error state, reset state and during rxe_qp_do_cleanup. Make the same as rxe_resp.c Link: https://lore.kernel.org/r/20230304174533.11296-5-rpearsonhpe@gmail.com Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Cleanup reset state handling in rxe_resp.cBob Pearson2023-03-242-51/+57
| | | | | | | | | | | | | Cleanup the handling of qp in the error state, reset state and during rxe_qp_do_cleanup. The error state does about the same thing as the others but has code spread all over. This patch combines them in a cleaner way. Link: https://lore.kernel.org/r/20230304174533.11296-4-rpearsonhpe@gmail.com Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Convert tasklet args to queue pairsBob Pearson2023-03-246-18/+17
| | | | | | | | | | | | Originally is was thought that the tasklet machinery in rxe_task.c would be used in other applications but that has not happened for years. This patch replaces the 'void *arg' by struct 'rxe_qp *qp' in the parameters to the tasklet calls. This change will have no affect on performance but may make the code a little clearer. Link: https://lore.kernel.org/r/20230304174533.11296-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Add error messagesBob Pearson2023-03-245-241/+609
| | | | | | | | | | | | | | This patch adds error and debug messages so that every interaction with rdma-core through a verbs API call or a completion error return will generate at least one error message backed up by debug messages with more detail. With dynamic debugging one can follow up after seeing an error message by turning on the appropriate debug messages. Link: https://lore.kernel.org/r/20230303221623.8053-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Extend dbg log messages to err and infoBob Pearson2023-03-242-3/+47
| | | | | | | | | | Extend the dbg log messages (e.g. rxe_dbg_xxx) to include err and info types. rxe.c is modified to use these new log messages as examples. Link: https://lore.kernel.org/r/20230303221623.8053-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Change rxe_dbg to rxe_dbg_devBob Pearson2023-03-249-24/+25
| | | | | | | | | | | Replace the name rxe_dbg with rxe_dbg_dev which better matches the remaining rxe_dbg_xxx macros for debug messages with a rxe device parameter. Reuse the name rxe_dbg for debug messages which do not have a rxe device parameter. Link: https://lore.kernel.org/r/20230303221623.8053-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Replace exists by rxe in rxe.cBob Pearson2023-03-241-6/+6
| | | | | | | | | | | | | | 'exists' looks like a boolean. This patch replaces it by the normal name used for the rxe device, 'rxe', which should be a little less confusing. The second rxe_dbg() message is incorrect since rxe is known to be NULL and this will cause a seg fault if this message were ever sent. Replace it by pr_debug for the moment. Fixes: c6aba5ea0055 ("RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe.c") Link: https://lore.kernel.org/r/20230303221623.8053-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/core: Fix multiple -Warray-bounds warningsGustavo A. R. Silva2023-03-231-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC-13 (and Clang)[1] does not like to access a partially allocated object, since it cannot reason about it for bounds checking. In this case 140 bytes are allocated for an object of type struct ib_umad_packet: packet = kzalloc(sizeof(*packet) + IB_MGMT_RMPP_HDR, GFP_KERNEL); However, notice that sizeof(*packet) is only 104 bytes: struct ib_umad_packet { struct ib_mad_send_buf * msg; /* 0 8 */ struct ib_mad_recv_wc * recv_wc; /* 8 8 */ struct list_head list; /* 16 16 */ int length; /* 32 4 */ /* XXX 4 bytes hole, try to pack */ struct ib_user_mad mad __attribute__((__aligned__(8))); /* 40 64 */ /* size: 104, cachelines: 2, members: 5 */ /* sum members: 100, holes: 1, sum holes: 4 */ /* forced alignments: 1, forced holes: 1, sum forced holes: 4 */ /* last cacheline: 40 bytes */ } __attribute__((__aligned__(8))); and 36 bytes extra bytes are allocated for a flexible-array member in struct ib_user_mad: include/rdma/ib_mad.h: 120 enum { ... 123 IB_MGMT_RMPP_HDR = 36, ... } struct ib_user_mad { struct ib_user_mad_hdr hdr; /* 0 64 */ /* --- cacheline 1 boundary (64 bytes) --- */ __u64 data[] __attribute__((__aligned__(8))); /* 64 0 */ /* size: 64, cachelines: 1, members: 2 */ /* forced alignments: 1 */ } __attribute__((__aligned__(8))); So we have sizeof(*packet) + IB_MGMT_RMPP_HDR == 140 bytes Then the address of the flex-array member (for which only 36 bytes were allocated) is casted and copied into a pointer to struct ib_rmpp_mad, which, in turn, is of size 256 bytes: rmpp_mad = (struct ib_rmpp_mad *) packet->mad.data; struct ib_rmpp_mad { struct ib_mad_hdr mad_hdr; /* 0 24 */ struct ib_rmpp_hdr rmpp_hdr; /* 24 12 */ u8 data[220]; /* 36 220 */ /* size: 256, cachelines: 4, members: 3 */ }; The thing is that those 36 bytes allocated for flex-array member data in struct ib_user_mad onlly account for the size of both struct ib_mad_hdr and struct ib_rmpp_hdr, but nothing is left for array u8 data[220]. So, the compiler is legitimately complaining about accessing an object for which not enough memory was allocated. Apparently, the only members of struct ib_rmpp_mad that are relevant (that are actually being used) in function ib_umad_write() are mad_hdr and rmpp_hdr. So, instead of casting packet->mad.data to (struct ib_rmpp_mad *) create a new structure struct ib_rmpp_mad_hdr { struct ib_mad_hdr mad_hdr; struct ib_rmpp_hdr rmpp_hdr; } __packed; and cast packet->mad.data to (struct ib_rmpp_mad_hdr *). Notice that IB_MGMT_RMPP_HDR == sizeof(struct ib_rmpp_mad_hdr) == 36 bytes Refactor the rest of the code, accordingly. Fix the following warnings seen under GCC-13 and -Warray-bounds: drivers/infiniband/core/user_mad.c:564:50: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=] drivers/infiniband/core/user_mad.c:566:42: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=] drivers/infiniband/core/user_mad.c:618:25: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=] drivers/infiniband/core/user_mad.c:622:44: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=] Link: https://github.com/KSPP/linux/issues/273 Link: https://godbolt.org/z/oYWaGM4Yb [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/ZBpB91qQcB10m3Fw@work Signed-off-by: Leon Romanovsky <leon@kernel.org>
* Enable IB out-of-order by default in mlx5Leon Romanovsky2023-03-231-0/+8
|\ | | | | | | | | | | | | | | | | | | This series from Or changes default of IB out-of-order feature and allows to the RDMA users to decide if they need to wait for completion for all segments or it is enough to wait for last segment completion only. Thanks Signed-off-by: Leon Romanovsky <leon@kernel.org>
| * RDMA/mlx5: Disable out-of-order in integrity enabled QPsOr Har-Toov2023-03-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | Set retry_mode to GO_BACK_N when qp is created with INTEGRITY_EN flag because out-of-order is not supported when doing HW offload of signature operations. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/362de42cdc7a541afa5b1fd0ec6ae706061764a2.1679230449.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/efa: Add data polling capability feature bitYonatan Nachum2023-03-222-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add feature bit to existing device caps field. EFA supports data polling of 128 bytes blocks. The flag indicates that the NIC guarentees that a 128 byte aligned block is written in order, ie that observing the last 8 bits of the block mean the prior 127 bytes are also written. It is useful for "last data polling" acceleration techniques. Link: https://lore.kernel.org/r/20230219081328.10419-1-mrgolin@amazon.com Reviewed-by: Yehuda Yitschak <yehuday@amazon.com> Reviewed-by: Yossi Leybovich <sleybo@amazon.com> Signed-off-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Acked-by: Gal Pressman <gal.pressman@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/erdma: Minor refactor of device init flowCheng Xu2023-03-222-32/+34
| | | | | | | | | | | | | | | | | | | | After necessary configuration, driver should wait hardware finishing initialization. The wait sets at CMDQ related function though it has nothing to do with CMDQ. Refactor this part to make code cleaner. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230322093319.84045-4-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/erdma: Eliminate unnecessary casting of EQ doorbellsCheng Xu2023-03-223-8/+6
| | | | | | | | | | | | | | | | | | | | Using void * to define EQ doorbell pointer can eliminate unnecessary casting when performing assignment. Also rename *db_addr* to *db* for a shorter name. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230322093319.84045-3-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/erdma: Unify byte ordering APIs usageCheng Xu2023-03-223-13/+11
| | | | | | | | | | | | | | | | | | | | Replace __be32_to_cpu/__cpu_to_be16 with be32_to_cpu/cpu_to_be16. And use be32_to_cpu_array to copy and swap byte order to hide the loop. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://lore.kernel.org/r/20230322093319.84045-2-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | IB/qib: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas2023-03-191-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | IB/hfi1: Drop redundant pci_enable_pcie_error_reporting()Bjorn Helgaas2023-03-191-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pci_enable_pcie_error_reporting() enables the device to send ERR_* Messages. Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"), the PCI core does this for all devices during enumeration, so the driver doesn't need to do it itself. Remove the redundant pci_enable_pcie_error_reporting() call from the driver. Note that this only controls ERR_* Messages from the device. An ERR_* Message may cause the Root Port to generate an interrupt, depending on the AER Root Error Command register managed by the AER service driver. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/mlx5: Coding style fix reported by checkpatchRohit Chavan2023-03-191-5/+4
| | | | | | | | | | | | | | | | | | Block comments should align the * on each line on line 2849 Avoid line continuations in quoted strings on line 3848 Signed-off-by: Rohit Chavan <roheetchavan@gmail.com> Link: https://lore.kernel.org/r/20230319100847.5566-1-roheetchavan@gmail.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/irdma: Refactor PBLE functionsSindhu Devale2023-03-193-32/+31
| | | | | | | | | | | | | | | | | | | | | | Refactor PBLE functions using a bit mask to represent the PBLE level desired versus 2 parameters use_pble and lvl_one_only which makes the code confusing. Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/irdma: Change name of interruptsMichal Swiatkowski2023-03-192-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add more information in interrupt names. Before this patch it was: irdma CEQ CEQ ... Now: irdma-0000:18:00.0-AEQ irdma-0000:18:00.0-CEQ-0 irdma-0000:18:00.0-CEQ-1 ... Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Suggested-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/irdma: Remove a redundant irdma_arp_table() callTatyana Nikolova2023-03-191-4/+0
| | | | | | | | | | | | | | | | | | | | | | Remove a redundant function call in irdma_modify_qp_roce, since irdma_arp_table() with IRDMA_ARP_RESOLVE action is called after the if/else ipv check as part of irdma_add_arp(). Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/irdma: Refactor HW statisticsKrzysztof Czurylo2023-03-199-580/+362
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor HW statistics which, - Unifies HW statistics support for all HW generations. - Unifies support of 32- and 64-bit counters. - Removes duplicated code and simplifies implementation. - Fixes roll-over handling. - Removes unneeded last_hw_stats. With new implementation, there is no separate handling and no separate arrays for 32- and 64-bit counters (offsets, regs, values). Instead, there is a HW stats map array for each HW revision, which defines HW-specific width and location of each counter in the statistics buffer. Once the statistics are gathered (either via CQP op, or by reading HW registers), counter values are extracted from the statistics buffer using the stats map and the delta between the last and new values is computed. Finally, the counter values in rdma_hw_stats are incremented by those deltas. From the OS perspective, all the counters are 64-bit and their order in rdma_hw_stats->value[] array, as well as in irdma_hw_stat_names[], is the same for all HW gens. New statistics should always be added at the end. Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com> Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145305.955-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/qib: Remove deprecated kmap() callIra Weiny2023-03-191-8/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kmap() has been deprecated in favor of the kmap_local_page() call. kmap_local_page() is thread local. In the sdma coalesce case the page allocated is potentially free'ed in a different context through qib_sdma_get_complete() -> qib_user_sdma_make_progress(). The use of kmap_local_page() is inappropriate in this call path. However, the page is allocated using GFP_KERNEL and will never be from highmem. Remove the use of kmap calls and use page_address() in this case. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20230217-kmap-qib-v1-1-e5a6fde167e0@intel.com Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/mlx4: Prevent shift wrapping in set_user_sq_size()Dan Carpenter2023-03-191-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | The ucmd->log_sq_bb_count variable is controlled by the user so this shift can wrap. Fix it by using check_shl_overflow() in the same way that it was done in commit 515f60004ed9 ("RDMA/hns: Prevent undefined behavior in hns_roce_set_user_sq_size()"). Fixes: 839041329fd3 ("IB/mlx4: Sanity check userspace send queue sizes") Signed-off-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/a8dfbd1d-c019-4556-930b-bab1ded73b10@kili.mountain Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/hns: Add new command to support query vf capsYixing Liu2023-03-142-164/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | The current resource query for vf caps is driven by the driver, which is unreasonable. This patch adds a new command HNS_ROCE_OPC_QUERY_VF_CAPS_NUM to support obtaining vf caps information from firmware. Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com> Link: https://lore.kernel.org/r/20230304091555.2241298-2-xuhaoyue1@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
* | RDMA/rdmavt: Delete unnecessary NULL checkNatalia Petrova2023-03-141-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to check 'rdi->qp_dev' for NULL. The field 'qp_dev' is created in rvt_register_device() which will fail if the 'qp_dev' allocation fails in rvt_driver_qp_init(). Overwise this pointer doesn't changed and passed to rvt_qp_exit() by the next step. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 0acb0cc7ecc1 ("IB/rdmavt: Initialize and teardown of qpn table") Signed-off-by: Natalia Petrova <n.petrova@fintech.ru> Link: https://lore.kernel.org/r/20230303124408.16685-1-n.petrova@fintech.ru Signed-off-by: Leon Romanovsky <leon@kernel.org>