summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/core/mad.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2020-08-061-15/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "A quiet cycle after the larger 5.8 effort. Substantially cleanup and driver work with a few smaller features this time. - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re - Removal of dead or redundant code across the drivers - RAW resource tracker dumps to include a device specific data blob for device objects to aide device debugging - Further advance the IOCTL interface, remove the ability to turn it off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands - Remove stubs related to devices with no pkey table - A shared CQ scheme to allow multiple ULPs to share the CQ rings of a device to give higher performance - Several more static checker, syzkaller and rare crashers fixed" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits) RDMA/mlx5: Fix flow destination setting for RDMA TX flow table RDMA/rxe: Remove pkey table RDMA/umem: Add a schedule point in ib_umem_get() RDMA/hns: Fix the unneeded process when getting a general type of CQE error RDMA/hns: Fix error during modify qp RTS2RTS RDMA/hns: Delete unnecessary memset when allocating VF resource RDMA/hns: Remove redundant parameters in set_rc_wqe() RDMA/hns: Remove support for HIP08_A RDMA/hns: Refactor hns_roce_v2_set_hem() RDMA/hns: Remove redundant hardware opcode definitions RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP RDMA/include: Replace license text with SPDX tags RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting RDMA/cma: Execute rdma_cm destruction from a handler properly RDMA/cma: Remove unneeded locking for req paths RDMA/cma: Using the standard locking pattern when delivering the removal event RDMA/cma: Simplify DEVICE_REMOVAL for internal_id RDMA/efa: Add EFA 0xefa1 PCI ID RDMA/efa: User/kernel compatibility handshake mechanism ...
| * IB/mad: Change atomics to refcount APIShay Drory2020-06-241-8/+8
| | | | | | | | | | | | | | | | | | | | | | The refcount API provides better safety than atomics API. Therefore, change atomic functions to refcount functions. Link: https://lore.kernel.org/r/20200621104738.54850-4-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * IB/mad: Issue complete whenever decrements agent refcountShay Drory2020-06-241-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Replace calls of atomic_dec() to mad_agent_priv->refcount with calls to deref_mad_agent() in order to issue complete. Most likely the refcount is > 1 at these points, but it is difficult to prove. Performance is not important on these paths, so be obviously correct. Link: https://lore.kernel.org/r/20200621104738.54850-3-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | IB/mad: Fix use after free when destroying MAD agentShay Drory2020-06-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when RMPP MADs are processed while the MAD agent is destroyed, it could result in use after free of rmpp_recv, as decribed below: cpu-0 cpu-1 ----- ----- ib_mad_recv_done() ib_mad_complete_recv() ib_process_rmpp_recv_wc() unregister_mad_agent() ib_cancel_rmpp_recvs() cancel_delayed_work() process_rmpp_data() start_rmpp() queue_delayed_work(rmpp_recv->cleanup_work) destroy_rmpp_recv() free_rmpp_recv() cleanup_work()[1] spin_lock_irqsave(&rmpp_recv->agent->lock) <-- use after free [1] cleanup_work() == recv_cleanup_handler Fix it by waiting for the MAD agent reference count becoming zero before calling to ib_cancel_rmpp_recvs(). Fixes: 9a41e38a467c ("IB/mad: Use IDR for agent IDs") Link: https://lore.kernel.org/r/20200621104738.54850-2-leon@kernel.org Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* | RDMA/mad: Fix possible memory leak in ib_mad_post_receive_mads()Fan Guo2020-06-191-0/+1
|/ | | | | | | | | | | | If ib_dma_mapping_error() returns non-zero value, ib_mad_post_receive_mads() will jump out of loops and return -ENOMEM without freeing mad_priv. Fix this memory-leak problem by freeing mad_priv in this case. Fixes: 2c34e68f4261 ("IB/mad: Check and handle potential DMA mapping errors") Link: https://lore.kernel.org/r/20200612063824.180611-1-guofan5@huawei.com Signed-off-by: Fan Guo <guofan5@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Allow ib_client's to fail when add() is calledJason Gunthorpe2020-05-061-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | When a client is added it isn't allowed to fail, but all the client's have various failure paths within their add routines. This creates the very fringe condition where the client was added, failed during add and didn't set the client_data. The core code will then still call other client_data centric ops like remove(), rename(), get_nl_info(), and get_net_dev_by_params() with NULL client_data - which is confusing and unexpected. If the add() callback fails, then do not call any more client ops for the device, even remove. Remove all the now redundant checks for NULL client_data in ops callbacks. Update all the add() callbacks to return error codes appropriately. EOPNOTSUPP is used for cases where the ULP does not support the ib_device - eg because it only works with IB. Link: https://lore.kernel.org/r/20200421172440.387069-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/mad: Remove snoop interfaceMaor Gottlieb2020-05-061-233/+5
| | | | | | | | | Snoop interface is not used. Remove it. Link: https://lore.kernel.org/r/20200413132408.931084-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Change MAD processing function to remove extra casting and parameterLeon Romanovsky2019-11-121-6/+6
| | | | | | | | | | | | All users of process_mad() converts input pointers from ib_mad_hdr to be ib_mad, update the function declaration to use ib_mad directly. Also remove not used input MAD size parameter. Link: https://lore.kernel.org/r/20191029062745.7932-17-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Tested-By: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/mad: Delete never implemented functionsLeon Romanovsky2019-11-061-19/+0
| | | | | | | | Delete never implemented and used MAD functions. Link: https://lore.kernel.org/r/20191029062745.7932-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mad: Fix use-after-free in ib mad completion handlingJack Morgenstein2019-08-011-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We encountered a use-after-free bug when unloading the driver: [ 3562.116059] BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.117233] Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862 [ 3562.118385] [ 3562.119519] CPU: 2 PID: 23862 Comm: kworker/u13:2 Tainted: G OE 5.1.0-for-upstream-dbg-2019-05-19_16-44-30-13 #1 [ 3562.121806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 [ 3562.123075] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] [ 3562.124383] Call Trace: [ 3562.125640] dump_stack+0x9a/0xeb [ 3562.126911] print_address_description+0xe3/0x2e0 [ 3562.128223] ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.129545] __kasan_report+0x15c/0x1df [ 3562.130866] ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.132174] kasan_report+0xe/0x20 [ 3562.133514] ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.134835] ? find_mad_agent+0xa00/0xa00 [ib_core] [ 3562.136158] ? qlist_free_all+0x51/0xb0 [ 3562.137498] ? mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib] [ 3562.138833] ? quarantine_reduce+0x1fa/0x270 [ 3562.140171] ? kasan_unpoison_shadow+0x30/0x40 [ 3562.141522] ib_mad_recv_done+0xdf6/0x3000 [ib_core] [ 3562.142880] ? _raw_spin_unlock_irqrestore+0x46/0x70 [ 3562.144277] ? ib_mad_send_done+0x1810/0x1810 [ib_core] [ 3562.145649] ? mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib] [ 3562.147008] ? _raw_spin_unlock_irqrestore+0x46/0x70 [ 3562.148380] ? debug_object_deactivate+0x2b9/0x4a0 [ 3562.149814] __ib_process_cq+0xe2/0x1d0 [ib_core] [ 3562.151195] ib_cq_poll_work+0x45/0xf0 [ib_core] [ 3562.152577] process_one_work+0x90c/0x1860 [ 3562.153959] ? pwq_dec_nr_in_flight+0x320/0x320 [ 3562.155320] worker_thread+0x87/0xbb0 [ 3562.156687] ? __kthread_parkme+0xb6/0x180 [ 3562.158058] ? process_one_work+0x1860/0x1860 [ 3562.159429] kthread+0x320/0x3e0 [ 3562.161391] ? kthread_park+0x120/0x120 [ 3562.162744] ret_from_fork+0x24/0x30 ... [ 3562.187615] Freed by task 31682: [ 3562.188602] save_stack+0x19/0x80 [ 3562.189586] __kasan_slab_free+0x11d/0x160 [ 3562.190571] kfree+0xf5/0x2f0 [ 3562.191552] ib_mad_port_close+0x200/0x380 [ib_core] [ 3562.192538] ib_mad_remove_device+0xf0/0x230 [ib_core] [ 3562.193538] remove_client_context+0xa6/0xe0 [ib_core] [ 3562.194514] disable_device+0x14e/0x260 [ib_core] [ 3562.195488] __ib_unregister_device+0x79/0x150 [ib_core] [ 3562.196462] ib_unregister_device+0x21/0x30 [ib_core] [ 3562.197439] mlx4_ib_remove+0x162/0x690 [mlx4_ib] [ 3562.198408] mlx4_remove_device+0x204/0x2c0 [mlx4_core] [ 3562.199381] mlx4_unregister_interface+0x49/0x1d0 [mlx4_core] [ 3562.200356] mlx4_ib_cleanup+0xc/0x1d [mlx4_ib] [ 3562.201329] __x64_sys_delete_module+0x2d2/0x400 [ 3562.202288] do_syscall_64+0x95/0x470 [ 3562.203277] entry_SYSCALL_64_after_hwframe+0x49/0xbe The problem was that the MAD PD was deallocated before the MAD CQ. There was completion work pending for the CQ when the PD got deallocated. When the mad completion handling reached procedure ib_mad_post_receive_mads(), we got a use-after-free bug in the following line of code in that procedure: sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey; (the pd pointer in the above line is no longer valid, because the pd has been deallocated). We fix this by allocating the PD before the CQ in procedure ib_mad_port_open(), and deallocating the PD after freeing the CQ in procedure ib_mad_port_close(). Since the CQ completion work queue is flushed during ib_free_cq(), no completions will be pending for that CQ when the PD is later deallocated. Note that freeing the CQ before deallocating the PD is the practice in the ULPs. Fixes: 4be90bc60df4 ("IB/mad: Remove ib_get_dma_mr calls") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190801121449.24973-1-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/MAD: Add SMP details to MAD tracingIra Weiny2019-03-271-0/+8
| | | | | | | | Decode more information from the packet and include it in the trace. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add agent trace pointsIra Weiny2019-03-271-0/+4
| | | | | | | | | | Trace agent details when agents are [un]registered. In addition, report agent details on send/recv. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add recv path trace pointIra Weiny2019-03-271-0/+3
| | | | | | | | | Trace received MAD details. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/MAD: Add send path trace pointsIra Weiny2019-03-271-1/+32
| | | | | | | | | | | | Use the standard Linux trace mechanism to trace MADs being sent. 4 trace points are added, when the MAD is posted to the qp, when the MAD is completed, if a MAD is resent, and when the MAD completes in error. Reviewed-by: "Ruhl, Michael J" <michael.j.ruhl@intel.com> Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mad: Convert ib_mad_clients to XArrayMatthew Wilcox2019-03-261-25/+14
| | | | | | | | | Pull the allocation function out into its own function to reduce the length of ib_register_mad_agent() a little and keep all the allocation logic together. Signed-off-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Add and use rdma_for_each_portJason Gunthorpe2019-02-191-2/+2
| | | | | | | | We have many loops iterating over all of the end port numbers on a struct ib_device, simplify them with a for_each helper. Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA: Start use ib_device_opsKamal Heib2018-12-121-12/+10
| | | | | | | Make all the required change to start use the ib_device_ops structure. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Annotate timeout as unsigned longLeon Romanovsky2018-10-161-1/+1
| | | | | | | | | | | | | | | The ucma users supply timeout in u32 format, it means that any number with most significant bit set will be converted to negative value by various rdma_*, cma_* and sa_query functions, which treat timeout as int. In the lowest level, the timeout is converted back to be unsigned long. Remove this ambiguous conversion by updating all function signatures to receive unsigned long. Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/core: Rate limit MAD error messagesParav Pandit2018-09-061-35/+37
| | | | | | | | | | | | | While registering a mad agent, a user space can trigger various errors and flood the logs. Therefore, decrease verbosity and rate limit such error messages. While we are at it, use __func__ to print function name. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Fail early if unsupported QP is providedParav Pandit2018-09-061-0/+4
| | | | | | | | | | | When requested QP type is not supported for a {device, port}, return the error right away before validating all parameters during mad agent registration time. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Add an unbound WQ type to the new CQ APIJack Morgenstein2018-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | The upstream kernel commit cited below modified the workqueue in the new CQ API to be bound to a specific CPU (instead of being unbound). This caused ALL users of the new CQ API to use the same bound WQ. Specifically, MAD handling was severely delayed when the CPU bound to the WQ was busy handling (higher priority) interrupts. This caused a delay in the MAD "heartbeat" response handling, which resulted in ports being incorrectly classified as "down". To fix this, add a new "unbound" WQ type to the new CQ API, so that users have the option to choose either a bound WQ or an unbound WQ. For MADs, choose the new "unbound" WQ. Fixes: b7363e67b23e ("IB/device: Convert ib-comp-wq to be CPU-bound") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.m> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/core: Simplify ib_post_(send|recv|srq_recv)() callsBart Van Assche2018-07-241-9/+5
| | | | | | | | Instead of declaring and passing a dummy 'bad_wr' pointer, pass NULL as third argument to ib_post_(send|recv|srq_recv)(). Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mad: Use IDR for agent IDswilly@infradead.org2018-06-181-32/+51
| | | | | | | | | | | | | | | | | | | | | | | Allocate agent IDs from a global IDR instead of an atomic variable. This eliminates the possibility of reusing an ID which is already in use after 4 billion registrations. We limit the assigned ID to be less than 2^24 as the mlx4 driver uses the most significant byte of the agent ID to store the slave number. Users unlucky enough to see a collision between agent numbers and slave numbers see messages like: mlx4_ib: egress mad has non-null tid msb:1 class:4 slave:0 and the MAD layer stops working. We look up the agent under protection of the RCU lock, which means we have to free the agent using kfree_rcu, and only increment the reference counter if it is not 0. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Reported-by: Hans Westgaard Ry <hans.westgaard.ry@oracle.com> Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Tested-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/mad: Agent registration is process context onlyMatthew Wilcox2018-06-181-7/+9
| | | | | | | | | | Document this (it's implicitly true due to sleeping operations already in use in both registration and deregistration). Use this fact to use spin_lock_irq instead of spin_lock_irqsave. This improves performance slightly. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB: Replace ib_query_gid/ib_get_cached_gid with rdma_query_gidParav Pandit2018-06-181-2/+2
| | | | | | | | | | | | | If the gid_attr argument is NULL then the functions behave identically to rdma_query_gid. ib_query_gid just calls ib_get_cached_gid, so everything can be consolidated to one function. Now that all callers either use rdma_query_gid() or ib_get_cached_gid(), ib_query_gid() API is removed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* RDMA/mad: Convert BUG_ONs to error flowsLeon Romanovsky2018-06-011-4/+7
| | | | | | | Let's perform checks in-place instead of BUG_ONs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* RDMA/mad: Delete inaccessible BUG_ONLeon Romanovsky2018-06-011-1/+0
| | | | | | | | There is no need to check existence of mad_queue, because we already did pointer dereference before call to dequeue_mad(). Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Make ib_mad_client_id atomicHåkon Bugge2018-04-301-2/+2
| | | | | | | | | | | | | | | | | | | Currently, the kernel protects access to the agent ID allocator on a per port basis using a spinlock, so it is impossible for two apps/threads on the same port to get the same TID, but it is entirely possible for two threads on different ports to end up with the same TID. As this can be confusing (regardless of it being legal according to the IB Spec 1.3, C13-18.1.1, in section 13.4.6.4 - TransactionID usage), and as the rdma-core user space API for /dev/umad devices implies unique TIDs even across ports, make the TID an atomic type so that no two allocations, regardless of port number, will be the same. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* drivers: infiniband: remove duplicate includesPravin Shedge2017-12-221-1/+0
| | | | | | | | These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
* IB/core: Avoid crash on pkey enforcement failed in received MADsParav Pandit2017-11-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Below kernel crash is observed when Pkey security enforcement fails on received MADs. This issue is reported in [1]. ib_free_recv_mad() accesses the rmpp_list, whose initialization is needed before accessing it. When security enformcent fails on received MADs, MAD processing avoided due to security checks failed. OpenSM[3770]: SM port is down kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 kernel: IP: ib_free_recv_mad+0x44/0xa0 [ib_core] kernel: PGD 0 kernel: P4D 0 kernel: kernel: Oops: 0002 [#1] SMP kernel: CPU: 0 PID: 2833 Comm: kworker/0:1H Tainted: P IO 4.13.4-1-pve #1 kernel: Hardware name: Dell XS23-TY3 /9CMP63, BIOS 1.71 09/17/2013 kernel: Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] kernel: task: ffffa069c6541600 task.stack: ffffb9a729054000 kernel: RIP: 0010:ib_free_recv_mad+0x44/0xa0 [ib_core] kernel: RSP: 0018:ffffb9a729057d38 EFLAGS: 00010286 kernel: RAX: ffffa069cb138a48 RBX: ffffa069cb138a10 RCX: 0000000000000000 kernel: RDX: ffffb9a729057d38 RSI: 0000000000000000 RDI: ffffa069cb138a20 kernel: RBP: ffffb9a729057d60 R08: ffffa072d2d49800 R09: ffffa069cb138ae0 kernel: R10: ffffa069cb138ae0 R11: ffffa072b3994e00 R12: ffffb9a729057d38 kernel: R13: ffffa069d1c90000 R14: 0000000000000000 R15: ffffa069d1c90880 kernel: FS: 0000000000000000(0000) GS:ffffa069dba00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000008 CR3: 00000011f51f2000 CR4: 00000000000006f0 kernel: Call Trace: kernel: ib_mad_recv_done+0x5cc/0xb50 [ib_core] kernel: __ib_process_cq+0x5c/0xb0 [ib_core] kernel: ib_cq_poll_work+0x20/0x60 [ib_core] kernel: process_one_work+0x1e9/0x410 kernel: worker_thread+0x4b/0x410 kernel: kthread+0x109/0x140 kernel: ? process_one_work+0x410/0x410 kernel: ? kthread_create_on_node+0x70/0x70 kernel: ? SyS_exit_group+0x14/0x20 kernel: ret_from_fork+0x25/0x30 kernel: RIP: ib_free_recv_mad+0x44/0xa0 [ib_core] RSP: ffffb9a729057d38 kernel: CR2: 0000000000000008 [1] : https://www.spinics.net/lists/linux-rdma/msg56190.html Fixes: 47a2b338fe63 ("IB/core: Enforce security on management datagrams") Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Parav Pandit <parav@mellanox.com> Reported-by: Chris Blake <chrisrblake93@gmail.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Enforce security on management datagramsDaniel Jurgens2017-05-231-8/+44
| | | | | | | | | | | | | | | | | | | | | | | | | Allocate and free a security context when creating and destroying a MAD agent. This context is used for controlling access to PKeys and sending and receiving SMPs. When sending or receiving a MAD check that the agent has permission to access the PKey for the Subnet Prefix of the port. During MAD and snoop agent registration for SMI QPs check that the calling process has permission to access the manage the subnet and register a callback with the LSM to be notified of policy changes. When notificaiton of a policy change occurs recheck permission and set a flag indicating sending and receiving SMPs is allowed. When sending and receiving MADs check that the agent has access to the SMI if it's on an SMI QP. Because security policy can change it's possible permission was allowed when creating the agent, but no longer is. Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Doug Ledford <dledford@redhat.com> [PM: remove the LSM hook init code] Signed-off-by: Paul Moore <paul@paul-moore.com>
* IB/core: Use rdma_ah_attr accessor functionsDasaratharaman Chandramouli2017-05-011-8/+13
| | | | | | | | | | | | Modify core and driver components to use accessor functions introduced to access individual fields of rdma_ah_attr Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Rename ib_query_ah to rdma_query_ahDasaratharaman Chandramouli2017-05-011-1/+1
| | | | | | | | | | | Rename ib_query_ah to rdma_query_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: Rename struct ib_ah_attr to rdma_ah_attrDasaratharaman Chandramouli2017-05-011-1/+1
| | | | | | | | | | | | | This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: change the return type to voidZhu Yanjun2017-04-251-2/+1
| | | | | | | | | | | | | The function ib_unregister_mad_agent always returns zero. And this returned value is not checked. As such, chane the return type to void. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: Add port_num to error messageYuval Shaia2017-01-241-1/+3
| | | | | | | | | | Print the invalid port number to ease troubleshooting. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
*-. Merge branches 'misc', 'qedr', 'reject-helpers', 'rxe' and 'srp' into merge-testDoug Ledford2016-12-141-3/+3
|\ \
| * | IB/mad: Fix an array index checkBart Van Assche2016-12-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS (80) elements. Hence compare the array index with that value instead of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity reports the following: Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127). Fixes: commit b7ab0b19a85f ("IB/mad: Verify mgmt class in received MADs") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: <stable@vger.kernel.org> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | IB/mad: Eliminate redundant SM class version defines for OPAHal Rosenstock2016-12-141-2/+2
| |/ | | | | | | | | | | | | | | and rename class version define to indicate SM rather than SMP or SMI Signed-off-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* / IB/mad: Remove debug prints after allocation failureLeon Romanovsky2016-12-031-34/+6
|/ | | | | | | | | The prints after [k|v][m|z|c]alloc() functions are not needed, because in case of failure, allocator will print their internal error prints anyway. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar2016-10-071-1/+1
| | | | | | | | | | | The workqueue "ib_nl" queues work items &ib_nl_timed_work and &mad_agent_priv->local_work. It has been identity converted. WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/core: add support to create a unsafe global rkey to ib_create_pdChristoph Hellwig2016-09-231-1/+1
| | | | | | | | | | | | | Instead of exposing ib_get_dma_mr to ULPs and letting them use it more or less unchecked, this moves the capability of creating a global rkey into the RDMA core, where it can be easily audited. It also prints a warning everytime this feature is used as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: Fix indentationBart Van Assche2016-06-061-3/+3
| | | | | | | | | | | | Make indentation consistent. Detected by smatch. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hal Rosenstock <hal@mellanox.com> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-By: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/MAD: Integrate ib_mad module into ib_core moduleMark Bloch2016-05-241-10/+3
| | | | | | | | Consolidate ib_mad into ib_core, this commit eliminates ib_mad.ko and makes it part of ib_core.ko Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: use CQ abstractionChristoph Hellwig2016-01-191-104/+58
| | | | | | | | | | | Remove the local workqueue to process mad completions and use the CQ API instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: pass ib_mad_send_buf explicitly to the recv_handlerChristoph Hellwig2016-01-191-8/+10
| | | | | | | | | | Stop abusing wr_id and just pass the parameter explicitly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: Ensure fairness in ib_mad_completion_handlerDean Luick2015-12-241-0/+18
| | | | | | | | | | | | | | | | | | | | It was found that when a process was rapidly sending MADs other processes could be hung in their unregister calls. This would happen when process A was injecting packets fast enough that the single threaded workqueue was never exiting ib_mad_completion_handler. Therefore when process B called flush_workqueue via the unregister call it would hang until process A stopped sending MADs. The fix is to periodically reschedule ib_mad_completion_handler after processing a large number of completions. The number of completions chosen was decided based on the defaults for the recv queue size. However, it was kept fixed such that increasing those queue sizes would not adversely affect fairness in the future. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/mad: Require CM send method for everything except ClassPortInfoHal Rosenstock2015-12-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | Receipt of CM MAD with other than the Send method for an attribute other than the ClassPortInfo attribute is invalid. CM attributes other than ClassPortInfo only use the send method. The SRP initiator does not maintain a timeout policy for CM connect requests relies on the CM layer to do that. The result was that the SRP initiator hung as the connect request never completed. A new SRP target has been observed to respond to Send CM REQ with GetResp of CM REQ with bad status. This is non conformant with IBA spec but exposes a vulnerability in the current MAD/CM code which will respond to the incoming GetResp of CM REQ as if it was a valid incoming Send of CM REQ rather than tossing this on the floor. It also causes the MAD layer not to retransmit the original REQ even though it has not received a REP. Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge branch 'wr-cleanup' into k.o/for-4.4Doug Ledford2015-10-281-20/+20
|\
| * IB: split struct ib_send_wrChristoph Hellwig2015-10-081-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch split up struct ib_send_wr so that all non-trivial verbs use their own structure which embedds struct ib_send_wr. This dramaticly shrinks the size of a WR for most common operations: sizeof(struct ib_send_wr) (old): 96 sizeof(struct ib_send_wr): 48 sizeof(struct ib_rdma_wr): 64 sizeof(struct ib_atomic_wr): 96 sizeof(struct ib_ud_wr): 88 sizeof(struct ib_fast_reg_wr): 88 sizeof(struct ib_bind_mw_wr): 96 sizeof(struct ib_sig_handover_wr): 80 And with Sagi's pending MR rework the fast registration WR will also be down to a reasonable size: sizeof(struct ib_fastreg_wr): 64 Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [srp, srpt] Reviewed-by: Chuck Lever <chuck.lever@oracle.com> [sunrpc] Tested-by: Haggai Eran <haggaie@mellanox.com> Tested-by: Sagi Grimberg <sagig@mellanox.com> Tested-by: Steve Wise <swise@opengridcomputing.com>